Skip to main content
Version: 2.0.0

Create a voice

This endpoint creates a voice and optionally starts training


Create voice API is only available for pro users. Please contact us to upgrade your plan.

Voice types

There are two types of voices you can create on the Resemble platform: Rapid Voice Clone and Professional Voice Clone.

Rapid Voice Clone

A Rapid Voice Clone is a quick and easy way to create a voice for your content. Using as little as 10 seconds of recordings, you can create a voice clone in under a minute.

Professional Voice Clone

A Professional Voice Clone provides a more accurate way of of creating a voice. It requires at least 10 minutes of recordings and takes around 40 minutes to create. This allows for a more detailed and personalized voice, as the AI has more data to work with.

Voice Data

There are 2 ways to provide data for a voice:

  1. Providing a URL to a dataset when creating the voice

  2. Uploading individual recordings using the recording API

Option 1: Providing a URL to a dataset when creating the voice

Rapid Voice

  1. Create a voice using the "Create a voice" endpoint and provide a URL to the dataset in the dataset_url attribute. The dataset must be a wav file of at least 10 seconds.
  2. After creating the voice, follow the Build a voice documentation to start training.

Professional Voice

  1. Create a voice using the "Create a voice" endpoint and provide a URL to the dataset in the dataset_url attribute. Please see here for acceptable dataset formats.
  2. The dataset will first be analyzed and then training will begin automatically.

Option 2: Uploading individual recordings using the recording API

Rapid Voice

  1. Create a voice using the "Create a voice" endpoint and omit the dataset_url attribute.
  2. Use the instructions on the "Create a recording" page to upload recordings to your voice.
  3. Upon uploading at least 3 recordings, follow the Build a voice documentation to start training.

Professional Voice

  1. Create a voice using the "Create a voice" endpoint and omit the dataset_url attribute.
  2. Use the instructions on the "Create a recording" page to upload recordings to your voice.
  3. Upon uploading at least 20 recordings, follow the Build a voice documentation to start training.

In order to clone a voice, you must be an authorized uploader or provide consent to clone your voice using the Resemble AI platform. To provide consent, upload an audio recording containing the following message:

I am aware that recordings of my voice will be used by [name of your company] to train and create a synthetic version of my voice by Resemble AI.

This audio content will be used by the Resemble platform for the purposes of authorizing your voice clone.

HTTP Request

JSON Body ParametersTypeDescription
namestringName of the voice
consentstringA base-64 encoded Wavefile string containing your consent and authorization to create and clone a voice. Please see the Voice Consent section for more details.
voice_type(optional) stringThe type of voice to create. Either rapid or professional. If not provided defaults to professional
dataset_url(optional) stringA URL to a dataset on which to train the voice on. Please see here for acceptable dataset formats
callback_uri(optional) stringA URL (webhook) that will be notified upon voice training completion Please see here for callback details

Base 64 Encoding

The required consent field must be a valid base-64 encoded string containing your consent audio file content. To convert your consent audio file to a base-64 encoded string you can use your programming language of choice's standard library. See the following examples below for implementation in several popular languages.

1 2 3 4 5 6 7 8 9 const fs = require('fs'); const path = require('path'); // Read the contents of the file into a string const filePath = 'path/to/consent.wav'; const fileContents = fs.readFileSync(filePath, { encoding: 'base64' }); // Output the Base64-encoded string to stdout console.log(fileContents);

HTTP Response

"success": true,
"item": {
"uuid": <string>,
"name": <string>,
"status": <string>,
"dataset_url": <string>,
"created_at": <UTC Date>,
"updated_at": <UTC Date>,


If you've provided a callback_uri when you created a voice, you will receive the following POST request when the voice has completed training.

Training Completion Callback

This callback happens when your training completes without any issues.

"ok": true,
"id": "<string>",
"status": "finished",
"recordings": [],
"issue": null

Dataset Issue Callback

If the status is set to dataset_issue, this callback will contain detailed information about the issue and problematic recordings:

"ok": "<boolean>",
"id": "<string>",
"status": "dataset_issue",
"issue": "Detailed description of the dataset issue.",
"recordings": [
"uuid": "<string>",
"name": "<string>",
"transcript": "<string>",
"stoi_score": "<number>",
"pesq_score": "<number>"
"si_dr_score": "<number>",
"resemble_sample_score": "<number>",
"is_active": "<boolean>",
"is_outlier": "<boolean>",
"is_silent": "<boolean>"

If the status is consent_validation_failed, this callback provides information about consent issues:

"ok": "<boolean>",
"id": "<string>",
"status": "consent_validation_failed",
"issue": "Consent statement issue description.",
"recordings": []
JSON Body ParametersTypeDescription
idstringThe UUID of the voice this callback is for.
statusstringThe status of the voice, such as finished, dataset_issue, or consent_validation_failed.
issuestringA detailed description of the issue, if any.
recordingsarrayThe recordings array provides detailed feedback for each problematic recording, including scores for STOI, PESQ, and SI-SDR, as well as flags indicating whether the recording is active, an outlier, or silent.


1 2 3 4 5 import { Resemble } from '@resemble/node' Resemble.setApiKey('YOUR_API_TOKEN') await Resemble.v2.voices.create({ name: "Chef", dataset_url: "https://../", callback_uri: "" })

Try it out

API Key:
JSON Body: