Version: 2.0.0

Create a voice

This endpoint creates a voice and optionally starts training

note

Create voice API is only available for pro users. Please contact us to upgrade your plan.

Voice types

There are two types of voices you can create on the Resemble platform: Rapid Voice Clone and Professional Voice Clone.

Rapid Voice Clone

A Rapid Voice Clone is a quick and easy way to create a voice for your content. Using as little as 10 seconds of recordings, you can create a voice clone in under a minute.

Professional Voice Clone

A Professional Voice Clone provides a more accurate way of of creating a voice. It requires at least 10 minutes of recordings and takes around 40 minutes to create. This allows for a more detailed and personalized voice, as the AI has more data to work with.

Voice Data

There are 2 ways to provide data for a voice:

Providing a URL to a dataset when creating the voice
Uploading individual recordings using the recording API

Option 1: Providing a URL to a dataset when creating the voice

Rapid Voice

Create a voice using the "Create a voice" endpoint and provide a URL to the dataset in the dataset_url attribute. The dataset must be a wav file of at least 10 seconds.
After creating the voice, follow the Build a voice documentation to start training.

Professional Voice

Create a voice using the "Create a voice" endpoint and provide a URL to the dataset in the dataset_url attribute. Please see here for acceptable dataset formats.
The dataset will first be analyzed and then training will begin automatically.

Option 2: Uploading individual recordings using the recording API

Rapid Voice

Create a voice using the "Create a voice" endpoint and omit the dataset_url attribute.
Use the instructions on the "Create a recording" page to upload recordings to your voice.
Upon uploading at least 3 recordings, follow the Build a voice documentation to start training.

Professional Voice

Create a voice using the "Create a voice" endpoint and omit the dataset_url attribute.
Use the instructions on the "Create a recording" page to upload recordings to your voice.
Upon uploading at least 20 recordings, follow the Build a voice documentation to start training.

In order to clone a voice, you must be an authorized uploader or provide consent to clone your voice using the Resemble AI platform. To provide consent, upload an audio recording containing the following message:

I am aware that recordings of my voice will be used by [name of your company] to train and create a synthetic version of my voice by Resemble AI.

This audio content will be used by the Resemble platform for the purposes of authorizing your voice clone.

HTTP Request

POST https://app.resemble.ai/api/v2/voices

JSON Body Parameters	Type	Description
name	string	Name of the voice
consent	string	A base-64 encoded Wavefile string containing your consent and authorization to create and clone a voice. Please see the Voice Consent section for more details.
voice_type	(optional) string	The type of voice to create. Either `rapid` or `professional`. If not provided defaults to `professional`
dataset_url	(optional) string	A URL to a dataset on which to train the voice on. Please see here for acceptable dataset formats
callback_uri	(optional) string	A URL (webhook) that will be notified upon voice training completion Please see here for callback details

Base 64 Encoding

The required consent field must be a valid base-64 encoded string containing your consent audio file content. To convert your consent audio file to a base-64 encoded string you can use your programming language of choice's standard library. See the following examples below for implementation in several popular languages.

NodeJS

const fs = require('fs');
const path = require('path');

// Read the contents of the file into a string
const filePath = 'path/to/consent.wav';
const fileContents = fs.readFileSync(filePath, { encoding: 'base64' });

// Output the Base64-encoded string to stdout
console.log(fileContents);

HTTP Response

{
  "success": true,
  "item": {
    "uuid": <string>,
    "name": <string>,
    "status": <string>,
    "dataset_url": <string>,
    "created_at": <UTC Date>,
    "updated_at": <UTC Date>,
  }
}

Callback

If you've provided a callback_uri when you created a voice, you will receive the following POST request when the voice has completed training.

Training Completion Callback

This callback happens when your training completes without any issues.

{
    "ok": true,
    "id": "<string>",
    "status": "finished",
    "recordings": [],
    "issue": null
}

Dataset Issue Callback

If the status is set to dataset_issue, this callback will contain detailed information about the issue and problematic recordings:

{
  "ok": "<boolean>",
  "id": "<string>",
  "status": "dataset_issue",
  "issue": "Detailed description of the dataset issue.",
  "recordings": [
    {
      "uuid": "<string>",
      "name": "<string>",
      "transcript": "<string>",
      "stoi_score": "<number>",
      "pesq_score": "<number>"
      "si_dr_score": "<number>",
      "resemble_sample_score": "<number>",
      "is_active": "<boolean>",
      "is_outlier": "<boolean>",
      "is_silent": "<boolean>"
    },
    ...
  ]
}

If the status is consent_validation_failed, this callback provides information about consent issues:

{
    "ok": "<boolean>",
    "id": "<string>",
    "status": "consent_validation_failed",
    "issue": "Consent statement issue description.",
    "recordings": []
}

JSON Body Parameters	Type	Description
id	string	The UUID of the voice this callback is for.
status	string	The status of the voice, such as `finished`, `dataset_issue`, or `consent_validation_failed`.
issue	string	A detailed description of the issue, if any.
recordings	array	The `recordings` array provides detailed feedback for each problematic recording, including scores for STOI, PESQ, and SI-SDR, as well as flags indicating whether the recording is active, an outlier, or silent.

Examples

NodeJS

import { Resemble } from '@resemble/node'

Resemble.setApiKey('YOUR_API_TOKEN')

await Resemble.v2.voices.create({ name: "Chef", dataset_url: "https://../dataset.zip", callback_uri: "http://example.com/cb" })

Try it out

API Key:

JSON Body:

Create a voice

Voice types​

Rapid Voice Clone​

Professional Voice Clone​

Voice Data​

Option 1: Providing a URL to a dataset when creating the voice​

Rapid Voice​

Professional Voice​

Option 2: Uploading individual recordings using the recording API​

Rapid Voice​

Professional Voice​

Voice Consent​

HTTP Request​

Base 64 Encoding​

HTTP Response​

Callback​

Training Completion Callback​

Dataset Issue Callback​

Consent Validation Failure Callback​

Examples​

Try it out​

Voice types

Rapid Voice Clone

Professional Voice Clone

Voice Data

Option 1: Providing a URL to a dataset when creating the voice

Rapid Voice

Professional Voice

Option 2: Uploading individual recordings using the recording API

Rapid Voice

Professional Voice

Voice Consent

HTTP Request

Base 64 Encoding

HTTP Response

Callback

Training Completion Callback

Dataset Issue Callback

Consent Validation Failure Callback

Examples

Try it out