Skip to main content

How to Create a Voice

Creating a custom voice with Resemble involves a two-step process: generating voice design samples and then building your preferred voice. This guide walks you through both steps with example API requests and responses.

Overview of the Voice Creation Process

  1. Generate three different voice samples using the voice design endpoint and choose your favorite
  2. Build the finalized voice using the voice design UUID and your chosen sample index

Step 1: Generate Voice Design Samples

First, you'll make a POST request to the voice design endpoint https://app.resemble.ai/api/v2/voice-design with the user_prompt parameter which will generate three distinct voice samples. For more information on how to prompt for a voice, see How to Prompt.

For this example, we'll use the prompt A middle-aged female voice with an Australian accent. She speaks with confidence and warmth, at a moderate pace. Her tone is friendly and approachable, like a knowledgeable tour guide.

Be sure to replace YOUR_API_TOKEN with your actual API token.

API Request:

curl 'https://app.resemble.ai/api/v2/voice-design' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--form 'user_prompt="A middle-aged female voice with an Australian accent. She speaks with confidence and warmth, at a moderate pace. Her tone is friendly and approachable, like a knowledgeable tour guide."'

This will return a JSON response with the voice design UUID and the three generated samples.

API Response:

{
"voice_candidates": [
{
"audio_url": "<audio_url>",
"voice_sample_index": 0,
"uuid": "abcd1234"
},
{
"audio_url": "<audio_url>",
"voice_sample_index": 1,
"uuid": "abcd1234"
},
{
"audio_url": "<audio_url>",
"voice_sample_index": 2,
"uuid": "abcd1234"
}
]
}

Step 2: Choose and Build Your Voice

After receiving the three samples, listen to each one and decide which you prefer. Note the index (0, 1, or 2) of your chosen sample. For this example, we'll choose sample index 1.

We will make a POST request to the voice design endpoint https://app.resemble.ai/api/v2/voice-design/{voice_design_uuid}/{voice_sample_index}/create_rapid_voice with the voice design UUID, the chosen sample index, and the voice_name parameter.

note

The uuid is for the voice design request and is the same for all three generated samples from Step 1, not to be confused with voice_uuid which is the UUID of the voice you're building.

The voice_sample_index is the index of the sample you want to use to build your voice.

The voice design uuid , voice_sample_index, and voice_name are required for the request.

API Request:

curl 'https://app.resemble.ai/api/v2/voice-design/abcd1234/1/create_rapid_voice' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--form 'voice_name="Australian Female"'

This will return a JSON response with the voice UUID. Your voice will now begin building in the background.

API Response:

{
"voice_uuid": "1234567890"
}

Step 3: Check the Status of Your Voice Build

You can check the status of your voice build in your dashboard, or programmatically with the voice UUID returned in the response from Step 2.

We will use the voice UUID to check the status of the voice build by making a GET request to the endpoint https://app.resemble.ai/api/v2/voices/1234567890

API Request:

curl 'https://app.resemble.ai/api/v2/voices/1234567890' \
--header 'Authorization: Bearer YOUR_API_TOKEN'

API Response:

{
"success": true,
"item": {
"uuid": <string>,
"name": <string>,
"status": "running",
"default_language": <string>,
"voice_type": <string>,
"supported_languages": <string[]>,
"dataset_url": <string*>,
"callback_uri": <string*>,
"source": <string>,
"component_status": {
"text_to_speech": { "status": <string> },
"speech_to_speech": { "status": <string> },
"fill": { "status": <string> }
},
"api_support": {
"sync": <boolean>,
"async": <boolean>,
"direct_synthesis": <boolean>,
"streaming": <boolean>
},
"created_at": <UTC Date>,
"updated_at": <UTC Date>,
}
}

Once the voice build is complete, the status field will be finished.

Step 4: Create an Audio Clip

Once the voice build is complete, you can create an audio clip with your new custom voice. For more information on how to create an audio clip, see Generating Audio With Text to Speech.

Be sure to replace YOUR_API_TOKEN with your actual API token.

API Request:

curl --request POST "https://f.cluster.resemble.ai/synthesize"
-H "Authorization: Bearer YOUR_API_TOKEN"
-H "Content-Type: application/json"
-H "Accept-Encoding: gzip"
--data '{
"voice_uuid": "1234567890",
"data": "Hello from Resemble!",
"sample_rate": 48000,
"output_format": "wav"
}'

API Response:

{
"audio_content": <base64 encoded string of the raw audio bytes>,
"audio_timestamps": {
"graph_chars": string[],
"graph_times": float[][],
"phon_chars": string[],
"phon-times": float[][],
},
"duration": float,
"issues": string[],
"output_format": string,
"sample_rate": float,
"success": boolean,
"synth_duration": float,
"title": string|null
}

All Done!

You have now created a custom voice and generated your first audio clip with it. For more information on how to use your new voice, check out the following guides: