How to Create a Voice
Creating a custom voice with Resemble involves a two-step process: generating voice design samples and then building your preferred voice. This guide walks you through both steps with example API requests and responses.
Overview of the Voice Creation Process
- Generate three different voice samples using the voice design endpoint and choose your favorite
- Build the finalized voice using the voice design UUID and your chosen sample index
Step 1: Generate Voice Design Samples
First, you'll make a POST
request to the voice design endpoint https://app.resemble.ai/api/v2/voice-design
with the user_prompt
parameter which will generate three distinct voice samples. For more information on how to prompt for a voice, see How to Prompt.
For this example, we'll use the prompt A middle-aged female voice with an Australian accent. She speaks with confidence and warmth, at a moderate pace. Her tone is friendly and approachable, like a knowledgeable tour guide.
Be sure to replace YOUR_API_TOKEN
with your actual API token.
API Request:
curl 'https://app.resemble.ai/api/v2/voice-design' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--form 'user_prompt="A middle-aged female voice with an Australian accent. She speaks with confidence and warmth, at a moderate pace. Her tone is friendly and approachable, like a knowledgeable tour guide."'
This will return a JSON response with the voice design UUID and the three generated samples.
API Response:
{
"voice_candidates": [
{
"audio_url": "<audio_url>",
"voice_sample_index": 0,
"uuid": "abcd1234"
},
{
"audio_url": "<audio_url>",
"voice_sample_index": 1,
"uuid": "abcd1234"
},
{
"audio_url": "<audio_url>",
"voice_sample_index": 2,
"uuid": "abcd1234"
}
]
}
Step 2: Choose and Build Your Voice
After receiving the three samples, listen to each one and decide which you prefer. Note the index (0, 1, or 2) of your chosen sample. For this example, we'll choose sample index 1.
We will make a POST
request to the voice design endpoint https://app.resemble.ai/api/v2/voice-design/{voice_design_uuid}/{voice_sample_index}/create_rapid_voice
with the voice design UUID, the chosen sample index, and the voice_name
parameter.
The uuid
is for the voice design request and is the same for all three generated samples from Step 1, not to be confused with voice_uuid
which is the UUID of the voice you're building.
The voice_sample_index
is the index of the sample you want to use to build your voice.
The voice design uuid
, voice_sample_index
, and voice_name
are required for the request.
API Request:
curl 'https://app.resemble.ai/api/v2/voice-design/abcd1234/1/create_rapid_voice' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--form 'voice_name="Australian Female"'
This will return a JSON response with the voice UUID. Your voice will now begin building in the background.
API Response:
{
"voice_uuid": "1234567890"
}
Step 3: Check the Status of Your Voice Build
You can check the status of your voice build in your dashboard, or programmatically with the voice UUID returned in the response from Step 2.
We will use the voice UUID to check the status of the voice build by making a GET
request to the endpoint https://app.resemble.ai/api/v2/voices/1234567890
API Request:
curl 'https://app.resemble.ai/api/v2/voices/1234567890' \
--header 'Authorization: Bearer YOUR_API_TOKEN'
API Response:
{
"success": true,
"item": {
"uuid": <string>,
"name": <string>,
"status": "running",
"default_language": <string>,
"voice_type": <string>,
"supported_languages": <string[]>,
"dataset_url": <string*>,
"callback_uri": <string*>,
"source": <string>,
"component_status": {
"text_to_speech": { "status": <string> },
"speech_to_speech": { "status": <string> },
"fill": { "status": <string> }
},
"api_support": {
"sync": <boolean>,
"async": <boolean>,
"direct_synthesis": <boolean>,
"streaming": <boolean>
},
"created_at": <UTC Date>,
"updated_at": <UTC Date>,
}
}
Once the voice build is complete, the status
field will be finished
.
Step 4: Create an Audio Clip
Once the voice build is complete, you can create an audio clip with your new custom voice. For more information on how to create an audio clip, see Generating Audio With Text to Speech.
Be sure to replace YOUR_API_TOKEN
with your actual API token.
API Request:
curl --request POST "https://f.cluster.resemble.ai/synthesize"
-H "Authorization: Bearer YOUR_API_TOKEN"
-H "Content-Type: application/json"
-H "Accept-Encoding: gzip"
--data '{
"voice_uuid": "1234567890",
"data": "Hello from Resemble!",
"sample_rate": 48000,
"output_format": "wav"
}'
API Response:
{
"audio_content": <base64 encoded string of the raw audio bytes>,
"audio_timestamps": {
"graph_chars": string[],
"graph_times": float[][],
"phon_chars": string[],
"phon-times": float[][],
},
"duration": float,
"issues": string[],
"output_format": string,
"sample_rate": float,
"success": boolean,
"synth_duration": float,
"title": string|null
}
All Done!
You have now created a custom voice and generated your first audio clip with it. For more information on how to use your new voice, check out the following guides: