Skip to main content
Version: 1.0.0

Dataset formats for voice upload

Resemble accepts data in two formats:

  1. A single audio/wav file.
  2. A gzipped file (.tar.gz) that includes a metadata.csv file and a wavs directory. The wavs directory should have audio files between 1 second to 12 seconds in length. Each wav file should have an entry in the metadata.csv file that includes the filename without the extension, and the transcript for that file. The | character serves as the delimiter in metadata.csv.

For example:

data/
metadata.csv
wavs/
file1.wav
file2.wav
file3.wav

Where the metadata file is formatted as follows:

file1|This is the text that is included in file one.