Dataset formats for voice upload
Resemble accepts data in two formats:
- A single audio/wav file.
- A gzipped file (.tar.gz) that includes a
metadata.csvfile and a
wavsdirectory should have audio files between 1.5 seconds to 15 seconds in length. Each wav file should have an entry in the
metadata.csvfile that includes the filename without the extension, and the transcript for that file. The
|character serves as the delimiter in
Where the metadata file is formatted as follows:
file1|This is the text that is included in file one.