VoiceCraft

This is a duplicated space as the original space was broken.The description of the license is based of the original github page.

License

The codebase is under CC BY-NC-SA 4.0 (LICENSE-CODE), and the model weights are under Coqui Public Model License 1.0.0 (LICENSE-MODEL). Note that we use some of the code from other repository that are under different licenses: ./models/codebooks_patterns.py is under MIT license; ./models/modules, ./steps/optim.py, data/tokenizer.py are under Apache License, Version 2.0; the phonemizer we used is under GNU 3.0 License.

If enabled, the target transcript will be constructed for you:

  • In TTS and Long TTS mode just write the text you want to synthesize.
  • In Edit mode just write the text to replace selected editing segment.
    If disabled, you should write the target transcript yourself:
  • In TTS mode write prompt transcript followed by generation transcript.
  • In Long TTS select split by newline (SENTENCE SPLIT WON'T WORK) and start each line with a prompt transcript.
  • In Edit mode write full prompt
Mode
Last word in prompt
0 7.86
First word to edit
Last word to edit
0 7.86
0 7.86
Sentence

Select sentence you want to regenerate

stop_repetition

if there are long silence in the generated audio, reduce the stop_repetition to 1 or 2. -1 = disabled

kvcache

set to 0 to use less VRAM, but with slower inference