Questions tagged [voice]
4 questions
5
votes
2 answers
How can you efficiently cluster speech segments by speaker?
We have ~30 audio snippets, of which around 50% are from the same speaker, who is our target speaker, and the rest are from various different speakers. We want to extract all audio snippets from our target speaker, so basically figure out which…
Yes
- 181
- 4
1
vote
0 answers
How can I get tortoise-tts to pronounce acronyms correctly?
I'm trying to get tortoise-tts to pronounce acronyms correctly. Example of text that I'd like tortoise-tts to generate an audio file for: OpenAI ChatGPT is a new language model.
The audio file generated by tortoise-tts is: OpenAI Chat is a new…
Franck Dernoncourt
- 5,862
- 12
- 44
- 80
0
votes
1 answer
How to create AI voice generator for fantasy language?
I have a "fantasy language" (a conlang), which has a very simple pronunciation system. Every letter represents one sound, as opposed to English, where you can have the same sound with different spellings ("here" and "hear", for example). In the…
Lance Pollard
- 75
- 2
- 9
0
votes
1 answer
What is the difference between VAD and Speaker Segmentation?
I'm not sure I can distinguish and understand the difference between:
VAD (Voice Activity Detection) and
Speaker Segmentation
I understand that:
VAD - split audio to segments of speech or not speech
Speaker Segmentation - split audio to segments…
user3668129
- 769
- 4
- 15