Questions tagged [voice]

4 questions
5
votes
2 answers

How can you efficiently cluster speech segments by speaker?

We have ~30 audio snippets, of which around 50% are from the same speaker, who is our target speaker, and the rest are from various different speakers. We want to extract all audio snippets from our target speaker, so basically figure out which…
Yes
  • 181
  • 4
1
vote
0 answers

How can I get tortoise-tts to pronounce acronyms correctly?

I'm trying to get tortoise-tts to pronounce acronyms correctly. Example of text that I'd like tortoise-tts to generate an audio file for: OpenAI ChatGPT is a new language model. The audio file generated by tortoise-tts is: OpenAI Chat is a new…
Franck Dernoncourt
  • 5,862
  • 12
  • 44
  • 80
0
votes
1 answer

How to create AI voice generator for fantasy language?

I have a "fantasy language" (a conlang), which has a very simple pronunciation system. Every letter represents one sound, as opposed to English, where you can have the same sound with different spellings ("here" and "hear", for example). In the…
Lance Pollard
  • 75
  • 2
  • 9
0
votes
1 answer

What is the difference between VAD and Speaker Segmentation?

I'm not sure I can distinguish and understand the difference between: VAD (Voice Activity Detection) and Speaker Segmentation I understand that: VAD - split audio to segments of speech or not speech Speaker Segmentation - split audio to segments…
user3668129
  • 769
  • 4
  • 15