3

I have a text and a narration of the exact same text. What is the best way to synchronize them together? By synchronizing I mean, finding out for example the location of each word in the audio. For example if the sentence is "I took a cab" I want for markers on the accompanying audio which indicate start of each word.

Obviously any speech-to-text algorithm can be applied here, but given the fact that we've already know the text, I wonder if there's an algorithm for this simplified problem which works either much faster, or gives much better performance. Or if there is a general algorithm that can be tuned for this specific problem.

Nikolay Shmyrev
  • 385
  • 2
  • 7
Ameer Jewdaki
  • 539
  • 2
  • 14

1 Answers1

1

Papers on algorithms:

Implementations:

Nikolay Shmyrev
  • 385
  • 2
  • 7