Highest Voted 'speech-recognition' Questions - Computer Science Stack Exchange

9

votes

2 answers

Why are HMMs appropriate for speech recognition when the problem doesn't seem to satisfy the Markov property

I'm learning about HMMs and their applications and trying to understand their usages. My knowledge is a bit spotty, so please correct any incorrect assumptions I'm making. The specific example I'm wondering about is for using HMMs for speech…

hidden-markov-models markov-chains speech-recognition

asked Jan 28 '15 at 19:02

sooniln

275
1
4

8

votes

1 answer

Why do mainstream speech models no longer require a personalized training step?

Back in the Windows XP era, when setting up Windows OS-built-in speech/dictation, I had to speak out a bunch of programmed-in text samples to the speech-to-text engine to personalize my voice profile. Today, with networked speech-to-text engines…

algorithms machine-learning speech-recognition

asked Jan 05 '19 at 22:17

tsutsu

113
3

3

votes

1 answer

Synchronizing speech and text

I have a text and a narration of the exact same text. What is the best way to synchronize them together? By synchronizing I mean, finding out for example the location of each word in the audio. For example if the sentence is "I took a cab" I want…

speech-recognition

asked Jan 10 '16 at 15:47

Ameer Jewdaki

539
2
14

2

votes

0 answers

State of the art in multi-modal command recognition

I'm currently researching various fusion methods in a multi-modal (video, audio, identity, user position and gesture) human-computer interaction environment (think in terms of a smart-home system). What is the current state of the art in this field…

reference-request speech-recognition

asked Aug 31 '15 at 21:16

Seanny123

651
8
23

2

votes

1 answer

What type of HMM-GMM I need

Context: I have 100 speech sentences that I asked my friend to speak. The vocabulary in the sentences are same but only the order of words are changed. My friend says that he spoke exactly what was asked for each sentence. But I don't know whether…

hidden-markov-models speech-recognition

asked May 25 '14 at 07:30

Pupil

21
1

1

vote

0 answers

How much training data for speech recognition?

How much training data is needed to build a speech-to-text engine based on machine learning? (To within an order of magnitude or so.) Big companies like Google, Facebook have a massive amount of data. For usual people its not possible to acquire…

machine-learning natural-language-processing speech-recognition

asked Jul 03 '17 at 05:44

Tahlil

19
4

1

vote

1 answer

How to use frame based speech features for learning using a neural network classifier?

I am doing supervised learning on speech audio files using neural networks. For this purpose, I'll have to extract features from the audio file. But since an audio file is a time varying signal, it is generally divided into multiple frames and then…

machine-learning speech-recognition features

asked Apr 29 '15 at 22:48

ksb

801
1
7
5

0

votes

0 answers

Examples for speech recognition systems and spoken dialogue systems

I am collecting material for a MOOC about speech technology. My aim is that students also have examples to try rather than just watching the lecture and some complimentary youtube videos. So the idea was that they could call up some spoken dialogue…

machine-learning natural-language-processing speech-recognition

asked Mar 18 '16 at 14:40

Martin

101

0

votes

1 answer

Is it computationally possible to voice-recognize and word-tag-time-align audiobooks to their actual text?

I would like to know whether it is computationally possible for a computer to go through the words of an audiobook as input, output a file containing both the original audio and the text corresponding to each word (which could be reviewed by a…

speech-recognition

asked Feb 06 '16 at 17:33

Jack Maddington

101
2

0

votes

1 answer

Build Automatic Speech Recognition (ASR) from scratch

I want to build a Automatic Speech Recognition (ASR) engine for myself, but I've no idea from where to start. I've read that most ASR's are build upon Hidden Markov Models, but also I've read that HMM is limited somehow and a better approach is to…

machine-learning hidden-markov-models speech-recognition

asked Feb 01 '15 at 19:07

0xdeadcode

128
1
6

0

votes

1 answer

Speaker independent voice command recognition

I am looking for a software, a library or an algorithm that can be trained to recognize about a dozen speaker independent voice commands. The commands will be very distinct phrases of 4-5 words each. They can be chosen to sound very different from…

speech-recognition

asked Oct 01 '14 at 13:47

Sigman

9
1

0

votes

1 answer

How to understand a equation related to speaker recognition?

This question refers to the following paper: Support Vector Machines for Speaker and Language Recognition, W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, P. A. Torres-Carrasquillo, Computer speech and Language 20 (2006) 210-229. I am…

algorithms machine-learning svm speech-recognition

asked Apr 16 '19 at 03:05

Creator

3
4

Questions tagged [speech-recognition]