How much training data is needed to build a speech-to-text engine based on machine learning? (To within an order of magnitude or so.)
Big companies like Google, Facebook have a massive amount of data. For usual people its not possible to acquire that amount of data and without it training a model for the purpose seems impossible right now. Is it possible for a few individual to build a speech to text recognition model comparable to Google?