This post by Robert Fortner (via Hacker News) provides a brief but detailed history of speech recognition from its inception to the present. It is hard to argue with his conclusion that the technology peaked a few years ago and is unlikely to improve again. It clearly explains why even Google’s recent efforts won’t help and why the anecdotal evidence of the ubiquitous speech recognition on newer versions of Android are so mixed.
Fortner does suggest that there is a way forward, an inversion of the original priorities in AI research.
Originally, however, speech recognition was going to lead to artificial intelligence. Computing pioneer Alan Turing suggested in 1950 that we “provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English.” Over half a century later, artificial intelligence has become prerequisite to understanding speech. We have neither the chicken nor the egg.
I am not so pessimistic though I am still cautious in my optimism around renewed AI efforts. Regardless, speech as a practical component of general purpose user interfaces is clearly quite a bit further out than everyone is predicting. If the current upper bound on speech recognition is eventually cracked, the implication is we will also have achieved truly useful, autonomous software agents.