- Acoustic Variability. For example, there is no firm boundary between the vowels in "sit" vs. "seat". Differences in size and shape of the vocal tract and in dialect habits lead to significant variation. Whereas people deal well with fuzzy and relativistic boundaries, our systems do not.
- Style Variability. People take short cuts in casual speech. "Did you eat yet" might sound like "Jee chet". Recovering the intent from what is left in the signal can be difficult for people but disastrous for our systems.
- Noise. Though people have difficulty hearing in very noisy conditions, our systems are much more susceptible to disruption from small levels of noise, particularly noise from competing speech.
Although people cope better with these challenges than current technology does, there are many instances in which our technology is better than people, for example, at correctly transcribing twelve consecutive digits, or correctly recognizing one of sixty thousand company names. The real challenge is finding ways for humans and machines to collaborate that maximize our respective skills...a "social" problem discussed in this session!
See more of: Cognitive Function and Development
See more of: Symposia