Voice recognition
-
Upload
yoseop-shin -
Category
Technology
-
view
137 -
download
4
description
Transcript of Voice recognition
VOICE RECOGNITIONDepartment of Computer Engineering
2007152025
Yoseop Shin
Voice recognition (also known as automatic speech recognition) converts ⋯⋯ using the binary code for a string of character codes).
B
F
A
GD
E
C
Introduction of Voice Recognition
Voice RecognitionRecognize what is being said
Identify the person speaking
The market growth for voice recognitionSa
les (in
billion)
0
0.8
1.5
2.3
3
2004 2005 2006 2007 2008 2009
market for voice-recognition technology topped $1 billion for the first time in 2006. !100 percent increase in just two years (2006-2008). !The market for server-based voice-recognition technology to power call centers and the like reached nearly $600 million in 2006 and is expected to double by 2009.
Distributed Speech Recognition
▪ Commonly using in Mobile Devices ▪ e.g. Motorola, Google iPhone
Speech Coder
Speech Decoder ISDN ASR
Front-endASR
Decoder
ASR Front-end
ASR Decoder
Conventional
DSR
Pitch Analysis Noise Reduction
Contents server
Hidden Markov Model (HMM)
▪ Modern general-purpose speech recognition systems are generally based on HMMs.
▪ Statistical Model. ▪ The most popular statistical model in natural language processing.
▪ trained automatically, simple, computationally feasible to use.
Artificial neural network
▪ computational model based on biological neural networks.
▪ Training non-linear relation itself.
Performance of Speech Recognition
▪ Isolated Word Recognition (aprx to 95~97%)
- High Accuracy
- Very Limited Words
- Short command or Simple Control
▪ Continuous Word Recognition (aprx to 85~90%)
- Low Accuracy
- higher than 95% (1,000 ~ 3,000 words)
Performance of Speech Recognition
Researcher Feature DB Words Accuracy
IBM Isolated Word Recognition English 20,000 95.0%
NEC Isolated Word Recognition Japanese 1,800 97.5%
ATR Continuous Word Recognition Japanese 1,035 95.3%
SRI Continuous Word Recognition English 1,000 95.2%
CMU Continuous Word Recognition ATIS 3,000 95.0%
Ney Continuous Word Recognition NAB’94 20,000 84.6%
Cambridge Continuous Word Recognition HUB4 32,800 83.8%
KAIST Continuous Word Recognition Korean 3,064 96.7%
Optimal conditions
▪ have speech characteristics which match the training data.
▪ can achieve proper speaker adaptation. ▪ work in a clean noise environment .
(e.g. quiet office or laboratory space)
Applications – medical transcription
▪ MT (Medical Transcription)
- Searches, queries, and for filling may all be faster to perform by voice than by using a keyboard.
Applications - People with Disabilities
▪ Deaf Telephony
▪ Voice To Text
▪ Captioned Telephone
▪ Using Mouse with mouth
Prof. Sang-Mook Lee Seoul National University
School of Earth and Environmental Science
Perl Scripting with Windows Vista
Applications – Further Applications
▪ Automatic Translation
▪ Automotive Speech Recognition (e.g. Ford Sync)
▪ Telematics (e.g. vehicle Navigation Systems)
▪ Court reporting
▪ Hands-free computing : voice command recognition computer user interface
▪ Mobile telephony, including mobile email
▪ Transcription (digital speech-to-text)
▪ Air Traffic Control Speech Recognition
Tom Clancy’s ENDWAR
Speech Recognition Software
▪ Korea
VoiceTech ByVoice 2.0
▪ U.S.
Dragon Naturally Speaking
Speech Recognition Software