Speak up: como criar Speech-based apps
-
Upload
codebits -
Category
Technology
-
view
3.119 -
download
0
description
Transcript of Speak up: como criar Speech-based apps
![Page 2: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/2.jpg)
Agenda
• O Microsoft Language Development Center (MLDC)
• A Tecnologia de Fala:– Reconhecimento de fala.
– Síntese de texto-para-fala.
• Desenvolvimento de aplicações Speech (client-side):– A managed SpeechFX API.
– Desenvolvimento e demos.
• Desenvolvimento de aplicações Speech (server-side):– Microsoft Office Communications Server 2007: Speech Server.
– Desenvolvimento e demos.
• Downloads públicos e recursos:– Beta Program + TAP Program
– Bits para Português!
![Page 3: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/3.jpg)
MLDC - Microsoft LanguageDevelopment Center, Portugal
• Criado um Microsoft | Development Center em Portugal
– Criação em Novembro de 2005 e início da operação em Março de 2006
– http://www.microsoft.com/portugal/mldc
– Miguel Dias (Director, FTE) + 10 colaboradores (Engenheiros e Linguistas)
– Um dos 4 Centros de Desenvolvimento Microsoft na Europa e o 1º fora de Redmond (EUA) dedicado ao desenvolvimento local da linguagem.
– Expansão do grupo de componentes de processamento de fala da Microsoft, baseado emRedmond, EUA
– Co-suportado pelo PRIME –NITEC
![Page 4: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/4.jpg)
![Page 5: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/5.jpg)
http://www.microsoft.com/portugal/mldc
![Page 6: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/6.jpg)
Missão de longo termo e linhas de acção do MLDC
Missão de Longo TermoExpandir o desenvolvimento local das tecnologiasde linguagem na região da EMEA, para umconjunto de produtos e platformas Microsoft(Vista, Exchange, Office, Mobilidade, MediaCenter, Xbox)
Início na língua Portuguesa
Linhas de Acção :1. Linhas de cooperação com as universidades e
institutos de I&D mais inovadores em Portugal e naregião da EMEA, nos domínios da fala e da línguanatural
2. Desenvolvimento de recursos e tecnologias dalinguagem em Portugal e na EMEA
3. Participação em projectos de I&D em consórcio nosprogramas Nacionais (FCT, PRIME-IDEA, PRIME-NITEC) e Europeus (FP7)
![Page 7: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/7.jpg)
A tecnologia de Reconhecimento e Síntese de Fala
![Page 8: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/8.jpg)
Speech Recognition - SR
• Ou Reconhecimento Automático de Fala.
• Características de um sistema de SR:
– Modos de operação:• Comando e controlo,
• Ditado (ou fala espontânea)
– Dependência ao falante.
– Adaptação ao falante.
– Principais métricas de avaliação: precisão e velocidade
![Page 9: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/9.jpg)
Speech Recognition
• Como funciona
– Hidden Markov Models: modelos estatísticos baseados emprobabilidades.
– A Fala é uma série de palavras.
– Cada palavra consiste numa série de sons (fonemas).
– Confidence scoring.
![Page 10: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/10.jpg)
Tempo
Am
plit
ud
eText-to-speech synthesis
Transformar
“É fácil sintetizar fala”
Em
![Page 11: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/11.jpg)
Text-to-speech synthesis - TTS
• Síntese de texto para fala.
• Produção artificial de fala humana.
• Tipicamente, convertendo uma representação textual para falanum formato de audio.
• Como funciona? Técnicas:– Concatenative synthesis
– Formant
– Articulatory
– HMMs
• A voice font: a fala do talento de voz armazenado como um conjunto de segmentos de sons individuais.
![Page 12: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/12.jpg)
Engines and Language Packs
• Microsoft Speech Technology.
• Dois principais core engines:– O engine de SR.
– O engine de TTS.
– Independentes da língua.
• Speech Language Packs: ficheiros específicos por língua.
• Tipicamente, LPs contêm:– language-dependent recognizer data.
– language-dependent synthesizer data.
![Page 13: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/13.jpg)
TalkToMe
![Page 14: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/14.jpg)
Desenvolvimento de aplicações Speech (client-side)
![Page 15: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/15.jpg)
A nova API de Speech
• A API managed SpeechFX.
• O que está no namespace System.Speech:– System.Speech.Recognition
– System.Speech.Synthesis
• Disponível publicamente no .NET Framework 3.0
.NET Framework 3
WPF WCF WWF Cardspace SpeechFX!!
![Page 16: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/16.jpg)
A nova API de Speech
• O que já vem no Vista:– Runtime .NET Framework 3.0, incluindo SpeechFX.
– O reconhecedor de Inglês (Francês, Alemão, Espanhol, Japonês e Chinêstambém disponíveis).
– O sintetizador de Inglês – a voz “Anna”.
– O “Windows Speech Recognition User Experience”
• Para XP: download do .NF3.0
![Page 17: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/17.jpg)
System.Speech.Synthesizer
• Como usar?
• Inclui suporte para custom spoken pronunciations, standard XML SSML W3C, gravar output para wave file, alterar velocidade de síntese e volume.
SpeechSynthesizer synthesizer = new SpeechSynthesizer();synthesizer.Speak(“Olá mundo!”);
![Page 18: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/18.jpg)
System.Speech.Recognition
• Como usar?
– Construir uma gramática.
– Carregar a gramática no reconhecedor.
– Registar eventos (SpeechRecognized, SpeechHypothesized, SpeechDetected, …)
– Começar o reconhecimento…!
– Inclui suporte para gramáticas complexas, semantic values, standard XML SRGS W3C, input de wave file, recognition confidence value, recognition alternate choices.
![Page 19: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/19.jpg)
System.Speech.Recognition
• Como usar?
SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-us"));
void init()
{
Choices pizzaChoices = new Choices();
pizzaChoices.AddPhrase("I'd like a cheese pizza");
pizzaChoices.AddPhrase("I'd like a pepperoni pizza");
pizzaChoices.AddPhrase("I'd like a large pepperoni pizza");
Grammar pizzaGrammar = new Grammar(new GrammarBuilder(pizzaChoices));
recognizer.LoadGrammar(pizzaGrammar);
pizzaGrammar.SpeechRecognized += new EventHandler<RecognitionEventArgs>(PizzaGrammar_SpeechRecognized);
recognizer.Recognize(..);
}
void PizzaGrammar_SpeechRecognized(object sender, RecognitionEventArgs e)
{
MessageBox.Show(e.Result.Text);
}
![Page 20: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/20.jpg)
Speech Sample
![Page 21: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/21.jpg)
RecoTuga
![Page 22: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/22.jpg)
SpeechWiki
![Page 23: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/23.jpg)
Media Center
![Page 24: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/24.jpg)
Speech in Robotics
![Page 25: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/25.jpg)
Microsoft Robotics Studio
• Ambiente de desenvolvimento que permite criar facilmente aplicações para uma grande variedade de plataformas (robôs).
• Apresenta um ambiente virtual que simula o mundo real!
• Interface muito simples de usar!
• Para experts e para beginners!
![Page 26: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/26.jpg)
Microsoft Robotics Studio
![Page 27: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/27.jpg)
Microsoft Robotics Studio
![Page 28: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/28.jpg)
Lego MindStorm
1. 32-bit ARM7 microcontroller
2. Sensor de toque
3. Sensor de som
4. Sensor de luz
5. Sensor de distância
6. Motores
![Page 29: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/29.jpg)
Lego Mindstorm
![Page 30: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/30.jpg)
Desenvolvimento de aplicações
Speech (server-side)
![Page 31: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/31.jpg)
OCS 2007 Speech Server
• OCS 2007 Speech Server está incluído noMicrosoft® Office Communications Server 2007
• Principais componentes:– Authoring and debugging
– Reporting, Analysis and Tuning
– Telephony
– Operations, Administration, Maintenance
![Page 32: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/32.jpg)
Línguas suportadas: 14 (SR / TTS)
Suporte de SR e TTS:
North American English
UK English
Canadian French
German
American Spanish
Suporte de TTS:
Chinese (Mandarin + Traditional), English (Australia), French(France), Italian (Italy), Japanese (Japan), Korean (Korea),Portuguese (Brazil), Spanish (Spain).
Supported Languages
![Page 33: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/33.jpg)
Dialog Workflow Designer
![Page 34: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/34.jpg)
InfoService
![Page 35: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/35.jpg)
“Menu Principal. Diga notícias, trânsito ou tempo.”
“Escolheu a
categoria Desporto.
Existem 3 notícias
novas. Primeira
notícia...”
“Foram
encontradas 2
notícias contendo
o termo... “
“IC19. Trânsito
condicionado
no sentido...”
“Lisboa. Condições
actuais..., para amanhã...”
“Diga [categoria],
pesquisar ou menu
principal.
“Diga o nome do acesso que
deseja consultar, como por
exempo IC19, ou menu
principal.”
“Diga Lisboa, Porto
ou menu principal.”
“Diga um ou vários
termos a pesquisar.”
“Bem-vindo ao serviço
informativo da Microsoft
Portugal”
Global
Commands/Grammar
“Menu Principal”; “Iniciar”;
“Reiniciar”; “Voltar”;
“Terminar”
![Page 36: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/36.jpg)
Get the bits!
• Para cliente desktop:– API SpeechFX no .NET Framework3 (incluído no Vista; download
necessário para XP).
– Language Packs estão incluídos no Vista.
• Para servidor: OCS 2007 Speech Server– Vários Language Packs estão incluídos.
![Page 37: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/37.jpg)
Language Packs de Português
• MLDC disponibiliza Programas de Avaliação Beta da sua tecnologia.
• LPs Beta para Português Europeu e Brasileiro em:
– Client desktop:
• Reconhecedor de Fala de Português.
– Speech Server:
• Reconhecedor + Sintetizador de Fala de Português.
• Toda a informação em:
– http://www.microsoft.com/portugal/mldc/betaprograms/
• Invitation code: MLDC-BKBY-DTBD
• http://connect.microsoft.com
![Page 38: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/38.jpg)
Recursos
• Microsoft Language Development Center– http://www.microsoft.com/portugal/mldc
– Beta Program + Projects + Videos + Demos + News
• MLDC Blog– http://blogs.msdn.com/tagarela/
• Microsoft Speech– http://www.microsoft.com/speech/
![Page 39: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/39.jpg)
Recursos
• .NET Framework 3.0 SpeechFX API for client-side speech-development:
– Intro article:
• http://msdn.microsoft.com/msdnmag/issues/06/01/speechinWindowsVista/
– .NET Framework 3.0 runtime download (for XP):
• http://www.microsoft.com/downloads/details.aspx?FamilyId=10CC340B-F857-4A14-83F5-25634C3BF043&displaylang=en
– Managed SpeechFX API Documentation (MSDN):• http://msdn2.microsoft.com/en-us/library/system.speech.recognition.aspx
• http://msdn2.microsoft.com/en-us/library/system.speech.synthesis.aspx
– “Windows Speech Recognition” User Experience in Windows Vista:• http://www.microsoft.com/enable/products/windowsvista/speech.aspx
• http://www.microsoft.com/windows/products/windowsvista/features/details/speechrecognition.mspx
• MLDC Client Beta Program:
– http://www.microsoft.com/portugal/mldc/betaprograms/winclientdesktop.mspx
• MLDC Client Demo Videos:
– http://www.microsoft.com/portugal/mldc/projects/speechapps.mspx
![Page 40: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/40.jpg)
Recursos
• “Microsoft Office Communications Server 2007 Speech Server” for IVR server-side speech-development:
– Microsoft Office Communications Server 2007 Speech Server Developer EditionDownload:
• http://www.microsoft.com/downloads/details.aspx?FamilyId=BB183640-4B8F-4828-80C9-E83C3B2E7A2C&displaylang=en
– OCS 2007 Speech Server Documentation (MSDN):• http://msdn2.microsoft.com/en-us/library/bb857803.aspx
– Books and webcasts are also available.
• MLDC TAP Program:
– http://www.microsoft.com/portugal/mldc/betaprograms/officecomserv07spserv.mspx
• MLDC Server Demo Videos:
– http://www.microsoft.com/portugal/mldc/projects/europtconnect.mspx
– http://www.microsoft.com/portugal/mldc/news/feb08_Techdays2008.mspxhttp://www.microsoft.com/portugal/mldc/projects/ExchangeServer2007.mspx
![Page 41: Speak up: como criar Speech-based apps](https://reader033.fdocuments.net/reader033/viewer/2022051818/549d0204ac7959e22a8b489d/html5/thumbnails/41.jpg)
www.microsoft.com/portugal/mldc