upa nyc jonathan bloom

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

The Wheat and Chaff of Speech Recognition: 

The Wheat and Chaff of Speech Recognition Jonathan Bloom, Ph.D.

Agenda: 

Agenda When to use speech How to spec How to test

Definitions: 

Definitions Dictation (PC mostly) Command and Control (PC, Phone, PDA, cell, cars) Multimodal (PC, cars) asynchronous synchronous TTS - “Text to Speech”

GUI: Warts and All: 

GUI: Warts and All

SUI: Warts and All: 

SUI: Warts and All

SUI: Warts in Places You Didn’t Check: 

SUI: Warts in Places You Didn’t Check Taxes computer memory – requires tradeoffs Speaker dependence Vocabulary size Taxes human memory Remember command wording Remember how to speak

Speech for the Right Reasons: 

Speech for the Right Reasons Hands busy or disabled? Eyes busy? Repetitive task? Small form factor? Noisy environment?

Good examples: 

Good examples CAD on desktop Hands available for mouse and keyboard tasks Dictation on desktop For RSI For loosely formatted text Navigation destination entry in car Auto attendant on phone

Bad Examples: 

Bad Examples Email by phone Speech to replace long touch-tone menus Browse web by voice This could get me fired.

How to write a spec (speech only): 

How to write a spec (speech only) Start with sample interactions not comprehensive represent main legs validate feel audio version as well

How to write a spec (speech only): 

How to write a spec (speech only) ETC...

Directing Voice Talent: 

Directing Voice Talent Huge part of usability Common problems No pause before commands Too ‘friendly’ or too ‘cold’ Not a conversation Engages caller and makes app more understandable

Usability Testing Speech Apps: 

Usability Testing Speech Apps More is the same than different, except… No real-time discussion Leading wording becomes major concern Need to capture two audio channels (THAT-1) 2 audio and 2 video for multimodal

Changing Field: 

Changing Field SpeakFreely®, SayAnything®, How May I Help You Synchronous multimodal NLU?

Thank you.: 

Thank you.