MLI Site Visit 4 98 overview

Uploaded from authorPOINTLite
Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Serbo-Croatian LVCSR on the Dictation and Broadcast News Domain: 

Serbo-Croatian LVCSR on the Dictation and Broadcast News Domain Michael Finke, Petra Geutner, Peter Scheytt Alex Waibel Interactive Systems Laboratories Carnegie Mellon University University of Karlsruhe Multilingual Informedia Project

Multilingual Informedia Project Partners: 

Multilingual Informedia Project Partners Informedia (English) CMU Informedia Group (Howard Wactlar, Alex Hauptmann, Ricky Houghton, et al.) CMU Sphinx Group Multilingual Speech Recognition CMU/UKA Interactive Systems Labs - JanusRTk (Alex Waibel, Michael Finke, Petra Geutner, Peter Scheytt) Translation/Cross Language Retrieval CMU Language Technologies Institute (Jaime Carbonell, Eric Nyberg, Bob Frederking, Paul Kennedy, et al.) DARPA N66001-97-D-8502 Delivery Order 0001 Introduction and Motivation

Slide3: 

Introduction and Motivation

Serbo-Croatian BN Speech Performance: 

Serbo-Croatian BN Speech Performance Broadcast News System

Vocabulary Growth Per Broadcast: 

Vocabulary Growth Per Broadcast Broadcast News System

Language Normalization: 

Language Normalization Transcription and Pronunciation Variants One Single Form for Serbian/Croatian Variants Unique Orthography for Foreign Words Transcription Errors Vocabulary: 48 K OOV Rate: 7.9% (w/o Normal.: 10.2%) Interpolated Language Models: Higher Improvement Than w/o Normal. Broadcast News System

Hypothesis Driven Lexicon Adaptation: 

Hypothesis Driven Lexicon Adaptation Broadcast News System

Information: 

Information Publications and Presentation: ISL: http://www.is.cs.cmu.edu INFORMEDIA: http://www.informedia.cs.cmu.edu LTI: http://www.lti.cs.cmu.edu http://www.cs.cmu.edu/~scheytt Email: scheytt@cs.cmu.edu pgeutner@ira.uka.de finkem@cs.cmu.edu waibel@cs.cmu.edu