logging in or signing up dlxGonzalez FunSchool Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 20 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: September 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript A Spoken Dialog System to Access a Newspaper Web Site: A Spoken Dialog System to Access a Newspaper Web Site César González Ferreras (UVA) Rubén San-Segundo Hernández (UPM) Valentín Cardeñoso Payo (UVA) Universidad Politécnica de Madrid Dialog Systems based on XML Technologies Berliner XML Tage 2004Contents: Contents Introduction and Related Work System Overview System Architecture Interaction Model Information Model Sample Interaction Conclusions and Future WorkIntroduction: Introduction Provide vocal access to already existing Internet contents. Advantages of Vocal Interaction over traditional visual only web browsing: Speech is more natural for most of the people. Suits for users with special needs (e.g. blind people) Ideal for hands-free, eyes-busy environments. Solution for mobile devices which allow web access anytime anywhere, but still have limited displaying capabilities.Introduction: Introduction Maturity of spoken dialog systems for accessing structured information stored in databases [La99, Zu00]. Textual information is massive and speech interface has some limitations (sequential and not persistent). An efficient and natural way of interaction is required.Related Work: Related Work Approaches to make web contents available using speech: Add a vocal interface to an existing web browser, [HT95, Ve03]. Convert HTML contents into VoiceXML, [Go00, FKL01]. Restrict the the solution to selected on-line resources [La97, PCS03]. Extend a traditional Information Retrieval System with a speech interface [Cr99, Ch02].System Overview: System Overview Objective: develop a spoken dialog system to access a newspaper web site. We use two strategies to access information: Browse: review which information is available. Query: specific information need. To describe each strategy, we use two models: Interaction model: describes how the system dialogs with the user. Information model: describes how the web contents must be processed and structured in order to support that interaction. Browse: Browse Browse: the user does not have a specific information need and wants to know which information is available. Interaction Model: The information must be presented gradually, at different levels of detail. Information Model: The information must be organized in groups of items, and all the items in different levels of detail: first a headline, next a short description and finally all the information.Query: Query Query: the user has a specific information need which he can express as a query. Interaction Model:The system searches and presents the results to the user. Information Model: An inverted index is used. It contains, for each term in the lexicon, a list of documents in which that term appears. We have used the vector space model, [SWY75].System Architecture: System ArchitectureSystem Architecture: System Architecture Information Manager: HTML pages are converted into XML using Tidy and XSLT. Browsing tree is built (based on sections and news). Inverted index is built. Dialog Manager: VoiceXML is used as language to describe dialogs. Java Servlet technology (Tomcat). VoiceXML Browser: The system works for Spanish Language. Our own VoiceXML interpreter. Speech recognition and synthesis from Universidad Politécnica de Cataluña. Dialogic telephone card.Interaction Model: Interaction Model System initiative strategy to control the dialog flow (Finite State Diagrams mapped into VoiceXML). A large vocabulary (2000 words), dynamically generated, is divided in several smaller ones (50-100 words). Each of them associated with one state of the dialog (higher speech recognition rate). The system uses two different confirmation strategies, depending on the size of the vocabulary (implicit <25, explicit >25). The user can interrupt the system anytime: barge-in.Interaction Model (Browse): Interaction Model (Browse) SECTION BLOCK SUMMARY NEWS back back <news> body <section> Next Previous Next Previous backInteraction Model (Query): Interaction Model (Query) SECTION TERM SUMMARY NEWS back back <news> body <section> Next Previous OPTIONS <term> AND results>1 <term> AND results=1 backInformation Model: Information Model Built extracting information from the web site of a local newspaper (El Norte de Castilla). Decision tree: The contents of the newspaper are divided into sections. Each section contains several news stories. Each news story is composed of several elements: a headline, a short summary and a body.Information Model (Browse): Information Model (Browse) SECTION1 BLOCK1 SUMMARY1 NEWS1 S2 S3 N2 N3 S4 S5 N4 N5 B2 B3 SECTION2 ... ...Information Model (Query): Information Model (Query) Inverted index: For each news story we extract all the terms. A stemmer is used to remove affixes. For each term we calculate the weight using tf-idf. We use the 25 most relevant components of each news.Information Model (Query): Information Model (Query) Term weight: term frequency-inverse document frequency (tf-idf). The following formula is used to compute the weight w of each term in the document. tf is the number of times the term occurs in the document. df is the number of documents in which that term appears. N is the number of documents in the collection: Document collection: stories collected from the newspaper web site during more than a year (71,141 news). w = (1 + log(tf )) log df NSample Interaction (Browse): Sample Interaction (Browse) System: Welcome to “El Norte de Castilla” newspaper. Remember that you can interrupt the system anytime. You can ask for help whenever you need it. Choose browse or query to access information. User: Browse. System: What section do you want? local, Spain, international, life, culture or television. User: International System: International section. There are ten news. Block one. First: Left parties victory in regional French elections. Second: Tunisia cancels the summit of the Arab League about political reforms. Third: New leader of Hamas in Gaza describes George Bush as “enemy of God and Islam”. Fourth: Karzai announces a delay until September of Afghanistan elections. Fifth: United States gives the keys of the first of the twenty five Iraqi ministries. Please, choose the news you want, next or back. User: Fourth System: Headline: Karzai announces a delay until September of Afghanistan elections. International Section. Summary: He justifies the delay because of ...Sample Interaction (Query): Sample Interaction (Query) System: Welcome to “El Norte de Castilla” newspaper. Remember that you can interrupt the system anytime. You can ask for help whenever you need it. Choose browse or query to access information. User: Query. System: What section do you want? local, Spain, international, life, culture or television. User: International System: International section. Please, tell me the query term. User: Elections System: I understood elections. Is it correct? User: Yes System: International Section. There are three news about elections. First: Left parties victory in regional French elections. Second: Karzai announces a delay until September of Afghanistan elections. Third: At least 46 dead in the elections. Please, choose the news you want. User: Second System: Headline: Karzai announces a delay until September of Afghanistan elections. International Section. Summary: He justifies the delay because of ... Conclusions: Conclusions We have presented a system which allows speech access to a newspaper web site. Interaction Model combines browse and query mechanisms in order to allow the user access the information. Information Model supports that interaction using two data structures: a decision tree and an inverted index. All the contents used by the system are automatically obtained from the web. We used VoiceXML as a language to describe dialogs.Future Work: Future Work We are working in the evaluation of the system performance and an user satisfaction. We will study how users respond to the system and this will allow us to validate the adequacy of the models proposed to access the information.Slide22: QUESTIONSReferences: References [Ch02] Chang, E. et. al.: A System for Spoken Query Information Retrieval on Mobile Devices. IEEE Transactions on Speech and Audio Processing. 10(8). November 2002. [Cr99] Crestani, F.: Vocal access to a Newspaper Archive: Design Issues and Preliminary Investigations. In: ACM Digital Libraries. 1999. [FKL01] Freire, J.; Kumar, B.; Lieuwen, D. F.: WebViews: Accessing Personalized Web Content and Services. In: International World Wide Web Conference. 2001. [Go00] Goose, S. et. al.: Enhancing Web Accessibility Via the Vox Portal and a Web Hosted Dynamic HTML & VoxML Converter. In: International World Wide Web Conference. May 2000. [HT95] Hemphill, C. T.; Thrift, P. R.: Surfing the Web by Voice. In: ACM International Conference on Multimedia. 1995. [La97] Lau, R. et. al.: WebGalaxy - Integrating Spoken Language And Hypertext Navigation. In: European Conference on Speech Communication and Technology (Eurospeech). 1997.References: References [La99] Lamel, L. et. al.: The Limsi Arise System For Train Travel Information. In: International Conference on Acoustic, Speech and Signal Processing (ICASSP). 1999. [PCS03] Polifroni, J.; Chung, G.; Seneff, S.: Towards the Automatic Generation of Mixed-Initiative Dialogue Systems from Web Content. In: European Conference on Speech Communication and Technology (Eurospeech). 2003. [SWY75] Salton, G.; Wong, A.; Yang, C. S.: A vector space model for automatic indexing. Communications of the ACM. 18(11). November 1975. [Ve03] Vesnicer, B. et. al.: A Voice-driven Web Browser for Blind People. In: European Conference on Speech Communication and Technology (Eurospeech). 2003. [Zu00] Zue, V. et. al.: JUPITER: A Telephone-Based Conversational Interface for Weather Information. IEEE Transactions on Speech and Audio Processing. January 2000. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
dlxGonzalez FunSchool Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 20 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: September 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript A Spoken Dialog System to Access a Newspaper Web Site: A Spoken Dialog System to Access a Newspaper Web Site César González Ferreras (UVA) Rubén San-Segundo Hernández (UPM) Valentín Cardeñoso Payo (UVA) Universidad Politécnica de Madrid Dialog Systems based on XML Technologies Berliner XML Tage 2004Contents: Contents Introduction and Related Work System Overview System Architecture Interaction Model Information Model Sample Interaction Conclusions and Future WorkIntroduction: Introduction Provide vocal access to already existing Internet contents. Advantages of Vocal Interaction over traditional visual only web browsing: Speech is more natural for most of the people. Suits for users with special needs (e.g. blind people) Ideal for hands-free, eyes-busy environments. Solution for mobile devices which allow web access anytime anywhere, but still have limited displaying capabilities.Introduction: Introduction Maturity of spoken dialog systems for accessing structured information stored in databases [La99, Zu00]. Textual information is massive and speech interface has some limitations (sequential and not persistent). An efficient and natural way of interaction is required.Related Work: Related Work Approaches to make web contents available using speech: Add a vocal interface to an existing web browser, [HT95, Ve03]. Convert HTML contents into VoiceXML, [Go00, FKL01]. Restrict the the solution to selected on-line resources [La97, PCS03]. Extend a traditional Information Retrieval System with a speech interface [Cr99, Ch02].System Overview: System Overview Objective: develop a spoken dialog system to access a newspaper web site. We use two strategies to access information: Browse: review which information is available. Query: specific information need. To describe each strategy, we use two models: Interaction model: describes how the system dialogs with the user. Information model: describes how the web contents must be processed and structured in order to support that interaction. Browse: Browse Browse: the user does not have a specific information need and wants to know which information is available. Interaction Model: The information must be presented gradually, at different levels of detail. Information Model: The information must be organized in groups of items, and all the items in different levels of detail: first a headline, next a short description and finally all the information.Query: Query Query: the user has a specific information need which he can express as a query. Interaction Model:The system searches and presents the results to the user. Information Model: An inverted index is used. It contains, for each term in the lexicon, a list of documents in which that term appears. We have used the vector space model, [SWY75].System Architecture: System ArchitectureSystem Architecture: System Architecture Information Manager: HTML pages are converted into XML using Tidy and XSLT. Browsing tree is built (based on sections and news). Inverted index is built. Dialog Manager: VoiceXML is used as language to describe dialogs. Java Servlet technology (Tomcat). VoiceXML Browser: The system works for Spanish Language. Our own VoiceXML interpreter. Speech recognition and synthesis from Universidad Politécnica de Cataluña. Dialogic telephone card.Interaction Model: Interaction Model System initiative strategy to control the dialog flow (Finite State Diagrams mapped into VoiceXML). A large vocabulary (2000 words), dynamically generated, is divided in several smaller ones (50-100 words). Each of them associated with one state of the dialog (higher speech recognition rate). The system uses two different confirmation strategies, depending on the size of the vocabulary (implicit <25, explicit >25). The user can interrupt the system anytime: barge-in.Interaction Model (Browse): Interaction Model (Browse) SECTION BLOCK SUMMARY NEWS back back <news> body <section> Next Previous Next Previous backInteraction Model (Query): Interaction Model (Query) SECTION TERM SUMMARY NEWS back back <news> body <section> Next Previous OPTIONS <term> AND results>1 <term> AND results=1 backInformation Model: Information Model Built extracting information from the web site of a local newspaper (El Norte de Castilla). Decision tree: The contents of the newspaper are divided into sections. Each section contains several news stories. Each news story is composed of several elements: a headline, a short summary and a body.Information Model (Browse): Information Model (Browse) SECTION1 BLOCK1 SUMMARY1 NEWS1 S2 S3 N2 N3 S4 S5 N4 N5 B2 B3 SECTION2 ... ...Information Model (Query): Information Model (Query) Inverted index: For each news story we extract all the terms. A stemmer is used to remove affixes. For each term we calculate the weight using tf-idf. We use the 25 most relevant components of each news.Information Model (Query): Information Model (Query) Term weight: term frequency-inverse document frequency (tf-idf). The following formula is used to compute the weight w of each term in the document. tf is the number of times the term occurs in the document. df is the number of documents in which that term appears. N is the number of documents in the collection: Document collection: stories collected from the newspaper web site during more than a year (71,141 news). w = (1 + log(tf )) log df NSample Interaction (Browse): Sample Interaction (Browse) System: Welcome to “El Norte de Castilla” newspaper. Remember that you can interrupt the system anytime. You can ask for help whenever you need it. Choose browse or query to access information. User: Browse. System: What section do you want? local, Spain, international, life, culture or television. User: International System: International section. There are ten news. Block one. First: Left parties victory in regional French elections. Second: Tunisia cancels the summit of the Arab League about political reforms. Third: New leader of Hamas in Gaza describes George Bush as “enemy of God and Islam”. Fourth: Karzai announces a delay until September of Afghanistan elections. Fifth: United States gives the keys of the first of the twenty five Iraqi ministries. Please, choose the news you want, next or back. User: Fourth System: Headline: Karzai announces a delay until September of Afghanistan elections. International Section. Summary: He justifies the delay because of ...Sample Interaction (Query): Sample Interaction (Query) System: Welcome to “El Norte de Castilla” newspaper. Remember that you can interrupt the system anytime. You can ask for help whenever you need it. Choose browse or query to access information. User: Query. System: What section do you want? local, Spain, international, life, culture or television. User: International System: International section. Please, tell me the query term. User: Elections System: I understood elections. Is it correct? User: Yes System: International Section. There are three news about elections. First: Left parties victory in regional French elections. Second: Karzai announces a delay until September of Afghanistan elections. Third: At least 46 dead in the elections. Please, choose the news you want. User: Second System: Headline: Karzai announces a delay until September of Afghanistan elections. International Section. Summary: He justifies the delay because of ... Conclusions: Conclusions We have presented a system which allows speech access to a newspaper web site. Interaction Model combines browse and query mechanisms in order to allow the user access the information. Information Model supports that interaction using two data structures: a decision tree and an inverted index. All the contents used by the system are automatically obtained from the web. We used VoiceXML as a language to describe dialogs.Future Work: Future Work We are working in the evaluation of the system performance and an user satisfaction. We will study how users respond to the system and this will allow us to validate the adequacy of the models proposed to access the information.Slide22: QUESTIONSReferences: References [Ch02] Chang, E. et. al.: A System for Spoken Query Information Retrieval on Mobile Devices. IEEE Transactions on Speech and Audio Processing. 10(8). November 2002. [Cr99] Crestani, F.: Vocal access to a Newspaper Archive: Design Issues and Preliminary Investigations. In: ACM Digital Libraries. 1999. [FKL01] Freire, J.; Kumar, B.; Lieuwen, D. F.: WebViews: Accessing Personalized Web Content and Services. In: International World Wide Web Conference. 2001. [Go00] Goose, S. et. al.: Enhancing Web Accessibility Via the Vox Portal and a Web Hosted Dynamic HTML & VoxML Converter. In: International World Wide Web Conference. May 2000. [HT95] Hemphill, C. T.; Thrift, P. R.: Surfing the Web by Voice. In: ACM International Conference on Multimedia. 1995. [La97] Lau, R. et. al.: WebGalaxy - Integrating Spoken Language And Hypertext Navigation. In: European Conference on Speech Communication and Technology (Eurospeech). 1997.References: References [La99] Lamel, L. et. al.: The Limsi Arise System For Train Travel Information. In: International Conference on Acoustic, Speech and Signal Processing (ICASSP). 1999. [PCS03] Polifroni, J.; Chung, G.; Seneff, S.: Towards the Automatic Generation of Mixed-Initiative Dialogue Systems from Web Content. In: European Conference on Speech Communication and Technology (Eurospeech). 2003. [SWY75] Salton, G.; Wong, A.; Yang, C. S.: A vector space model for automatic indexing. Communications of the ACM. 18(11). November 1975. [Ve03] Vesnicer, B. et. al.: A Voice-driven Web Browser for Blind People. In: European Conference on Speech Communication and Technology (Eurospeech). 2003. [Zu00] Zue, V. et. al.: JUPITER: A Telephone-Based Conversational Interface for Weather Information. IEEE Transactions on Speech and Audio Processing. January 2000.