logging in or signing up aamas02 Tatlises Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 58 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA: RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA Paul Piwek, ITRI, Brighton Brigitte Krenn, OFAI, Vienna Marc Schröder, DFKI, Saarbrücken Martine Grice, IPUS, Saarbrücken Stefan Baumann, IPUS, Saarbrücken Hannes Pirker, OFAI, ViennaNECA: NECA Duration: 2.5 years Start: October 2001 A new generation of mixed multi-user / multi agent virtual spaces for the internet Populated by affective conversational agentsAffective Conversational Agents: Affective Conversational Agents Express themselves through Emotional speech and synchronised non-verbal expressionApplication Scenarios: Application Scenarios Socialite a multi-user web-application in the social domain eShowRoom a novel approach to the presentation of products in e-Commerce applications The NECA Platform will be evaluated in two concrete application scenariosSlide6: SocialiteNECA’s Architecture: NECA’s Architecture Scene Generator User Input Scene Description Affective Reasoner (AR) NECA’s Architecture: NECA’s Architecture Scene Generator User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Affective Reasoner (AR) NECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech NECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Animation directivesNECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Animation Control Sequence Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Player-Specific Rendering Animation directivesNECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Animation Control Sequence Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Player-Specific Rendering Animation directives RRL RRL RRL RRLRequirements for RRL: Requirements for RRL Application Domain Represent combinations of different types of information Expressivity Processing Modules Ease of manipulation/search (incremental/fast) Developers (Maintainability) Predictability Locality Conciseness IntelligibilityScene Description: Scene Description SG M-NLG GA TTS/CTS What is a Scene? I Theatr. 1 A subdivision of (an act of) a play, in which the time is continuous and the setting fixed, …; the action and dialogue comprised in any one of these subdivisions. (New Shorter Oxford English Dictionary, 1996) Scene Descriptions in a Nutshell: Scene Descriptions in a Nutshell Network representations: Flat, uniform Use the Description Logical T and A-box distinction. T-box defines types, subtypes, attributes and constants Can emulate CFGs, so we can include, e.g., semantic representation languages: Discourse Representation Theory (Kamp & Reyle, 1994) Reification of expressions in the network provide useful handles for interleaving different types of information Lends itself well for graphical representationScene Descriptions in a Nutshell: Scene Descriptions in a Nutshell Further Features of (RRL) Scene Descriptions For communication between modules: XML syntax Temporal relations are explicitly represented. Meta-conditions used in DRT for WH-questions, Topics and Bridging AnaphoraeShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example Multimodal Output: Multimodal Output SG M-NLG GA TTS/CTS Multimodal Natural Language Generation (M-NLG) supplies Information on emotional state Conceptually rich input for Speech Synthesis Initial specification of gestures and facial expressions for later use in Gesture AssignmentNeca’s Speech Synthesis: Emotions: Neca’s Speech Synthesis: Emotions SG M-NLG GA TTS/CTS Not restricted to prosody (pitch, duration) Several voice databases diphon-inventories for different voice qualities (modal, loud, soft) Emotive interjections Gradual emotional states Shades of emotion / changing over time Neca’s Speech Synthesis: Concept-to-Speech: Neca’s Speech Synthesis: Concept-to-Speech SG M-NLG GA TTS/CTS Concept-to-Speech instead of Text-to-Speech approach: Part of Speech tags Syntactic structure Information status (given/new) Information structure (theme/rheme)CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <sentence>CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <word text="This" pos="PDAT"/> <word text="car" pos="NN"/> <word text="has" pos="VAFIN"/> <word text="leather seats" pos="NN" /> <punct text="." pos="$."/> </sentence>CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <word text="car" pos="NN"/> </synPhrase> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <infoStatus type="referent-given"> <word text="car" pos="NN"/> <infoStatus /> </synPhrase> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <infoStruct part="theme"> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <infoStatus type="referent-given"> <word text="car" pos="NN"/> </infoStatus> </synPhrase> <infoStruct part="rheme"> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> </infoStruct> </infoStruct> </sentence>Prosodic/Phonetic Information for GA: Prosodic/Phonetic Information for GA SG M-NLG GA TTS/CTS Phonetics exact timing of speech sounds, pauses and interjections Prosody boundarie locations for syllables words prosodic phrases Prosodic/Phonetic Information for GA: Prosodic/Phonetic Information for GA SG M-NLG GA TTS/CTS information on: syllables bearing word-stress position and type of sentence accents position and type of prosodic boundaries Animation directives: Animation directives SG M-NLG GA TTS/CTS Phonetic information (phonemes) used for specifying Visemes breathing Animation directives: Animation directives SG M-NLG GA TTS/CTS Prosodic information (stress, accents, phrasing) used for specifying synchronization of gestures with speech eye-blinking gazeConclusions: Conclusions RRL is representation language for wide range of expert knowledge required at interfaces of NECA modules. Scene Descriptions: uniform representation/integration of different types of information (illustrated with integration of DRT); using handles;… Speech Synthesis: conceptually rich input as opposed to text Gesture Assignment: access to exact timing of speech You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
aamas02 Tatlises Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 58 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA: RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA Paul Piwek, ITRI, Brighton Brigitte Krenn, OFAI, Vienna Marc Schröder, DFKI, Saarbrücken Martine Grice, IPUS, Saarbrücken Stefan Baumann, IPUS, Saarbrücken Hannes Pirker, OFAI, ViennaNECA: NECA Duration: 2.5 years Start: October 2001 A new generation of mixed multi-user / multi agent virtual spaces for the internet Populated by affective conversational agentsAffective Conversational Agents: Affective Conversational Agents Express themselves through Emotional speech and synchronised non-verbal expressionApplication Scenarios: Application Scenarios Socialite a multi-user web-application in the social domain eShowRoom a novel approach to the presentation of products in e-Commerce applications The NECA Platform will be evaluated in two concrete application scenariosSlide6: SocialiteNECA’s Architecture: NECA’s Architecture Scene Generator User Input Scene Description Affective Reasoner (AR) NECA’s Architecture: NECA’s Architecture Scene Generator User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Affective Reasoner (AR) NECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech NECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Animation directivesNECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Animation Control Sequence Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Player-Specific Rendering Animation directivesNECA’s Architecture: NECA’s Architecture Scene Generator Text/Concept to Speech Synthesis (CTS) User Input Scene Description Multi-modal Output Multi-modal Natural Language Generator (M-NLG) Gesture Assignment Module (GA) Animation Control Sequence Phonetic+Prosodic Information Affective Reasoner (AR) Emotional Speech Player-Specific Rendering Animation directives RRL RRL RRL RRLRequirements for RRL: Requirements for RRL Application Domain Represent combinations of different types of information Expressivity Processing Modules Ease of manipulation/search (incremental/fast) Developers (Maintainability) Predictability Locality Conciseness IntelligibilityScene Description: Scene Description SG M-NLG GA TTS/CTS What is a Scene? I Theatr. 1 A subdivision of (an act of) a play, in which the time is continuous and the setting fixed, …; the action and dialogue comprised in any one of these subdivisions. (New Shorter Oxford English Dictionary, 1996) Scene Descriptions in a Nutshell: Scene Descriptions in a Nutshell Network representations: Flat, uniform Use the Description Logical T and A-box distinction. T-box defines types, subtypes, attributes and constants Can emulate CFGs, so we can include, e.g., semantic representation languages: Discourse Representation Theory (Kamp & Reyle, 1994) Reification of expressions in the network provide useful handles for interleaving different types of information Lends itself well for graphical representationScene Descriptions in a Nutshell: Scene Descriptions in a Nutshell Further Features of (RRL) Scene Descriptions For communication between modules: XML syntax Temporal relations are explicitly represented. Meta-conditions used in DRT for WH-questions, Topics and Bridging AnaphoraeShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example eShowRoom Example: eShowRoom Example Multimodal Output: Multimodal Output SG M-NLG GA TTS/CTS Multimodal Natural Language Generation (M-NLG) supplies Information on emotional state Conceptually rich input for Speech Synthesis Initial specification of gestures and facial expressions for later use in Gesture AssignmentNeca’s Speech Synthesis: Emotions: Neca’s Speech Synthesis: Emotions SG M-NLG GA TTS/CTS Not restricted to prosody (pitch, duration) Several voice databases diphon-inventories for different voice qualities (modal, loud, soft) Emotive interjections Gradual emotional states Shades of emotion / changing over time Neca’s Speech Synthesis: Concept-to-Speech: Neca’s Speech Synthesis: Concept-to-Speech SG M-NLG GA TTS/CTS Concept-to-Speech instead of Text-to-Speech approach: Part of Speech tags Syntactic structure Information status (given/new) Information structure (theme/rheme)CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <sentence>CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <word text="This" pos="PDAT"/> <word text="car" pos="NN"/> <word text="has" pos="VAFIN"/> <word text="leather seats" pos="NN" /> <punct text="." pos="$."/> </sentence>CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <word text="car" pos="NN"/> </synPhrase> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <infoStatus type="referent-given"> <word text="car" pos="NN"/> <infoStatus /> </synPhrase> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> CTS specific information: CTS specific information SG M-NLG GA TTS/CTS <sentence> <text>This car has leather seats.</text> <gesture modality="voice" meaning="beautiful"/> <infoStruct part="theme"> <synPhrase category="NP" function="SB"> <word text="This" pos="PDAT"/> <infoStatus type="referent-given"> <word text="car" pos="NN"/> </infoStatus> </synPhrase> <infoStruct part="rheme"> <synPhrase phrase="VP" function="PD"> <word text="has" pos="VAFIN"/> <synPhrase phrase="NP" function="OA"> <word text="leather seats" pos="NN" /> </synPhrase> <punct text="." pos="$."/> </synPhrase> </infoStruct> </infoStruct> </sentence>Prosodic/Phonetic Information for GA: Prosodic/Phonetic Information for GA SG M-NLG GA TTS/CTS Phonetics exact timing of speech sounds, pauses and interjections Prosody boundarie locations for syllables words prosodic phrases Prosodic/Phonetic Information for GA: Prosodic/Phonetic Information for GA SG M-NLG GA TTS/CTS information on: syllables bearing word-stress position and type of sentence accents position and type of prosodic boundaries Animation directives: Animation directives SG M-NLG GA TTS/CTS Phonetic information (phonemes) used for specifying Visemes breathing Animation directives: Animation directives SG M-NLG GA TTS/CTS Prosodic information (stress, accents, phrasing) used for specifying synchronization of gestures with speech eye-blinking gazeConclusions: Conclusions RRL is representation language for wide range of expert knowledge required at interfaces of NECA modules. Scene Descriptions: uniform representation/integration of different types of information (illustrated with integration of DRT); using handles;… Speech Synthesis: conceptually rich input as opposed to text Gesture Assignment: access to exact timing of speech