logging in or signing up alan_schultz aSGuest6112 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 35 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: December 11, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide 1: Using Computational Cognitive Models for Better Human-Robot Collaboration Alan C. Schultz J. Gregory Trafton Nick Cassimatis Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory Peer-to-peer collaboration in Human-Robot Teams : Peer-to-peer collaboration in Human-Robot Teams Not interested in general, unified grand theory of cognition for solving the whole problem We already know how to be mobile, avoid collisions,etc Approach: to be informed from cognitive psychology Study human-human collaboration Determine important high-level cognitive skills Build computational cognitive models of these skills ACT-R, SOAR, EPIC, Polyscheme… Use computational models as reasoningmechanism on robot for high-level cognition Cognitive Science as EnablerCognitive Robotics : Cognitive Science as EnablerCognitive Robotics Hypothesis: A system using human-like representations and processes will enable better collaboration with people than a computational system that does not Similar representations and reasoning mechanisms make it easier for humans to work with the system; more compatible For close collaboration, systems should act “naturally” i.e. not do something or say something in a way that detracts from the interaction/collaboration with the human Robot should accommodate humans; not other way around Solving tasks from “first principles” Humans are good at solving some tasks; let’s leverage human’s ability Cognitive Skills : Cognitive Skills Appropriate knowledge representations Spatial representation for spatial reasoning Adapting representation to problem solving method Problem solving Navigation routing with constraints (e.g., remaining hidden) Learning Learning to recognize and anticipate others’ behaviors Learning characteristics of other’s capabilities Vision Object permanence and tracking (Cassimatis et al., 04) Recognizing gestures Natural language/gestures (Perzanowski et al., 01) Cognitive Skills : Cognitive Skills Perspective-Taking Spatial (Trafton et al., 2005) Social (Breazeal et al., 2006) Spatial reasoning People use metric information implicitly; use and think qualitatively much more frequently (Trafton et al., 2006) Spatial referencing/language (Skubic et al., 04) Temporal reasoning Predicting how long something will take Anticipation What does a person need and why? Hide and Seek(Trafton & Schultz, 2004, 2006) : Hide and Seek(Trafton & Schultz, 2004, 2006) Lots of knowledge about space required A “good” hider needs visual, spatial perspective taking to find the good hiding places (large amount of spatial knowledge needed) Development of Perspective-Taking : Development of Perspective-Taking Children start developing (very very basic) perspective-taking ability around age 3-4 Huttenlocher & Presson, 1979; Newcombe & Huttenlocher, 1992; Wallace, Alan, & Tribol, 2001 In general, 3-4 year old children do not have a particularly well developed sense of perspective taking Case Study: Hide and Seek Age 3½ : Case Study: Hide and Seek Age 3½ Elena did not have perspective taking ability Left/right errors play hide and seek by learning pertinent qualitative features of objects construct knowledge about hiding that is object-specific Hide and Seek Cognitive Model : Hide and Seek Cognitive Model Created cognitive model of Elena learning to play hide and seek using ACT-R (Anderson, et al 93, 95, 98, 05) Correctly models Elena’s behavior at 3½ years of age Learns and refines hiding behavior based on interactions with “teacher” Learns production strength based on success and failure of hiding behavior Learns ontological or schematic knowledge about hiding Its bad to hide behind something that’s clear Its good to hide behind something that is big enough Knows about location of objects (relative) (behind, in front of) adds knowledge about relationships. Model only has syntactic notion of spatial relationships Hybrid Cognitive/Reactive ArchitectureRobot Hide and Seek : Hybrid Cognitive/Reactive ArchitectureRobot Hide and Seek Using cognitive model of hiding (after learning) in order to reason about what makes a good hiding place in order to seek. Computational cognitive model of hiding makes deliberative (high-level cognitive) decisions. Models learning. Reactive layer of hybrid model for mobility and sensor processing How important is perspective taking?(Trafton et al., 2005) : How important is perspective taking?(Trafton et al., 2005) Analyzed a corpus of NASA training tapes Space Station Mission 9A Two astronauts working in full suits in neutral-buoyancy facility. Third, remote person participates. Standard protocol analysis techniques; transcribed 8 hours of utterances and gestures (~4000 instances) Use of spatial language (up, down, forward, in between, my left, etc) and commands Research questions: What frames of reference are used? How often do people switch frames of reference? How often do people take another person’s perspective? Spatial language in spaceResults : Spatial language in spaceResults How frequently do people switch their frame of reference? 45% of the time (Consistent with Franklin, Tversky, & Coon, 1992) How often do people take other people’s perspective (or force others to take theirs)? 25% of the time Perspective Taking and Changing Frames of Reference : Perspective Taking and Changing Frames of Reference Perspective Taking and Changing Frames of Reference : Notice the mixing of perspectives: exocentric (down), object-centered (down under the rail), addressee-centered (right hand), and exocentric again (nadir) all in one instruction! Notice the “new” term developed collaboratively: mystery hand rail Bob, if you come straight down from where you are, uh, and uh kind of peek down under the rail on the nadir side, by your right hand, almost straight nadir, you should see the uh… Perspective Taking and Changing Frames of Reference Perspective Taking : Perspective Taking Perspective taking is critical for collaboration. How do we model it? (ACT-R, Polyscheme…) I’ll show several demos that show our current progress on spatial perspective taking But first a scenario: “Please hand me the wrench” Perspective taking in human interactions : Perspective taking in human interactions How do people usually resolve ambiguous references that involve different spatial perspectives? (Clark, 96) Principle of least effort (which implies least joint effort) All things being equal, agents try to minimize their effort Principle of joint salience The ideal solution to a coordination problem among two or more agents is the solution that is the most salient, prominent, or conspicuous with respect to their current common ground. In less simple contexts, agents may have to work harder to resolve ambiguous references Perspective Taking:A tale of two systems : Perspective Taking:A tale of two systems ACT-R/S (Schunn & Harrison, 2001) Our perspective-taking system using ACT-R/S is described in Hiatt et al. 2003 Three Integrated VisuoSpatial buffers Focal: Object ID; non-metric geon parts Manipulative: grasping/tracking; metric geons Configural: navigation; bounding boxes Polyscheme (Cassimatis) Computational Cognitive Architecture where: Mental Simulation is the primitive Many AI methods are integrated Our perspective-taking using Polyscheme is described in Trafton et al., 2005 Robot Perspective Taking : Robot Perspective Taking Human can see one coneRobot can sense two cones (Fong et al., 06) Summary : Summary Having similar or compatible representation and reasoning as a human facilitates human-robot collaboration We’ve developed computational cognitive models of high-level human cognitive skills as reasoning mechanisms for robots Open questions: Scale up; combining many such skills What are the important skills? Which skills are built upon others? Shameless Advertisement : Shameless Advertisement ACM/IEEE Second International Conference on Human-Robot Interaction Washington DC, March 9-11, 2007 With HRI 2007 Young Researchers Workshop, March 8, 2007 Single track, highly multi-disciplinary Robotics, Cognitive Science, HCI, Human factors, Cognitive Psychology… Submission deadline: August 31, 2006 www.hri2007.org A Dynamic Auditory Scene : A Dynamic Auditory Scene Everyday Auditory Scenes are VERY Noisy Fans Alarms/Telephones Traffic Weather People Auditory Perspective Taking : Auditory Perspective Taking Information Kiosk Robot uses speech to relay information to an interested human listener. Given the auditory scene, can the person understand what the robot is saying? If not, what actions can the robot take to improve intelligibility and knowledge transfer? Allow a robot to use its knowledge of the environment, both a priori and sensed, to predict what human can hear and effectively understand. Stealth Bot Robot uses its awareness of the auditory environment to hide from people and or machines. The robot knows its own acoustic signature Now predict how each action or location will be heard by the listener, and select the best choice. An Example of Adaptation:Robot Speech Interface : An Example of Adaptation:Robot Speech Interface Adjust word usage depending on noise levels Use smaller words with higher recognition rates. Ask questions to verify understanding; repeat yourself. Change the quality of the speech sounds Adapt voice volume and pitch to overcome local noise levels (Lombard Speech). Emphasize difficult words. Don’t talk during loud noises Reposition Oneself Vary the proximity to the listener Face the listener as much as possible Move to a different location if all else fails. Information Kiosk : Information Kiosk Overhead Microphone Array Tracks local sound levels Localizes interfering sources Guides the vision system to new users Stereo Vision Tracks the users position in real-time. Actions Raise speaking volume relative to users distance and the level of ambient noise Pause during loud sounds or speech interruptions. Rotate the robot to face users Reposition the robot if noise levels become too large. Acoustic Perspective : Acoustic Perspective Noise Maps – Combine Knowledge of Sound Sources to Build Maps Measured Volume/Frequency Levels Source Locations/Directionality Walls and environmental features Multiple maps can be built and combined in real-time Modifying action based on noise map Seeking noisy hiding places so that it can best observe its target without being detected. masking its particular acoustic signature. After exploring the area inside the square, 3 air vents are localized by the robot Slide 26: 4 Sources are combined together as omnidirectional sources, without environmental reflections. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
alan_schultz aSGuest6112 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 35 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: December 11, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide 1: Using Computational Cognitive Models for Better Human-Robot Collaboration Alan C. Schultz J. Gregory Trafton Nick Cassimatis Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory Peer-to-peer collaboration in Human-Robot Teams : Peer-to-peer collaboration in Human-Robot Teams Not interested in general, unified grand theory of cognition for solving the whole problem We already know how to be mobile, avoid collisions,etc Approach: to be informed from cognitive psychology Study human-human collaboration Determine important high-level cognitive skills Build computational cognitive models of these skills ACT-R, SOAR, EPIC, Polyscheme… Use computational models as reasoningmechanism on robot for high-level cognition Cognitive Science as EnablerCognitive Robotics : Cognitive Science as EnablerCognitive Robotics Hypothesis: A system using human-like representations and processes will enable better collaboration with people than a computational system that does not Similar representations and reasoning mechanisms make it easier for humans to work with the system; more compatible For close collaboration, systems should act “naturally” i.e. not do something or say something in a way that detracts from the interaction/collaboration with the human Robot should accommodate humans; not other way around Solving tasks from “first principles” Humans are good at solving some tasks; let’s leverage human’s ability Cognitive Skills : Cognitive Skills Appropriate knowledge representations Spatial representation for spatial reasoning Adapting representation to problem solving method Problem solving Navigation routing with constraints (e.g., remaining hidden) Learning Learning to recognize and anticipate others’ behaviors Learning characteristics of other’s capabilities Vision Object permanence and tracking (Cassimatis et al., 04) Recognizing gestures Natural language/gestures (Perzanowski et al., 01) Cognitive Skills : Cognitive Skills Perspective-Taking Spatial (Trafton et al., 2005) Social (Breazeal et al., 2006) Spatial reasoning People use metric information implicitly; use and think qualitatively much more frequently (Trafton et al., 2006) Spatial referencing/language (Skubic et al., 04) Temporal reasoning Predicting how long something will take Anticipation What does a person need and why? Hide and Seek(Trafton & Schultz, 2004, 2006) : Hide and Seek(Trafton & Schultz, 2004, 2006) Lots of knowledge about space required A “good” hider needs visual, spatial perspective taking to find the good hiding places (large amount of spatial knowledge needed) Development of Perspective-Taking : Development of Perspective-Taking Children start developing (very very basic) perspective-taking ability around age 3-4 Huttenlocher & Presson, 1979; Newcombe & Huttenlocher, 1992; Wallace, Alan, & Tribol, 2001 In general, 3-4 year old children do not have a particularly well developed sense of perspective taking Case Study: Hide and Seek Age 3½ : Case Study: Hide and Seek Age 3½ Elena did not have perspective taking ability Left/right errors play hide and seek by learning pertinent qualitative features of objects construct knowledge about hiding that is object-specific Hide and Seek Cognitive Model : Hide and Seek Cognitive Model Created cognitive model of Elena learning to play hide and seek using ACT-R (Anderson, et al 93, 95, 98, 05) Correctly models Elena’s behavior at 3½ years of age Learns and refines hiding behavior based on interactions with “teacher” Learns production strength based on success and failure of hiding behavior Learns ontological or schematic knowledge about hiding Its bad to hide behind something that’s clear Its good to hide behind something that is big enough Knows about location of objects (relative) (behind, in front of) adds knowledge about relationships. Model only has syntactic notion of spatial relationships Hybrid Cognitive/Reactive ArchitectureRobot Hide and Seek : Hybrid Cognitive/Reactive ArchitectureRobot Hide and Seek Using cognitive model of hiding (after learning) in order to reason about what makes a good hiding place in order to seek. Computational cognitive model of hiding makes deliberative (high-level cognitive) decisions. Models learning. Reactive layer of hybrid model for mobility and sensor processing How important is perspective taking?(Trafton et al., 2005) : How important is perspective taking?(Trafton et al., 2005) Analyzed a corpus of NASA training tapes Space Station Mission 9A Two astronauts working in full suits in neutral-buoyancy facility. Third, remote person participates. Standard protocol analysis techniques; transcribed 8 hours of utterances and gestures (~4000 instances) Use of spatial language (up, down, forward, in between, my left, etc) and commands Research questions: What frames of reference are used? How often do people switch frames of reference? How often do people take another person’s perspective? Spatial language in spaceResults : Spatial language in spaceResults How frequently do people switch their frame of reference? 45% of the time (Consistent with Franklin, Tversky, & Coon, 1992) How often do people take other people’s perspective (or force others to take theirs)? 25% of the time Perspective Taking and Changing Frames of Reference : Perspective Taking and Changing Frames of Reference Perspective Taking and Changing Frames of Reference : Notice the mixing of perspectives: exocentric (down), object-centered (down under the rail), addressee-centered (right hand), and exocentric again (nadir) all in one instruction! Notice the “new” term developed collaboratively: mystery hand rail Bob, if you come straight down from where you are, uh, and uh kind of peek down under the rail on the nadir side, by your right hand, almost straight nadir, you should see the uh… Perspective Taking and Changing Frames of Reference Perspective Taking : Perspective Taking Perspective taking is critical for collaboration. How do we model it? (ACT-R, Polyscheme…) I’ll show several demos that show our current progress on spatial perspective taking But first a scenario: “Please hand me the wrench” Perspective taking in human interactions : Perspective taking in human interactions How do people usually resolve ambiguous references that involve different spatial perspectives? (Clark, 96) Principle of least effort (which implies least joint effort) All things being equal, agents try to minimize their effort Principle of joint salience The ideal solution to a coordination problem among two or more agents is the solution that is the most salient, prominent, or conspicuous with respect to their current common ground. In less simple contexts, agents may have to work harder to resolve ambiguous references Perspective Taking:A tale of two systems : Perspective Taking:A tale of two systems ACT-R/S (Schunn & Harrison, 2001) Our perspective-taking system using ACT-R/S is described in Hiatt et al. 2003 Three Integrated VisuoSpatial buffers Focal: Object ID; non-metric geon parts Manipulative: grasping/tracking; metric geons Configural: navigation; bounding boxes Polyscheme (Cassimatis) Computational Cognitive Architecture where: Mental Simulation is the primitive Many AI methods are integrated Our perspective-taking using Polyscheme is described in Trafton et al., 2005 Robot Perspective Taking : Robot Perspective Taking Human can see one coneRobot can sense two cones (Fong et al., 06) Summary : Summary Having similar or compatible representation and reasoning as a human facilitates human-robot collaboration We’ve developed computational cognitive models of high-level human cognitive skills as reasoning mechanisms for robots Open questions: Scale up; combining many such skills What are the important skills? Which skills are built upon others? Shameless Advertisement : Shameless Advertisement ACM/IEEE Second International Conference on Human-Robot Interaction Washington DC, March 9-11, 2007 With HRI 2007 Young Researchers Workshop, March 8, 2007 Single track, highly multi-disciplinary Robotics, Cognitive Science, HCI, Human factors, Cognitive Psychology… Submission deadline: August 31, 2006 www.hri2007.org A Dynamic Auditory Scene : A Dynamic Auditory Scene Everyday Auditory Scenes are VERY Noisy Fans Alarms/Telephones Traffic Weather People Auditory Perspective Taking : Auditory Perspective Taking Information Kiosk Robot uses speech to relay information to an interested human listener. Given the auditory scene, can the person understand what the robot is saying? If not, what actions can the robot take to improve intelligibility and knowledge transfer? Allow a robot to use its knowledge of the environment, both a priori and sensed, to predict what human can hear and effectively understand. Stealth Bot Robot uses its awareness of the auditory environment to hide from people and or machines. The robot knows its own acoustic signature Now predict how each action or location will be heard by the listener, and select the best choice. An Example of Adaptation:Robot Speech Interface : An Example of Adaptation:Robot Speech Interface Adjust word usage depending on noise levels Use smaller words with higher recognition rates. Ask questions to verify understanding; repeat yourself. Change the quality of the speech sounds Adapt voice volume and pitch to overcome local noise levels (Lombard Speech). Emphasize difficult words. Don’t talk during loud noises Reposition Oneself Vary the proximity to the listener Face the listener as much as possible Move to a different location if all else fails. Information Kiosk : Information Kiosk Overhead Microphone Array Tracks local sound levels Localizes interfering sources Guides the vision system to new users Stereo Vision Tracks the users position in real-time. Actions Raise speaking volume relative to users distance and the level of ambient noise Pause during loud sounds or speech interruptions. Rotate the robot to face users Reposition the robot if noise levels become too large. Acoustic Perspective : Acoustic Perspective Noise Maps – Combine Knowledge of Sound Sources to Build Maps Measured Volume/Frequency Levels Source Locations/Directionality Walls and environmental features Multiple maps can be built and combined in real-time Modifying action based on noise map Seeking noisy hiding places so that it can best observe its target without being detected. masking its particular acoustic signature. After exploring the area inside the square, 3 air vents are localized by the robot Slide 26: 4 Sources are combined together as omnidirectional sources, without environmental reflections.