logging in or signing up march 01 Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 89 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 07, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence Laboratory MITPyramid: Pyramid Addressing problem at multiple levelsBuilt-in Behaviors: Built-in Behaviors Goal: general-purpose, robust visually guided local navigation optical flow for depth information finding the floor optical flow information Horswill’s ground-plane method build local occupancy grids navigate given the grid reactive methods dynamic programmingReactive Obstacle Avoidance: Reactive Obstacle Avoidance Standard method in mobile robotics is to use potential fields attractive force toward goal repulsive forces away from obstacles robot moves in direction given by resultant force New method for non-holonomic robots: move the center of the robot so that the front point is holonomic Human Obstacle Avoidance: Human Obstacle Avoidance Control law based on visual angle and distance to goal and obstacles Parameters set based on experiments with humans in large free-walking VR environment Humans are Smooth!: Humans are Smooth!Behavior Learning: Behavior Learning Typical RL methods require far too much data to be practical in an online setting. Address the problem with strong generalization techniques locally weighted regression “skeptical” Q-Learning bootstrapping from human-supplied policy need not be optimal and might be very wrong shows learner “interesting” parts of the space “bad” initial policies might be more effective Two Learning Phases: Two Learning Phases Learning System Phase One A R OTwo Learning Phases: Two Learning Phases Learning System A R O Phase TwoNew Results: New Results Drive to goal, avoiding obstacles in visual field Inputs (6 dimensions): heading and distance to goal image coordinates of two obstacles Output: steering angle Reward: +10 for getting to goal; -5 for running over obstacle Training: simple policy that avoids one obstacleRobot’s View: Robot’s ViewLocal Navigation: Local Navigation Map Learning: Map Learning Robot learns high-level structure of environment topological maps appropriate for large-scale structure low-level behaviors induce topology based on previous work using sonar vision changes problem dramatically no more problems with many states looking the same now same state always looks different!Sonar-Based Map Learning: Sonar-Based Map Learning Data True ModelCurrent Issues in Map Learning: Current Issues in Map Learning segmenting space into “rooms” detecting doors and corridor openings representation of places stored images gross 3D structure features for image and structure matchingLarge Simulation Domain: Large Simulation Domain Use for learning and large-scale experimentation that is impractical on a real robot built using video-game engine large multi-story building packages to deliver battery power management other agents (to survey) dynamically appearing items to collect general Bayes-net specification so it can be used widely as a test bedHierarchical MDP Planning: Hierarchical MDP Planning Large simulated domain has unspeakably many primitive states Use hierarchical representation for planning logarithmic improvement in planning times some loss of optimality of plans Existing work on planning and learning given a hierarchy temporal abstraction: macro actions spatial abstraction: aggregated states Where does the hierarchy come from? combined spatial and temporal abstraction top-down splitting approachRegion-Based Hierarchies: Region-Based Hierarchies Divide state space into regions each region is a single abstract state at next level polices for moving through regions are abstract actions at next levelChoosing Macros: Choosing Macros Given a choice of a region, what is a good set of macro actions for traversing it? existing approaches guarantee optimality with a number of macros exponential in the number of exit states our method is approximate, but works well when here are no large rewards inside the region Point-Source Rewards: Point-Source Rewards Compute a value function for each possible exit state, offline Given a new valuation of all exit states online Quickly combine value functions to determine near-optimal action Approximation is Good: Approximation is GoodHow to Use the Hierarchy: How to Use the Hierarchy Off line: Decompose environment into abstract states Compute macro operators On line: Given new goal, assign values to exits at highest level Propagate values at each level In current low-level region, choose actionWhat Makes a Decomposition Good?: What Makes a Decomposition Good? Trade off decrease in off-line planning time decrease in on-line planning time decrease in value of actions We can articulate this criterion formally but… … we can’t solve it Current research on reasonable approximationsNext Steps: Next Steps Low-level apply JAQL to tune obstacle avoidance behaviors Map learning landmark selection and representation visual detection of openings Hierarchy algorithm for constructing decomposition test hierarchical planning on huge simulated domain You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
march 01 Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 89 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 07, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence Laboratory MITPyramid: Pyramid Addressing problem at multiple levelsBuilt-in Behaviors: Built-in Behaviors Goal: general-purpose, robust visually guided local navigation optical flow for depth information finding the floor optical flow information Horswill’s ground-plane method build local occupancy grids navigate given the grid reactive methods dynamic programmingReactive Obstacle Avoidance: Reactive Obstacle Avoidance Standard method in mobile robotics is to use potential fields attractive force toward goal repulsive forces away from obstacles robot moves in direction given by resultant force New method for non-holonomic robots: move the center of the robot so that the front point is holonomic Human Obstacle Avoidance: Human Obstacle Avoidance Control law based on visual angle and distance to goal and obstacles Parameters set based on experiments with humans in large free-walking VR environment Humans are Smooth!: Humans are Smooth!Behavior Learning: Behavior Learning Typical RL methods require far too much data to be practical in an online setting. Address the problem with strong generalization techniques locally weighted regression “skeptical” Q-Learning bootstrapping from human-supplied policy need not be optimal and might be very wrong shows learner “interesting” parts of the space “bad” initial policies might be more effective Two Learning Phases: Two Learning Phases Learning System Phase One A R OTwo Learning Phases: Two Learning Phases Learning System A R O Phase TwoNew Results: New Results Drive to goal, avoiding obstacles in visual field Inputs (6 dimensions): heading and distance to goal image coordinates of two obstacles Output: steering angle Reward: +10 for getting to goal; -5 for running over obstacle Training: simple policy that avoids one obstacleRobot’s View: Robot’s ViewLocal Navigation: Local Navigation Map Learning: Map Learning Robot learns high-level structure of environment topological maps appropriate for large-scale structure low-level behaviors induce topology based on previous work using sonar vision changes problem dramatically no more problems with many states looking the same now same state always looks different!Sonar-Based Map Learning: Sonar-Based Map Learning Data True ModelCurrent Issues in Map Learning: Current Issues in Map Learning segmenting space into “rooms” detecting doors and corridor openings representation of places stored images gross 3D structure features for image and structure matchingLarge Simulation Domain: Large Simulation Domain Use for learning and large-scale experimentation that is impractical on a real robot built using video-game engine large multi-story building packages to deliver battery power management other agents (to survey) dynamically appearing items to collect general Bayes-net specification so it can be used widely as a test bedHierarchical MDP Planning: Hierarchical MDP Planning Large simulated domain has unspeakably many primitive states Use hierarchical representation for planning logarithmic improvement in planning times some loss of optimality of plans Existing work on planning and learning given a hierarchy temporal abstraction: macro actions spatial abstraction: aggregated states Where does the hierarchy come from? combined spatial and temporal abstraction top-down splitting approachRegion-Based Hierarchies: Region-Based Hierarchies Divide state space into regions each region is a single abstract state at next level polices for moving through regions are abstract actions at next levelChoosing Macros: Choosing Macros Given a choice of a region, what is a good set of macro actions for traversing it? existing approaches guarantee optimality with a number of macros exponential in the number of exit states our method is approximate, but works well when here are no large rewards inside the region Point-Source Rewards: Point-Source Rewards Compute a value function for each possible exit state, offline Given a new valuation of all exit states online Quickly combine value functions to determine near-optimal action Approximation is Good: Approximation is GoodHow to Use the Hierarchy: How to Use the Hierarchy Off line: Decompose environment into abstract states Compute macro operators On line: Given new goal, assign values to exits at highest level Propagate values at each level In current low-level region, choose actionWhat Makes a Decomposition Good?: What Makes a Decomposition Good? Trade off decrease in off-line planning time decrease in on-line planning time decrease in value of actions We can articulate this criterion formally but… … we can’t solve it Current research on reasonable approximationsNext Steps: Next Steps Low-level apply JAQL to tune obstacle avoidance behaviors Map learning landmark selection and representation visual detection of openings Hierarchy algorithm for constructing decomposition test hierarchical planning on huge simulated domain