Category: Education

Presentation Description

No description available.


Presentation Transcript

The Open Agent ArchitectureTM: 

Building communities of distributed software agents Outline The Open Agent ArchitectureTM Adam Cheyer David Martin Douglas Moran Artificial Intelligence Center SRI International 333 Ravenswood Avenue Menlo Park CA 94025 What is an Agent? Overview of the OAA Implementation OAA-based Applications Related Work Summary

What is an Agent?: 

What is an Agent? Examples Voyager, Aglets, Odyssey Mobile Agents Programs that move among computer hosts Autonomous Agents Based on planning technologies Learning Agents User preferences, collaborative filtering,... Animated Interface Agents Avatars, chatbots, ... Simulation-based Entities Cooperative Agents Collaboration among distributed heterogeneous components Robots, Softbots Microsoft Agent, Julia ModSAF, RoboCup OAA, KQML, FIPA FireFly, MIT Media Lab

Overview of the OAA: 

OAA: A framework for integrating a community of software agents in a distributed environment Facilitates flexible, adaptable interactions among distributed components through delegation of tasks, data requests & triggers Enables natural, mobile, multimodal user interfaces to distributed services Overview of the OAA Definition What, not how or who Distributed Computing Through Delegation User Interface

Approaches to Building Applications: 

Approaches to Building Applications OAA’s Objective Monolithic Applications Object-Oriented Applications Distributed Object Applications OAA Applications Virtual community of dynamic services Adaptable to changing, evolving network resources Flexible interactions among components Dynamic addition

Adaptable Interfaces: 

Adaptable Interfaces Platform-Independent Multimodal User Interfaces

OAA Architecture: 

OAA Architecture Facilitator Agent Modality Agents Registry User Interface Agents accept multimodal input and present results Agent Types Interagent Communication Language Natural Language Agents produce requests in ICL Facilitator Agents receive ICL requests and coordinate multiagent execution App Agents wrap legacy applications Meta Agents apply domain knowledge to help coordinate other agents

Interagent Communication Language (ICL): 

Interagent Communication Language (ICL) Using ICL, agents: - register capability specifications - request services of community: Perform queries, execute actions, exchange information, set triggers, manipulate data ICL defines both conversation layer of requests & logic-based content layer ICL delegation: description of request + advice & constraints Support for programming languages C, C++, Visual Basic, Java, Delphi, Prolog, Lisp ICL is platform- independent ICL: unified means of expressing all agent functionalities

Delegation through ICL: 

Delegation through ICL oaa_Solve(TaskExpr, ParamList) Expressions: logic-based (cf. Prolog) Parameters: provide advice & constraints High-level task types: query, action, inform, ... Low-level: solution_limit(N), time_limit(T), parallel_ok(TF), priority(P), address(Agt), reply(Mode), block(TF), collect(Mode), ... oaa_AddData(DataExpr, ParamList) oaa_AddTrigger(Typ,Cond,Action,Ps) oaa_Solve((manager(‘John Bear’,M), phone_number(M,P)), [query(var(P))]) Task Management Data & Trigger Management Example

Multimodal User Interfaces: 

Multimodal User Interfaces Natural language translation to and from ICL Multiple NL agents for different qualities (fast, robust) and languages (English, French) Multiagent cooperation for ambiguity resolution Pen: gesture or handwriting? Reference resolution: “photo of the hotel” - NL Agent: hotel in language context - Gesture Agent: hotel being pointed at - UI Agent: only one hotel visible - Database Agent: “hotel on Smith Street” - Discourse Agent: “the other hotel” - Human User: if still ambiguous, can clarify Cross-modality ambiguities - Arrow + “scroll map” vs. Arrow + “show hotel” User is special member of agent community User interfaces to distributed services, using distributed services

OAA Triggers: 

OAA Triggers OAA agents can dynamically register interest in any data change, communication event, or real-world occurrence accessible by any agent. oaa_AddTrigger(Type, Cond, Action, Params) comm: on_send, on_receive message time: “in ten minutes”, “every day at 5pm” data: on_change, on_remove, on_add task: “when mail arrives about...” The actions of triggers may be any ICL expression solvable by the community of agents Trigger Types Purpose Actions Adding a Trigger

A Sample Text-to-Speech Agent in C: 

A Sample Text-to-Speech Agent in C #include <libcom_tcp.h> #include <liboaa.h> ICLTerm capabilities = icl_TermFromStr(“[play(tts, Msg)]”); ICLTerm oaa_AppDoEvent(ICLTerm Event, ICLTerm Params) { if (strcmp(icl_Str(Event), “play”) == 0) { return playTTS(icl_ArgumentAsStr(Event, 2)); } else return NULL; } main() { com_Connect(“parent”, connectionInfo); oaa_Register(“parent”, “tts”, capabilities); oaa_MainLoop(True); } Include libraries List capabilities Define capabilities Agent Startup

A Sample Text-to-Speech Agent in Prolog: 

A Sample Text-to-Speech Agent in Prolog :- [libcom_tcp]. :- [liboaa]. capabilities([solvable(play(tts, Msg), [type(procedure), callback(tts_events)], [])]). tts_events(play(tts, Msg), Params) :- tts_api(Msg). start :- capabilities(C), com_Connect(parent, ConnectionInfo), oaa_Register(parent, tts, C), oaa_MainLoop(true). Include libraries List capabilities Define capabilities Agent Startup

OAA-based Applications: 

OAA-based Applications 1. Automated Office 2. Unified Messaging 3. Multimodal Maps 4. CommandTalk 5. ATIS-Web 6. Spoken Dialog Summarization 7. Agent Development Tools 8. InfoBroker 9. Rental Finder 10. InfoWiz Kiosk 11. Multi-Robot Control 12. MVIEWS Video Tools 13. MARVEL 14. SOLVIT 15. Surgical Training 16. Instant Collaboration 17.Crisis Response 18. WebGrader 19. Speech Translation 20-25+ ...

Automated Office Application: 

Automated Office Application Main Points Mobile access to distributed services Legacy applications interacting with AI technologies Flexible interactions among components High-level tasking of agents through NL and speech Delegated Triggers

Multimodal Maps Application: 

Multimodal Maps Application Main Points Natural interface to distributed (web) data Synergistic combination of handwriting, drawing, speech, direct manipulation Parallel cooperation and competition among many agents Human & Agent collaboration

Unified Messaging: 

Unified Messaging Main Points Mobile, adaptable access to distributed services Integrated Messaging: web, email, voice, fax Flexible interactions among components Distributed reference resolution and media format translation Delegated Triggers

MVIEWS Application: 

MVIEWS Application Video browser with multimedia timeline Interactive Map Main Points Multimodal annotation of video using speech & pen Automated detection, tracking, and geolocation of moving objects Search and replay of videos indexed by multimodal and auxilliary data Applications: multi-sensor surveillance, Predator UAV, Olympic bombing Interactive Map

InfoWiz Application: 

InfoWiz Application Main Points An information kiosk with an animated wizard who : answers questions, gives tours, and helps navigate the information space OAA integrates SRI’s speech recognition, NL, and knowledge representation with Microsoft Agent graphics and Netscape’s webbrowser Soon in SRI ’s lobby

CommandTalk Application: 

CommandTalk Application A spoken language interface to the LeatherNet military simulation and training system Main Points Spoken language interface adapts to dynamic changes in simulated world Advantages of speech: - More realistic training - Faster, more natural interface Supports Army, Navy, Marine Corp and Airforce versions of ModSAF simulator

Agent Development Tools: 

Agent Development Tools Tools are implemented themselves in OAA Guide user through process of creating an agent: Definition of capabilities Documentation management (publication on Web) Code generation of agent template Definition of NL vocabulary Update NL & speech recognition systems Assembly of multiagent projects Runtime tool for launching and monitoring agent communities

Related Work: 

Related Work Agent Communication Languages (KQML, FIPA) + Asynchronous message-passing communication richer than object model. Facilitates parallelism +/- Communication acts separate from content (KIF, SL) - Interactions primarily hard-coded (peer-to-peer msgs) Distributed objects (CORBA, DCOM) + Object-based integration of heterogeneous components + Network services (e.g. security, transactions) + Commercial implementations exist (e.g. Iona,Visigenic) - Interactions primarily hard-coded (method calls) OAA focuses on providing delegation services for flexible interactions on tasks, triggers and data mgmt + Research applicable to both DOBJ and ACL models + Bridges can be built from and to other models + OAA concepts could be layered on top of other models

OAA vs. Distributed Objects (CORBA, DCOM): 

Distributed, heterogeneous Retrieve obj, call obj interface: C++ -like hardcoded interactions OAA vs. Distributed Objects (CORBA, DCOM) Distributed, heterogeneous Ask Facilitator to call service + interface: declarative specs + delegated goal & advice parallel, compound goals, backtracking, constraints Data & Trigger management

OAA vs. Agent Communication Languages (KQML,FIPA): 

Distributed, heterogeneous Ask Agent Name Server or Service Broker for Addr, send msg, handle reply hardcoded interactions +/- conversation policies Logic-based content (KIF,SL) OAA vs. Agent Communication Languages (KQML,FIPA) Distributed, heterogeneous Ask Facilitator to distribute and coordinate complex requests + parallel, compound goals, backtracking, constraints + tasks, triggers, data mgmt Logic-based content (ICL)

OAA and Scalability: 

OAA and Scalability Facilitator is single point of failure Facilitator is bottleneck for communication Limitations: Solutions? Multi-Facilitator topologies Distribution of planning & execution functions of Facilitator + peer-to-peer communication Registry & Planner Agent E Replicated

OAA Characteristics: 

OAA Characteristics Open: Extensible: Distributed: Parallel: Mobile: High-level: Multimodal: agents can be created in many languages and interface with existing systems agents can be added or replaced dynamically agents are spread across many computers Parallel execution of subtasks Lightweight interfaces on phone and/or PDA hides software and hardware dependencies handwriting, speech, gestures, and direct manipulation can be combined together

authorStream Live Help