The Open Agent ArchitectureTM: Building communities of
distributed software agents Outline The Open Agent ArchitectureTM Adam Cheyer
David Martin
Douglas Moran
Artificial Intelligence Center
SRI International
333 Ravenswood Avenue
Menlo Park CA 94025
http://www.ai.sri.com/~oaa What is an Agent?
Overview of the OAA
Implementation
OAA-based Applications
Related Work
Summary
What is an Agent?: What is an Agent? Examples Voyager, Aglets, Odyssey Mobile Agents
Programs that move among computer hosts
Autonomous Agents
Based on planning technologies
Learning Agents
User preferences, collaborative filtering,...
Animated Interface Agents
Avatars, chatbots, ...
Simulation-based Entities
Cooperative Agents
Collaboration among distributed
heterogeneous components Robots, Softbots Microsoft Agent, Julia ModSAF, RoboCup OAA, KQML, FIPA FireFly, MIT Media Lab
Overview of the OAA: OAA: A framework for integrating a community of software agents in a distributed environment
Facilitates flexible, adaptable interactions among distributed components through delegation of tasks, data requests & triggers
Enables natural, mobile, multimodal user interfaces to distributed services
Overview of the OAA Definition What, not how or who Distributed Computing Through Delegation User Interface
Approaches to Building Applications: Approaches to Building Applications OAA’s Objective Monolithic
Applications Object-Oriented
Applications Distributed Object
Applications OAA
Applications Virtual community of dynamic services
Adaptable to changing, evolving network resources
Flexible interactions among components Dynamic addition
Adaptable Interfaces: Adaptable Interfaces Platform-Independent
Multimodal User Interfaces
OAA Architecture: OAA Architecture Facilitator Agent Modality Agents Registry User Interface Agents accept multimodal input and present results Agent Types Interagent Communication Language Natural Language Agents produce requests in ICL Facilitator Agents receive ICL requests and coordinate multiagent execution App Agents wrap legacy applications Meta Agents apply domain knowledge to help coordinate other agents
Interagent Communication Language (ICL): Interagent Communication Language (ICL) Using ICL, agents:
- register capability specifications
- request services of community:
Perform queries, execute actions, exchange
information, set triggers, manipulate data
ICL defines both conversation layer of requests &
logic-based content layer
ICL delegation:
description of request + advice & constraints
Support for programming languages
C, C++, Visual Basic, Java, Delphi, Prolog, Lisp ICL is platform- independent ICL: unified means of expressing all agent functionalities
Delegation through ICL: Delegation through ICL oaa_Solve(TaskExpr, ParamList)
Expressions: logic-based (cf. Prolog)
Parameters: provide advice & constraints
High-level task types: query, action, inform, ...
Low-level: solution_limit(N), time_limit(T), parallel_ok(TF), priority(P), address(Agt), reply(Mode), block(TF), collect(Mode), ...
oaa_AddData(DataExpr, ParamList)
oaa_AddTrigger(Typ,Cond,Action,Ps)
oaa_Solve((manager(‘John Bear’,M),
phone_number(M,P)), [query(var(P))]) Task Management Data & Trigger Management Example
Multimodal User Interfaces: Multimodal User Interfaces Natural language translation to and from ICL
Multiple NL agents for different qualities (fast, robust) and languages (English, French) Multiagent cooperation for ambiguity resolution
Pen: gesture or handwriting?
Reference resolution: “photo of the hotel”
- NL Agent: hotel in language context
- Gesture Agent: hotel being pointed at
- UI Agent: only one hotel visible
- Database Agent: “hotel on Smith Street”
- Discourse Agent: “the other hotel”
- Human User: if still ambiguous, can clarify
Cross-modality ambiguities
- Arrow + “scroll map” vs. Arrow + “show hotel” User is special member of agent community User interfaces to distributed services, using distributed services
OAA Triggers: OAA Triggers OAA agents can dynamically register interest in any data change, communication event, or real-world occurrence accessible by any agent.
oaa_AddTrigger(Type, Cond, Action, Params)
comm: on_send, on_receive message
time: “in ten minutes”, “every day at 5pm”
data: on_change, on_remove, on_add
task: “when mail arrives about...”
The actions of triggers may be any ICL expression solvable by the community of agents Trigger Types Purpose Actions Adding a Trigger
A Sample Text-to-Speech Agent in C: A Sample Text-to-Speech Agent in C #include
#include
ICLTerm capabilities = icl_TermFromStr(“[play(tts, Msg)]”);
ICLTerm oaa_AppDoEvent(ICLTerm Event, ICLTerm Params) {
if (strcmp(icl_Str(Event), “play”) == 0) {
return playTTS(icl_ArgumentAsStr(Event, 2));
}
else return NULL;
}
main() {
com_Connect(“parent”, connectionInfo);
oaa_Register(“parent”, “tts”, capabilities);
oaa_MainLoop(True);
} Include libraries List capabilities Define capabilities Agent Startup
A Sample Text-to-Speech Agent in Prolog: A Sample Text-to-Speech Agent in Prolog :- [libcom_tcp].
:- [liboaa].
capabilities([solvable(play(tts, Msg),
[type(procedure), callback(tts_events)], [])]).
tts_events(play(tts, Msg), Params) :-
tts_api(Msg).
start :-
capabilities(C),
com_Connect(parent, ConnectionInfo),
oaa_Register(parent, tts, C),
oaa_MainLoop(true). Include libraries List capabilities Define capabilities Agent Startup
OAA-based Applications: OAA-based Applications 1. Automated Office
2. Unified Messaging
3. Multimodal Maps
4. CommandTalk
5. ATIS-Web
6. Spoken Dialog Summarization
7. Agent Development
Tools
8. InfoBroker
9. Rental Finder
10. InfoWiz Kiosk
11. Multi-Robot Control
12. MVIEWS Video Tools
13. MARVEL
14. SOLVIT
15. Surgical Training
16. Instant Collaboration
17.Crisis Response
18. WebGrader
19. Speech Translation
20-25+ ...
Automated Office Application: Automated Office Application Main Points Mobile access to distributed services Legacy applications interacting with AI technologies Flexible interactions among components High-level tasking of agents through NL and speech Delegated Triggers
Multimodal Maps Application: Multimodal Maps Application Main Points Natural interface to distributed (web) data Synergistic combination of handwriting, drawing, speech, direct manipulation Parallel cooperation and competition among many agents Human & Agent collaboration
Unified Messaging: Unified Messaging Main Points Mobile, adaptable access to distributed services Integrated Messaging: web, email, voice, fax Flexible interactions among components Distributed reference resolution and media format translation Delegated Triggers
MVIEWS Application: MVIEWS Application Video browser with multimedia timeline Interactive Map Main Points Multimodal annotation of video using speech & pen Automated detection, tracking, and geolocation of moving objects Search and replay of videos indexed by multimodal and auxilliary data Applications: multi-sensor surveillance, Predator UAV, Olympic bombing Interactive Map
InfoWiz Application: InfoWiz Application Main Points An information kiosk with an animated wizard who :
answers questions, gives tours, and helps navigate the information space OAA integrates SRI’s speech recognition, NL, and knowledge representation with Microsoft Agent graphics and Netscape’s webbrowser Soon in SRI ’s lobby
CommandTalk Application: CommandTalk Application A spoken language interface to the LeatherNet
military simulation and training system Main Points Spoken language interface adapts to dynamic changes in simulated world
Advantages of speech:
- More realistic training - Faster, more natural interface
Supports Army, Navy, Marine Corp and Airforce versions of ModSAF simulator
Agent Development Tools: Agent Development Tools Tools are implemented themselves in OAA
Guide user through process of creating an agent:
Definition of capabilities
Documentation management (publication on Web)
Code generation of agent template
Definition of NL vocabulary
Update NL & speech recognition systems
Assembly of multiagent projects
Runtime tool for launching and monitoring
agent communities
Related Work: Related Work Agent Communication Languages (KQML, FIPA)
+ Asynchronous message-passing communication richer than object model. Facilitates parallelism
+/- Communication acts separate from content (KIF, SL)
- Interactions primarily hard-coded (peer-to-peer msgs) Distributed objects (CORBA, DCOM)
+ Object-based integration of heterogeneous components
+ Network services (e.g. security, transactions)
+ Commercial implementations exist (e.g. Iona,Visigenic)
- Interactions primarily hard-coded (method calls) OAA focuses on providing delegation services for
flexible interactions on tasks, triggers and data mgmt
+ Research applicable to both DOBJ and ACL models
+ Bridges can be built from and to other models
+ OAA concepts could be layered on top of other models
OAA vs. Distributed Objects (CORBA, DCOM): Distributed, heterogeneous
Retrieve obj, call obj
interface: C++ -like
hardcoded interactions OAA vs. Distributed Objects (CORBA, DCOM) Distributed, heterogeneous
Ask Facilitator to call service
+ interface: declarative specs
+ delegated goal & advice
parallel, compound goals, backtracking, constraints
Data & Trigger management
OAA vs. Agent Communication Languages (KQML,FIPA): Distributed, heterogeneous
Ask Agent Name Server or Service Broker for Addr, send msg, handle reply
hardcoded interactions
+/- conversation policies
Logic-based content (KIF,SL) OAA vs. Agent Communication Languages (KQML,FIPA) Distributed, heterogeneous
Ask Facilitator to distribute and coordinate complex requests
+ parallel, compound goals, backtracking, constraints
+ tasks, triggers, data mgmt
Logic-based content (ICL)
OAA and Scalability: OAA and Scalability Facilitator is single point of failure
Facilitator is bottleneck for communication Limitations: Solutions? Multi-Facilitator
topologies Distribution of planning
& execution functions
of Facilitator + peer-to-peer communication Registry &
Planner Agent E Replicated
OAA Characteristics: OAA Characteristics Open: Extensible: Distributed: Parallel: Mobile: High-level: Multimodal: agents can be created in many languages and interface with existing systems
agents can be added or replaced dynamically
agents are spread across many computers
Parallel execution of subtasks
Lightweight interfaces on phone and/or PDA
hides software and hardware dependencies
handwriting, speech, gestures, and direct
manipulation can be combined together