Building and UsingPractical Agent Applications: Building and Using Practical Agent Applications SRI International David Martin Adam Cheyer PAAM ’98 Tutorial
Contents: Contents Context: Agents & Distributed Computing
Challenges & Opportunities
Inside the Open Agent Architecture
Example Systems & Useful Techniques
Concluding Remarks
Slide3: Context: Agents & Distributed Computing
Areas of Agent Research
Evolving Paradigms for Distributed Systems
SRI’s Open Agent Architecture
Challenges & Opportunities
Inside the Open Agent Architecture
Example Systems & Useful Techniques
Concluding Remarks
What Is an Agent?: What Is an Agent? Mobile Agents
Programs that move among computer hosts
Autonomous Agents
Based on planning technologies
Learning Agents
User preferences, collaborative filtering,...
Animated Interface Agents
Avatars, chatbots, ...
Simulation-based Entities
Data/Info finding, filtering and merging
Cooperative Agents
Cooperation among distributed
heterogeneous programmatic components Examples Voyager, Aglets, Odyssey Robots, Softbots, BDI Microsoft Agent, Julia ModSAF, RoboCup OAA, KQML, FIPA FireFly, MIT Media Lab SIMS, InfoSleuth, IR
Approaches to Building Applications: Approaches to Building Applications Monolithic
Applications Object-Oriented
Applications Distributed Object
Applications OAA
Applications Dynamic addition Objective Suitable for Internet environment
Virtual community of dynamic services
Adaptable to changing, evolving network resources
Flexible interactions among components
Approaches to Distributed Computing: Approaches to Distributed Computing Mobile Objects
Blackboard Architectures
Agent Communication Languages (ACL)
Publish & Subscribe Brokers
Mobile Objects (Agents): Mobile Objects (Agents) Objects move under their own power (e.g., Voyager, Aglets)
Advantages
Network bandwidth for certain classes of problems
Parallelism - many objects can be spawned
Disadvantages
Programmatically specify where to go and what to do, through a known interface
Little automated support for inter-object cooperation
Programming language specific (non-heterogeneous)
Blackboard Architectures: Blackboard Architectures Knowledge Sources read and write tuples from a common information space (e.g. LINDA, FLiPSiDE)
Advantages
Eliminates explicitly programmed interactions among participants
Disadvantages
KS cannot coordinate interactions
Polling
tuple(abc,1,2,3)
Publish & Subscribe Brokers: Publish & Subscribe Brokers Clients register interest, broker routes/filters msgs
Examples: Talarian SmartSockets, Active Software’s ActiveWeb, ACL Brokers
Advantages
Destination process(es) not explicitly encoded
No polling
Disadvantages
Simple filtering, unitary messages Broker
Agent Communication Languages: Agent Communication Languages Communication message types based on speech acts (e.g., ask, tell, deny) + conversational policies
Examples: FIPA ACL, KQML
Advantages
Rich interaction model, peer-to-peer based
Standardized message types, content-agnostic
Disadvantages
Conformance to specs not universal
Explicitly coded interactions among participants ANS, Service Broker Ask Reply
Comparison of Distributed Approaches: Comparison of Distributed Approaches
Overview of the OAA: Overview of the OAA OAA: A framework for integrating a community of software agents in a distributed environment
Facilitates flexible, adaptable interactions among distributed components through delegation of tasks, data requests & triggers
Enables natural, mobile, multimodal user interfaces to distributed services
Definition What, not how or who Distributed Computing Through Delegation: User Interface
OAA Architecture: OAA Architecture Facilitator Agent Modality Agents Registry Interagent Communication Language
Automated Office Application: Automated Office Application Main Points Mobile access to distributed services Legacy applications interacting with AI technologies Flexible interactions among components High-level tasking of agents through NL and speech Delegated Triggers
OAA Characteristics: OAA Characteristics Open: Extensible: Distributed: Parallel: Mobile: High-level: Multimodal: agents can be created in many languages and interface with existing systems
agents can be added or replaced dynamically
agents are spread across many computers
Parallel execution of subtasks
Lightweight interfaces on phone and/or PDA
hides software and hardware dependencies
handwriting, speech, gestures, and direct
manipulation can be combined together
OAA-based Applications: OAA-based Applications 1. Automated Office
2. Unified Messaging
3. Multimodal Maps
4. CommandTalk
5. ATIS-Web
6. Spoken Dialog Summarization
7. Agent Development
Tools
8. InfoBroker
9. Rental Finder
10. InfoWiz Kiosk
11. Multi-Robot Control
12. MVIEWS Video Tools
13. MARVEL
14. SOLVIT
15. Surgical Training
16. Instant Collaboration
17.Crisis Response
18. WebGrader
19. Speech Translation
20-25+ ...
Slide17: Context: Agents & Distributed Computing
Challenges & Opportunities
Interoperation
Coordination & Control
Information Management and Sharing
Intelligent User Interfaces
Inside the Open Agent Architecture
Example Systems & Useful Techniques
Concluding Remarks
Interoperation: Interoperation
Interoperation: Interoperation Language, Ontology, Conversational Protocol
Imposing the Right Amount of Structure
Legacy & “Owned-elsewhere” Applications
Multi-platform, Multi-language
Wrappers & Surrogates
Backwards Compatibility With Older Paradigms
Integration with Standards
Opportunities
Support Greater Flexibility & Dynamism in Structuring Communities & Interactions
Provide Economical Means of Coding Interactions
Leverage Our Understanding Of Conversations
Minimize Platform & Language Barriers
Coordination & Control: Coordination & Control No Fully General Solutions Available
Families of C & C Strategies
Knowledge-Sharing
Team Coordination
Economic (Market-Driven)
Evolutionary
Opportunities
Flexibility, Synergy
Advice and Constraints
Temporal Control
Sophisticated Facilitation, Reactive Execution
Alternative Agent Control Strategies: Alternative Agent Control Strategies Knowledge-Sharing
Agents share knowledge about capabilities and requests.
Agent brokers dynamically match requests to capabilities.
System dynamically adjusts as capabilities are added to and removed from the environment. Team Coordination
Agents share knowledge about goals, plans, tasks & subtasks, commitments and performance.
Teams cooperative through partially synchronized actions to accomplish individual subtasks and common goals. Market-Driven Economy
Self-interested agents pursue personal profit.
Behavior is driven by the cost of resources.
Agents are controlled by specifying market rules, rewards and penalties. Evolutionary Systems
Agents populations evolve over time through “reproduction”, mutation and natural selection.
Agents are controlled by specifying selection criteria and reproduction process.
Applicability of Strategies: Complexity vs. Number of Agents Applicability of Strategies Coordination and Control Strategies
Knowledge-Sharing
Team Coordination
Market-Driven Economy
Evolutionary Systems The strategies differ in the complexity and number
of agents for which they are suited to control 101 - 103 105 - 107 Knowledge
Sharing Number of
agents in
system Complexity of
individual
agents Team
Coord. Market
Driven Evolutionary
System low high
Information Management and Sharing: Information Management and Sharing External Data
Heterogeneous, Dynamic, Unreliable Sources
Operational Data
Maintaining Consistent World-views
Transactions, Snapshots, Roll-back
Sharing Strategies
How Much to Share, Cost of Sharing
Support for Collaboration
Opportunities
Tight Integration With Service-providing & Requesting Mechanisms
Built-in Support for Handling Dynamism
Use Intelligence, Autonomy to Address Reliability
Intelligent User Interfaces: Intelligent User Interfaces Make User Requests Comprehensible to System
Make System Results Comprehensible to User
Help User Understand System Complexity …
Multiple Autonomous Actors
Dynamic Communities
… Or Not Be Required to
Opportunities
Agent-based Approaches to UI Implementation
Integrate Multimodality
User As Privileged Member of Agent Community
Use of Mixed-initiative Interactions
Collaboration
Slide25: Context: Agents & Distributed Computing
Opportunities & Challenges
Inside the Open Agent Architecture
Example Systems & Useful Techniques
Concluding Remarks
OAA Architecture: OAA Architecture Facilitator Agent Modality Agents Registry Interagent Communication Language
Interagent Communication Language: Interagent Communication Language Used by Agents to:
Declare Capabilities
Request Services of Community
Respond to Requests from Other Agents
Manage and Exchange Information
Conversation & Content Layers
Advice/Constraints Can Accompany Requests
Platform- and Language-Independence
Providing Services: Providing Services Declaring Capabilities
solvable(Goal, Parameters, Permissions)
Examples of Parameters
type: {data, procedure}
private: Boolean
utility: [0 .. 10]
solvable(send_message(email, +ToPerson, +Params),
[type(procedure), callback(send_mail)],
[]),
solvable(last_message(email, -MessageId),
[type(data), single_value(true)],
[write(true)])
Requesting Services: Requesting Services oaa_Solve(TaskExpr, ParamList)
Expressions: logic-based (cf. Prolog)
Parameters: provide advice & constraints
High-level task types: query, action, inform, ...
Low-level: solution_limit(N), time_limit(T), parallel_ok(TF), priority(P), address(Agt), reply(Mode), block(TF), collect(Mode), ...
oaa_AddData(DataExpr, ParamList)
oaa_AddTrigger(Typ,Cond,Action,Ps)
oaa_Solve((manager(‘John Bear’,M),
phone_number(M,P)), [query(var(P))]) Task Management Data & Trigger Management Example
Compound Queries: Compound Queries Address:Goal::Parameters
Address & Parameters Optional
Value-returning Parameters
Composable Using Standard Prolog Operators
Extensions
Parallel Disjunction
oaa_Solve(
(locate(‘Adam Cheyer’, Where)::[strategy(query)],
notify(MsgRef, ‘Adam Cheyer’,
[at(Where), by(fax)])::[strategy(action)]),
[])
Facilitation: Facilitation Facilitator
OAA Data Management: OAA Data Management Declaring & Utilizing Data Solvables
Built-in Support
Example Parameters
single_value(t_f), unique_values(t_f)
bookkeeping(t_f), persistent(t_f)
synonym(Synonym, Original)
rules_ok(t_f)
Maintaining Data Solvables
Sharing Data
OAA Triggers: OAA Triggers OAA agents can dynamically register interest in any data change, communication event, or real-world occurrence accessible by any agent.
oaa_AddTrigger(Type, Cond, Action, Params)
comm: on_send, on_receive message
time: “in ten minutes”, “every day at 5pm”
data: on_change, on_remove, on_add
task: “when mail arrives about...”
The actions of triggers may be any ICL expression solvable by the community of agents Trigger Types Purpose Actions Adding a Trigger
System-Building Infrastructure: System-Building Infrastructure The Event Loop
Event Types
Built-In
Task-Specific
Hybrid
Libraries
Multiple Languages Supported
Minimal Structure Imposed on Agents
A Sample Text-to-Speech Agent in C: A Sample Text-to-Speech Agent in C #include
#include
ICLTerm capabilities = icl_TermFromStr(“[play(tts, Msg)]”);
ICLTerm oaa_AppDoEvent(ICLTerm Event, ICLTerm Params) {
if (strcmp(icl_Str(Event), “play”) == 0) {
return playTTS(icl_ArgumentAsStr(Event, 2));
}
else return NULL;
}
main() {
com_Connect(“parent”, connectionInfo);
oaa_Register(“parent”, “tts”, capabilities);
oaa_MainLoop(True);
} Include libraries List capabilities Define capabilities Agent Startup
A Sample Text-to-Speech Agent in Prolog: A Sample Text-to-Speech Agent in Prolog :- use_module(com).
:- use_module(oaa).
capabilities([
solvable(play(tts, Msg),
[type(procedure), callback(tts_events)],
[])]).
tts_events(play(tts, Msg), Params) :-
tts_api(Msg).
start :-
capabilities(C),
com_Connect(parent, ConnectionInfo),
oaa_Register(parent, tts, C),
oaa_MainLoop(true). Include libraries List capabilities Define capabilities Agent Startup
OAA and Scalability: OAA and Scalability Facilitator is single point of failure
Facilitator is bottleneck for communication Limitations: Solutions? Multi-Facilitator
topologies Distribution of planning
& execution functions
of Facilitator + peer-to-peer communication Registry &
Planner Agent E Replicated
Slide38: Context: Agent Types & Approaches
Challenges & Opportunities
Inside the Open Agent Architecture
Example Systems & Useful Techniques
Agent & Interagent Programming Tips
Dynamic Presentation: Unified Messaging
Reference Resolution: Multimodal Map
Information Management and Collaboration: InfoBroker & Multimodal Map
Incremental System Development & Evaluation: Stimulate
Looking for the Killer App: Other Tries
Concluding Remarks
Agent & Interagent Programming Tips: Agent & Interagent Programming Tips Choosing an Agent Interface
Information Sharing Strategies
Domain-Specific vs. Domain-Independent Agents
Adding Speech & NL to Interfaces
Tips: Choosing an Agent Interface: Tips: Choosing an Agent Interface Natural-language inspired interfaces
Imperative Verb, Direct Object, ParamList, (Result)
Parameter lists hold Adjs, Advs & Prepositions as well as extensible programmatic instruction
Classes tagged by type
inform(phone, ringing, Params)
send_message(MsgRef, Params) :- memberchk(by(fax), Params)
Succeed once with list vs. Multiple success
get(email, message_headers, +Params, -ListOfHeaders)
phone_number(Person, PhoneNum)
Tips: 3 Information Sharing Strategies: Tips: 3 Information Sharing Strategies Example: Phone Dialer Agent
1. Query
When an agent wants to know the status of the phone, it asks the Facilitator who asks the phone agent
pa: oaa_Declare(status(phone, S),[])
?a: oaa_Solve(status(phone, S), [])
Tips: Information Sharing Strategies - Post: Tips: Information Sharing Strategies - Post 2. Post (Blackboard)
The phone agent writes its status to the Facilitator; agents can query the facilitator for status, and install a trigger which proactively monitors changes to status
pa: oaa_AddData(status(phone, busy), [])
ia: oaa_Solve(status(phone, S), []), oaa_AddTrigger(data, status(phone,S), notify(Me, phone(S)), [on(change)])
Tips: Information Sharing Strategies - Inform: Tips: Information Sharing Strategies - Inform 3. Inform
Broadcast time-critical events to interested parties
ia: oaa_Declare(msg(phone, Msg), [])
pa: oaa_Solve(msg(phone, ringing, []), [inform])
Tips: Domain-Specific vs. Domain Independent Agents: Tips: Domain-Specific vs. Domain Independent Agents Move domain-dependent code into separate agent
Employ hooks and parameters to allow domain-specific tailoring of functionality
Always ask: Domain-specific or domain independent?
Phone agent?
Office interface?
Notify agent?
Speech recognition?
Natural language?
Facilitator?
Tips: Adding Speech & NL: Tips: Adding Speech & NL User Interface responsible for:
accepting user input, sending requests, displaying results
controlling interactions of Speech and NL
Complex interpretation processed by external domain agent
Unified Messaging: Problem: Unified Messaging: Problem Universal Access: Access to web, email, voicemail, applications (e.g., calendar, database, scheduler) from multiple interfaces (e.g., web browser, desktop, telephone)
Delegated triggers to monitor information
Message dissemination across various media (e.g., fax, printer, email, phone, pager)
Locating destination target
Plan route according to user preferences & resources
Media translation as necessary
Extensible and distributed! Minimize dependencies among component technologies
Unified Messaging: Components: Unified Messaging: Components Main Points Mobile, adaptable access to distributed services Integrated Messaging: web, email, voice, fax Flexible interactions among components Distributed reference resolution and media format translation Delegated Triggers
Unified Messaging: Implementation 1/2: Unified Messaging: Implementation 1/2 Universal Access
Every user interface (including phone) must identify user
UI’s coordinate themselves to ensure only one “primary” interface per user, per utterance
Message Dissemination
Media agents: distributed reference resolution and translation
print(Object, Params)
ref(it): oaa_Solve(resolve_reference(the, document, Params, NewObj))
id(Pointer): oaa_Solve(resolve_reference_id_as(id(Pointer), postscript, [], PostScript)
print TextObject or PostScript
Unified Messaging: Implementation 2/2: Unified Messaging: Implementation 2/2 Adaptable Presentation
GenNL agent produces simple or structured text-based response for any ICL query
Reads distributed NL vocabulary definitions in forming simple responses:
Vocabulary: noun(‘telephone number’, phone_number, [])
NL -> ICL: “What is Adam Cheyer’s telephone number?”
ICL: oaa_Solve(phone_number(‘Adam Cheyer’, X),[query(var(X))])
Reponse: [phone_number(‘Adam Cheyer’, ‘859-4119’)]
GenNL: “The telephone number of Adam Cheyer is 859-4119.”
Structured response: description(list(EltList, AttrList))
title(Title): Title of list, e.g. ‘Schedule’
elt(Elt): Name of individual element in list, e.g. ‘Appointment’
intro(Intro): Introduction to be played at start of list, e.g., ‘Here is today’s schedule for Adam Cheyer’
max_len(Max): Num < Max Display All, else Display 1st & iterate
Multimodal Maps Application: Multimodal Maps Application Main Points Natural interface to distributed (web) data Synergistic combination of handwriting, drawing, speech, direct manipulation Parallel cooperation and competition among many agents Human & Agent collaboration
Adaptable displays according to user preferences
Multimodal Interfaces using Parallel Distributed Agents: Multimodal Interfaces using Parallel Distributed Agents Competition and cooperation among agents at many levels
Pen input: gesture recognizer vs. handwriting recognizer
Natural language: multiple NL systems (multilingual, diff. capabilities)
Reference Resolution
Multiple modalities for resolving ambiguities
e.g. arrow + “scroll map” vs. arrow + “photo of this hotel”
Multimodal Reference Resolution: Multimodal Reference Resolution Context by object type: “show photo of the hotel”
Deictic: “Find distance from here to here”, “this one”
Positional context: Write “photo?” on hotel
Visual context: “Photo of the [visible] hotel”
Database queries: “show photo of the hotel in Menlo Park”
Discourse: “No, the other one”
User disambiguation through prompting: “Which hotel?”
Information Broker: Requirements: Information Broker: Requirements Integrate Internet sources with enterprise sources
Heterogeneity handled transparently
Structured and “semi-structured” sources
Flexible access to unreliable information sources
Easily extensible to new domains
User and task models used to guide retrievals
Infrastructure must provide a basis for tools
Information Broker: Functionality: Information Broker: Functionality Mediation
Retrieval Strategies
User & Task Models
Mediation: Mediation Transparent access to heterogeneous sources
WWW structured and semi-structured sources
SQL sources
Knowledge bases
Multimedia repositories
Dynamic source registration & schema update
Query planning across distributed sources
Queries in broker or source schema
Domain knowledge used to increase query range
Built-in normalizations and conversions
Incomplete & inconsistent information
Retrieval Strategies: Retrieval Strategies Identification of relevant sources
Extraction of desired information
Imposing structure on semi-structured Web pages
Local caching of virtual databases
Sensitivity to time constraints
Flexible strategies for web vs. cache retrievals
Dealing with unreliability and change
Cache maintenance
Use of alternate sources
Tracking and rating of sources
User and Task Modeling: User and Task Modeling Representation of salient characteristics of users and tasks
Mapping from situation to information request
What information is needed and when?
User and task models used as constraints
Mapping information retrieval to presentation
What information does the user want to see?
User and task models used as filters
User-friendly knowledge acquisition
Learning user and task models where feasible
Sample Queries: Sample Queries Mediation
“Find all hotels (meeting certain constraints) in San Francisco”
Use of domain knowledge
“Find hotels halfway between S. F. and Portland”
User modeling
“Apply my preferences” (to the same query)
Legacy and Web data source integration
“Show just the hotels for which we get a corporate discount” (Accesses WWW sources and employee db)
“Find the names and extensions of employees in the AI Center who have written about …” (Accesses Harvest index, Bibtex file and employee db)
“Persistent” queries
“Notify me of any ad selling a used color inkjet printer”
Information Broker: Architecture: Information Broker: Architecture Semi-structured
Source (Surrogate) Structured
Source
(Surrogate) Broker RDB
Source
(Wrapped) Broker
schema Source
schemas BQ BR SQ SR
The Broker: The Broker Agent Interactions Management
Surrogates: Surrogates Cache
Persistent Queries: Persistent Queries Broker Q T T T T Transaction
Management Surrogate Helper
Agent R
Useful Features of the Framework: Useful Features of the Framework Tight Integration of Data Capabilities
Standardized, Visible Content Language
Extension of Logic Programming Paradigm
Collaboration-ready Data Management: Collaboration-ready Data Management Store data using OAA Data Management
oaa_DbDeclare(icon(Id, X, Y, PictureType), [shareable, callback(icon_change)])
Separate code which changes data from results, using callback feature
NOT:
{ oaa_AddData(icon(hilton, 100, 100, hotel), []) map_Display(icon(hilton, 100, 100, hotel)) }
BUT:
{ oaa_AddData(icon(hilton, 100, 100, hotel), []) }
icon_change(add, icon(Id, X, Y, Picture)) :- map_Display(icon(Id, X, Y, Picture)).
Incremental System Development & Evaluation: Incremental System Development & Evaluation Collaborative Multimodal Map application adapted for Wizard Of Oz (WOZ) experiment to elicit data about coordinated use of language and gesture
Subject Screen vs. Wizard Screen: Subject Screen vs. Wizard Screen
Subject Video: Subject Video
Hybrid Wizard Of Oz Experiment: Hybrid Wizard Of Oz Experiment Naive user free to write, draw, or speak without constraints imposed by current technology
Wizard must respond quickly and accurately by using existing means, including pen and voice
Simultaneous evaluation of:
Experienced user manipulating real system
New user, providing data for future extensions
Bootstrap effect: continuous loop from data to theory, to system enhancement
Improvements from data analysis quantifiable
General-purpose approach
Hybrid WOZ: Implementation: Hybrid WOZ: Implementation System logging and playback “for free” using OAA collaboration facilities
“Subject mode”: functional interpretation (mostly) turned off
Addition of simple Wizard Feedback panel (separate agent) for text-to-speech messages (e.g., “Function not available.”)
Looking for Killer Apps: Looking for Killer Apps OAA has been used to implement more than 25 systems and prototypes
Not good for every application, but good for:
integrating numerous components which need to cooperate, often across language boundaries
supporting media translation
distributed reference resolution
tasking through adaptable or multimodal user interfaces
human/agent collaborative systems & incremental dvpt
exploring direct manipulation/task delegation tradeoffs
OAA-based Applications: OAA-based Applications 1. Automated Office
2. Unified Messaging
3. Multimodal Maps
4. CommandTalk
5. ATIS-Web
6. Spoken Dialog Summarization
7. Agent Development
Tools
8. InfoBroker
9. Rental Finder
10. InfoWiz Kiosk
11. Multi-Robot Control
12. MVIEWS Video Tools
13. MARVEL
14. SOLVIT
15. Surgical Training
16. Instant Collaboration
17.Crisis Response
18. WebGrader
19. Speech Translation
20-25+ ...
MVIEWS Application: MVIEWS Application Interactive Map Main Points Multimodal annotation of video using speech & pen Automated detection, tracking, and geolocation of moving objects Search and replay of videos indexed by multimodal and auxilliary data Applications: multi-sensor surveillance, Predator UAV, Olympic bombing Interactive Map Video browser with multimedia timeline
MVIEWS Architecture: MVIEWS Architecture
InfoWiz Kiosk: InfoWiz Kiosk Main Points An information kiosk with an animated wizard who :
answers questions, gives tours, and helps navigate the information space OAA integrates SRI’s speech recognition, NL, dialogue, and knowledge representation with Microsoft Agent graphics and Netscape’s webbrowser Soon in SRI ’s lobby
InfoWiz Kiosk Architecture: InfoWiz Kiosk Architecture
Multi-Robot Control: Multi-Robot Control Concept Design Monitoring Maps, video, status Configurable displays Global or individual views Directed camera & robot control Delegated tasking through speech & gesture Tasking
Agent Development Tools: Agent Development Tools Tools are implemented themselves in OAA
Guide user through process of creating an agent:
Definition of capabilities
Documentation management (publication on Web)
Code generation of agent template
Definition of NL vocabulary
Update NL & speech recognition systems
Assembly of multiagent projects
Runtime tool for launching and monitoring
agent communities
Concluding Remarks: Concluding Remarks Many Varieties of
Agents
Agent-based Systems
Agent Frameworks
Useful Features of Agent Frameworks
Important Design Choices
Strategies for Interoperation & Coordination
Managing and Sharing Data
User Interface Functionality
Framework