2001 Baudisch Dissertation DynamicInformationFi lt

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Dynamic information filtering: 

Dynamic information filtering Patrick Baudisch Xerox PARC March 26, 2001

Contents: 

Contents Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

Slide3: 

Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

Motivation: Information overload: 

Motivation: Information overload Too many research papers books movies web pages … even TV programs! Goal: alleviate information overload

IF, IR, and dynamic filtering: 

IF, IR, and dynamic filtering Analytic information seeking strategies Retrieval (IR) changing interests, stable database Filtering (IF) changing sources, stable interests Many application fit in dictionaries => IR music => IF Others fit into neither niche High source and need change rate Example stock market [Oard 96]: “Grand challenge” Dynamic information filtering

Objective of dynamic filtering: 

Objective of dynamic filtering Adaptation speed is crucial (user profile = interest) is crucial for filtering accuracy Interest changes: (profile  interest) => filtering quality drops Adapt profile as fast as possible Subject of this thesis: Filtering architecture for maximum adaptation speed

Slide7: 

Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

Requirements: 

Requirements Requirement 1: Exhaustiveness (arbitrary interests) (King and Sacramento), but not (King and Queen), INFOS [Mock 96] Requirement 2: Output style (single ranking preferred) Boolean output, Info. Lens [Malone 87]; Categories, SIFT [Yan 95] Requirements 3-5: Adapt to interest changes

R3: Learning from relevance feedback: 

R3: Learning from relevance feedback time [Jennings 91, p.207] actual interests Newt [Sheth and Maes 93] WebMate [Chen and K. Sycara] GroupLens [Konstan et al 97]

R4: Limitations of manual profile editing: 

Rule-based systems Information Lens [Malone et al 87] ISCREEN [Pollock 88] INFOSCOPE [Fischer 91] R4: Limitations of manual profile editing Problems with gradual changes user interest

Resulting design guideline: 

Resulting design guideline Build a filtering system that allows learning from relevance feedback (for gradual changes) users to edit their profiles directly (for abrupt changes) and that uses a “meaningful” model for the user profiles, so that users understand how to edit them

Slide12: 

Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

Slide13: 

Query Frame Content frame

Q1. select a query: 

Best match Q1. select a query

Q2. read & retain program descriptions: 

program description list program description table Q2. read & retain program descriptions

Q3. suggestions: 

Q3. suggestions suggest queries

Slide17: 

Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

P+. Exact match profiles: 

P+. Exact match profiles

Best match profile (QuerySet profile): 

Best match profile (QuerySet profile) QuerySet Profile:Personal programper singlemouse click

Summary: 

Summary TV Scout interface with starting page

Incremental usage: 

Incremental usage S1 U1 T1 system provides user writes

Studies done on the TV Scout so far: 

Studies done on the TV Scout so far Comparison of individual query classes > 13,000 registered users Predefined queries (genres) covered most interests Text search for what genres do not cover Search for actors, series, topics “Opinion leader” recommendation was 5th most popular query Long term study still outstanding

Slide23: 

Introduction Requirements and related work The TV Scout The TV Scout as a retrieval system… …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

QuerySet profile vs. other user profiles: 

QuerySet profile vs. other user profiles Queries in QSA profile intended to represent different interests != query representation nodes != concepts (or facets) that are part of a query/interest. != IR query that represents a single interest only e.g. news, sports, Comedy shows How does user like news compared to sports…? This is not (necessarily) an inference network

Objective of that decomposition: 

Objective of that decomposition Several interests changes can be handled with minor profile changes “I am not in the mood for action movies today” “My taste in action movies has changed” => Update only query weight in aggregation function Benefit: all queries remain unaffected Edit only action movies query Benefit: all other queries remain unaffected

Make queries correspond to interests: 

Make queries correspond to interests Selection principle Make a query what will change as a whole It is interests that change => Use queries corresponding to interests Negative examples Data fusion (e.g. [Fox 94, Lee 97]) => redundancy Automated collaborative filtering => overlap Positive example: The Incremental usage supported by QSA systems: Use as query, then bookmark, then use as profile queries (one shot state) S1 U1 T1 user defines system suggests S2 U2 T2 system compiles QSA profile (filtering state) S3 T system provides user writes

Slide27: 

Introduction Requirements and related work The TV Scout The TV Scout as a retrieval system… …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

How to build QSA systems? Reuse!: 

How to build QSA systems? Reuse! Sybase,FreeWAIS, Print import, <more>

Aggregation subsystem: 

Aggregation subsystem Example User profile = {action movies, comedies, Tips by Lars} Aggregation: turn these three rankings into a single ranking Is a programs {0.4 action movie, 0.3 comedy, “excellent” by Lars} better than {0 action movie, 0.8 comedy, “ok” by Lars}? Notion of tradeoffs similar to IR/IF systems on term frequencies Query = {“information”, “retrieval”} Is a web page {0.4 information, 0.3 retrieval} better than another web page {0 information, 0.8 retrieval}? => Reuse IR/IF systems Weighted request and indexing retrieval model Output rating(object) = Sum of query ratings TV Scout: Overlap between queries was small enough => This model is sufficient

Slide30: 

Introduction Requirements and related work The TV Scout The TV Scout as a retrieval system… …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

Simple case: “Rate a query”: 

Simple case: “Rate a query” What is the general concept behind profile editors? Rate a query as a whole “How do you like science fiction movies”? => This is fast, because users can take experience with and expectations about query into account But what if the user loves news programs, but wants only a few top-ranked ones? (redundancy between news)

General case: “Rate a set”: 

General case: “Rate a set” Generalization Ask user to rate arbitrary set of objects Example “How do you like: {Back to the future, Brazil, Blade runner, 1984…Metropolis}? User-aggregated relevance feedback The user mentally assigns a rating to each object The user aggregates these and tells the system the result This save effort for communicating individual ratings Benefit “Rate a query” is a special case of “Rate a set” This makes both compatible with relevance feedback

Combine both: 

Combine both Goal: find a way as simple and fast as “rate a query” as flexible as “rate a set” Solution Use top and bottom ranks of queries (and others) Extensible to arbitrary ranks -> Histogram-based interfaces “How muchdo you liketop-rankednews programs?” “How muchdo you likebottom-rankednews programs?”

Profile editor framework: 

Profile editor framework Skip all 2. Dead Poets Society 1. Bayern-Manchester 2. Amazons on Mars ------------------------------- 2. Le Grand Bleu 1. Sat1 ran Skip

Paintable interfaces: 

Paintable interfaces

Example for multiple select: 

Example for multiple select

Multiple select applied to interest: 

Multiple select applied to interest Information Sports Beverly Hills 90210 Endorsed by Paul Comedy “Action AND Comedy” Action movies Schwarzen egger Endorsed by Lars M.A.S.H. Basketball Classic music Theater Golf Series

Multiple select versus painting: 

Multiple select versus painting Immediate visual feedback allows differentiated input

Layout by co-occurrence: 

Danish Milk Pan- cakes Orange Juice Bacon T O T A L French Toast English muffin Hash Browns Ham Eggs Root Beer Milk Shake Cookie Chick Sand Iced Tea Fish sand Fruit Pie Sundae Cheese Burger Ham Burger French Fries Cola Onion Rings Coffee Layout by co-occurrence T O T A L

A paintable profile editor: 

A paintable profile editor

Paintable time and channel editors: 

Paintable time and channel editors Interval sliders are split into segments no handles, just paint the addition Intervals labeled as entities to reduce cluttering

Slide42: 

Introduction Requirements and related work The TV Scout The TV Scout as a retrieval system… …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems Manual profile editing Conclusions

QSA vs. requirements: 

QSA vs. requirements Requirement 1: Exhaustiveness Requirement 2: Output style Requirements 3-5: Adapt to interest changes arbitrary interests single ranking User-aggregated relevance feedback Relevance feedback Reuse of old queries (weight set to zero)

Achievements of the dissertation: 

Achievements of the dissertation (1) a new generic IF system architecture designed for the efficient handling of highly dynamic interests (the QuerySet Architecture) (2) a new paradigm of high-level access to user profiles (user-aggregated relevance feedback) (3) a framework of new user interface interaction styles providing users with this high-level access (4) a proof of concept implementation (TV Scout)

Future work: 

Future work (1) new application areas (2) new query classes (3) improved aggregation functions (4) new profile editor user interfaces (5) empirical work.

Slide46: 

END

Image processing: 

Image processing Luminance Number of pixels there are no black pixels there are no white pixels only rather dark pixels white handle assigns 100% luminance black handle assigns 0% luminance current state of the image desired state of the image gray handle assigns 50% luminance

Slide rule (Rechenschieber): 

Slide rule (Rechenschieber) 1 ½ 0 1 ½ 0 action movies comedies | | merge histograms “zipper style” c o m e d i e s ¾ ¼ ¾ ¼    

Histogram-based interfaces: 

Histogram-based interfaces hot! selected rejected Martial arts Legend Comedy shows Entertain- ment Sports 32 out of 333 sports programs per week selected 512 out of 914 movies per week selected Terminator 2 Dead Poets Society Amazons on Mars -------------------------- Le Grand Bleu Back to the Future hot! selected rejected Martial arts Legend 14 out of 14 martial arts programs per week selected Overall: 1094 out of 1797 programs per week selected Save Undo Comedy shows Entertain- ment 536 out of 536 comedy shows per week selected Sports

The jelly interface: 

The jelly interface

Slide51: 

STUFF

QSA vs. related work: 

QSA vs. related work

QSA can emulate some of them: 

QSA can emulate some of them SDI systems (Selected Dissemination of Information Rule-based systems Stereotype-based systems Automated collaborative filtering systems

Short break?: 

Short break?

Chapter 4: User interfaces Normalization and interest intensity editors: 

Chapter 4: User interfaces Normalization and interest intensity editors 1. Form-based 2. Histogram-based 3. and Paintable Interfaces

Parameters users know: 

Parameters users know Interest intensities “How important is that query to you” Amounts of objects “How many objects do you want from that query”

Relating histograms to each other: 

Relating histograms to each other Moving arrows

What is in and what is not?: 

What is in and what is not? 2. Dead Poets Society 1. Bayern-Manchester 2. Amazons on Mars ------------------------------- 2. Le Grand Bleu 1. Sat1 ran 2. Back to the Future

Comparison: 

Comparison 2F 1F 0F 2H 1H

Results: 

Results 2D preferred over 1D Computer experts preferred the more powerful histogram-based editors Computer novices prefer form-based 9 8 7 6 5 4 3 2 1 5 4 3 2 1 0 wonderful horrible horrible wonderful 2F 2H Number of subjects Number of subjects

Chapter 5: TV Scout: 

Chapter 5: TV Scout TV compared to other application areas TV Scout user interface overview Gathering implicit feedback The TV Scout query classes

Properties of TV: 

Properties of TV TV is mainly non-textual TV is broadcast medium no reactions of users, only expectations Annotations date out No incentive for reading new descriptions Broad range of content + interests

TV vs. other application areas: 

TV vs. other application areas

TV Scout User interface: 

TV Scout User interface

Gathering relevance feedback: 

Gathering relevance feedback

Gathering relevance feedback: 

Gathering relevance feedback Profit of relevance feedback Query suggestion, profile optimization For first-time users and casual users this is too far away for being an incentive Use implicit feedback, i.e. monitor user behavior Allows gathering relevance feedback also if users do not plan to have a profile => System can become active and suggest

Which implicit feedback to use?: 

Which implicit feedback to use? examination feedback is ambiguous, reference requires community, … but retention is very natural for planning TV [Nichols 97, Oard 98]

Implicit retention feedback: 

Implicit retention feedback Monitor retention tools and exact match menu Assign implicit ratings: Default is not inspected Overwrite displayed program descriptions that are not inspected with negative implicit rating not retained Overwrite rating of retained program descriptions with positive implicit rating retained Enhancement Assign negative rating avoided if user queries this date/time/channel segment, but other queries

Meaning of retention feedback: 

Meaning of retention feedback Implicit retention feedback => all predictions based on that predict what users will retain Retention is not planning to watch Planning to watch is not to watch To watch is not to have liked in the past Retention-based ratings support users only in doing what their task in the TV Scout is—to find the programs that they will want to retain.

Query classes of the TV Scout: 

Query classes of the TV Scout

TV Scout query classes: 

TV Scout query classes Text search FreeWAIS subsystem searches in titles Genres Combined with predicted popularity Extend to automated collaborative filtering Editor’s tips Tips from the professional TV TODAY editors Opinion leaders Users recruited to become “Editors”

Query classes: techniques: 

Query classes: techniques

Query classes: applicability: 

Query classes: applicability

Chapter 6: Conclusions: 

Chapter 6: Conclusions

Thanks: 

Thanks Dieter Boecker Uli Thiel Matthias Hemmje

END: 

END

NOT INCLUDED: 

NOT INCLUDED

Classification of IF systems: 

Classification of IF systems Objects User Rated objects feature extraction matching Profile (= objects) Rated stereotypes stereotype expansion Profile (= stereotypes) Rated attributes Profile (= attributes) feedback attributes

Bar chart  histogram: 

Bar chart  histogram

Email Profile Editor: 

Email Profile Editor

Channel interface toggle look: 

Channel interface toggle look

Banner advertising dialog: 

Banner advertising dialog Daily life Shopping Apparel Food Cosmetics Multimedia Music Games Movies Concerts Books Computer Hardware Software Internet Services Electronics Telecomm. TV Video Hi-fi Mobility Cars Flights Trains Last minute Hotels Money Insurance Stocks Services Contact Jobs Friends Dating Classifieds Sports&Fun Sports Clubs Traveling Infotainment News Magazines Media Competition Free stuff Banking Daily life Shopping Apparel Food Cosmetics Multimedia Music Games Movies Concerts Books Computer Hardware Software Internet Services Electronics Telecomm. TV Video Hi-fi Mobility Cars Flights Trains Last minute Hotels Money Insurance Stocks Services Contact Jobs Friends Dating Classifieds Sports&Fun Sports Clubs Traveling Infotainment News Magazines Media Competition Free stuff Banking done undo done undo

Toggle tree maps: 

Toggle tree maps

>> Classification of IF systems: 

>> Classification of IF systems Objects User Rated objects feature extraction Profile (= attributes) 1. Feature extraction (Relevance feedback) feedback attributes matching

>> Classification of IF systems: 

>> Classification of IF systems Objects User feedback Rated attributes attributes Profile (= attributes) 2. Attribute-level interaction (Rules etc.) matching

>> Classification of IF systems: 

>> Classification of IF systems Objects User Rated stereotypes stereotype expansion Profile (stereotypes) feedback attributes 3. Stereotype expansion matching

>> User profiles of related work: 

>> User profiles of related work 4. Automated collaborative filtering

>> 3. Stereotype-based: 

>> 3. Stereotype-based e.g. GRUNDY [Rich 79] The personality traits that users list to describe themselves are inherently long-term. => no possibility for the users to directly update the representation of their information needs

>> 4. Collaborative filtering: 

>> 4. Collaborative filtering No aggregation of ratings in user profile Interest change Which ratings have dated out, which are still valid? No way to find this out. => Interest change implies that users have to re-rate large amounts of objects, maybe all This is why ACF is so popular for movies: There are hardly any interest changes

IF model [Belkin & Croft 92]: 

d. Inner refinement cycle c. Outer refinement cycle b. Creation Producers of Documents Distributors of Documents Distribution and Representation Regular Information Interest Users/Groups with Long-term goals Representation Comparison or Filtering Modification Use and/or Evaluation Profiles Retrieved Documents Document Surrogates IF model [Belkin & Croft 92]

Collaborative filtering: 

Collaborative filtering Record reactions of users to data objects, e.g. documents (annotations) [Goldberg 92] Aggregates annotations and direct them to appropriate recipients.

Requirement 4: Output styles: 

Requirement 4: Output styles 1 n 1 mn 1  weakly ordered output single ranked output “multiple” ranked outputs

Requirement 5: Correctness: 

Requirement 5: Correctness “King and Queen” example [Mock 96] not interested in the features “King” and “Queen” but are interested in features “Sacramento” and “King” (the basketball team) Linear combination all input features makes conditional independence assumption same values for the word “king” unable to classify these articles correctly => not able to correctly represent multiple unrelated interests

Interest changes in literature: 

Interest changes in literature Gradual changes [Belkin 92, Baclace 91, Lang 95, ...] Consequence of processes, e.g. as people age Example: Favorite TV series Abrupt changes [Marchionini 95, Lam 96, Frisse 89...] Consequence of events Example: Actor quits series Temporary variations [Allen 90, Loeb 92, Kay 95, …] Mood changes Example: In the mood for an action movie

How to tackle the problem? Learn from interactive computer graphics: 

How to tackle the problem? Learn from interactive computer graphics

Computer graphics vs. Info filtering: 

Computer graphics vs. Info filtering

Interactivity in computer graphics: 

Interactivity in computer graphics IF: Interest changes are not known in advance => CG: interactive animation, e.g. video games

Interactivity in computer graphics: 

Interactivity in computer graphics

Interactivity in computer graphics: 

Interactivity in computer graphics

Requirement: Detail and interactivity: 

Requirement: Detail and interactivity Requirements high interactivity (rapid reaction to input) graphical quality Video games: scene graph and bitmaps Bitmaps for the details Scene graph for the modifiability Application programs: Assimilate characteristics of other approach Drawing programs => texture maps [Foley 90]. Painting programs => layers [Adobe].

Benefit from using layers in CG: 

Benefit from using layers in CG Creating all layers >= painting a single frame. … but, pays off when the scene is animated Represent change in scene graph (translate, fade in or out, or taint a layer, …) Update only selected layers Group into one layer what will change as a whole

Transfer the idea: 

Transfer the idea Transfer the idea 2D animation to information filtering n layers => n queries (Query = “a function that assigns ratings to objects”) Scene graph => “Aggregation function”

What if overlap is substantial?: 

What if overlap is substantial? The WRIR model assumes mutual independence This is not always justified Two queries are used in a data fusion way (=> redundancy) Action, comedy, but user dislikes action comedies (=> implicit interest) => Use model that can learn relation between queries

Link matrices: 

Link matrices

Model 2: Implementation as inference network: 

Model 2: Implementation as inference network r1 rm d2 r3 r2 q1 dj I d1 qn dj-1 … … … QSA profile

Learning inference network: 

Learning inference network [Baclace 91] Simple “agents” represent each query Complex “agents” represent conjunctions of queries Agents learn from relevance feedback what this query match or combination is worth

Normalization: Fitting: 

Normalization: Fitting

Normalization in image processing: 

Normalization in image processing

Demo levels dialog: 

Demo levels dialog

Results of user study: 

Results of user study What confuses users is the surface property “What does the height of these boxes mean?” They had recognized bar charts, not histograms => Better give up bar chart look Which real-world object has the right properties Deformable… …but not compressible (constant volume) Preserves its shape when deformed

Histograms help combining knowledge: 

Histograms help combining knowledge

Inserting queries in QSA profile: 

Inserting queries in QSA profile

TV Scout: 

TV Scout TV Scout interface with starting page

Some design possibilities: 

Some design possibilities

Painting (instead of multiple select): 

Painting (instead of multiple select) Use different colors to express different degrees of like or dislike

Semantic space layout: 

Semantic space layout Layout according to geographic location of TV stations

3D and 4D paintable interfaces: 

3D and 4D paintable interfaces Domains with natural n-dimensional structure Display in n-d Explosion displays keep 2-d painting applicable

Slide119: 

program descriptions Date feedback QSA filtering QSA profile Video labels Laundry list ad hoc query

Query-executing subsystems: 

Query-executing subsystems Use everything that returns (object, rating) pairs Can use retrieval systems, but also others TV Scout Genres, hand-made function in Sybase database Text searches run in FreeWAIS Editor’s recommendations imported from print magazine User tips done by users Plug in more query-executing subsystems at any time