logging in or signing up is202 fall06 Clown Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 95 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: June 26, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript User Interfaces for Information Access : User Interfaces for Information Access Marti Hearst IS202, Fall 2006 Outline: Outline What do people search for? Why is supporting search difficult? What works in search interfaces? When does search result grouping work? What about social tagging and search? What Do People Search For?(And How?): What Do People Search For? (And How?) A of Information Needs: A of Information Needs What is the typical height of a giraffe? What are some good ideas for landscaping my client’s yard? What are some promising untried treatments for Raynaud’s disease? Questions and Answers: Questions and Answers What is the height of a typical giraffe? The result can be a simple answer, extracted from existing web pages. Can specify with keywords or a natural language query However, most search engines are not set up to handle questions properly. Get different results using a question vs. keywords Slide6: Slide7: Slide8: Slide9: Slide10: Classifying Queries: Classifying Queries Query logs only indirectly indicate a user’s needs One set of keywords can mean various different things 'barcelona' 'dog pregnancy' 'taxes' Idea: pair up query logs with which search result the user clicked on. 'taxes' followed by a click on tax forms Study performed on Altavista logs Author noted afterwards that Yahoo logs appear to have a different query balance. Rose andamp; Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04 Classifying Web Queries: Classifying Web Queries Rose andamp; Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04 What are people looking for?Check out Google Answers: What are people looking for? Check out Google Answers Slide14: Slide15: Why is Supporting Search Difficult?: Why is Supporting Search Difficult? Why is Supporting Search Difficult?: Why is Supporting Search Difficult? Everything is fair game Abstractions are difficult to represent The vocabulary disconnect Users’ lack of understanding of the technology Everything is Fair Game: Everything is Fair Game The scope of what people search for is all of human knowledge and experience. Other interfaces are more constrained (word processing, formulas, etc) Interfaces must accommodate human differences in: Knowledge / life experience Cultural background and expectations Reading / scanning ability and style Methods of looking for things (pilers vs. filers) Abstractions Are Hard to Represent: Abstractions Are Hard to Represent Text describes abstract concepts Difficult to show the contents of text in a visual or compact manner Exercise: How would you show the preamble of the US Constitution visually? How would you show the contents of Joyce’s Ulysses visually? How would you distinguish it from Homer’s The Odyssey or McCourt’s Angela’s Ashes? The point: it is difficult to show text without using text Vocabulary Disconnect: Vocabulary Disconnect If you ask a set of people to describe a set of things there is little overlap in the results. Lack of Technical Understanding: Lack of Technical Understanding Most people don’t understand the underlying methods by which search engines work. People Don’t Understand Search Technology: People Don’t Understand Search Technology A study of 100 randomly-chosen people found: 14% never type a url directly into the address bar Several tried to use the address bar, but did it wrong Put spaces between words Combinations of dots and spaces 'nursing spectrum.com' 'consumer reports.com' Several use search form with no spaces 'plumber’slocal9' 'capitalhealthsystem' People do not understand the use of quotes Only 16% use quotes Of these, some use them incorrectly Around all of the words, making results too restrictive 'lactose intolerance –recipies' Here the – excludes the recipes People don’t make use of 'advanced' features Only 1 used 'find in page' Only 2 used Google cache Hargattai, Classifying and Coding Online Actions, Social Science Computer Review 22(2), 2004 210-227. People Don’t Understand Search Technology: People Don’t Understand Search Technology Without appropriate explanations, most of 14 people had strong misconceptions about: ANDing vs ORing of search terms Some assumed ANDing search engine indexed a smaller collection; most had no explanation at all For empty results for query 'to be or not to be' 9 of 14 could not explain in a method that remotely resembled stop word removal For term order variation 'boat fire' vs. 'fire boat' Only 5 out of 14 expected different results Understanding was vague, e.g.: 'Lycos separates the two words and searches for the meaning, instead of what’re your looking for. Google understands the meaning of the phrase.' Muramatsu andamp; Pratt, 'Transparent Queries: Investigating Users’ Mental Models of Search Engines, SIGIR 2001. What Works in Search Interfaces?: What Works in Search Interfaces? What Works for Search Interfaces?: What Works for Search Interfaces? Query term highlighting in results listings in retrieved documents Sorting of search results according to important criteria (date, author) Grouping of results according to well-organized category labels (see Flamenco) DWIM only if highly accurate: Spelling correction/suggestions Simple relevance feedback (more-like-this) Certain types of term expansion So far: not really visualization Hearst et al: Finding the Flow in Web Site Search, CACM 45(9), 2002. Highlighting Query Terms: Highlighting Query Terms Boldface or color Adjacency of terms with relevant context is a useful cue. Slide27: Slide28: Highlighted query term hits using Google toolbar: Highlighted query term hits using Google toolbar US Blackout PGA Microsoft Microso Small Details Matter: Small Details Matter UIs for search especially require great care in small details In part due to the text-heavy nature of search A tension between more information and introducing clutter How and where to place things important People tend to scan or skim Only a small percentage reads instructions Small Details Matter: Small Details Matter UIs for search especially require endless tiny adjustments In part due to the text-heavy nature of search Example: In an earlier version of the Google Spellchecker, people didn’t always see the suggested correction Used a long sentence at the top of the page: 'If you didn’t find what you were looking for …' People complained they got results, but not the right results. In reality, the spellchecker had suggested an appropriate correction. Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html Small Details Matter: Small Details Matter The fix: Analyzed logs, saw people didn’t see the correction: clicked on first search result, didn’t find what they were looking for (came right back to the search page scrolled to the bottom of the page, did not find anything and then complained directly to Google Solution was to repeat the spelling suggestion at the bottom of the page. More adjustments: The message is shorter, and different on the top vs. the bottom Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html Slide33: Using DWIM: Using DWIM DWIM – Do What I Mean Refers to systems that try to be 'smart' by guessing users’ unstated intentions or desires Examples: Automatically augment my query with related terms Automatically suggest spelling corrections Automatically load web pages that might be relevant to the one I’m looking at Automatically file my incoming email into folders Pop up a paperclip that tells me what kind of help I need. THE CRITICAL POINT: Users love DWIM when it really works Users DESPISE it when it doesn’t DWIM that Works: DWIM that Works Amazon’s 'customers who bought X also bought Y' And many other recommendation-related features DWIM Example: Spelling Correction/Suggestion: DWIM Example: Spelling Correction/Suggestion Google’s spelling suggestions are highly accurate But this wasn’t always the case. Google introduced a version that wasn’t very accurate. People hated it. They pulled it. (According to a talk by Marissa Mayer of Google.) Later they introduced a version that worked well. People love it. But don’t get too pushy. For a while if the user got very few results, the page was automatically replaced with the results of the spelling correction This was removed, presumably due to negative responses Information from a talk by Marissa Mayer of Google Query Reformulation: Query Reformulation Query reformulation: After receiving unsuccessful results, users modify their initial queries and submit new ones intended to more accurately reflect their information needs. Web search logs show that searchers often reformulate their queries A study of 985 Web user search sessions found 33% went beyond the first query Of these, ~35% retained the same number of terms while 19% had 1 more term and 16% had 1 fewer Use of query reformulation and relevance feedback by Excite users, Spink, Janson andamp; Ozmultu, Internet Research 10(4), 2001 Query Reformulation: Query Reformulation Many studies show that if users engage in relevance feedback, the results are much better. In one study, participants did 17-34% better with RF They also did better if they could see the RF terms than if the system did it automatically (DWIM) But the effort required for doing so is usually a roadblock. Koenemann andamp; Belkin, A Case for Interaction: A Study of Interactive Information Retrieval Behavior and Effectiveness, CHI’96 Query Reformulation: Query Reformulation What happens when the web search engines suggests new terms? Web log analysis study using the Prisma term suggestion system: Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Feedback terms were displayed to 15,133 user sessions. Of these, 14% used at least one feedback term For all sessions, 56% involved some degree of query refinement Within this subset, use of the feedback terms was 25% By user id, ~16% of users applied feedback terms at least once on any given day Looking at a 2-week session of feedback users: Of the 2,318 users who used it once, 47% used it again in the same 2-week window. Comparison was also done to a baseline group that was not offered feedback terms. Both groups ended up making a page-selection click at the same rate. Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Other observations Users prefer refinements that contain the initial query terms Presentation order does have an influence on term uptake Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Types of refinements Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Prognosis: Query Reformulation: Prognosis: Query Reformulation Researchers have always known it can be helpful, but the methods proposed for user interaction were too cumbersome Had to select many documents and then do feedback Had to select many terms Was based on statistical ranking methods which are hard for people to understand Indirect Relevance Feedback can improve general ranking (see section on social search) Usability of Grouping Search Results: Usability of Grouping Search Results The Need to Group: The Need to Group Interviews with lay users often reveal a desire for better organization of retrieval results Useful for suggesting where to look next People prefer links over generating search terms* But only when the links are for what they want *Ojakaar and Spool, Users Continue After Category Links, UIETips Newsletter, http://world.std.com/~uieweb/Articles/, 2001 Slide47: Slide48: Slide49: Slide50: Conundrum: Conundrum Everyone complains about disorganized search results. There are lots of ideas about how to organize them. Why don’t the major search engines do so? What works; what doesn’t? Different Types of Grouping: Different Types of Grouping Clusters (Document similarity based) (polythetic) Scatter/Gather Grouper Keyword Sharing (any doc with keyword in group) (monothetic) Findex DisCover Single Category Swish Dynacat Multiple (Faceted) Categories Flamenco Phlat/Stuff I’ve seen Monothetic vs Polythetic After Kummamuru et al, 2004 Clusters: Clusters Fully automated Potential benefits: Find the main themes in a set of documents Potentially useful if the user wants a summary of the main themes in the subcollection Potentially harmful if the user is interested in less dominant themes More flexible than pre-defined categories There may be important themes that have not been anticipated Disambiguate ambiguous terms ACL Clustering retrieved documents tends to group those relevant to a complex query together Hearst, Pedersen, Revisiting the Cluster Hypothesis, SIGIR’96 Categories: Categories Human-created But often automatically assigned to items Arranged in hierarchy, network, or facets Can assign multiple categories to items Or place items within categories Usually restricted to a fixed set So help reduce the space of concepts Intended to be readily understandable To those who know the underlying domain Provide a novice with a conceptual structure There are many already made up! Cluster-based Grouping: Cluster-based Grouping Document Self-similarity (Polythetic) Scatter/Gather Clustering: Scatter/Gather Clustering Developed at PARC in the late 80’s/early 90’s Top-down approach Start with k seeds (documents) to represent k clusters Each document assigned to the cluster with the most similar seeds To choose the seeds: Cluster in a bottom-up manner Hierarchical agglomerative clustering Can recluster a cluster to produce a hierarchy of clusters Pedersen, Cutting, Karger, Tukey, Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, SIGIR 1992 The Scatter/Gather Interface: The Scatter/Gather Interface Two Queries: Two Clusterings: Two Queries: Two Clusterings AUTO, CAR, ELECTRIC AUTO, CAR, SAFETY The main differences are the clusters that are central to the query 8 control drive accident … 25 battery california technology … 48 import j. rate honda toyota … 16 export international unit japan 3 service employee automatic … 6 control inventory integrate … 10 investigation washington … 12 study fuel death bag air … 61 sale domestic truck import … 11 japan export defect unite … Scatter/Gather Evaluations: Scatter/Gather Evaluations Can be slower to find answers than linear search! Difficult to understand the clusters. There is no consistence in results. However, the clusters do group relevant documents together. Participants noted that useful for eliminating irrelevant groups. S/G Example: query on “star”: S/G Example: query on 'star' Encyclopedia text 14 sports 8 symbols 47 film, tv 68 film, tv (p) 7 music 97 astrophysics 67 astronomy(p) 12 steller phenomena 10 flora/fauna 49 galaxies, stars 29 constellations 7 miscelleneous Clustering and re-clustering is entirely automated Slide61: Slide62: S/G Example: query on “star”: S/G Example: query on 'star' Newspaper/Magazine text 22 products / business 41 software / computers 35 hollywood 58 restaurants / food (reviews) 54 astronomers/movies 98 movies / tv (reviews) 9 film mini-reviews 31 wall street / finance Topics quite different from encyclopedia text! Visualizing Clustering Results: Visualizing Clustering Results Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. User dimension reduction and then project these onto a 2D/3D graphical representation Clustering Visualizationsimage from Wise et al 95: Clustering Visualizations image from Wise et al 95 Clustering Visualizations(image from Wise et al 95): Clustering Visualizations (image from Wise et al 95) Slide67: Kohonen Feature Maps (Lin 92, Chen et al. 97) Are visual clusters useful?: Are visual clusters useful? Four Clustering Visualization Usability Studies Conclusions: Huge 2D maps may be inappropriate focus for information retrieval cannot see what the documents are about space is difficult to browse for IR purposes (tough to visualize abstract concepts) Perhaps more suited for pattern discovery and gist-like overviews. Clustering for Search Study 1: Clustering for Search Study 1 This study compared a system with 2D graphical clusters a system with 3D graphical clusters a system that shows textual clusters Novice users Only textual clusters were helpful (and they were difficult to use well) Kleiboemer, Lazear, and Pedersen. Tailoring a retrieval system for naive users. SDAIR’96 Clustering Study 2: Kohonen Feature Maps, Chen et al.: Clustering Study 2: Kohonen Feature Maps, Chen et al. Comparison: Kohonen Map and Yahoo Task: 'Window shop' for interesting home page Repeat with other interface Results: Starting with map could repeat in Yahoo (8/11) Starting with Yahoo unable to repeat in map (2/14) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Slide71: Kohonen Feature Maps (Lin 92, Chen et al. 97) Study 2 (cont.), Chen et al.: Study 2 (cont.), Chen et al. Participants liked: Correspondence of region size to # documents Overview (but also wanted zoom) Ease of jumping from one topic to another Multiple routes to topics Use of category and subcategory labels Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Study 2 (cont.), Chen et al.: Study 2 (cont.), Chen et al. Participants wanted: hierarchical organization other ordering of concepts (alphabetical) integration of browsing and search correspondence of color to meaning more meaningful labels labels at same level of abstraction fit more labels in the given space combined keyword and category search multiple category assignment (sports+entertain) (These can all be addressed with faceted categories) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Clustering Study 3: Sebrechts et al.: Clustering Study 3: Sebrechts et al. Each rectangle is a cluster. Larger clusters closer to the 'pole'. Similar clusters near one another. Opening a cluster causes a projection that shows the titles. Study 3, Sebrechts et al.: Study 3, Sebrechts et al. This study compared: 3D graphical clusters 2D graphical clusters textual clusters 15 participants, between-subject design Tasks Locate a particular document Locate and mark a particular document Locate a previously marked document Locate all clusters that discuss some topic List more frequently represented topics Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller, SIGIR ‘99. Study 3, Sebrechts et al.: Study 3, Sebrechts et al. Results (time to locate targets) Text clusters fastest 2D next 3D last With practice (6 sessions) 2D neared text results; 3D still slower Computer experts were just as fast with 3D Certain tasks equally fast with 2D andamp; text Find particular cluster Find an already-marked document But anything involving text (e.g., find title) much faster with text. Spatial location rotated, so users lost context Helpful viz features Color coding (helped text too) Relative vertical locations Clustering Study 4: Clustering Study 4 Compared several factors Findings: Topic effects dominate (this is a common finding) Strong difference in results based on spatial ability No difference between librarians and other people No evidence of usefulness for the cluster visualization Aspect windows, 3-D visualizations, and indirect comparisons of information retrieval systems, Swan, andamp;Allan, SIGIR 1998. Summary:Visualizing for Search Using Clusters: Summary: Visualizing for Search Using Clusters Huge 2D maps may be inappropriate focus for information retrieval cannot see what the documents are about space is difficult to browse for IR purposes (tough to visualize abstract concepts) Perhaps more suited for pattern discovery and gist-like overviews. Clustering Algorithm Problems: Clustering Algorithm Problems Doesn’t work well if data is too homogenous or too heterogeneous Often is difficult to interpret quickly Automatically generated labels are unintuitive and occur at different levels of description Often the top-level can be ok, but the subsequent levels are very poor Need a better way to handle items that fall into more than one cluster Term-based Grouping: Term-based Grouping Single Term from Document Characterizes the Group (Monothetic) Findex, Kaki & Aula: Findex, Kaki andamp; Aula Two innovations: Used very simple method to create the groupings, so that it is not opaque to users Based on frequent keywords Doc is in category if it contains the keyword Allows docs to appear in multiple categories Did a naturalistic, longitudinal study of use Analyzed the results in interesting ways Kaki and Aula: 'Findex: Search Result Categories Help Users when Document Ranking Fails', CHI ‘05 Slide82: Study Design: Study Design 16 academics 8F, 8M No CS Frequent searchers 2 months of use Special Log 3099 queries issued 3232 results accessed Two questionnaires (at start and end) Google as search engine; rank order retained Slide84: After 1 Week After 2 Months Kaki & Aula Key Findings (all significant): Kaki andamp; Aula Key Findings (all significant) Category use takes almost 2 times longer than linear First doc selected in 24.4 sec vs 13.7 sec No difference in average number of docs opened per search (1.05 vs. 1.04) However, when categories used, users select andgt;1 doc in 28.6% of the queries (vs 13.6%) Num of searches without 0 result selections is lower when the categories are used Median position of selected doc when: Using categories: 22 (sd=38) Just ranking: 2 (sd=8.6) Kaki & Aula Key Findings: Kaki andamp; Aula Key Findings Category Selections 1915 categories selections in 817 searches Used in 26.4% of the searches During the last 4 weeks of use, the proportion of searches using categories stayed above the average (27-39%) When categories used, selected 2.3 cats on average Labels of selected cats used 1.9 words on average (average in general was 1.4 words) Out of 15 cats (default): First quartile at 2nd cat Median at 5th Third quartile at 9th Kaki & Aula Survey Results: Kaki andamp; Aula Survey Results Subjective opinions improved over time Realization that categories useful only some of the time Freeform responses indicate that categories useful when queries vague, broad or ambiguous Second survey indicated that people felt that their search habits began to change Consider query formulation less than before (27%) Use less precise search terms (45%) Use less time to evaluate results (36%) Use categories for evaluating results (82%) Conclusions from Kaki Study: Conclusions from Kaki Study Simplicity of category assignment made groupings understandable (my view, not stated by them) Keyword-based Categories: Are beneficial when result ranking fails Find results lower in the ranking Reduce empty results May make it easier to access multiple results Availability changed user querying behavior Highlight, Wu et al.: Highlight, Wu et al. Select terms from document summaries, organize into a subsumption hierarchy. Highlight the terms in the retrieved documents. Wu, Shankar, Chen, Finding More Useful Information Faster from Web Search Results CICM ‘03 Slide90: Slide91: Slide92: Highlight, Wu et al.: Highlight, Wu et al. First study: 19 undergraduates Used the system for their own queries Significant preference for the grouping interface Second study: 6 participants Their own queries Accesses were sequential in linear interface Accesses went deeper in grouping interface Participants saved more documents per query Category-based Grouping: Category-based Grouping General Categories Domain-Specific Categories Slide95: SWISH, Chen & Dumais: SWISH, Chen andamp; Dumais 18 participants, 30 tasks, within subjects Significant (and large, 50%) timing differences in favor of categories For queries where the results are in the first page, the differences are much smaller. Strong subjective preferences. BUT: the baseline was quite poor and the queries were very cooked. Very small category set (13 categories) Subhierarchy wasn’t used. Chen, Dumais, Bringing Order to the Web: Automatically Categorizing Search Results CHI 2000 Test queries, Chen & Dumais : Test queries, Chen andamp; Dumais Chen, Dumais, Bringing Order to the Web, Automatically Categorizing Search Results. CHI 2000 Slide98: Dumais, Cutrell, Chen, Bringing Order to the Web, Optimizing Search by Showing Results in Context, CHI 2001 Slide99: Revisiting the Study, Dumais, Cutrell, Chen Slide100: Revisiting the Study, Dumais, Cutrell, Chen Slide101: Revisiting the Study, Dumais, Cutrell, Chen Revisiting the Study, Dumais, Cutrell, Chen: This followup study reveals that the baseline had been unfairly weakened. The speedup isn’t so much from the category labels as the grouping of similar documents. For queries where the answer is in the first page, the category effects are not very strong. Revisiting the Study, Dumais, Cutrell, Chen DynaCat, Pratt, Hearst, and Fagan.: DynaCat, Pratt, Hearst, and Fagan. Medical Domain Decide on important question types in an advance What are the adverse effects of drug D? What is the prognosis for treatment T? Make use of MeSH categories Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat, Pratt, Hearst, & Fagan: DynaCat, Pratt, Hearst, andamp; Fagan Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat Study, Pratt, Hearst & Fagan: DynaCat Study, Pratt, Hearst andamp; Fagan Design Three queries 24 cancer patients Compared three interfaces ranked list, clusters, categories Results Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat study, Pratt et al.: DynaCat study, Pratt et al. Faceted Category Grouping: Faceted Category Grouping Multiple Categories per Document Search Usability Design Goals: Search Usability Design Goals Strive for Consistency Provide Shortcuts Offer Informative Feedback Design for Closure Provide Simple Error Handling Permit Easy Reversal of Actions Support User Control Reduce Short-term Memory Load From Shneiderman, Byrd, andamp; Croft, Clarifying Search, DLIB Magazine, Jan 1997. www.dlib.org How to Structure Information for Search and Browsing?: How to Structure Information for Search and Browsing? Hierarchy is too rigid KL-One is too complex Hierarchical faceted metadata: A useful middle ground The Problem with Hierarchy: Inflexible Force the user to start with a particular category What if I don’t know the animal’s diet, but the interface makes me start with that category? Wasteful Have to repeat combinations of categories Makes for extra clicking and extra coding Difficult to modify To add a new category type, must duplicate it everywhere or change things everywhere The Problem with Hierarchy The Idea of Facets: The Idea of Facets Facets are a way of labeling data A kind of Metadata (data about data) Can be thought of as properties of items Facets vs. Categories Items are placed INTO a category system Multiple facet labels are ASSIGNED TO items The Idea of Facets: The Idea of Facets Create INDEPENDENT categories (facets) Each facet has labels (sometimes arranged in a hierarchy) Assign labels from the facets to every item Example: recipe collection Course Main Course Cooking Method Stir-fry Cuisine Thai Ingredient Bell Pepper Curry Chicken The Idea of Facets: The Idea of Facets Break out all the important concepts into their own facets Sometimes the facets are hierarchical Assign labels to items from any level of the hierarchy Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sorbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple Using Facets: Using Facets Now there are multiple ways to get to each item Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sherbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple Fruit andgt; Pineapple Dessert andgt; Cake Preparation andgt; Bake Dessert andgt; Dairy andgt; Sherbet Fruit andgt; Berries andgt; Strawberries Preparation andgt; Freeze Using Facets: Using Facets The system only shows the labels that correspond to the current set of items Start with all items and all facets The user then selects a label within a facet This reduces the set of items (only those that have been assigned to the subcategory label are displayed) This also eliminates some subcategories from the view. Flamenco Usability Studies: Flamenco Usability Studies Usability studies done on 3 collections: Recipes: 13,000 items Architecture Images: 40,000 items Fine Arts Images: 35,000 items Conclusions: Users like and are successful with the dynamic faceted hierarchical metadata, especially for browsing tasks Very positive results, in contrast with studies on earlier iterations. Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. Flamenco Study Post-Interface Assessments: Flamenco Study Post-Interface Assessments All significant at pandlt;.05 except 'simple' and 'overwhelming' Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. Flamenco Study Post-Test Comparison: Flamenco Study Post-Test Comparison Faceted Baseline Overall Assessment More useful for your tasks Easiest to use Most flexible More likely to result in dead ends Helped you learn more Overall preference Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For: Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. The Advantages of Facets: The Advantages of Facets Lets the user decide how to start, and how to explore and group. After refinement, categories that are not relevant to the current results disappear. Seamlessly integrates keyword search with the organizational structure. Very easy to expand out (loosen constraints) Very easy to build up complex queries. Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Advantages of Facets: Advantages of Facets Can’t end up with empty results sets (except with keyword search) Helps avoid feelings of being lost. Easier to explore the collection. Helps users infer what kinds of things are in the collection. Evokes a feeling of 'browsing the shelves' Is preferred over standard search for collection browsing in usability studies. (Interface must be designed properly) Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Advantages of Facets: Advantages of Facets Seamless to add new facets and subcategories Seamless to add new items. Helps with 'categorization wars' Don’t have to agree exactly where to place something Interaction can be implemented using a standard relational database. May be easier for automatic categorization Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Summary: Evaluation Good Ideas : Summary: Evaluation Good Ideas Longitudinal studies of real use Match the participants to the content of the collection and the tasks Test against a strong baseline Summary: Evaluation Problems: Summary: Evaluation Problems Bias participants towards a system 'Try our interface' versus linear view Tailor tasks unrealistically to benefit the target interface Impoverish the baseline relative to the test condition Conflate test conditions Summary: Grouping Search Results: Summary: Grouping Search Results Grouping search results seems beneficial in two circumstances: General web search, using transparent labeling (monothetic terms) or category labels rather than cluster centroids. Effects: Works primarily on ambiguous queries, (so used a fraction of the time) Promotes relevant results up from below the first page of hits So important to group the related items together visually Users tend to select more documents than with linear search May work even better with meta-search Positive subjective responses (small studies) Visualization does not work. Summary: Grouping Search Results: Summary: Grouping Search Results Grouping search results seems beneficial in two circumstances: Collection navigation with faceted categories Multiple angles better than single categories 'searchers' turn into 'browsers' Becoming commonplace in e-commerce, digital libraries, and other kinds of collections Extends naturally to tags. Positive subjective responses (small studies) Social Tagging and Search: Social Tagging and Search Slide127: Search Topical Metadata Structured, Flexible Navigation Problem with Metadata-Oriented Approaches: Problem with Metadata-Oriented Approaches Getting the metadata! Slide129: Search Topical Metadata Social question answering Recorded Human Interaction Click-through ranking Inferred recommendations Human Real-time Question Answering: Human Real-time Question Answering More popular in Korea than algorithmic search Maybe fewer good web pages? Maybe more social society? Several examples in US: Yahoo answers recently released and successful wondir.com answerbag.com Yahoo Answers (also answerbag.com, wondir.com, etc): Yahoo Answers (also answerbag.com, wondir.com, etc) Yahoo Answers appearing in search results: Yahoo Answers appearing in search results answerbag.com: answerbag.com Using User Behavior as Implicit Preferences: Using User Behavior as Implicit Preferences Search click-through experimentally shown to boost search rankings for top results Joachims et al. ‘05, Agichtein et al. ‘06 Works ok even if non-relevant documents examined Best in combination with sophisticated search algorithms Doesn’t work well for ambiguous queries Aggregates of movie and book selections comprise implicit recommendations Slide135: Search Topical Metadata Recorded Human Interaction Social Tagging (photos, bookmarks) Game-based tagging Social Tagging: Social Tagging Metadata assignment without all the bother Spontaneous, easy, and tends towards single terms Issues with Photo and Web link Tagging: Issues with Photo and Web link Tagging There is a strong personal component Marking for my own reminders Marking for my circle of friends There is also a strong social component Try to promote certain tags to make them more popular, or post to popular tags to see your influence rise Tagging Games: Tagging Games Assigning metadata is fun! (ESP game, von Ahn) No need for reputation system, etc. Pay people to do it MyCroft (iSchool student project) Drawback: least common denominator labels Experts already label their own data or that about which they have expertise E.g., protein function Wikipedia Slide139: Search Topical Metadata Social question answering Recorded Human Interaction Social Tagging (photos, bookmarks) Click-through ranking Inferred recommendations Game-based tagging ???? Expert-Oriented Tagging in Search: Expert-Oriented Tagging in Search Already happening at Google co-op Shows up in certain types of search results Expert-Oriented Tagging: Expert-Oriented Tagging Already happening at Google co-op Shows up in certain types of search results Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: User Interfaces To make rapid-feedback suggestions of pre-established tags Like type-ahead queries To incentivize labeling and make it fun To allow the personal aspects to shine through Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: NLP Algorithms (We have an algorithm to build facets from text) To convert tags into facet hierarchies To capture implicit labeling information Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: Digital infrastructure Extending tagging games Build an architecture that channels specialized subproblems to appropriate experts We now know there is a green plant in an office; direct this to the botany andgt; houseplants experts Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: economics and sociology What are the right incentive structures? Using Implicit Preferences: Using Implicit Preferences Extend implicit recommendation technology to online catalog use Final Words: Final Words User interfaces for search remains a fascinating and challenging field Search has taken a primary role in the web and internet business Thus, we can expect fascinating developments, and maybe some breakthroughs, in the next few years! You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
is202 fall06 Clown Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 95 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: June 26, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript User Interfaces for Information Access : User Interfaces for Information Access Marti Hearst IS202, Fall 2006 Outline: Outline What do people search for? Why is supporting search difficult? What works in search interfaces? When does search result grouping work? What about social tagging and search? What Do People Search For?(And How?): What Do People Search For? (And How?) A of Information Needs: A of Information Needs What is the typical height of a giraffe? What are some good ideas for landscaping my client’s yard? What are some promising untried treatments for Raynaud’s disease? Questions and Answers: Questions and Answers What is the height of a typical giraffe? The result can be a simple answer, extracted from existing web pages. Can specify with keywords or a natural language query However, most search engines are not set up to handle questions properly. Get different results using a question vs. keywords Slide6: Slide7: Slide8: Slide9: Slide10: Classifying Queries: Classifying Queries Query logs only indirectly indicate a user’s needs One set of keywords can mean various different things 'barcelona' 'dog pregnancy' 'taxes' Idea: pair up query logs with which search result the user clicked on. 'taxes' followed by a click on tax forms Study performed on Altavista logs Author noted afterwards that Yahoo logs appear to have a different query balance. Rose andamp; Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04 Classifying Web Queries: Classifying Web Queries Rose andamp; Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04 What are people looking for?Check out Google Answers: What are people looking for? Check out Google Answers Slide14: Slide15: Why is Supporting Search Difficult?: Why is Supporting Search Difficult? Why is Supporting Search Difficult?: Why is Supporting Search Difficult? Everything is fair game Abstractions are difficult to represent The vocabulary disconnect Users’ lack of understanding of the technology Everything is Fair Game: Everything is Fair Game The scope of what people search for is all of human knowledge and experience. Other interfaces are more constrained (word processing, formulas, etc) Interfaces must accommodate human differences in: Knowledge / life experience Cultural background and expectations Reading / scanning ability and style Methods of looking for things (pilers vs. filers) Abstractions Are Hard to Represent: Abstractions Are Hard to Represent Text describes abstract concepts Difficult to show the contents of text in a visual or compact manner Exercise: How would you show the preamble of the US Constitution visually? How would you show the contents of Joyce’s Ulysses visually? How would you distinguish it from Homer’s The Odyssey or McCourt’s Angela’s Ashes? The point: it is difficult to show text without using text Vocabulary Disconnect: Vocabulary Disconnect If you ask a set of people to describe a set of things there is little overlap in the results. Lack of Technical Understanding: Lack of Technical Understanding Most people don’t understand the underlying methods by which search engines work. People Don’t Understand Search Technology: People Don’t Understand Search Technology A study of 100 randomly-chosen people found: 14% never type a url directly into the address bar Several tried to use the address bar, but did it wrong Put spaces between words Combinations of dots and spaces 'nursing spectrum.com' 'consumer reports.com' Several use search form with no spaces 'plumber’slocal9' 'capitalhealthsystem' People do not understand the use of quotes Only 16% use quotes Of these, some use them incorrectly Around all of the words, making results too restrictive 'lactose intolerance –recipies' Here the – excludes the recipes People don’t make use of 'advanced' features Only 1 used 'find in page' Only 2 used Google cache Hargattai, Classifying and Coding Online Actions, Social Science Computer Review 22(2), 2004 210-227. People Don’t Understand Search Technology: People Don’t Understand Search Technology Without appropriate explanations, most of 14 people had strong misconceptions about: ANDing vs ORing of search terms Some assumed ANDing search engine indexed a smaller collection; most had no explanation at all For empty results for query 'to be or not to be' 9 of 14 could not explain in a method that remotely resembled stop word removal For term order variation 'boat fire' vs. 'fire boat' Only 5 out of 14 expected different results Understanding was vague, e.g.: 'Lycos separates the two words and searches for the meaning, instead of what’re your looking for. Google understands the meaning of the phrase.' Muramatsu andamp; Pratt, 'Transparent Queries: Investigating Users’ Mental Models of Search Engines, SIGIR 2001. What Works in Search Interfaces?: What Works in Search Interfaces? What Works for Search Interfaces?: What Works for Search Interfaces? Query term highlighting in results listings in retrieved documents Sorting of search results according to important criteria (date, author) Grouping of results according to well-organized category labels (see Flamenco) DWIM only if highly accurate: Spelling correction/suggestions Simple relevance feedback (more-like-this) Certain types of term expansion So far: not really visualization Hearst et al: Finding the Flow in Web Site Search, CACM 45(9), 2002. Highlighting Query Terms: Highlighting Query Terms Boldface or color Adjacency of terms with relevant context is a useful cue. Slide27: Slide28: Highlighted query term hits using Google toolbar: Highlighted query term hits using Google toolbar US Blackout PGA Microsoft Microso Small Details Matter: Small Details Matter UIs for search especially require great care in small details In part due to the text-heavy nature of search A tension between more information and introducing clutter How and where to place things important People tend to scan or skim Only a small percentage reads instructions Small Details Matter: Small Details Matter UIs for search especially require endless tiny adjustments In part due to the text-heavy nature of search Example: In an earlier version of the Google Spellchecker, people didn’t always see the suggested correction Used a long sentence at the top of the page: 'If you didn’t find what you were looking for …' People complained they got results, but not the right results. In reality, the spellchecker had suggested an appropriate correction. Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html Small Details Matter: Small Details Matter The fix: Analyzed logs, saw people didn’t see the correction: clicked on first search result, didn’t find what they were looking for (came right back to the search page scrolled to the bottom of the page, did not find anything and then complained directly to Google Solution was to repeat the spelling suggestion at the bottom of the page. More adjustments: The message is shorter, and different on the top vs. the bottom Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html Slide33: Using DWIM: Using DWIM DWIM – Do What I Mean Refers to systems that try to be 'smart' by guessing users’ unstated intentions or desires Examples: Automatically augment my query with related terms Automatically suggest spelling corrections Automatically load web pages that might be relevant to the one I’m looking at Automatically file my incoming email into folders Pop up a paperclip that tells me what kind of help I need. THE CRITICAL POINT: Users love DWIM when it really works Users DESPISE it when it doesn’t DWIM that Works: DWIM that Works Amazon’s 'customers who bought X also bought Y' And many other recommendation-related features DWIM Example: Spelling Correction/Suggestion: DWIM Example: Spelling Correction/Suggestion Google’s spelling suggestions are highly accurate But this wasn’t always the case. Google introduced a version that wasn’t very accurate. People hated it. They pulled it. (According to a talk by Marissa Mayer of Google.) Later they introduced a version that worked well. People love it. But don’t get too pushy. For a while if the user got very few results, the page was automatically replaced with the results of the spelling correction This was removed, presumably due to negative responses Information from a talk by Marissa Mayer of Google Query Reformulation: Query Reformulation Query reformulation: After receiving unsuccessful results, users modify their initial queries and submit new ones intended to more accurately reflect their information needs. Web search logs show that searchers often reformulate their queries A study of 985 Web user search sessions found 33% went beyond the first query Of these, ~35% retained the same number of terms while 19% had 1 more term and 16% had 1 fewer Use of query reformulation and relevance feedback by Excite users, Spink, Janson andamp; Ozmultu, Internet Research 10(4), 2001 Query Reformulation: Query Reformulation Many studies show that if users engage in relevance feedback, the results are much better. In one study, participants did 17-34% better with RF They also did better if they could see the RF terms than if the system did it automatically (DWIM) But the effort required for doing so is usually a roadblock. Koenemann andamp; Belkin, A Case for Interaction: A Study of Interactive Information Retrieval Behavior and Effectiveness, CHI’96 Query Reformulation: Query Reformulation What happens when the web search engines suggests new terms? Web log analysis study using the Prisma term suggestion system: Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Feedback terms were displayed to 15,133 user sessions. Of these, 14% used at least one feedback term For all sessions, 56% involved some degree of query refinement Within this subset, use of the feedback terms was 25% By user id, ~16% of users applied feedback terms at least once on any given day Looking at a 2-week session of feedback users: Of the 2,318 users who used it once, 47% used it again in the same 2-week window. Comparison was also done to a baseline group that was not offered feedback terms. Both groups ended up making a page-selection click at the same rate. Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Other observations Users prefer refinements that contain the initial query terms Presentation order does have an influence on term uptake Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Query Reformulation Study: Query Reformulation Study Types of refinements Anick, Using Terminological Feedback for Web Search Refinement – A Log-based Study, SIGIR’03. Prognosis: Query Reformulation: Prognosis: Query Reformulation Researchers have always known it can be helpful, but the methods proposed for user interaction were too cumbersome Had to select many documents and then do feedback Had to select many terms Was based on statistical ranking methods which are hard for people to understand Indirect Relevance Feedback can improve general ranking (see section on social search) Usability of Grouping Search Results: Usability of Grouping Search Results The Need to Group: The Need to Group Interviews with lay users often reveal a desire for better organization of retrieval results Useful for suggesting where to look next People prefer links over generating search terms* But only when the links are for what they want *Ojakaar and Spool, Users Continue After Category Links, UIETips Newsletter, http://world.std.com/~uieweb/Articles/, 2001 Slide47: Slide48: Slide49: Slide50: Conundrum: Conundrum Everyone complains about disorganized search results. There are lots of ideas about how to organize them. Why don’t the major search engines do so? What works; what doesn’t? Different Types of Grouping: Different Types of Grouping Clusters (Document similarity based) (polythetic) Scatter/Gather Grouper Keyword Sharing (any doc with keyword in group) (monothetic) Findex DisCover Single Category Swish Dynacat Multiple (Faceted) Categories Flamenco Phlat/Stuff I’ve seen Monothetic vs Polythetic After Kummamuru et al, 2004 Clusters: Clusters Fully automated Potential benefits: Find the main themes in a set of documents Potentially useful if the user wants a summary of the main themes in the subcollection Potentially harmful if the user is interested in less dominant themes More flexible than pre-defined categories There may be important themes that have not been anticipated Disambiguate ambiguous terms ACL Clustering retrieved documents tends to group those relevant to a complex query together Hearst, Pedersen, Revisiting the Cluster Hypothesis, SIGIR’96 Categories: Categories Human-created But often automatically assigned to items Arranged in hierarchy, network, or facets Can assign multiple categories to items Or place items within categories Usually restricted to a fixed set So help reduce the space of concepts Intended to be readily understandable To those who know the underlying domain Provide a novice with a conceptual structure There are many already made up! Cluster-based Grouping: Cluster-based Grouping Document Self-similarity (Polythetic) Scatter/Gather Clustering: Scatter/Gather Clustering Developed at PARC in the late 80’s/early 90’s Top-down approach Start with k seeds (documents) to represent k clusters Each document assigned to the cluster with the most similar seeds To choose the seeds: Cluster in a bottom-up manner Hierarchical agglomerative clustering Can recluster a cluster to produce a hierarchy of clusters Pedersen, Cutting, Karger, Tukey, Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, SIGIR 1992 The Scatter/Gather Interface: The Scatter/Gather Interface Two Queries: Two Clusterings: Two Queries: Two Clusterings AUTO, CAR, ELECTRIC AUTO, CAR, SAFETY The main differences are the clusters that are central to the query 8 control drive accident … 25 battery california technology … 48 import j. rate honda toyota … 16 export international unit japan 3 service employee automatic … 6 control inventory integrate … 10 investigation washington … 12 study fuel death bag air … 61 sale domestic truck import … 11 japan export defect unite … Scatter/Gather Evaluations: Scatter/Gather Evaluations Can be slower to find answers than linear search! Difficult to understand the clusters. There is no consistence in results. However, the clusters do group relevant documents together. Participants noted that useful for eliminating irrelevant groups. S/G Example: query on “star”: S/G Example: query on 'star' Encyclopedia text 14 sports 8 symbols 47 film, tv 68 film, tv (p) 7 music 97 astrophysics 67 astronomy(p) 12 steller phenomena 10 flora/fauna 49 galaxies, stars 29 constellations 7 miscelleneous Clustering and re-clustering is entirely automated Slide61: Slide62: S/G Example: query on “star”: S/G Example: query on 'star' Newspaper/Magazine text 22 products / business 41 software / computers 35 hollywood 58 restaurants / food (reviews) 54 astronomers/movies 98 movies / tv (reviews) 9 film mini-reviews 31 wall street / finance Topics quite different from encyclopedia text! Visualizing Clustering Results: Visualizing Clustering Results Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. User dimension reduction and then project these onto a 2D/3D graphical representation Clustering Visualizationsimage from Wise et al 95: Clustering Visualizations image from Wise et al 95 Clustering Visualizations(image from Wise et al 95): Clustering Visualizations (image from Wise et al 95) Slide67: Kohonen Feature Maps (Lin 92, Chen et al. 97) Are visual clusters useful?: Are visual clusters useful? Four Clustering Visualization Usability Studies Conclusions: Huge 2D maps may be inappropriate focus for information retrieval cannot see what the documents are about space is difficult to browse for IR purposes (tough to visualize abstract concepts) Perhaps more suited for pattern discovery and gist-like overviews. Clustering for Search Study 1: Clustering for Search Study 1 This study compared a system with 2D graphical clusters a system with 3D graphical clusters a system that shows textual clusters Novice users Only textual clusters were helpful (and they were difficult to use well) Kleiboemer, Lazear, and Pedersen. Tailoring a retrieval system for naive users. SDAIR’96 Clustering Study 2: Kohonen Feature Maps, Chen et al.: Clustering Study 2: Kohonen Feature Maps, Chen et al. Comparison: Kohonen Map and Yahoo Task: 'Window shop' for interesting home page Repeat with other interface Results: Starting with map could repeat in Yahoo (8/11) Starting with Yahoo unable to repeat in map (2/14) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Slide71: Kohonen Feature Maps (Lin 92, Chen et al. 97) Study 2 (cont.), Chen et al.: Study 2 (cont.), Chen et al. Participants liked: Correspondence of region size to # documents Overview (but also wanted zoom) Ease of jumping from one topic to another Multiple routes to topics Use of category and subcategory labels Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Study 2 (cont.), Chen et al.: Study 2 (cont.), Chen et al. Participants wanted: hierarchical organization other ordering of concepts (alphabetical) integration of browsing and search correspondence of color to meaning more meaningful labels labels at same level of abstraction fit more labels in the given space combined keyword and category search multiple category assignment (sports+entertain) (These can all be addressed with faceted categories) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): 582-603 (1998) Clustering Study 3: Sebrechts et al.: Clustering Study 3: Sebrechts et al. Each rectangle is a cluster. Larger clusters closer to the 'pole'. Similar clusters near one another. Opening a cluster causes a projection that shows the titles. Study 3, Sebrechts et al.: Study 3, Sebrechts et al. This study compared: 3D graphical clusters 2D graphical clusters textual clusters 15 participants, between-subject design Tasks Locate a particular document Locate and mark a particular document Locate a previously marked document Locate all clusters that discuss some topic List more frequently represented topics Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller, SIGIR ‘99. Study 3, Sebrechts et al.: Study 3, Sebrechts et al. Results (time to locate targets) Text clusters fastest 2D next 3D last With practice (6 sessions) 2D neared text results; 3D still slower Computer experts were just as fast with 3D Certain tasks equally fast with 2D andamp; text Find particular cluster Find an already-marked document But anything involving text (e.g., find title) much faster with text. Spatial location rotated, so users lost context Helpful viz features Color coding (helped text too) Relative vertical locations Clustering Study 4: Clustering Study 4 Compared several factors Findings: Topic effects dominate (this is a common finding) Strong difference in results based on spatial ability No difference between librarians and other people No evidence of usefulness for the cluster visualization Aspect windows, 3-D visualizations, and indirect comparisons of information retrieval systems, Swan, andamp;Allan, SIGIR 1998. Summary:Visualizing for Search Using Clusters: Summary: Visualizing for Search Using Clusters Huge 2D maps may be inappropriate focus for information retrieval cannot see what the documents are about space is difficult to browse for IR purposes (tough to visualize abstract concepts) Perhaps more suited for pattern discovery and gist-like overviews. Clustering Algorithm Problems: Clustering Algorithm Problems Doesn’t work well if data is too homogenous or too heterogeneous Often is difficult to interpret quickly Automatically generated labels are unintuitive and occur at different levels of description Often the top-level can be ok, but the subsequent levels are very poor Need a better way to handle items that fall into more than one cluster Term-based Grouping: Term-based Grouping Single Term from Document Characterizes the Group (Monothetic) Findex, Kaki & Aula: Findex, Kaki andamp; Aula Two innovations: Used very simple method to create the groupings, so that it is not opaque to users Based on frequent keywords Doc is in category if it contains the keyword Allows docs to appear in multiple categories Did a naturalistic, longitudinal study of use Analyzed the results in interesting ways Kaki and Aula: 'Findex: Search Result Categories Help Users when Document Ranking Fails', CHI ‘05 Slide82: Study Design: Study Design 16 academics 8F, 8M No CS Frequent searchers 2 months of use Special Log 3099 queries issued 3232 results accessed Two questionnaires (at start and end) Google as search engine; rank order retained Slide84: After 1 Week After 2 Months Kaki & Aula Key Findings (all significant): Kaki andamp; Aula Key Findings (all significant) Category use takes almost 2 times longer than linear First doc selected in 24.4 sec vs 13.7 sec No difference in average number of docs opened per search (1.05 vs. 1.04) However, when categories used, users select andgt;1 doc in 28.6% of the queries (vs 13.6%) Num of searches without 0 result selections is lower when the categories are used Median position of selected doc when: Using categories: 22 (sd=38) Just ranking: 2 (sd=8.6) Kaki & Aula Key Findings: Kaki andamp; Aula Key Findings Category Selections 1915 categories selections in 817 searches Used in 26.4% of the searches During the last 4 weeks of use, the proportion of searches using categories stayed above the average (27-39%) When categories used, selected 2.3 cats on average Labels of selected cats used 1.9 words on average (average in general was 1.4 words) Out of 15 cats (default): First quartile at 2nd cat Median at 5th Third quartile at 9th Kaki & Aula Survey Results: Kaki andamp; Aula Survey Results Subjective opinions improved over time Realization that categories useful only some of the time Freeform responses indicate that categories useful when queries vague, broad or ambiguous Second survey indicated that people felt that their search habits began to change Consider query formulation less than before (27%) Use less precise search terms (45%) Use less time to evaluate results (36%) Use categories for evaluating results (82%) Conclusions from Kaki Study: Conclusions from Kaki Study Simplicity of category assignment made groupings understandable (my view, not stated by them) Keyword-based Categories: Are beneficial when result ranking fails Find results lower in the ranking Reduce empty results May make it easier to access multiple results Availability changed user querying behavior Highlight, Wu et al.: Highlight, Wu et al. Select terms from document summaries, organize into a subsumption hierarchy. Highlight the terms in the retrieved documents. Wu, Shankar, Chen, Finding More Useful Information Faster from Web Search Results CICM ‘03 Slide90: Slide91: Slide92: Highlight, Wu et al.: Highlight, Wu et al. First study: 19 undergraduates Used the system for their own queries Significant preference for the grouping interface Second study: 6 participants Their own queries Accesses were sequential in linear interface Accesses went deeper in grouping interface Participants saved more documents per query Category-based Grouping: Category-based Grouping General Categories Domain-Specific Categories Slide95: SWISH, Chen & Dumais: SWISH, Chen andamp; Dumais 18 participants, 30 tasks, within subjects Significant (and large, 50%) timing differences in favor of categories For queries where the results are in the first page, the differences are much smaller. Strong subjective preferences. BUT: the baseline was quite poor and the queries were very cooked. Very small category set (13 categories) Subhierarchy wasn’t used. Chen, Dumais, Bringing Order to the Web: Automatically Categorizing Search Results CHI 2000 Test queries, Chen & Dumais : Test queries, Chen andamp; Dumais Chen, Dumais, Bringing Order to the Web, Automatically Categorizing Search Results. CHI 2000 Slide98: Dumais, Cutrell, Chen, Bringing Order to the Web, Optimizing Search by Showing Results in Context, CHI 2001 Slide99: Revisiting the Study, Dumais, Cutrell, Chen Slide100: Revisiting the Study, Dumais, Cutrell, Chen Slide101: Revisiting the Study, Dumais, Cutrell, Chen Revisiting the Study, Dumais, Cutrell, Chen: This followup study reveals that the baseline had been unfairly weakened. The speedup isn’t so much from the category labels as the grouping of similar documents. For queries where the answer is in the first page, the category effects are not very strong. Revisiting the Study, Dumais, Cutrell, Chen DynaCat, Pratt, Hearst, and Fagan.: DynaCat, Pratt, Hearst, and Fagan. Medical Domain Decide on important question types in an advance What are the adverse effects of drug D? What is the prognosis for treatment T? Make use of MeSH categories Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat, Pratt, Hearst, & Fagan: DynaCat, Pratt, Hearst, andamp; Fagan Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat Study, Pratt, Hearst & Fagan: DynaCat Study, Pratt, Hearst andamp; Fagan Design Three queries 24 cancer patients Compared three interfaces ranked list, clusters, categories Results Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99 DynaCat study, Pratt et al.: DynaCat study, Pratt et al. Faceted Category Grouping: Faceted Category Grouping Multiple Categories per Document Search Usability Design Goals: Search Usability Design Goals Strive for Consistency Provide Shortcuts Offer Informative Feedback Design for Closure Provide Simple Error Handling Permit Easy Reversal of Actions Support User Control Reduce Short-term Memory Load From Shneiderman, Byrd, andamp; Croft, Clarifying Search, DLIB Magazine, Jan 1997. www.dlib.org How to Structure Information for Search and Browsing?: How to Structure Information for Search and Browsing? Hierarchy is too rigid KL-One is too complex Hierarchical faceted metadata: A useful middle ground The Problem with Hierarchy: Inflexible Force the user to start with a particular category What if I don’t know the animal’s diet, but the interface makes me start with that category? Wasteful Have to repeat combinations of categories Makes for extra clicking and extra coding Difficult to modify To add a new category type, must duplicate it everywhere or change things everywhere The Problem with Hierarchy The Idea of Facets: The Idea of Facets Facets are a way of labeling data A kind of Metadata (data about data) Can be thought of as properties of items Facets vs. Categories Items are placed INTO a category system Multiple facet labels are ASSIGNED TO items The Idea of Facets: The Idea of Facets Create INDEPENDENT categories (facets) Each facet has labels (sometimes arranged in a hierarchy) Assign labels from the facets to every item Example: recipe collection Course Main Course Cooking Method Stir-fry Cuisine Thai Ingredient Bell Pepper Curry Chicken The Idea of Facets: The Idea of Facets Break out all the important concepts into their own facets Sometimes the facets are hierarchical Assign labels to items from any level of the hierarchy Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sorbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple Using Facets: Using Facets Now there are multiple ways to get to each item Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sherbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple Fruit andgt; Pineapple Dessert andgt; Cake Preparation andgt; Bake Dessert andgt; Dairy andgt; Sherbet Fruit andgt; Berries andgt; Strawberries Preparation andgt; Freeze Using Facets: Using Facets The system only shows the labels that correspond to the current set of items Start with all items and all facets The user then selects a label within a facet This reduces the set of items (only those that have been assigned to the subcategory label are displayed) This also eliminates some subcategories from the view. Flamenco Usability Studies: Flamenco Usability Studies Usability studies done on 3 collections: Recipes: 13,000 items Architecture Images: 40,000 items Fine Arts Images: 35,000 items Conclusions: Users like and are successful with the dynamic faceted hierarchical metadata, especially for browsing tasks Very positive results, in contrast with studies on earlier iterations. Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. Flamenco Study Post-Interface Assessments: Flamenco Study Post-Interface Assessments All significant at pandlt;.05 except 'simple' and 'overwhelming' Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. Flamenco Study Post-Test Comparison: Flamenco Study Post-Test Comparison Faceted Baseline Overall Assessment More useful for your tasks Easiest to use Most flexible More likely to result in dead ends Helped you learn more Overall preference Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For: Yee, K-P., Swearingen, K., Li, K., and Hearst, M., Faceted Metadata for Image Search and Browsing, in CHI 2003. The Advantages of Facets: The Advantages of Facets Lets the user decide how to start, and how to explore and group. After refinement, categories that are not relevant to the current results disappear. Seamlessly integrates keyword search with the organizational structure. Very easy to expand out (loosen constraints) Very easy to build up complex queries. Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Advantages of Facets: Advantages of Facets Can’t end up with empty results sets (except with keyword search) Helps avoid feelings of being lost. Easier to explore the collection. Helps users infer what kinds of things are in the collection. Evokes a feeling of 'browsing the shelves' Is preferred over standard search for collection browsing in usability studies. (Interface must be designed properly) Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Advantages of Facets: Advantages of Facets Seamless to add new facets and subcategories Seamless to add new items. Helps with 'categorization wars' Don’t have to agree exactly where to place something Interaction can be implemented using a standard relational database. May be easier for automatic categorization Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, P., Finding the Flow in Web Site Search, Communications of the ACM, 45 (9), September 2002, pp.42-49 Summary: Evaluation Good Ideas : Summary: Evaluation Good Ideas Longitudinal studies of real use Match the participants to the content of the collection and the tasks Test against a strong baseline Summary: Evaluation Problems: Summary: Evaluation Problems Bias participants towards a system 'Try our interface' versus linear view Tailor tasks unrealistically to benefit the target interface Impoverish the baseline relative to the test condition Conflate test conditions Summary: Grouping Search Results: Summary: Grouping Search Results Grouping search results seems beneficial in two circumstances: General web search, using transparent labeling (monothetic terms) or category labels rather than cluster centroids. Effects: Works primarily on ambiguous queries, (so used a fraction of the time) Promotes relevant results up from below the first page of hits So important to group the related items together visually Users tend to select more documents than with linear search May work even better with meta-search Positive subjective responses (small studies) Visualization does not work. Summary: Grouping Search Results: Summary: Grouping Search Results Grouping search results seems beneficial in two circumstances: Collection navigation with faceted categories Multiple angles better than single categories 'searchers' turn into 'browsers' Becoming commonplace in e-commerce, digital libraries, and other kinds of collections Extends naturally to tags. Positive subjective responses (small studies) Social Tagging and Search: Social Tagging and Search Slide127: Search Topical Metadata Structured, Flexible Navigation Problem with Metadata-Oriented Approaches: Problem with Metadata-Oriented Approaches Getting the metadata! Slide129: Search Topical Metadata Social question answering Recorded Human Interaction Click-through ranking Inferred recommendations Human Real-time Question Answering: Human Real-time Question Answering More popular in Korea than algorithmic search Maybe fewer good web pages? Maybe more social society? Several examples in US: Yahoo answers recently released and successful wondir.com answerbag.com Yahoo Answers (also answerbag.com, wondir.com, etc): Yahoo Answers (also answerbag.com, wondir.com, etc) Yahoo Answers appearing in search results: Yahoo Answers appearing in search results answerbag.com: answerbag.com Using User Behavior as Implicit Preferences: Using User Behavior as Implicit Preferences Search click-through experimentally shown to boost search rankings for top results Joachims et al. ‘05, Agichtein et al. ‘06 Works ok even if non-relevant documents examined Best in combination with sophisticated search algorithms Doesn’t work well for ambiguous queries Aggregates of movie and book selections comprise implicit recommendations Slide135: Search Topical Metadata Recorded Human Interaction Social Tagging (photos, bookmarks) Game-based tagging Social Tagging: Social Tagging Metadata assignment without all the bother Spontaneous, easy, and tends towards single terms Issues with Photo and Web link Tagging: Issues with Photo and Web link Tagging There is a strong personal component Marking for my own reminders Marking for my circle of friends There is also a strong social component Try to promote certain tags to make them more popular, or post to popular tags to see your influence rise Tagging Games: Tagging Games Assigning metadata is fun! (ESP game, von Ahn) No need for reputation system, etc. Pay people to do it MyCroft (iSchool student project) Drawback: least common denominator labels Experts already label their own data or that about which they have expertise E.g., protein function Wikipedia Slide139: Search Topical Metadata Social question answering Recorded Human Interaction Social Tagging (photos, bookmarks) Click-through ranking Inferred recommendations Game-based tagging ???? Expert-Oriented Tagging in Search: Expert-Oriented Tagging in Search Already happening at Google co-op Shows up in certain types of search results Expert-Oriented Tagging: Expert-Oriented Tagging Already happening at Google co-op Shows up in certain types of search results Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: User Interfaces To make rapid-feedback suggestions of pre-established tags Like type-ahead queries To incentivize labeling and make it fun To allow the personal aspects to shine through Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: NLP Algorithms (We have an algorithm to build facets from text) To convert tags into facet hierarchies To capture implicit labeling information Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: Digital infrastructure Extending tagging games Build an architecture that channels specialized subproblems to appropriate experts We now know there is a green plant in an office; direct this to the botany andgt; houseplants experts Promoting Expertise-Oriented Tagging: Promoting Expertise-Oriented Tagging Research area: economics and sociology What are the right incentive structures? Using Implicit Preferences: Using Implicit Preferences Extend implicit recommendation technology to online catalog use Final Words: Final Words User interfaces for search remains a fascinating and challenging field Search has taken a primary role in the web and internet business Thus, we can expect fascinating developments, and maybe some breakthroughs, in the next few years!