logging in or signing up Recommenders HCC01 Justine Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 180 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 23, 2007 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Comparing Human Recommenders to Online Systems: Comparing Human Recommenders to Online Systems Rashmi Sinha & Kirsten Swearingen SIMS, UC Berkeley Slide2: Quantitative Methods Class @ SIMS Methods for Design Card Sorting Pre Design Surveys Quantitative Observational Methods Methods for Testing Formal Usability Experiments Informal Usability Evaluations Students are looking for industry projects!Slide3: Method Philosophy: Testing & Analysis as part of the Iterative Design Process Design Evaluate Analyze Slide adapted from James Landay Use both quantitative & qualitative methods Generate Design RecommendationsRecommender Systems are technological proxy for a social process: Recommender Systems are technological proxy for a social process Recommendations from friends Recommendations from Online SystemsI know what you’ll read next summer (Amazon, Barnes&Noble) : I know what you’ll read next summer (Amazon, Barnes&Noble) what movies you should watch… (Reel, RatingZone, Amazon) what music you should listen to… (CDNow, Mubu, Gigabeat) what websites you should visit (Alexa) what jokes you will like (Jester) & who you should date (Yenta) Taking a closer look at the Recommendation Process : Taking a closer look at the Recommendation Process Slide7: Amazon’s Recommendation Process Input: One artist/author nameSlide8: Output: List of Recommendations Explore / Refine Recommendations Search using RecommendationsSlide9: Book Recommendation Site: Sleeper Input: Ratings of 10 books for all users Use of continuous Rating Bar (System designed by Ken Goldberg)Slide10: Output: List of items with brief information about each item Degree of confidence in prediction Sleeper: OutputWhat convinces a user to sample the recommendation: What convinces a user to sample the recommendation Judging recommendations: What is a good recommendation from the user’s perspective? Trust in a Recommender System: What factors lead to trust in a system? System Transparency: Do users need to know why an item was recommended? Study of RS has focused mostly on Collaborative Filtering Algorithms: Study of RS has focused mostly on Collaborative Filtering Algorithms Collaborative Filtering Algorithms Output (Recommendations) Input from user Beyond “Algorithms Only” : An HCI Perspective on Recommender Systems: Beyond “Algorithms Only” : An HCI Perspective on Recommender Systems Comparing the Social Recommendation Process to Online Recommender Systems Understanding the factors that go into an effective recommendation (by studying users interaction with 6 online RS) The Human vs. Recommenders Death Match: The Human vs. Recommenders Death MatchBook Systems: Book Systems Amazon Books Sleeper Rating ZoneMovie Systems: Movie Systems Amazon Movies Reel Movie CriticMethod: Method For each of 3 online systems: Registered at site Rated items Reviewed and evaluated recommendation set Completed questionnaire Also reviewed and evaluated sets of recommendations from 3 friends each 19 participants, age:18 to 34 yearsResults: ResultsDefining Types of Recommendations: Defining Types of Recommendations Good Recs. (Precision) % items user felt interested in Useful Recs. Subset of Good Recs. User felt interested in and had not read / viewed yet GOOD: User likesSlide20: 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Amazon (15) Sleeper (10) Rating Zone (8) Friends (9) Amazon (15) Reel (5-10) Movie Critic (20) Friends (9) Books Movies % Good Recommendations % Useful Recommendations Comparing Human Recommenders to RS: “Good” and “Useful” Recommendations RS Average Ave. Std. Error (x) No. of Recommendations However users like online RS: However users like online RS This result was supported by post test interviews.Why systems over friends?: Why systems over friends? “Suggested a number of things I hadn’t heard of, interesting matches.” “It was like going to Cody’s—looking at that table up front for new and interesting books.” “Systems can pull from a large database—no one person knows about all the movies I might like.” Items users had “Heard of” before: Items users had “Heard of” before Friends recommended mostly “old” previously experienced items Movies BooksWhat systems did users prefer?: What systems did users prefer? Sleeper and Amazon books average highest ratings Split opinions on Reel, MovieCritic No Yes Movies Books Why did some systems…: Why did some systems… Provide useful recommendations but leave users unsatisfied? RatingZone, MovieCritic & ReelPossible Reasons: Possible Reasons Previously Enjoyed Items are important: We term these Trust-Generating Items Adequate Item Description & Ease of Use are important Missing from List: Time to Receive Recommendations & No. of Items to Rate not important! All correlations are significant at .05 A Question of Trust…: TRUST-GENERATING Previously read/viewed A Question of Trust… GOOD: User likes Post Test Interviews showed that users “trust” systems if they have already sampled some recommendations Positive Experiences lead to “trust Negative Experiences with Recommended Items lead to mistrust of systemSlide28: Difference between Amazon and Sleeper highlights the fact that there are different kinds of good Recommender Systems A Question of Trust … Books Movies Adequate Item Description: The RatingZone Story: Adequate Item Description: The RatingZone Story 0 % of Version 1 and 60% of Version 2 users found item description adequate An adequate item description, and links to other sources about item was a crucial factor in users being convinced by a recommendation.System Transparency: System Transparency Why was this item recommended? Do users understand why an item was recommended Users mentioned this factor in post test interviewsDiscussion & Design Recommendations: Discussion & Design RecommendationsDesign Recommendations: Justification: Design Recommendations: Justification Justify your Recommendations Adequate Item Information: Providing enough detail about item for user to make choice System Transparency: Generate (at least some) recommendations which are clearly linked to the rated items Explanation: Provide an Explanation, why the item was recommended. Community Ratings: Provide link to ratings / reviews by other users. If possible, present numerical summary of ratings.Design Recommendations:Accuracy vs. Less Input: Design Recommendations:Accuracy vs. Less Input Don’t sacrifice accuracy for the sake of generating quick recommendations. Users don’t mind rating more items to receive quality recommendations. A possible way to achieve this: have multilevel recommendations. Users can initially use the system by providing one rating, and are offered subsequent opportunities to refine recommendation One needs a happy medium between too little input (leading to low accuracy) and too much input (leading to user impatience) Design Recommendations: New Unexpected Items: Design Recommendations: New Unexpected Items Users like Rec. Systems as they provide information about new, unexpected items. List of recommended items should include new items which the user might not find out in any other way. List could also include some unexpected items (e.g., from other topics / genres) which the user might not have thought of themselves. Design Recommendations: New Unexpected ItemsSlide35: Users (especially first time users) need to develop trust in the system. Trust in system is enhanced by the presence of items that the user has already enjoyed. Generating some very popular (which have probably been experienced previously) in the initial recommendation set might be one way to achieve this. Design Recommendations: Trust Generating ItemsSlide36: Systems need to provide a mix of different kinds of items to cater to different users: Trust Generating Items: A few very popular ones, which the system has high confidence in Unexpected Items: Some unexpected items, whose purpose is to allow users to broaden horizons. Transparent Items: At least some items for which the user can see the clear link between the items he /she rated and the recommendation. New Items: Some items which are new. Design Recommendations: Mix of Items Question: Should these be presented as a sorted list / unsorted list/ different categories of recommendations?Slide37: Allow users to provide ratings on a continuous scale. One of the reasons users liked Sleeper was because it allowed them to rate on a continuous scale. Users did not like binary scales. Design Recommendations: Continuous Scales for InputLimitations of Study: Limitations of Study Simulated first-time visit, did not allow system to learn user preferences over time Source of recommendations known to subjects—might have biased towards friends Fairly homogenous group of subjects, no novice usersFuture Plans: Second Generation Music Recommender Systems: Future Plans: Second Generation Music Recommender Systems Have evolved beyond previous systems Use a variety of sophisticated algorithms to map users preferences over music domain Require a lot more input from the user Users can sample recommendations during the study! Slide40: MusicBudha (Mubu.com): Exploring GenresSlide41: Mubu.com: Exploring Jazz StylesSlide42: Mubu.com: Rating Samples Slide43: Mubu.com: Recommendations as Audio SamplesSlide44: Compare systems, friends and experts Anonymize the source of recommendation The Turing Test for Music Recommender SystemsStudy Design: Study Design Goal: Compare recommendations by online RS, Experts (who have same information as RS) & FriendsSlide46: So far we have heard what this study tells us about Recommender Systems But what (if anything) does it have to say about human nature? In conclusion…Slide47: Recommender Systems tantalize us with the idea that we are not as unique and unpredictable as we think we are. Study results show that Recommender Systems do not know us better than our friends! But …Slide48: 11 out of 19 users preferred Recommender Systems over Friends Recommendations! Ultimately, we all want to be tables in a database!Slide49: Email: sinha@sims.berkeley.edu Web address: http://sims.berkeley.edu/~sinha You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Recommenders HCC01 Justine Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 180 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 23, 2007 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Comparing Human Recommenders to Online Systems: Comparing Human Recommenders to Online Systems Rashmi Sinha & Kirsten Swearingen SIMS, UC Berkeley Slide2: Quantitative Methods Class @ SIMS Methods for Design Card Sorting Pre Design Surveys Quantitative Observational Methods Methods for Testing Formal Usability Experiments Informal Usability Evaluations Students are looking for industry projects!Slide3: Method Philosophy: Testing & Analysis as part of the Iterative Design Process Design Evaluate Analyze Slide adapted from James Landay Use both quantitative & qualitative methods Generate Design RecommendationsRecommender Systems are technological proxy for a social process: Recommender Systems are technological proxy for a social process Recommendations from friends Recommendations from Online SystemsI know what you’ll read next summer (Amazon, Barnes&Noble) : I know what you’ll read next summer (Amazon, Barnes&Noble) what movies you should watch… (Reel, RatingZone, Amazon) what music you should listen to… (CDNow, Mubu, Gigabeat) what websites you should visit (Alexa) what jokes you will like (Jester) & who you should date (Yenta) Taking a closer look at the Recommendation Process : Taking a closer look at the Recommendation Process Slide7: Amazon’s Recommendation Process Input: One artist/author nameSlide8: Output: List of Recommendations Explore / Refine Recommendations Search using RecommendationsSlide9: Book Recommendation Site: Sleeper Input: Ratings of 10 books for all users Use of continuous Rating Bar (System designed by Ken Goldberg)Slide10: Output: List of items with brief information about each item Degree of confidence in prediction Sleeper: OutputWhat convinces a user to sample the recommendation: What convinces a user to sample the recommendation Judging recommendations: What is a good recommendation from the user’s perspective? Trust in a Recommender System: What factors lead to trust in a system? System Transparency: Do users need to know why an item was recommended? Study of RS has focused mostly on Collaborative Filtering Algorithms: Study of RS has focused mostly on Collaborative Filtering Algorithms Collaborative Filtering Algorithms Output (Recommendations) Input from user Beyond “Algorithms Only” : An HCI Perspective on Recommender Systems: Beyond “Algorithms Only” : An HCI Perspective on Recommender Systems Comparing the Social Recommendation Process to Online Recommender Systems Understanding the factors that go into an effective recommendation (by studying users interaction with 6 online RS) The Human vs. Recommenders Death Match: The Human vs. Recommenders Death MatchBook Systems: Book Systems Amazon Books Sleeper Rating ZoneMovie Systems: Movie Systems Amazon Movies Reel Movie CriticMethod: Method For each of 3 online systems: Registered at site Rated items Reviewed and evaluated recommendation set Completed questionnaire Also reviewed and evaluated sets of recommendations from 3 friends each 19 participants, age:18 to 34 yearsResults: ResultsDefining Types of Recommendations: Defining Types of Recommendations Good Recs. (Precision) % items user felt interested in Useful Recs. Subset of Good Recs. User felt interested in and had not read / viewed yet GOOD: User likesSlide20: 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Amazon (15) Sleeper (10) Rating Zone (8) Friends (9) Amazon (15) Reel (5-10) Movie Critic (20) Friends (9) Books Movies % Good Recommendations % Useful Recommendations Comparing Human Recommenders to RS: “Good” and “Useful” Recommendations RS Average Ave. Std. Error (x) No. of Recommendations However users like online RS: However users like online RS This result was supported by post test interviews.Why systems over friends?: Why systems over friends? “Suggested a number of things I hadn’t heard of, interesting matches.” “It was like going to Cody’s—looking at that table up front for new and interesting books.” “Systems can pull from a large database—no one person knows about all the movies I might like.” Items users had “Heard of” before: Items users had “Heard of” before Friends recommended mostly “old” previously experienced items Movies BooksWhat systems did users prefer?: What systems did users prefer? Sleeper and Amazon books average highest ratings Split opinions on Reel, MovieCritic No Yes Movies Books Why did some systems…: Why did some systems… Provide useful recommendations but leave users unsatisfied? RatingZone, MovieCritic & ReelPossible Reasons: Possible Reasons Previously Enjoyed Items are important: We term these Trust-Generating Items Adequate Item Description & Ease of Use are important Missing from List: Time to Receive Recommendations & No. of Items to Rate not important! All correlations are significant at .05 A Question of Trust…: TRUST-GENERATING Previously read/viewed A Question of Trust… GOOD: User likes Post Test Interviews showed that users “trust” systems if they have already sampled some recommendations Positive Experiences lead to “trust Negative Experiences with Recommended Items lead to mistrust of systemSlide28: Difference between Amazon and Sleeper highlights the fact that there are different kinds of good Recommender Systems A Question of Trust … Books Movies Adequate Item Description: The RatingZone Story: Adequate Item Description: The RatingZone Story 0 % of Version 1 and 60% of Version 2 users found item description adequate An adequate item description, and links to other sources about item was a crucial factor in users being convinced by a recommendation.System Transparency: System Transparency Why was this item recommended? Do users understand why an item was recommended Users mentioned this factor in post test interviewsDiscussion & Design Recommendations: Discussion & Design RecommendationsDesign Recommendations: Justification: Design Recommendations: Justification Justify your Recommendations Adequate Item Information: Providing enough detail about item for user to make choice System Transparency: Generate (at least some) recommendations which are clearly linked to the rated items Explanation: Provide an Explanation, why the item was recommended. Community Ratings: Provide link to ratings / reviews by other users. If possible, present numerical summary of ratings.Design Recommendations:Accuracy vs. Less Input: Design Recommendations:Accuracy vs. Less Input Don’t sacrifice accuracy for the sake of generating quick recommendations. Users don’t mind rating more items to receive quality recommendations. A possible way to achieve this: have multilevel recommendations. Users can initially use the system by providing one rating, and are offered subsequent opportunities to refine recommendation One needs a happy medium between too little input (leading to low accuracy) and too much input (leading to user impatience) Design Recommendations: New Unexpected Items: Design Recommendations: New Unexpected Items Users like Rec. Systems as they provide information about new, unexpected items. List of recommended items should include new items which the user might not find out in any other way. List could also include some unexpected items (e.g., from other topics / genres) which the user might not have thought of themselves. Design Recommendations: New Unexpected ItemsSlide35: Users (especially first time users) need to develop trust in the system. Trust in system is enhanced by the presence of items that the user has already enjoyed. Generating some very popular (which have probably been experienced previously) in the initial recommendation set might be one way to achieve this. Design Recommendations: Trust Generating ItemsSlide36: Systems need to provide a mix of different kinds of items to cater to different users: Trust Generating Items: A few very popular ones, which the system has high confidence in Unexpected Items: Some unexpected items, whose purpose is to allow users to broaden horizons. Transparent Items: At least some items for which the user can see the clear link between the items he /she rated and the recommendation. New Items: Some items which are new. Design Recommendations: Mix of Items Question: Should these be presented as a sorted list / unsorted list/ different categories of recommendations?Slide37: Allow users to provide ratings on a continuous scale. One of the reasons users liked Sleeper was because it allowed them to rate on a continuous scale. Users did not like binary scales. Design Recommendations: Continuous Scales for InputLimitations of Study: Limitations of Study Simulated first-time visit, did not allow system to learn user preferences over time Source of recommendations known to subjects—might have biased towards friends Fairly homogenous group of subjects, no novice usersFuture Plans: Second Generation Music Recommender Systems: Future Plans: Second Generation Music Recommender Systems Have evolved beyond previous systems Use a variety of sophisticated algorithms to map users preferences over music domain Require a lot more input from the user Users can sample recommendations during the study! Slide40: MusicBudha (Mubu.com): Exploring GenresSlide41: Mubu.com: Exploring Jazz StylesSlide42: Mubu.com: Rating Samples Slide43: Mubu.com: Recommendations as Audio SamplesSlide44: Compare systems, friends and experts Anonymize the source of recommendation The Turing Test for Music Recommender SystemsStudy Design: Study Design Goal: Compare recommendations by online RS, Experts (who have same information as RS) & FriendsSlide46: So far we have heard what this study tells us about Recommender Systems But what (if anything) does it have to say about human nature? In conclusion…Slide47: Recommender Systems tantalize us with the idea that we are not as unique and unpredictable as we think we are. Study results show that Recommender Systems do not know us better than our friends! But …Slide48: 11 out of 19 users preferred Recommender Systems over Friends Recommendations! Ultimately, we all want to be tables in a database!Slide49: Email: sinha@sims.berkeley.edu Web address: http://sims.berkeley.edu/~sinha