logging in or signing up trt 11 Lassie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 145 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 20, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Text Retrieval and Mining# 12Recommendation System, Recommender System: Text Retrieval and Mining # 12 Recommendation System, Recommender System Lecture by Young Hwan CHO, Ph. D. Youngcho@gmail.comRecommendation Systems: Recommendation Systems 정의 사용자가 경험하지 않은 것 중에서 적합한 것을 선택해서 제안하는 시스템 입력 : 사용자의 과거 경험, 다른 사용자들의 선택, 대상들의 정보 응용처 전자 상거래 : Amazon.com, CDNOW, Media Unbound 정보 서비스 : Pandora (Music Genome Project) 방법 사용자 이용의 측면 : 다른 사용자들의 선호도를 기본으로 현재의 사용자와 가장 취향이 비슷한 사용자 혹은 사용자 그룹을 선택해서 현재의 사용자가 선택하지 않은 것을 제안 컨텐츠의 측면 : 사용자가 선호하는 컨텐츠의 성향을 분석해서 컨텐츠를 그룹핑하고 이 중에서 사용자가 과거에 선택하지 않았던 것을 제안 일반적인 문제 정보 부족 : 사용자와 대상은 많은데, 경험 기록은 부족 : Sparse Matrix 새로운 대상 : 어느 사용자도 과거에 경험하지 않았음 프라이버시 : 개인정보 유출에 대한 우려가 있음Methods: Methods Use item-to-item similarity – content-based Use item-to-item similarity – association A C B like similar contents RecommendMethods: Methods Use people-to-people similarity – demographic Use people-to-people similarity – collaborativeWhat do RSs achieve?: What do RSs achieve? Help people make decisions Where to spend attention Where to spend money Help maintain awareness New products New information Demographic features Item features Sales history Purchase history Customer Recommend itemsSample Applications: Sample Applications Ecommerce Product recommendations - amazon Corporate Intranets Recommendation, finding domain experts, … Digital Libraries Finding pages/books people will like Medical Applications Matching patients to doctors, clinical trials, … Customer Relationship Management Matching customer problems to internal expertsWell-known recommender systems: Amazon and Netflix: Well-known recommender systems: Amazon and NetflixCorporate intranets - document recommendation: Corporate intranets - document recommendation Corporate intranets - “expert” finding: Corporate intranets - “expert” finding Recommender System 구성: Recommender System 구성 내용 분석이 필요 내용 분석이 불필요 다른 사용자 정보 필요 다른 사용자 정보 불필요 collaborative demographic association content-basedRecommender System =:= Clustering Model: Recommender System =:= Clustering Model Content Clustering 문서 : 키워드와 중요도 벡터 상품 : 분류, 메이커, 가격, 기능, 상품평 등에 대한 값 벡터 People Clustering 행태 : 문서 혹은 상품에 대한 선호도 프로필 : 나이, 주소, 직업, 결혼여부, 취미 등 Recommender System 만의 고유한 문제 상품의 숫자는 너무 많고, 사용자는 그중에 한두개 정도만을 구매함 [사용자 * 상품] 매트릭스가 심하게 Sparse함Short History: Short History 혼합 시스템 IR: Information Retrieval IF : Information Filter Item Information(Content) User Information 촛점 (Pure) 내용기반 추천 1980 (Pure) 협동 추천 1990 1992 1999 GroupLens(Minesota) Fab(Stanford) Agent NetPerceptions FireflyCollaborative filtering (CF): Collaborative filtering (CF) Collaborative Filtering (CF): A promising Recommender System technology. Used in many of the most successful Recommender Systems on the web w y m r f cSimplest Algorithm: Naïve k Nearest Neighbors: Simplest Algorithm: Naïve k Nearest Neighbors U viewed d1, d2, d5. Look at who else viewed d1, d2 or d5. Recommend to U the doc(s) most “popular” among these users. U V W d1 d2 d5Single evidence CF: Single evidence CF 옥션의 사례 – 서비스 개인화: 옥션의 사례 – 서비스 개인화Recommender System의 기법들: Recommender System의 기법들 Collaborative Filtering (CF) Singular Value Decomposition (SVD) An SVD-CF Approach in the Recommender Systems DomainThe RS Space: The RS Space User-User Links Links derived from similar attributes, explicit connectionsLink types: Link types User attributes-based Recommendation Male, 18-35: Recommend The Matrix Content Similarity You liked The Matrix: recommend The Matrix Reloaded Collaborative Filtering People with interests like yours also liked Kill BillCollaborative Method: Collaborative Method Advantages No needs of contents analysis Items that are difficult to analyze contents can be recommended Ex> Movie, music, … No needs of user information High precision Method Find out similar users Predict preferences based on similar users preferencesCollaborative Method: Collaborative Method Computing similarity 유사도 계산 Pearson correlation coefficient r a,i : 사용자 a 의 상품 i 에 대한 평가값 ra : 사용자 a 의 평균 평가값 Example 사용자 a : (1, 9, 10) 사용자 b : (2, 10, 9) 사용자 a는 사용자 b와 더 유사함 사용자 c : (10, 1, 2)Collaborative Method: Collaborative Method Prediction of preferences Weighted sum of similar users’ preferences : 사용자 a와 u의 유사도 Example Average rating of user a: 5 Preferences of user a User b: (2, 8, 8), wa,b = 0.5 = (5, 5, 5) + (-3, 3, 3)*0.5 + (-1, -1, 2)*0.1 User c: (4, 4, 7), wa,c = 0.1 = (3.4, 6.4, 6.7) Data Sparseness Problem: Data Sparseness Problem Example dataData Sparseness Problem: Data Sparseness Problem Available data are usually very sparse Buy 2~3 items among thousands of items Cosine similarity can not be computed Reduce dimension Dimensionality Reduction: Dimensionality Reduction Using category information Represent user preference vector with item categories Monster Co., Lion King, Pocahontas animation Holloween, Scream horror Dimensionality Reduction: Dimensionality Reduction Singular Value Decomposition (SVD) Decompose the user-item matrix Amn Amn = Umm Smn (Vnn)T S : Diagonal matrix that contains the singular values of A in descending order U, V : Orthogonal matrices Dimensionality Reduction: Dimensionality Reduction SVD example Dimensionality Reduction: Dimensionality Reduction Approximation of A Select largest k singular values A’mn = Umk Skk (Vnk)T Computing user similarity AAT = USVT(USVT)T = USVTVSTUT = (US)(US)T Projection of A into k dimension A’mn Vnk = Umk Skk An Example: An Example User-item matrix An Example: An Example Reduction, k = 2 An Example: An Example User-user similarityWhat is the SVD doing: What is the SVD doing Type 1 Type 2 … Type k Users Items Atypical users SamplesAn Example: An Example User vectors in 2-D space u6 u4 u5 u3 u2 u1Experiments : Experiments Dataset – MovieLens 943 users, 1628 movies, 1~5 rating, 6.4% rated Change ratings to 0/1 3.6% rated Experiments Compare performance of plain collaborative(CF) and reduced dimension(SVD) recommendation CF: 60 neighbor SVD: rank 20 Change sparseness to 2.0%, 1.0%, 0.5% Experiments: Experiments Metric Hit ratio Remove 1 rating from each user test data Recommend 10 items for each user If the test data is in the recommended item hit Total # of hit Total # of test data Result Sparseness 3.6% SVD improves hit ratio by x % Sparseness 0.5% SVD improves hit ratio by x % Hit ratio = Experiments: Experiments ResultsResults from SIGIR 2004 Paper: Results from SIGIR 2004 Paper Much better predicts top movies Cost is that it tends to often predict blockbuster movies A serendipity/ trust trade-off Case-Based Reasoning (CBR): Case-Based Reasoning (CBR) Use people-to-people similarity 속성이 유사한 고객(case)을 찾아 유사한 고객이 구매한 상품을 추천 Automatic, ephemeral A C B like same featureSlide39: Moviegoer SurveySlide40: Independence Day Courage Under Fire Birdcage Nutty Professor 1 3 4 2 25 30 35 40 45 50 0 age Source ? ? ? Nearest neighbor Nearest neighbor Nearest neighbor All the data points closest to this point saw “Independence Day”. Nearest NeighborCase-Based Reasoning (CBR): Case-Based Reasoning (CBR) Example Customers with sales history Case-Based Reasoning (CBR): Case-Based Reasoning (CBR) Distance to neighbors d_gender(A, B) = |A – B| d_age(A, B) = d_salary(A, B) = |A – B| / max difference d_sum = d_gender + d_age + d_salary Prediction based on distance weighted sum Decision Trees: Decision Trees Use decision tree to classify customers 과거의 여러 고객들의 구매 데이터로부터 고객을 분류하는 decision tree를 생성하고 이를 이용하여 새로운 고객에 대한 분류 수행 A C B 속성 Class A Class C Class BDecision Trees: Decision Trees Constructing a tree Easy way - one path for each example Better way - make it as simple as possible Ex> (a=0, b=0) Class A (a=0, b=1) Class B (a=1, b=0) Class A a? b? b? Class A Class A Class B 0 0 0 1 1 b? 1 0 Class A Class B vs.Neural Networks: Neural Networks Use neural network to classify customers 과거의 여러 고객들의 구매 데이터로부터 고객을 분류하는 neural network을 훈련시키고 이를 이용하여 새로운 고객에 대한 분류 수행 A C B 속성 Class A Class C Class B Conclusion: Conclusion Future issues Integrating various methods Attribute of people (demographic info.) Attribute of product Purchase data People’s rating Fully automatic recommendation Implicit negative data? Producing marketing information Grouping customers Sales prediction You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
trt 11 Lassie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 145 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 20, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Text Retrieval and Mining# 12Recommendation System, Recommender System: Text Retrieval and Mining # 12 Recommendation System, Recommender System Lecture by Young Hwan CHO, Ph. D. Youngcho@gmail.comRecommendation Systems: Recommendation Systems 정의 사용자가 경험하지 않은 것 중에서 적합한 것을 선택해서 제안하는 시스템 입력 : 사용자의 과거 경험, 다른 사용자들의 선택, 대상들의 정보 응용처 전자 상거래 : Amazon.com, CDNOW, Media Unbound 정보 서비스 : Pandora (Music Genome Project) 방법 사용자 이용의 측면 : 다른 사용자들의 선호도를 기본으로 현재의 사용자와 가장 취향이 비슷한 사용자 혹은 사용자 그룹을 선택해서 현재의 사용자가 선택하지 않은 것을 제안 컨텐츠의 측면 : 사용자가 선호하는 컨텐츠의 성향을 분석해서 컨텐츠를 그룹핑하고 이 중에서 사용자가 과거에 선택하지 않았던 것을 제안 일반적인 문제 정보 부족 : 사용자와 대상은 많은데, 경험 기록은 부족 : Sparse Matrix 새로운 대상 : 어느 사용자도 과거에 경험하지 않았음 프라이버시 : 개인정보 유출에 대한 우려가 있음Methods: Methods Use item-to-item similarity – content-based Use item-to-item similarity – association A C B like similar contents RecommendMethods: Methods Use people-to-people similarity – demographic Use people-to-people similarity – collaborativeWhat do RSs achieve?: What do RSs achieve? Help people make decisions Where to spend attention Where to spend money Help maintain awareness New products New information Demographic features Item features Sales history Purchase history Customer Recommend itemsSample Applications: Sample Applications Ecommerce Product recommendations - amazon Corporate Intranets Recommendation, finding domain experts, … Digital Libraries Finding pages/books people will like Medical Applications Matching patients to doctors, clinical trials, … Customer Relationship Management Matching customer problems to internal expertsWell-known recommender systems: Amazon and Netflix: Well-known recommender systems: Amazon and NetflixCorporate intranets - document recommendation: Corporate intranets - document recommendation Corporate intranets - “expert” finding: Corporate intranets - “expert” finding Recommender System 구성: Recommender System 구성 내용 분석이 필요 내용 분석이 불필요 다른 사용자 정보 필요 다른 사용자 정보 불필요 collaborative demographic association content-basedRecommender System =:= Clustering Model: Recommender System =:= Clustering Model Content Clustering 문서 : 키워드와 중요도 벡터 상품 : 분류, 메이커, 가격, 기능, 상품평 등에 대한 값 벡터 People Clustering 행태 : 문서 혹은 상품에 대한 선호도 프로필 : 나이, 주소, 직업, 결혼여부, 취미 등 Recommender System 만의 고유한 문제 상품의 숫자는 너무 많고, 사용자는 그중에 한두개 정도만을 구매함 [사용자 * 상품] 매트릭스가 심하게 Sparse함Short History: Short History 혼합 시스템 IR: Information Retrieval IF : Information Filter Item Information(Content) User Information 촛점 (Pure) 내용기반 추천 1980 (Pure) 협동 추천 1990 1992 1999 GroupLens(Minesota) Fab(Stanford) Agent NetPerceptions FireflyCollaborative filtering (CF): Collaborative filtering (CF) Collaborative Filtering (CF): A promising Recommender System technology. Used in many of the most successful Recommender Systems on the web w y m r f cSimplest Algorithm: Naïve k Nearest Neighbors: Simplest Algorithm: Naïve k Nearest Neighbors U viewed d1, d2, d5. Look at who else viewed d1, d2 or d5. Recommend to U the doc(s) most “popular” among these users. U V W d1 d2 d5Single evidence CF: Single evidence CF 옥션의 사례 – 서비스 개인화: 옥션의 사례 – 서비스 개인화Recommender System의 기법들: Recommender System의 기법들 Collaborative Filtering (CF) Singular Value Decomposition (SVD) An SVD-CF Approach in the Recommender Systems DomainThe RS Space: The RS Space User-User Links Links derived from similar attributes, explicit connectionsLink types: Link types User attributes-based Recommendation Male, 18-35: Recommend The Matrix Content Similarity You liked The Matrix: recommend The Matrix Reloaded Collaborative Filtering People with interests like yours also liked Kill BillCollaborative Method: Collaborative Method Advantages No needs of contents analysis Items that are difficult to analyze contents can be recommended Ex> Movie, music, … No needs of user information High precision Method Find out similar users Predict preferences based on similar users preferencesCollaborative Method: Collaborative Method Computing similarity 유사도 계산 Pearson correlation coefficient r a,i : 사용자 a 의 상품 i 에 대한 평가값 ra : 사용자 a 의 평균 평가값 Example 사용자 a : (1, 9, 10) 사용자 b : (2, 10, 9) 사용자 a는 사용자 b와 더 유사함 사용자 c : (10, 1, 2)Collaborative Method: Collaborative Method Prediction of preferences Weighted sum of similar users’ preferences : 사용자 a와 u의 유사도 Example Average rating of user a: 5 Preferences of user a User b: (2, 8, 8), wa,b = 0.5 = (5, 5, 5) + (-3, 3, 3)*0.5 + (-1, -1, 2)*0.1 User c: (4, 4, 7), wa,c = 0.1 = (3.4, 6.4, 6.7) Data Sparseness Problem: Data Sparseness Problem Example dataData Sparseness Problem: Data Sparseness Problem Available data are usually very sparse Buy 2~3 items among thousands of items Cosine similarity can not be computed Reduce dimension Dimensionality Reduction: Dimensionality Reduction Using category information Represent user preference vector with item categories Monster Co., Lion King, Pocahontas animation Holloween, Scream horror Dimensionality Reduction: Dimensionality Reduction Singular Value Decomposition (SVD) Decompose the user-item matrix Amn Amn = Umm Smn (Vnn)T S : Diagonal matrix that contains the singular values of A in descending order U, V : Orthogonal matrices Dimensionality Reduction: Dimensionality Reduction SVD example Dimensionality Reduction: Dimensionality Reduction Approximation of A Select largest k singular values A’mn = Umk Skk (Vnk)T Computing user similarity AAT = USVT(USVT)T = USVTVSTUT = (US)(US)T Projection of A into k dimension A’mn Vnk = Umk Skk An Example: An Example User-item matrix An Example: An Example Reduction, k = 2 An Example: An Example User-user similarityWhat is the SVD doing: What is the SVD doing Type 1 Type 2 … Type k Users Items Atypical users SamplesAn Example: An Example User vectors in 2-D space u6 u4 u5 u3 u2 u1Experiments : Experiments Dataset – MovieLens 943 users, 1628 movies, 1~5 rating, 6.4% rated Change ratings to 0/1 3.6% rated Experiments Compare performance of plain collaborative(CF) and reduced dimension(SVD) recommendation CF: 60 neighbor SVD: rank 20 Change sparseness to 2.0%, 1.0%, 0.5% Experiments: Experiments Metric Hit ratio Remove 1 rating from each user test data Recommend 10 items for each user If the test data is in the recommended item hit Total # of hit Total # of test data Result Sparseness 3.6% SVD improves hit ratio by x % Sparseness 0.5% SVD improves hit ratio by x % Hit ratio = Experiments: Experiments ResultsResults from SIGIR 2004 Paper: Results from SIGIR 2004 Paper Much better predicts top movies Cost is that it tends to often predict blockbuster movies A serendipity/ trust trade-off Case-Based Reasoning (CBR): Case-Based Reasoning (CBR) Use people-to-people similarity 속성이 유사한 고객(case)을 찾아 유사한 고객이 구매한 상품을 추천 Automatic, ephemeral A C B like same featureSlide39: Moviegoer SurveySlide40: Independence Day Courage Under Fire Birdcage Nutty Professor 1 3 4 2 25 30 35 40 45 50 0 age Source ? ? ? Nearest neighbor Nearest neighbor Nearest neighbor All the data points closest to this point saw “Independence Day”. Nearest NeighborCase-Based Reasoning (CBR): Case-Based Reasoning (CBR) Example Customers with sales history Case-Based Reasoning (CBR): Case-Based Reasoning (CBR) Distance to neighbors d_gender(A, B) = |A – B| d_age(A, B) = d_salary(A, B) = |A – B| / max difference d_sum = d_gender + d_age + d_salary Prediction based on distance weighted sum Decision Trees: Decision Trees Use decision tree to classify customers 과거의 여러 고객들의 구매 데이터로부터 고객을 분류하는 decision tree를 생성하고 이를 이용하여 새로운 고객에 대한 분류 수행 A C B 속성 Class A Class C Class BDecision Trees: Decision Trees Constructing a tree Easy way - one path for each example Better way - make it as simple as possible Ex> (a=0, b=0) Class A (a=0, b=1) Class B (a=1, b=0) Class A a? b? b? Class A Class A Class B 0 0 0 1 1 b? 1 0 Class A Class B vs.Neural Networks: Neural Networks Use neural network to classify customers 과거의 여러 고객들의 구매 데이터로부터 고객을 분류하는 neural network을 훈련시키고 이를 이용하여 새로운 고객에 대한 분류 수행 A C B 속성 Class A Class C Class B Conclusion: Conclusion Future issues Integrating various methods Attribute of people (demographic info.) Attribute of product Purchase data People’s rating Fully automatic recommendation Implicit negative data? Producing marketing information Grouping customers Sales prediction