logging in or signing up TopicSpecificPageRank Goldie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 18 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript CS345Data Mining: CS345 Data Mining Page Rank VariantsReview Page Rank: Review Page Rank Web graph encoded by matrix M N£N matrix (N = number of web pages) Mij = 1/|O(j)| iff there is a link from j to i Mij = 0 otherwise O(j) = set of pages node i links to Define matrix A as follows Aij = Mij + (1-)/N, where 0<<1 1- is the “tax” discussed in prior lecture Page rank r is first eigenvector of A Ar = rRandom walk interpretation: Random walk interpretation At time 0, pick a page on the web uniformly at random to start the walk Suppose at time t, we are at page j At time t+1 With probability , pick a page uniformly at random from O(j) and walk to it With probability 1-, pick a page on the web uniformly at random and teleport into it Page rank of page p = “steady state” probability that at any given time, the random walker is at page p Many random walkers: Many random walkers Alternative, equivalent model Imagine a large number M of independent, identical random walkers (MÀN) At any point in time, let M(p) be the number of random walkers at page p The page rank of p is the fraction of random walkers that are expected to be at page p i.e., E[M(p)]/M.Problems with page rank: Problems with page rank Measures generic popularity of a page Biased against topic-specific authorities Ambiguous queries e.g., jaguar This lecture Link spam Creating artificial link topographies in order to boost page rank Next lectureTopic-Specific Page Rank: Topic-Specific Page Rank Instead of generic popularity, can we measure popularity within a topic? E.g., computer science, health Bias the random walk When the random walker teleports, he picks a page from a set S of web pages S contains only pages that are relevant to the topic E.g., Open Directory (DMOZ) pages for a given topic (www.dmoz.org) Correspong to each teleport set S, we get a different rank vector rS Matrix formulation: Matrix formulation Aij = Mij + (1-)/|S| if i 2 S Aij = Mij otherwise Show that A is stochastic We have weighted all pages in the teleport set S equally Could also assign different weights to them Example: Example 1 2 3 4 Suppose S = {1}, = 0.8 Note how we initialize the page rank vector differently from the unbiased page rank case. How well does TSPR work?: How well does TSPR work? Experimental results [Haveliwala 2000] Picked 16 topics Teleport sets determined using DMOZ E.g., arts, business, sports,… “Blind study” using volunteers 35 test queries Results ranked using Page Rank and TSPR of most closely related topic E.g., bicycling using Sports ranking In most cases volunteers preferred TSPR ranking Which topic ranking to use?: Which topic ranking to use? User can pick from a menu Can use the context of the query E.g., query is launched from a web page talking about a known topic E.g., use Bayesian classification schemes to classify query into a topic (forthcoming lecture) History of queries e.g., “basketball” followed by “jordan” User context e.g., user’s My Yahoo settings, bookmarks, … Scaling with topics and users: Scaling with topics and users Suppose we wanted to cover 1000’s of topics Need to compute 1000’s of different rank vectors Need to store and retrieve them efficiently at query time For good performance vectors must fit in memory Even harder when we consider personalization Each user has their own teleport vector One page rank vector per user!Tricks: Tricks Determine a set of basis vectors so that any rank vector is a linear combination of basis vectors Encode basis vectors compactly as partial vectors and a hubs skeleton At runtime perform a small amount of computation to derive desired rank vector elements Linearity Theorem : Linearity Theorem Let S be a teleport set and rS be the corresponding rank vector For page i2S, let ri be the rank vector corresponding to the teleport set {i} ri is a vector with N entries rS = (1/|S|) åi2S ri Why is linearity important? Instead of 2N biased page rank vectors we need to store N vectors Linearity example : Linearity example Let us compute r{1,2} for = 0.8 Node Iteration 0 1 2… stable 1 0.1 0.1 0.164 0.300 2 0.1 0.14 0.172 0.323 3 0 0.04 0.04 0.120 4 0 0.04 0.056 0.130 5 0 0.04 0.056 0.130 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.1 0.1Linearity example: Linearity example 0.407 0.239 0.163 0.096 0.096 0.192 0.407 0.077 0.163 0.163 0.300 0.323 0.120 0.130 0.130 0.300 0.323 0.120 0.130 0.130 r{1,2} r1 r2 (r1+r2)/2Intuition behind proof: Intuition behind proof Let’s use the many-random-walkers model with M random walkers Let us color a random walker with color i if his most recent teleport was to page i At time t, we expect M/|S| of the random walkers to be colored i At any page j, we would therefore expect to find (M/|S|)ri(j) random walkers colored i So total number of random walkers at page j = (M/|S|)åi2Sri(j) Basis Vectors: Basis Vectors Suppose T = union of all teleport sets of interest Call it the teleport universe We can compute the rank vector corresponding to any teleport set SµT as a linear combination of the vectors ri for i2T We call these vectors the basis vectors for T We can also compute rank vectors where we assign different weights to teleport pagesDecomposition: Decomposition Still too many basis vectors E.g., |T| might be in the thousands N|T| values Decompose basis vectors into partial vectors and hubs skeletonTours: Tours Consider a random walker with teleport set {i} Suppose walker is currently at node j The random walker’s tour is the sequence of nodes on the walker’s path since the last teleport E.g., i,a,b,c,a,j Nodes can repeat in tours – why? Interior nodes of the tour = {a,b,c,j} Start node = {i}, end node = {j} A page can be both start node and interior node, etc Tour splitting: Tour splitting Consider random walker with teleport set {i}, biased rank vector ri ri(j) = probability random walker reaches j by following some tour with start node i and end node j Consider node k Can have i = k or j = k Tour splitting: Tour splitting Let rik(j) be the probability that random surfer reaches page j through a tour that includes page k as an interior node or end node Let ri~k(j) be the probability that random surfer reaches page j through a tour that does not include k as an interior node or end node ri(j) = rik(j) + ri~k(j) Example: Example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 Let us compute r1~2 for = 0.8 Node Iteration 0 1 2… stable 1 0.2 0.2 0.264 0.294 2 0 0 0 0 3 0 0.08 0.08 0.118 4 0 0 0 0 5 0 0 0 0 Note that many entries are zeros 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2Example: Example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 Let us compute r2~2 for = 0.8 Node Iteration 0 1 2… stable 1 0 0 0.064 0.094 2 0.2 0.2 0.2 0.2 3 0 0 0 0.038 4 0 0.08 0.08 0.08 5 0 0.08 0.08 0.08Rank composition: Rank composition Notice: r12(3) = r1(3) – r1~2(3) = 0.163 - 0.118 = 0.045 r1(2) * r2~2(3) = 0.239 * 0.038 = 0.009 = 0.2 * 0.045 = (1-)*r12(3) r12(3) = r1(2) r2~2(3)/ (1-) Rank composition: Rank composition i j k ri(k) rk~k(j) rik(j) = ri(k)rk~k(j)/(1-) Hubs: Hubs Instead of a single page k, we can use a set H of “hub” pages Define ri~H(j) as set of tours from i to j that do not include any node from H as interior nodes or end node Hubs example: Hubs example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 H = {1,2} = 0.8 r2~H Node Iteration 0 1 stable 1 0 0 0 2 0.2 0.2 0.2 3 0 0 0 4 0 0.08 0.08 5 0 0.08 0.08 r1~H Node Iteration 0 1 stable 1 0.2 0 0.2 2 0 0 0 3 0 0.08 0.08 4 0 0 0 5 0 0 0Rank composition with hubs: Rank composition with hubs i j wi(h) rh~H(j) H h ri(j) = ri~H(j) + riH(j) riH(j) = åh2Hwi(h)rh~H(j)/(1-b) wi(h) = ri(h) if i = h wi(h) = ri(h) - (1-b) if i = h ri~H(j)Hubs rule example: Hubs rule example r2(3) = r2~H(3) + r2H(3) = 0 + r2H(3) = [r2(1)r1~H(3)]/0.2+[(r2(2)-0.2)r2~H(3)]/0.2 = [0.192*0.08]/0.2+[(0.407-0.2)*0]/0.2 = 0.077 H = {1,2} = 0.8 HHubs: Hubs Start with H = T, the teleport universe Add nodes to H such that given any pair of nodes i and j, there is a high probability that H separates i and j i.e., ri~H(j) is zero for most i,j pairs Observation: high page rank nodes are good separators and hence good hub nodes Hubs skeleton: Hubs skeleton To compute ri(j) we need: ri~H(j) for all i2H, j2V called the partial vector Sparse ri(h) for all h2H called the hubs skeleton i j ri(h) rh~H(j) ri~H(j) HStorage reduction: Storage reduction Say |T| = 1000, |H|=2000, N = 1 billion Store all basis vectors 1000*1 billion = 1 trillion nonzero values Use partial vectors and hubs skeleton Suppose each partial vector has N/200 nonzero entries Partial vectors = 2000*N/200 = 10 billion nonzero values Hubs skeleton = 2000*2000 = 4 million values Total = approx 10 billion nonzero values Approximately 100x compression You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
TopicSpecificPageRank Goldie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 18 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript CS345Data Mining: CS345 Data Mining Page Rank VariantsReview Page Rank: Review Page Rank Web graph encoded by matrix M N£N matrix (N = number of web pages) Mij = 1/|O(j)| iff there is a link from j to i Mij = 0 otherwise O(j) = set of pages node i links to Define matrix A as follows Aij = Mij + (1-)/N, where 0<<1 1- is the “tax” discussed in prior lecture Page rank r is first eigenvector of A Ar = rRandom walk interpretation: Random walk interpretation At time 0, pick a page on the web uniformly at random to start the walk Suppose at time t, we are at page j At time t+1 With probability , pick a page uniformly at random from O(j) and walk to it With probability 1-, pick a page on the web uniformly at random and teleport into it Page rank of page p = “steady state” probability that at any given time, the random walker is at page p Many random walkers: Many random walkers Alternative, equivalent model Imagine a large number M of independent, identical random walkers (MÀN) At any point in time, let M(p) be the number of random walkers at page p The page rank of p is the fraction of random walkers that are expected to be at page p i.e., E[M(p)]/M.Problems with page rank: Problems with page rank Measures generic popularity of a page Biased against topic-specific authorities Ambiguous queries e.g., jaguar This lecture Link spam Creating artificial link topographies in order to boost page rank Next lectureTopic-Specific Page Rank: Topic-Specific Page Rank Instead of generic popularity, can we measure popularity within a topic? E.g., computer science, health Bias the random walk When the random walker teleports, he picks a page from a set S of web pages S contains only pages that are relevant to the topic E.g., Open Directory (DMOZ) pages for a given topic (www.dmoz.org) Correspong to each teleport set S, we get a different rank vector rS Matrix formulation: Matrix formulation Aij = Mij + (1-)/|S| if i 2 S Aij = Mij otherwise Show that A is stochastic We have weighted all pages in the teleport set S equally Could also assign different weights to them Example: Example 1 2 3 4 Suppose S = {1}, = 0.8 Note how we initialize the page rank vector differently from the unbiased page rank case. How well does TSPR work?: How well does TSPR work? Experimental results [Haveliwala 2000] Picked 16 topics Teleport sets determined using DMOZ E.g., arts, business, sports,… “Blind study” using volunteers 35 test queries Results ranked using Page Rank and TSPR of most closely related topic E.g., bicycling using Sports ranking In most cases volunteers preferred TSPR ranking Which topic ranking to use?: Which topic ranking to use? User can pick from a menu Can use the context of the query E.g., query is launched from a web page talking about a known topic E.g., use Bayesian classification schemes to classify query into a topic (forthcoming lecture) History of queries e.g., “basketball” followed by “jordan” User context e.g., user’s My Yahoo settings, bookmarks, … Scaling with topics and users: Scaling with topics and users Suppose we wanted to cover 1000’s of topics Need to compute 1000’s of different rank vectors Need to store and retrieve them efficiently at query time For good performance vectors must fit in memory Even harder when we consider personalization Each user has their own teleport vector One page rank vector per user!Tricks: Tricks Determine a set of basis vectors so that any rank vector is a linear combination of basis vectors Encode basis vectors compactly as partial vectors and a hubs skeleton At runtime perform a small amount of computation to derive desired rank vector elements Linearity Theorem : Linearity Theorem Let S be a teleport set and rS be the corresponding rank vector For page i2S, let ri be the rank vector corresponding to the teleport set {i} ri is a vector with N entries rS = (1/|S|) åi2S ri Why is linearity important? Instead of 2N biased page rank vectors we need to store N vectors Linearity example : Linearity example Let us compute r{1,2} for = 0.8 Node Iteration 0 1 2… stable 1 0.1 0.1 0.164 0.300 2 0.1 0.14 0.172 0.323 3 0 0.04 0.04 0.120 4 0 0.04 0.056 0.130 5 0 0.04 0.056 0.130 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.1 0.1Linearity example: Linearity example 0.407 0.239 0.163 0.096 0.096 0.192 0.407 0.077 0.163 0.163 0.300 0.323 0.120 0.130 0.130 0.300 0.323 0.120 0.130 0.130 r{1,2} r1 r2 (r1+r2)/2Intuition behind proof: Intuition behind proof Let’s use the many-random-walkers model with M random walkers Let us color a random walker with color i if his most recent teleport was to page i At time t, we expect M/|S| of the random walkers to be colored i At any page j, we would therefore expect to find (M/|S|)ri(j) random walkers colored i So total number of random walkers at page j = (M/|S|)åi2Sri(j) Basis Vectors: Basis Vectors Suppose T = union of all teleport sets of interest Call it the teleport universe We can compute the rank vector corresponding to any teleport set SµT as a linear combination of the vectors ri for i2T We call these vectors the basis vectors for T We can also compute rank vectors where we assign different weights to teleport pagesDecomposition: Decomposition Still too many basis vectors E.g., |T| might be in the thousands N|T| values Decompose basis vectors into partial vectors and hubs skeletonTours: Tours Consider a random walker with teleport set {i} Suppose walker is currently at node j The random walker’s tour is the sequence of nodes on the walker’s path since the last teleport E.g., i,a,b,c,a,j Nodes can repeat in tours – why? Interior nodes of the tour = {a,b,c,j} Start node = {i}, end node = {j} A page can be both start node and interior node, etc Tour splitting: Tour splitting Consider random walker with teleport set {i}, biased rank vector ri ri(j) = probability random walker reaches j by following some tour with start node i and end node j Consider node k Can have i = k or j = k Tour splitting: Tour splitting Let rik(j) be the probability that random surfer reaches page j through a tour that includes page k as an interior node or end node Let ri~k(j) be the probability that random surfer reaches page j through a tour that does not include k as an interior node or end node ri(j) = rik(j) + ri~k(j) Example: Example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 Let us compute r1~2 for = 0.8 Node Iteration 0 1 2… stable 1 0.2 0.2 0.264 0.294 2 0 0 0 0 3 0 0.08 0.08 0.118 4 0 0 0 0 5 0 0 0 0 Note that many entries are zeros 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2Example: Example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 Let us compute r2~2 for = 0.8 Node Iteration 0 1 2… stable 1 0 0 0.064 0.094 2 0.2 0.2 0.2 0.2 3 0 0 0 0.038 4 0 0.08 0.08 0.08 5 0 0.08 0.08 0.08Rank composition: Rank composition Notice: r12(3) = r1(3) – r1~2(3) = 0.163 - 0.118 = 0.045 r1(2) * r2~2(3) = 0.239 * 0.038 = 0.009 = 0.2 * 0.045 = (1-)*r12(3) r12(3) = r1(2) r2~2(3)/ (1-) Rank composition: Rank composition i j k ri(k) rk~k(j) rik(j) = ri(k)rk~k(j)/(1-) Hubs: Hubs Instead of a single page k, we can use a set H of “hub” pages Define ri~H(j) as set of tours from i to j that do not include any node from H as interior nodes or end node Hubs example: Hubs example 1 2 3 4 5 0.4 0.4 0.8 0.8 0.4 0.4 0.8 0.2 H = {1,2} = 0.8 r2~H Node Iteration 0 1 stable 1 0 0 0 2 0.2 0.2 0.2 3 0 0 0 4 0 0.08 0.08 5 0 0.08 0.08 r1~H Node Iteration 0 1 stable 1 0.2 0 0.2 2 0 0 0 3 0 0.08 0.08 4 0 0 0 5 0 0 0Rank composition with hubs: Rank composition with hubs i j wi(h) rh~H(j) H h ri(j) = ri~H(j) + riH(j) riH(j) = åh2Hwi(h)rh~H(j)/(1-b) wi(h) = ri(h) if i = h wi(h) = ri(h) - (1-b) if i = h ri~H(j)Hubs rule example: Hubs rule example r2(3) = r2~H(3) + r2H(3) = 0 + r2H(3) = [r2(1)r1~H(3)]/0.2+[(r2(2)-0.2)r2~H(3)]/0.2 = [0.192*0.08]/0.2+[(0.407-0.2)*0]/0.2 = 0.077 H = {1,2} = 0.8 HHubs: Hubs Start with H = T, the teleport universe Add nodes to H such that given any pair of nodes i and j, there is a high probability that H separates i and j i.e., ri~H(j) is zero for most i,j pairs Observation: high page rank nodes are good separators and hence good hub nodes Hubs skeleton: Hubs skeleton To compute ri(j) we need: ri~H(j) for all i2H, j2V called the partial vector Sparse ri(h) for all h2H called the hubs skeleton i j ri(h) rh~H(j) ri~H(j) HStorage reduction: Storage reduction Say |T| = 1000, |H|=2000, N = 1 billion Store all basis vectors 1000*1 billion = 1 trillion nonzero values Use partial vectors and hubs skeleton Suppose each partial vector has N/200 nonzero entries Partial vectors = 2000*N/200 = 10 billion nonzero values Hubs skeleton = 2000*2000 = 4 million values Total = approx 10 billion nonzero values Approximately 100x compression