Latent Semantic Indexing

Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Slide 1: 

Keywords www.lsikeywords.com Latent Semantic Indexing

Slide 2: 

Keywords www.lsikeywords.com LSI stands for Latent Semantic Indexing - is an algorithm that identifies the semantic the relationship between keywords and its context in an article especially online these days.

Slide 3: 

Keywords www.lsikeywords.com What are LSI keywords? It is the ability of keywords to connect with search engines to identify and give weight age to relevant content.

Slide 4: 

Keywords www.lsikeywords.com For example: A search for "keywords" can bring back pages that include "keywords", "commands", "AdWords keywords" and so on.

Slide 5: 

Keywords www.lsikeywords.com If you enter a generic product name, LSI also searches for popular brand names.Meaning if you search for "search engine", it will return popular search engines like Yahoo and Google.

Slide 6: 

Keywords www.lsikeywords.com How to Write LSI-based website content based on LSI keywords:

Slide 7: 

Keywords www.lsikeywords.com Create Great Usable Content: If our content is relevant for the target audience, it will automatically have a higher chance of being loved by an LSI based search engine

Slide 8: 

Keywords www.lsikeywords.com Write naturally: When writing naturally, you would rarely use the same word over and over again. Good article writers tend to use variations in language while writing and that is what LSI aims to identify.

Slide 9: 

Keywords www.lsikeywords.com Widen Your Focus: Do not stick to multiple repetitions of the same keyword. Instead use words that are most likely to be used in the same context. Also, mention popular brand names related to your keywords.

Slide 10: 

Keywords www.lsikeywords.com How Latent Semantic Indexing Really Work?

Slide 11: 

Keywords www.lsikeywords.com LSI or Latent Semantic Indexing looks at patterns of word distribution (specifically, word co-occurrence) across a set of documents.

Slide 12: 

Keywords www.lsikeywords.com Natural language is full of redundancies, and not every word that appears in a document carries semantic meaning. In fact, the most frequently used words in English are words that don't carry content at all:

Slide 13: 

Keywords www.lsikeywords.com functional words conjunctions prepositions auxiliary verbs and others

Slide 14: 

Keywords www.lsikeywords.com/ Quick recipe for generating a list of content words from a document:

Slide 15: 

Keywords www.lsikeywords.com/ Discard pronouns Discard common verbs (know, see, do, be) Discard articles, prepositions, and conjunctions Make a complete list of all the words that appear anywhere in the collection

Slide 16: 

Keywords www.lsikeywords.com/ Discard any words that appear in only one document Discard any words that appear in every document Discard frilly words (therefore, thus, however, albeit, etc.) Discard common adjectives (big, late, high)

Slide 17: 

Keywords www.lsikeywords.com/ Want to know MORE….Thinking inside the Grid ( find me images and example of this)

Slide 18: 

Keywords www.lsikeywords.com/ Using our list of content words and documents, we can now generate a term-document matrix. This is a fancy name for a very large grid, with documents listed along the horizontal axis, and content words along the vertical axis. For each content word in our list, we go across the appropriate row and put an 'X' in the column for any document where that word appears. If the word does not appear, we leave that column blank.

Slide 19: 

Keywords www.lsikeywords.com/ Doing this for every word and document in our collection gives us a mostly empty grid with a sparse scattering of X-es. This grid displays everything that we know about our document collection. We can list all the content words in any given document by looking for X-es in the appropriate column, or we can find all the documents containing a certain content word by looking across the appropriate row.

Slide 20: 

Keywords www.lsikeywords.com/ Notice that our arrangement is binary - a square in our grid either contains an X, or it doesn't. This big grid is the visual equivalent of a generic keyword search, which looks for exact matches between documents and keywords. If we replace blanks and X-es with zeroes and ones, we get a numerical matrix containing the same information.

Slide 21: 

Keywords www.lsikeywords.com/ The key step in LSI is decomposing this matrix using a technique called singular value decomposition. The mathematics of this transformation is beyond the scope of this article but we can get an intuitive grasp of what SVD does by thinking of the process spatially. An analogy will help.

Slide 22: 

Keywords www.lsikeywords.com/ Breakfast in Hyperspace (need examples and images)

Slide 23: 

Keywords www.lsikeywords.com/ You can graph the results of your survey by setting up a chart with three orthogonal axes - one for each keyword. The choice of direction is arbitrary - perhaps a bacon axis in the x direction, an eggs axis in the y direction, and the all-important coffee axis in the z direction. To plot a particular breakfast order, you count the occurrence of each keyword, and then take the appropriate number of steps along the axis for that word. When you are finished, you get a cloud of points in three-dimensional space, representing all of that day's breakfast orders.

Slide 24: 

Keywords www.lsikeywords.com/

Slide 25: 

Keywords www.lsikeywords.com/ If you draw a line from the origin of the graph to each of these points, you obtain a set of vectors in 'bacon-eggs-and-coffee' space. The size and direction of each vector tells you how many of the three key items were in any particular order, and the set of all the vectors taken together tells you something about the kind of breakfast people favor on a Saturday morning.

Slide 26: 

Keywords www.lsikeywords.com Singular Value Decomposition

Slide 27: 

Keywords www.lsikeywords.com Imagine you keep tropical fish, and are proud of your prize aquarium - so proud that you want to submit a picture of it to Modern Aquaria magazine, for fame and profit. To get the best possible picture, you will want to choose a good angle from which to take the photo. You want to make sure that as many of the fish as possible are visible in your picture, without being hidden by other fish in the foreground.

Slide 28: 

Keywords www.lsikeywords.com www.lsikeywords.com Check it at: