text mining

Views:
 
Category: Education
     
 

Presentation Description

A general overview of why text-mining

Comments

Presentation Transcript

Why Text Mining?:

Why Text Mining? Welcome to the age of too much information. We can now easily retrieve far more relevant information than is humanly possible to read. © 2012 by VP Institute, All rights reserved

PowerPoint Presentation:

What does Text Mining give us? Since text mining allows us to use computers to “read” the information, we can digest far more information than we could before. © 2012 by VP Institute, All rights reserved

When does Text Mining Work?:

When does Text Mining Work? When what you seek is a pattern (not a specific document) There is a distinct difference between “search and retrieval” and “text mining” When your information is available in electronic machine readable form PDF images introduce an added layer of complexity due to OCR issues When your electronic information is accessible via bulk download If you have to download one/few records at a time, adding a “ bot ” to do the downloads is possible but usually adds additional technical and licensing issues © 2012 by VP Institute, All rights reserved

PowerPoint Presentation:

What are Patterns in Text? Patterns in text are the relationships between words or phrases that repeat across many different documents. For example, if one document mentions “sodium chloride” and “salt” and then another document mentions “sodium chloride” and “salt” and then another and another etc… You begin to assume that “sodium chloride” and “salt” are related. © 2012 by VP Institute, All rights reserved

PowerPoint Presentation:

Patterns have Meaning Patterns that we find represent higher order abstractions within the large text collection. In our salt example, we can induce that Sodium Chloride is a salt Meaning © 2012 by VP Institute, All rights reserved

PowerPoint Presentation:

How do we find a pattern? Word 1 Word 2 Use Co-word Bibliometrics /Co-occurrence statistics to find relationships Count the number of times words appear together in a set of documents The higher the co-occurrence, the stronger the potential relationship © 2012 by VP Institute, All rights reserved

What kind of questions can we answer with text-mining software?:

What kind of questions can we answer with text-mining software? Who? Where? What? When? © 2012 by VP Institute, All rights reserved

authorStream Live Help