logging in or signing up 195 Wanderer Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 207 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 07, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Learning the Semantic Meaning of a Concept from the Web: Learning the Semantic Meaning of a Concept from the Web Yang Yu Master’s Thesis Defense August 03, 2006The Problem : The Problem Manually preparing training data for text classification based ontology mapping is expensive.The Thesis: The Thesis Solution Automatically collecting training data for the concept defined in an ontology. Contribution Reduce the amount of human work Fully automated ontology mappingOverview: Overview Background The semantic Web and ontology Ontology Mapping Proposal System Experimental Results WEAPONS ontology LIVING_THINGS ontology Discussions and ConclusionSemantic Web and Ontology: Semantic Web and Ontology What is it? “an extension of the current web” An ExampleOntology Mapping: Definition r = f (Ci, Cj) where i=1, …, n and j=1, …, m; r {equivalent, subClassOf, superClassOf, complement, overlapped, other} Interoperability problem Independently developed ontologies for the same or overlapped domain Ontology MappingApproaches to Ontology Mapping: Approaches to Ontology Mapping Manual mapping String Matching Text classification the semantic meaning of a concept is reflected in the training data that use the concept Probabilistic feature model Classification Results highly depend on training data Motivation: Motivation Preparing exemplars manually is costly Billions of documents available on the web Search engines The Proposal: The Proposal Using the concept defined in an ontology as a query and processing the search results to obtain exemplars Verification Build a prototype system Check ontology mapping resultsSystem overview – Part I: System overview – Part I Search EngineThe parser (Query expansion): The parser (Query expansion) FOOD+FRUIT+APPLEThe retriever: The retrieverThe processor: The processorNaïve Bayes text classifier: Naïve Bayes text classifier Bow toolkit McCallum, Andrew Kachites, Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering, http://www.cs.cmu.edu/~mccallum/bow 1996. rainbow -d model --index dir/* rainbow –d model –query Bayes Rule Naïve Bayes text classifierBayes Rule: Bayes Rule P (A | B) =Naïve Bayes classifier: Naïve Bayes classifier A text classification problem “What’s the most probable classification of the new instance given the training data?” vj: category j. (a1, a2, …, an): attributes of a new document So Naïve (Mitchell Tom, Machine Learning, McGraw Hill) 1997System overview– Part II: System overview– Part IIThe model builder: The model builder Mutually exclusive and exhaustive Leaf classes C+ and C-The calculator: The calculator Naïve Bayes text classifier tends to give extreme values (1/0) Tasks Feed exemplars to the classifier one by one Keep records of classification results Take averages and generate reportAn Example of the Calculator: An Example of the Calculator APC TANK-VEHICLE AIR-DEFENSE-GUN SAUDI-NAVAL- MISSILE-CRAFT Classifier 200 P(TANK-VEHICLE | APC) = 170 /200= 0.85 P(AIR-DEFENSE-GUN | APC) = 0.10 P(SAUDI-NAVAL-MISSILE-CRAFT| APC) = 0.05Experiments with WEAPONS ontology: Experiments with WEAPONS ontology Information Interpretation and Integration Conference (http://www.atl.lmco.com/projects/ontology/i3con.html) WeaponsA.n3 and WeaponsB.n3 Both over 80 classes defined More than 60 classes are leaf classes Similar structure WeaponsA.n3: WeaponsA.n3 Part of WeaponsA.n3 TANK-VEHICLE - MODERN- NAVAL-SHIP WEAPON CONVENTIONAL- WEAPON WARPLANE ARMORED- COMBAT-VEHICLE PATROL-CRAFT AIRCRAFT-CARRIER SUPER-ETENDARD WeaponsB.n3: WeaponsB.n3 Part of WeaponsB.n3Expected Results: Expected Results Part of WeaponsB.n3 TANK-VEHICLE SUPER- ETENDARD LIGHT-TANK APC PATROL- WARTER-CRAFT AIRCRAFT-CARRIER LIGHT-AIRCRAFT-CARRIER PATROL- BOAT- RIVER PATROL- BOAT FIGHTER-PLANE FIGHTER-ATTACK-PLANE SUPER-ETENDARD-FIGHTER PATROL-CRAFT A Typical Report: A Typical Report P(APC | Ci) where i = 1 … 63 ...... ……classes with highest conditional probability: classes with highest conditional probability different numbers of exemplars (whole): different numbers of exemplars (whole) different numbers of exemplars (sentence): different numbers of exemplars (sentence) Comparison of mapping accuracy of different groups of experiments: Comparison of mapping accuracy of different groups of experiments Higher Conditional ProbabilityExperiment with LIVING_THINGS ontology: Experiment with LIVING_THINGS ontology P(MAN | HUMAN) P (WOMAN | HUMAN) Find a mapping for GIRLActual Experiment Results: L-1: Actual Experiment Results: L-1 Results of experiment (1) Actual Experiment Results: L-2: Actual Experiment Results: L-2 With clustering on exemplars Without clustering on exemplars with additional classes Actual Experiment Results: Different Queries: Actual Experiment Results: Different Queries Queries augmented with class properties Actual Experiment Results: L-4: Actual Experiment Results: L-4 Results of experiment (1) with new queries Results of experiment (2) with new queries Limitation 1: An exemplar is not a sample of a concept : Limitation 1: An exemplar is not a sample of a concept An exemplar is a combination of strings that represent some usage of a concept. An exemplar is not an instance of a concept. The way we calculate conditional probability is an estimation. Limitation 2: Popularity does not equal relevancy : Limitation 2: Popularity does not equal relevancy Limited by a search engine’s algorithm PageRank™ Popularity does not equal relevancy Weight cannot be specified for words in a search query Limitation 3: Relevancy does not equal to similarity: Limitation 3: Relevancy does not equal to similarity Search Results for concept A Text related to concept A Text against concept A Text for concept A i.e. desired exemplars Text for related concept B Related Research: Related Research UMBC OntoMapper Sushama Prasad, Peng Yun and Finin Tim, A Tool for Mapping between Two Ontologies Using Explicit Information, AAMAS 2002 Workshop on Ontologies and Agent Systems, 2002. CAIMEN Lacher S. Martin and Groh Georg ,Facilitating the Exchange of Explicit Knowledge through Ontology Mappings, Proc of the Fourteenth International FLAIRS conference, 2001. GLUE Doan Anhai, Madhavan Jayant, Dhamankar Robin, Domingos Pedro, and Halevy Alon, Learning to Match Ontologies on the Semantic Web, WWW2002, May, 2002. Google Conditional Probability P(HUMAN | MAN) = 1.77 billion / 2.29 billion = 0.77 P(HUMAN | WOMAN) = 0.6 billion / 2.29 billion = 0.26 Wyatt D., Philipose M., and Choudhury T., Unsupervised Activity Recognition Using Automatically Mined Common Sense. Proceedings of AAAI-05. pp. 21-27.Conclusion and Future Work: Conclusion and Future Work Text retrieved from the web can be used as exemplars for text classification based ontology mapping Many parameters affect the quality of the exemplars There are noise contained in the processed documents Future work ClusteringQuestions: Questions You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
195 Wanderer Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 207 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 07, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Learning the Semantic Meaning of a Concept from the Web: Learning the Semantic Meaning of a Concept from the Web Yang Yu Master’s Thesis Defense August 03, 2006The Problem : The Problem Manually preparing training data for text classification based ontology mapping is expensive.The Thesis: The Thesis Solution Automatically collecting training data for the concept defined in an ontology. Contribution Reduce the amount of human work Fully automated ontology mappingOverview: Overview Background The semantic Web and ontology Ontology Mapping Proposal System Experimental Results WEAPONS ontology LIVING_THINGS ontology Discussions and ConclusionSemantic Web and Ontology: Semantic Web and Ontology What is it? “an extension of the current web” An ExampleOntology Mapping: Definition r = f (Ci, Cj) where i=1, …, n and j=1, …, m; r {equivalent, subClassOf, superClassOf, complement, overlapped, other} Interoperability problem Independently developed ontologies for the same or overlapped domain Ontology MappingApproaches to Ontology Mapping: Approaches to Ontology Mapping Manual mapping String Matching Text classification the semantic meaning of a concept is reflected in the training data that use the concept Probabilistic feature model Classification Results highly depend on training data Motivation: Motivation Preparing exemplars manually is costly Billions of documents available on the web Search engines The Proposal: The Proposal Using the concept defined in an ontology as a query and processing the search results to obtain exemplars Verification Build a prototype system Check ontology mapping resultsSystem overview – Part I: System overview – Part I Search EngineThe parser (Query expansion): The parser (Query expansion) FOOD+FRUIT+APPLEThe retriever: The retrieverThe processor: The processorNaïve Bayes text classifier: Naïve Bayes text classifier Bow toolkit McCallum, Andrew Kachites, Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering, http://www.cs.cmu.edu/~mccallum/bow 1996. rainbow -d model --index dir/* rainbow –d model –query Bayes Rule Naïve Bayes text classifierBayes Rule: Bayes Rule P (A | B) =Naïve Bayes classifier: Naïve Bayes classifier A text classification problem “What’s the most probable classification of the new instance given the training data?” vj: category j. (a1, a2, …, an): attributes of a new document So Naïve (Mitchell Tom, Machine Learning, McGraw Hill) 1997System overview– Part II: System overview– Part IIThe model builder: The model builder Mutually exclusive and exhaustive Leaf classes C+ and C-The calculator: The calculator Naïve Bayes text classifier tends to give extreme values (1/0) Tasks Feed exemplars to the classifier one by one Keep records of classification results Take averages and generate reportAn Example of the Calculator: An Example of the Calculator APC TANK-VEHICLE AIR-DEFENSE-GUN SAUDI-NAVAL- MISSILE-CRAFT Classifier 200 P(TANK-VEHICLE | APC) = 170 /200= 0.85 P(AIR-DEFENSE-GUN | APC) = 0.10 P(SAUDI-NAVAL-MISSILE-CRAFT| APC) = 0.05Experiments with WEAPONS ontology: Experiments with WEAPONS ontology Information Interpretation and Integration Conference (http://www.atl.lmco.com/projects/ontology/i3con.html) WeaponsA.n3 and WeaponsB.n3 Both over 80 classes defined More than 60 classes are leaf classes Similar structure WeaponsA.n3: WeaponsA.n3 Part of WeaponsA.n3 TANK-VEHICLE - MODERN- NAVAL-SHIP WEAPON CONVENTIONAL- WEAPON WARPLANE ARMORED- COMBAT-VEHICLE PATROL-CRAFT AIRCRAFT-CARRIER SUPER-ETENDARD WeaponsB.n3: WeaponsB.n3 Part of WeaponsB.n3Expected Results: Expected Results Part of WeaponsB.n3 TANK-VEHICLE SUPER- ETENDARD LIGHT-TANK APC PATROL- WARTER-CRAFT AIRCRAFT-CARRIER LIGHT-AIRCRAFT-CARRIER PATROL- BOAT- RIVER PATROL- BOAT FIGHTER-PLANE FIGHTER-ATTACK-PLANE SUPER-ETENDARD-FIGHTER PATROL-CRAFT A Typical Report: A Typical Report P(APC | Ci) where i = 1 … 63 ...... ……classes with highest conditional probability: classes with highest conditional probability different numbers of exemplars (whole): different numbers of exemplars (whole) different numbers of exemplars (sentence): different numbers of exemplars (sentence) Comparison of mapping accuracy of different groups of experiments: Comparison of mapping accuracy of different groups of experiments Higher Conditional ProbabilityExperiment with LIVING_THINGS ontology: Experiment with LIVING_THINGS ontology P(MAN | HUMAN) P (WOMAN | HUMAN) Find a mapping for GIRLActual Experiment Results: L-1: Actual Experiment Results: L-1 Results of experiment (1) Actual Experiment Results: L-2: Actual Experiment Results: L-2 With clustering on exemplars Without clustering on exemplars with additional classes Actual Experiment Results: Different Queries: Actual Experiment Results: Different Queries Queries augmented with class properties Actual Experiment Results: L-4: Actual Experiment Results: L-4 Results of experiment (1) with new queries Results of experiment (2) with new queries Limitation 1: An exemplar is not a sample of a concept : Limitation 1: An exemplar is not a sample of a concept An exemplar is a combination of strings that represent some usage of a concept. An exemplar is not an instance of a concept. The way we calculate conditional probability is an estimation. Limitation 2: Popularity does not equal relevancy : Limitation 2: Popularity does not equal relevancy Limited by a search engine’s algorithm PageRank™ Popularity does not equal relevancy Weight cannot be specified for words in a search query Limitation 3: Relevancy does not equal to similarity: Limitation 3: Relevancy does not equal to similarity Search Results for concept A Text related to concept A Text against concept A Text for concept A i.e. desired exemplars Text for related concept B Related Research: Related Research UMBC OntoMapper Sushama Prasad, Peng Yun and Finin Tim, A Tool for Mapping between Two Ontologies Using Explicit Information, AAMAS 2002 Workshop on Ontologies and Agent Systems, 2002. CAIMEN Lacher S. Martin and Groh Georg ,Facilitating the Exchange of Explicit Knowledge through Ontology Mappings, Proc of the Fourteenth International FLAIRS conference, 2001. GLUE Doan Anhai, Madhavan Jayant, Dhamankar Robin, Domingos Pedro, and Halevy Alon, Learning to Match Ontologies on the Semantic Web, WWW2002, May, 2002. Google Conditional Probability P(HUMAN | MAN) = 1.77 billion / 2.29 billion = 0.77 P(HUMAN | WOMAN) = 0.6 billion / 2.29 billion = 0.26 Wyatt D., Philipose M., and Choudhury T., Unsupervised Activity Recognition Using Automatically Mined Common Sense. Proceedings of AAAI-05. pp. 21-27.Conclusion and Future Work: Conclusion and Future Work Text retrieved from the web can be used as exemplars for text classification based ontology mapping Many parameters affect the quality of the exemplars There are noise contained in the processed documents Future work ClusteringQuestions: Questions