Share PowerPoint. Anywhere!

EACL06 OLTutorial

Uploaded from authorPOINT Lite
Download as Download Not Available PPT
Presentation Description

No description available

Views: 16
Like it  ( Likes) Dislike it  ( Dislikes)
Added: December 03, 2007 This presentation is Public
Presentation Category :Entertainment
Tags Add Tags
Presentation StatisticsNew!
Views on authorSTREAM: 16
Presentation Transcript

Slide1 : EACL-2006   11th Conference of the European Chapter of the Association for Computational Linguistics   Tutorial Notes Ontology Learning from Text   Paul Buitelaar DFKI GmbH, Saarbrücken Philipp Cimiano AIFB, Univ. of Karlsruhe April 2006 Trento, Italy


Aims of the Tutorial : Aims of the Tutorial Present an Overview of Ontology Learning Methods in the Context of NLP Systems Analyze Ontology Learning (from Text) as a Layered Set of Sub-Tasks Discuss Methods, Evaluation and Available Tools for each Layer Provide Pointers to Relevant Literature


Structure of the Tutorial : Structure of the Tutorial Introduction Ontologies: Origin and Purpose Ontologies for NLP Applications: IR, IE, MT, QA Ontologies and Lexical Semantics Layers in Ontology Learning from Text Terms (Multilingual) Synonyms Concept Formation – Intension & Extension Relations Concept & Relation Hierarchies Axioms Schemata & General Axioms - Rules Wrap Up What have we Learned in the Tutorial? Where are we Today? Where are we Heading?


Part I : Part I Ontologies: Origin and Purpose


Ontologies in Philosophy : Ontologies in Philosophy A Branch of Philosophy that Deals with the Nature and Organization of Reality Science of Being (Aristotle, Metaphysics) What Characterizes Being? Eventually, what is Being?


Ontologies in Computer Science : Ontologies in Computer Science Ontology refers to an engineering artifact a specific vocabulary used to describe a certain reality a set of explicit assumptions regarding the intended meaning of the vocabulary An Ontology is an explicit specification of a conceptualization [Gruber 93] a shared understanding of a domain of interest [Uschold and Gruninger 96]


Why Develop an Ontology? : Why Develop an Ontology? Make domain assumptions explicit Easier to exchange domain assumptions Easier to understand and update legacy data Separate domain knowledge from operational knowledge Re-use domain and operational knowledge separately A community reference for applications Shared understanding of what information means


Applications of Ontologies : Applications of Ontologies NLP Information Extraction, e.g. [Buitelaar et al. 06], [Stevenson et al. 05], [Mädche et al. 02] Information Retrieval (Semantic Search), e.g. WebKB [Martin et al. 00], SHOE [Hendler et al. 00], OntoSeek [Guarino et al. 99] Question Answering, e.g. [Sinha and Narayanan 05], [Schlobach et al. 04], Aqualog [Lopez and Motta 04], [Pasca and Harabagiu 01] Machine Translation, e.g. [Nirenburg et al. 04], [Beale et al. 95], [Hovy and Nirenburg 92], [Knight 93] Other Business Process Modeling, e.g. [Uschold et al. 98] Information Integration, e.g. [Kashyap 99], [Wiederhold 92] Knowledge Management (incl. Semantic Web), e.g. [Fensel 01], [Mulholland et al. 2001], [Staab and Schnurr 00], [Sure et al. 00], [Abecker et al. 97] Software Agents, e.g. [Gluschko et al. 99], [Smith and Poulter 99] User Interfaces, e.g. [Kesseler 96]


Types of Ontologies [Guarino 98] : Types of Ontologies [Guarino 98] Describe very general concepts like space, time, event, which are independent of a particular problem or domain. Describe the vocabulary related to a generic domain by specializing the concepts introduced in the top-level ontology. Describe the vocabulary related to a generic task or activity by specializing the top-level ontologies. Concepts in application ontologies often correspond to roles played by domain entities while performing a certain activity.


Ontologies and Their Relatives : Ontologies and Their Relatives


Ontologies and Their Relatives (continued) : Ontologies and Their Relatives (continued)


Thesauri - Examples : Thesauri - Examples


Semantic Networks - Examples : Semantic Networks - Examples


Ontologies – Example (Geographical) : Ontologies – Example (Geographical) capital city Neckar Zugspitze GE Natural GE Inhabited GE country river mountain instance_of Germany Berlin Stuttgart is-a flow_through located_in capital_of flow_through flow_through located_in capital_of 367 length (km) 2962 height (m)


A Mathematical Definition [Stumme et al. 2003] : A Mathematical Definition [Stumme et al. 2003] Structure: C: set of concept identifiers R: set of relation identifiers

Part II : Part II Ontologies for NLP


Overview : Overview Ontologies in NLP Applications Information Retrieval Query Expansion Information Extraction Template Definition, Semantic Integration Question Answering Question Analysis, Answer Selection Machine Translation Interlingua Summarization Semantic Graphs Lexical Semantic / Ontological Knowledge Inference and Reasoning for Semantic Interpretation Compound Analysis, Coercion, Bridging, WSD, Semantic Roles Lexical Semantic Theory Qualia Structure, Meaning-Text Theory, Case Grammar, … A Lexicon Model for Ontologies LingInfo


Information Retrieval : Information Retrieval Query Expansion Works for Short Queries [Voorhees 94] Conceptual Indexing Indexing with Respect to WordNet can Improve Retrieval - Even at Considerable WSD Noise Levels [Gonzalo 98] Combination of Traditional Indexing with Semantic Features Improves Results in Cross-Lingual IR (e.g. [Volk et al. 02], [Vossen et al. 2006]) Document Clustering/Classification Extended Bag-of-Words Model with WordNet or MeSH leads to 2-7% Improvement [Bloehdorn et al. 05]


Information Extraction : Information Extraction Class-based Template Definition Allows for Reasoning over Extracted Templates with Respect to the Ontology (see e.g. [Nedellec and Nazarenko 05] for discussion) Rule Induction Discovering Semantically Similar Patterns (e.g. Unsupervised Approach w.r.t. WordNet [Stevenson and Greenwood 05]) Discourse Analysis Event Co-Reference Resolution (e.g. LaSIE [Gaizauskas et al. 95]) Semantic Integration (“Template Merging”) Extraction from Heterogeneous Sources (Text, Tables and other Semi-Structured Data, Image Captions) – SmartWeb [Buitelaar et al. 06a/b] Multi-Document Information Extraction – ArtEquAKT [Alani et al. 03]


Question Answering : Question Answering Question Analysis Ontology/WordNet-based Semantic Question Interpretation e.g. [Pasca and Harabagiu 01] Answer Selection Ontology/WordNet-based Reasoning for Answer Type-Checking Ontology of Events [Sinha and Narayanan 05] Geographical Ontology, WordNet [Schlobach et al. 04] WordNet [Pasca and Harabagiu 01] Ontology-based Question Answering Derive Answers from a Knowledge Base e.g. Aqualog [Lopez and Motta 04]


Machine Translation & Summarization : Machine Translation & Summarization Conceptual Model for Interlingua in MT Not much current work, but see e.g. [Hovy and Nirenburg 92], [Knight 93] Background Knowledge from Relevant Ontologies in Concept-based Summarization Not common, but see e.g. [Lee et al. 04], [Lenci et al. 02] Multi-Document Concept-based Summarization e.g. ArtEquAKT [Alani et al. 03]


Semantic Interpretation : Semantic Interpretation Ontology-based Inference and Reasoning for Compound Analysis headache medicine (medicine cure headache) Metonymy and Coercion The Boston office called (office > person, person part_of office) I began the book (book > event, read telic book) Bridging and Discourse Peter bought a car. The engine runs well (engine part_of car) Word Sense Disambiguation in the corner (> location) / before the corner is taken (> event) Beckham kicked the ball (kick > shot) / the referee (kick > foul)


Sense Disambiguation / Assignment : Sense Disambiguation / Assignment … with Wordnets Domain independent, with high ambiguity rate Sense Disambiguation based on semantic distance that exploit taxonomic and non-hierarchical structure, e.g. [Navigli and Velardi 04], [Resnik 98], [Agirre and Rigau 96] … with Domain Ontology Domain-specific, with low ambiguity rate Primarily non-ambiguous Sense Assignment In case of ambiguity, Sense Disambiguation based on: Semantic Distance (Statistical) – as above Formal Inference (Reasoning) and Hybrid Approaches – like Sense Resolution work of 70s/80s/early 90s (also: Qualia Structure), but with large scale domain-specific ontologies/corpora


Semantic Role Assignment : Semantic Role Assignment … with FrameNet and Similar Domain independent High ambiguity rate Classifier Induction on the basis of training data e.g. CoNLL Task on Semantic Role Labeling [Carreras and Marquez 04], [Baldewein et al. 04], [Gildea and Jurafsky 02] … with Domain Ontology Domain-specific Less ambiguity No availability of training data Relation extraction/discovery


Lexical Semantics : Lexical Semantics Lexical Semantic Theory Link Morphological/Syntactic Structure to Semantic Structure Theories, e.g. Generative Lexicon / Qualia Structure [Pustejovsky 95] Lexical Functions / Meaning-Text Theory [Mel‘cuk and Polguere 87] Case Grammar / FrameNet [Fillmore 68] Ontology-Driven (Lexical) Semantic Interpretation Link Ontological Knowledge - Classes, Relations, Properties - to Lexical/Terminological Realizations Semantics is in the Ontology, not in the Lexicon! LingInfo - A Model for Integrating Linguistic (Lexical) Information in Ontologies [Buitelaar, Sintek and Kiesel 05]


Ontology Meta-Classes for LingInfo : Ontology Meta-Classes for LingInfo


LingInfo Model : LingInfo Model


Example Instance: “Fußballspielers” (of the football player) : Example Instance: “Fußballspielers” (of the football player) Fußballspielers term morphSynDecomp de lang inst0 : LingInfo wordForm … singular number Fußballspielers ortographicForm Noun partOfSpeech male gender genitive case inst1 : InflectedWordForm isComposedOf singular number Fußballspieler ortographicForm Noun partOfSpeech male gender nominative case inst2 : Stem root Fußball orthographicForm modifier function isComposedOf semantics ... 1 analysisIndex inst3 : Stem … Spieler orthographicForm root 2 analysisIndex inst8 : Stem Spieler orthographicForm … inst1 : Root inst7 : Stem (Ball) inst5 : Stem (Fuß) inst4 : Root (Ball) inst6 : Root (Fuß) o:BallObject


Part III : Part III Methods in Ontology Learning from Text


Motivation for Ontology Learning from Text : Motivation for Ontology Learning from Text Problem: Knowledge Acquisition Bottleneck Possible solution: Data-driven Knowledge Acquisition As text is massively available on the Web, ontology learning from text is an attractive option


OL from Text as Reverse Engineering : OL from Text as Reverse Engineering


OL from Text - Some “pre-History” : OL from Text - Some “pre-History” AI - Knowledge Acquisition Since 60s/70s: Semantic Network Extraction and similar for Story Understanding e.g. MARGIE (Schank et al. 73), LUNAR (Woods 73) NLP - Lexical Knowledge Extraction 70s/80s/early 90s: Extraction of Lexical Semantic Representations from Machine Readable Dictionaries e.g. ACQUILEX LKB (Copestake et al. 92) 80s/90s: Extraction of Semantic Lexicons from Corpora for Information Extraction Systems e.g. AutoSlog (Riloff 93), CRYSTAL (Soderland et al. 95) IR - Thesaurus Extraction Since 60s: Extraction of Keywords, Thesauri and Controlled Vocabularies e.g. (Sparck-Jones 66/86, 71), Sextant (Grefenstette 92), DR-Link (Liddy 94)


Some Current Work on OL from Text : Some Current Work on OL from Text Terms, Synonyms & Classes Statistical Analysis Patterns (Shallow) Linguistic Parsing Term Disambiguation & Compositional Interpretation Taxonomies Statistical Analysis & Clustering (e.g. FCA) Patterns (Shallow) Linguistic Parsing WordNet Relations Anonymous Relations (e.g. with Association Rules) Named Relations (Linguistic Parsing) (Linguistic) Compound Analysis Web Mining, Social Network Analysis Definitions (Linguistic) Compound Analysis (incl. WordNet) Overview of Current Work: Paul Buitelaar, Philipp Cimiano, Bernardo Magnini Ontology Learning from Text: Methods, Evaluation and Applications Frontiers in Artificial Intelligence and Applications Series, Vol. 123, IOS Press, July 2005.


Slide34 : Ontology Learning Layer Cake


Evaluation : Evaluation Gold Standard Human Evaluation Task-based Other


Tools : Tools


Slide37 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


Terms : Terms Terms are at the basis of the ontology learning process Terms express more or less complex semantic units But what is a term? Huge Selection of Top Brand Computer Terminals Available for Immediate Delivery Because Vecmar carries such a large inventory of high-quality computer terminals, including: ADDS terminals, Boundless terminals, DEC terminals, HP terminals, IBM terminals, LINK terminals, NCR terminals and Wyse terminals, your order can often ship same day. Every computer terminal shipped to you is protected with careful packing, including thick boxes. All of our shipping options - including international - are available through major carriers. Extracted term candidates (phrases) computer terminal computer terminal ? high-quality computer terminal ? top brand computer terminal ? HP terminal, DEC terminal, …


Term Extraction : Term Extraction Determine most relevant phrases as terms Linguistic Methods Rules over linguistically analyzed text Linguistic analysis – Part-of-Speech Tagging, Morphological Analysis, … Extract patterns – Adjective-Noun, Noun-Noun, Adj-Noun-Noun, … Ignore Names (DEC, HP, …), Certain Adjectives (quality, top, …), etc. Statistical Methods Co-occurrence (collocation) analysis for term extraction within the corpus Comparison of frequencies between domain and general corpora Computer Terminal will be specific to the Computer domain Dining Table will be less specific to the Computer domain Hybrid Methods Linguistic rules to extract term candidates Statistical (pre- or post-) filtering


Statistical Analysis : Statistical Analysis Scores used in Term Extraction: MI (Mutual Information) – Cooccurrence Analysis TFIDF – Term Weighting 2 (Chi-square) – Cooccurrence Analysis & Term Weighting Other c-value/nc-value (Frantzi & Ananiadou, 1999) Considers length (c-value) and context (nc-value) of terms Domain Relevance & Domain Consensus (Navigli and Velardi, 2004) Considers term distribution within (DC) and between (DR) corpora


TFIDF : TFIDF most popular weighting schema (normalized word frequency) tf(w) term frequency (number of word occurrences in a document) df(w) document frequency (number of documents containing the word) N number of all documents tfIdf(w) relative importance of the word in the document The word is more important if it appears several times in a target document The word is more important if it appears in less documents


C- / NC-value : C- / NC-value Combination of: C-value (indicator for termhood) NC-value (contextual indicators for termhood) C-value (frequency-based method sensitive to multi-word terms)


C- / NC-value : C- / NC-value NC-value (incorporation of information from context words indicating termhood) C-/NC-value


Terms - Evaluation : Terms - Evaluation Gold Standard handcrafted term lists (e.g. [Frantzi and Ananiadou 1999]) domain-specific ontology (vocabulary overlap) (e.g. [Mädche 2002]) Human Evaluation assessment of relevance of terms a posteriori Task-based Indirect evaluation of coverage in IE, IR, ...


Terms – Tools : Terms – Tools


Slide46 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


(Multilingual) Synonyms : (Multilingual) Synonyms Next step in ontology learning is to identify terms that share (some) semantics, i.e., potentially refer to the same concept Synonyms (Within Languages) ‘100% synonyms’ don’t exist – only term pairs with similar meanings Examples from http://thesaurus.com terminal – video display – input device graphics terminal - video display unit - screen Translations (Between Languages) ‘100% translations’ don’t exist - only multilingual term pairs with similar meanings Examples from http://dict.leo.org input device (English) – Eingabegerät (German) Back to English: input device, input unit, signal conditioning device video display unit (English) – Videosichtgerät (German)


Extraction of Synonyms : Extraction of Synonyms Term Classification and Clustering Classification Classifying terms to existing class systems, e.g., by extending WordNet (with SynSets corresponding to classes) Clustering Clusters according to similar distributions, e.g., by measuring co-occurrence between terms


Extraction of Translations : Extraction of Translations Multilingual Term Classification and Clustering - see e.g. [Grefenstette, 1998] Similar as with monolingual terms, but depending on translated contexts (i.e., document collections): Parallel Corpora: Pairs of translated documents Comparable Corpora: Pairs of documents in different languages on the same topic In both cases ‘need to cross the language barrier’ Parallel Corpora: Term alignment according to document structure (layout, linguistic, semantic) Comparable Corpora: Term alignment according to similar contexts, e.g. by translating context words (dictionary lookup)


Synoyms - Evaluation : Synoyms - Evaluation Gold Standard TOEFL (Landauer – LSA: 64.45%, Turney – PMI-IR: 48-74%) WordNet (problematic due to domain-independence, e.g. [Pantel and Lin 03]) WordNet „tuning“, e.g. [Cucchiarelli and Velardi 98], [Turcato 00], [Buitelaar and Sacaleanu 01] Human Evaluation Task-based (Cross-lingual ) IR/QA - e.g. Query Expansion Other Artificial Evaluation (see [Grefenstette 94]) e.g. transform cell -> CELL in some contexts


Synonyms – Tools : Synonyms – Tools


Slide52 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


The Semiotic Triangle : The Semiotic Triangle Ogden & Richards, 1923 based on Structural Linguistics studies (de Saussure, 1916) adopted in Knowledge Representation (e.g. Sowa, 1984)


Concepts: Intension, Extension, Lexicon : Concepts: Intension, Extension, Lexicon A term may indicate a concept, if we can define its Intension (in)formal definition of the set of objects that this concept describes a disease is an impairment of health or a condition of abnormal functioning Extension a set of objects (instances) that the definition of this concept describes influenza, cancer, heart disease, … Discussion: what is an instance? - ‘heart disease’ or ‘my uncle’s heart disease’ Lexical Realizations the term itself and its multilingual synonyms disease, illness, Krankheit, maladie, … Discussion: synonyms vs. instances – ‘disease’, ‘heart disease’, ‘cancer’, …


Concepts – Intension : Concepts – Intension Extraction of a Definition for a Concept from Text Informal Definition e.g., a gloss for the concept as used in WordNet OntoLearn (Navigli and Velardi 04; Velardi et al. 05) uses natural language generation to compositionally build up a WordNet gloss for automatically extracted concepts ‘Integration Strategy’ : “strategy for the integration of …” Formal Definition e.g., a logical form that defines all formal constraints on class membership Inductive Logic Programming, Formal Concept Analysis, …


Concepts – Extension : Concepts – Extension Extraction of Instances for a Concept from Text Commonly referred to as Ontology Population Relates to Knowledge Markup (Semantic Metadata) Uses Named-Entity Recognition and Information Extraction Instances can be: Names for objects, e.g. Person, Organization, Country, City, … Event instances (with participant and property instances), e.g. Football Match (with Teams, Players, Officials, ...) Disease (with Patient-Name, Symptoms, Date, …)


Concepts – Lexicon: LingInfo : Concepts – Lexicon: LingInfo


Concept Formation - Evaluation : Concept Formation - Evaluation Concept Extension Gold Standard overlap on clusters, e.g. OntoBasis overlap on set of instances w.r.t. KB (difficult) Human Evaluation (e.g. OntoBasis) Task Based QA from KBs Concept Intension (in/formal definitions) Gold Standard (e.g. WordNet glosses, WikiPedia) Human Evaluation (e.g. WordNet glosses [Velardi et al. 05]) Task Based Ontology Engineering Understanding Consistency


Concept Formation – Tools : Concept Formation – Tools


Slide60 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


Taxonomy Extraction - Overview : Taxonomy Extraction - Overview Lexico-syntactic patterns Distributional Similarity & Clustering Linguistic Approaches Taxonomy Extension/Refinement Combination of Methods Evaluation Tools Matrix


Hearst Patterns [Hearst 1992] : Hearst Patterns [Hearst 1992] Patterns to extract a relation of interest fullfilling the following requirements: They should occur frequently and in many text genres. They should accurately indicate the relation of interest. They should be recognizable with little or no pre-encoded knowledge.


Acquiring Hearst Patterns : Acquiring Hearst Patterns Hearst also suggests a procedure in order to acquire such patterns from a corpus: Decide on a lexical relation R of interest, e.g. hyponymy/hypernymy. Gather a list of terms for which this relation is known to hold, e.g. hyponym(car, vehicle). This list can be found automatically using the Hearst patterns or by bootstrapping from an existing lexicon or knowledge base. Find places in the corpus where these expressions occur syntactically near one another. Find the commonalities and generalize the expressions in 3. to yield patterns that indicate the relation of interest. Once a new pattern has been identified, gather more instances of the target relation and go to step 3.


Hearst Patterns - Examples : Hearst Patterns - Examples Examples for hyponymy patterns: Vehicles such as cars, trucks and bikes Such fruits as oranges, nectarines or apples Swimming, running and other activities Publications, especially papers and books A seabass is a fish.


Hearst Patterns (Continued) : Hearst Patterns (Continued) Use regular expression defined over syntactic categories: NP such as NP, NP, ... and NP Such NP as NP, NP, ... or NP NP, NP, ... and other NP NP, especially NP, NP ,... and NP NP is a NP. ... Precision wrt. Wordnet: 55,46% (66/119) on the basis of New York Times corpus [Cederberg and Widdows 03] report lower results: 40%


Extensions of Hearst’s approach : Extensions of Hearst’s approach Using Hearst Patterns for Anaphora Resolution Poesio et al. 02 / Markert et al. 03 Additional Patterns [Iwanska et al. 00] Using Questions [Sundblad 02] Application to collateral texts [Ahmad et al. 03] Matching patterns on the Web KnowItAll [Etzioni et al. 04-05], PANKOW [Cimiano et al. 04-05] Improving Accuracy (LSA) & Coverage (Conjunctions) [Cederberg and Widdows 03 ] Learning Patterns Snowball [Agichtein et al. 00], [Downey et al. 04], [Ravichandran and Hovy 02], [Snow et al. 04])


Improving Precision and Recall of Hearst patterns [Cederberg and Widdows 03] : Improving Precision and Recall of Hearst patterns [Cederberg and Widdows 03] Main Idea: Improve precision by filtering hyponym pairs using their similarity in WordSpace (error reduction by 30%, P=58%) Improve recall by using coordination information, i.e. A < B and coordinated(A,C) -> C < B This yields a five-fold increase in recall while mantaining precision at P=54% using the WordSpace filtering technique.


Generalizing Patterns : Generalizing Patterns Pantel, Ravichandran, Hovy using edit distance as a basis to generalize patterns Snowball [Agichtein et al. 00] patterns as triples of bag-of-words represented as vectors, i.e. (left,arg1,middle,arg2,right) use dot product to calculate similarity calculating centroid as a generalization of the pattern Other [Downey et al. 04] [Ravichandran and Hovy 02] [Snow et al. 04]


Taxonomy Extraction - Overview : Taxonomy Extraction - Overview Lexico-syntactic patterns Distributional Similarity & Clustering Linguistic Approaches Taxonomy Extension/Refinement Combination of Methods Evaluation Tools Matrix


Distributional Hypothesis & Vector Space Model : Distributional Hypothesis & Vector Space Model Harris, 1986 „Words are (semantically) similar to the extent to which they share similar words“ Firth, 1957 „You shall know a word by the company it keeps“ Idea: collect context information and represent it as a vector: compute similarity among vectors wrt. a measure


Context Features : Context Features Four-grams [Schuetze 93] Word-windows [Grefenstette 92] Predicate-Argument relations (SUBJ/OBJ/COMPLEMENT) Modifier Relations (fast car, the hood of the car) [Grefenstette 92, Cimiano 04b, Gasperin et al. 03] Appositions (Ferrari, the fastest car in the world) [Caraballo 99] Coordination (ladies and gentlemen) [Caraballo 99, Dorow and Widdows 03]


Extracting contextual features : Extracting contextual features The museum houses an impressive collection of medieval and modern art. The building combines geometric abstraction with classical references that allude to the Roman influence on the region. house_subj(museum) house_obj(collection) combine_subj(museum) combine_obj(abstraction) combine_with(reference) allude_to(influence)


Clustering Concept Hierarchies from Text : Clustering Concept Hierarchies from Text Similarity-based Set-theoretical Soft clustering


Similarity-based Clustering : Similarity-based Clustering Similarity Measures: Binary (Jaccard, Dine) Geometric (Cosine, Euclidean/Manhattan distance) Information-theoretic (Relative Entropy, Mutual Information) (…) Linkage Strategies: Complete linkage Average linkage Single linkage (…) Methods: Hierarchical agglomerative clustering Hierarchical top-down clustering, e.g. Bi-Section KMeans (…)


Hierarchical Agglomerative Clustering : Hierarchical Agglomerative Clustering car bus trip excursion


Bi-Section-KMeans : Bi-Section-KMeans


Clustering Concept Hierarchies : Clustering Concept Hierarchies Similarity-based Set Theoretical Soft clustering


Formal Concept Analysis [Ganter, Wille 1999] : Formal Concept Analysis [Ganter, Wille 1999] finds ‚closed‘ sets of attributes and objects (Formal Concepts) yields a hierarchy with a formal interpretation in terms of subsumption of attributes


Clustering – Comparison [Cimiano 04] : Clustering – Comparison [Cimiano 04]


Clustering Concept Hierarchies from Text : Clustering Concept Hierarchies from Text Similarity-based Set-theoretical & Probabilistic Soft clustering


What About Multiple Word Meanings? : What About Multiple Word Meanings? bank: financial institute or natural object? At least two clusters! So we need soft clustering algorithms: Clustering By Committee (CBC) [Lin et al. 2002] Gaussian Mixtures (EM) PoBOC (Pole-Based Overlapping Clustering) FCA (...) Challenge: recognize multiple word meanings!


Soft clustering aglorithms : Soft clustering aglorithms Principle underlying POBOC and CBC: Construct first `poles‘ or ´committees´ corresponding to very homogeneous groups of words, e.g. monosemous words At a second step, assign words which do not form poles or committes to one or more committees; these are the ambiguos words Additional trick in CBC: once you assign a word to a committe, remove the overlapping features, i.e. substract the `meaning of the committee´


Approach by [Widdows and Dorow 2002] : Approach by [Widdows and Dorow 2002] Extract shallow grammatical relations for words -> build a context vector. Apply LSA/LSI to reduce dimension of co-occurrence matrix. Calculate similarity as the cosine between the angle of the corresponding vectors. Senses of a word = disjoint subgraphs


Scalability : Scalability Problem with clustering algorithms: Compute at least pairwise similarity between words, i.e. O(n2k) Idea of [Ravichandran, Pantel and Hovy] Apply locality sensitive hash functions i.e. approximate cosine measure by a randomized procedure


Randomly approximating the cosine measure : Randomly approximating the cosine measure where d is the number of random vectors!


Taxonomy Extraction - Overview : Taxonomy Extraction - Overview Lexico-syntactic patterns Distributional Similarity & Clustering Linguistic Approaches Taxonomy Extension/Refinement Combination of Methods Evaluation Tools Matrix


Linguistic Approaches : Linguistic Approaches Modifiers: Modifiers (adjectives/nouns) typically restrict or narrow down the meaning of the modified noun, i.e. e.g. isa(international credit card, credit card) Yields a very accurate heuristic for learning taxonomic relations, e.g. OntoLearn [Velardi & Navigli], OntoLT [Buitelaar et al., 2004], TextToOnto [Cimiano et al.], [Sanchez et al., 2005] Compositional interpretation of compounds [OntoLearn] e.g. long-term debt Disambiguate long-term and debt with respect to WordNet Generate a gloss out of the glosses of the respective synsets: long-term debt := „a kind of debt, the state of owing something (especially money), relating to or extending over a relatively long time“


Taxonomy Extraction - Overview : Taxonomy Extraction - Overview Lexico-syntactic patterns Distributional Similarity & Clustering Linguistic Approaches Taxonomy Extension/Refinement Combination of Methods Evaluation Tools Matrix


General Problem : General Problem


Hearst & Schuetze 1993 : Hearst & Schuetze 1993 For each word w in WordSpace: collect the 20 nearest neighbors in space using the cosine measure, compute the score si of category i for w as the number of nearest neighbors that are in i, and assign w to the highest scoring category.


Widdows 2003 : Widdows 2003 For a target word w, find words from the corpus which are similar to those of w. Consider these corpus-derived neighbors N(w) Map the target word w to the place in the taxonomy where the neighbors N(w) are most concentrated. Crucial question: What does most concentrated mean?


Determine where they are `most concentrated´ : Determine where they are `most concentrated´ Maximization problem:


Taxonomy Extension/Refinement : Taxonomy Extension/Refinement Conclusions: difficult problem approaches not comparable (datasets, measures, ontologies, number of concepts,...)


Taxonomy Extraction - Overview : Taxonomy Extraction - Overview Lexico-syntactic patterns Distributional Similarity & Clustering Linguistic Approaches Taxonomy Extension/Refinement Combination of Methods Evaluation Tools Matrix


Initial Blueprints for Combination : Initial Blueprints for Combination Ontology learning is error-prone, combination of techniques can be expected to make results more accurate: [Caraballo 99] Label tree produced with hierarchical agglomerative clustering using lexico-syntactic patterns [Cimiano 05b/c] Guided Clustering Integrate a hypernym oracle with agglomerative clustering Classification-based approach use features derived from several learning paradigms [Cederberg & Widdows 03] Increase accuracy and coverage of lexico-syntactic patterns by using LSA and coordination patterns


Hierarchical Agglomerative Clustering with Postprocessing : Hierarchical Agglomerative Clustering with Postprocessing Caraballo’s Method [Caraballo 1999]: Agglomerative Clustering Labeling Clusters with hypernyms derived from Hearst patterns Removing unlabeled concepts thus compacting the hierarchy Evaluation: select 20 nouns with at least 20 hypernyms and present them to human judges with the 3 best hypernyms for each Results: Best Hypernym: 33% (Majority) / 39% (Any) Any Hypernym: 47.5% (Majority) / 60.5% (Any)


Classification-based approach [Cimiano et al. 2005b] : Classification-based approach [Cimiano et al. 2005b] isa(t1,t2)=p isaWN(t1,t2) isaHearst(t1,t2) isaWWW(t1,t2) isahead(t1,t2) Idea: Use as input features derived by applying different techniques, resources, etc. and find optimal combination in a supervised manner!


Results for Combination : Results for Combination


Concept Hierarchy - Evaluation : Concept Hierarchy - Evaluation Taxonomy Induction Gold Standard - comparison with hand-crafted taxonomy (e.g. [Mädche 01], [Cimiano 05a]) Human Evaluation of is-a triples (e.g. [Hearst 92] [Caraballo 99], [Cimiano 05b], [Cimiano 05c]) Taxonomy Extension/Refinement Gold Standard – leave-one-out method (e.g. [Mädche, Pekar and Staab 02]) Human Evaluation – a posteriori (e.g. [Hearst and Schütze 93]) Task-based WSD (e.g. [Agirre and Rigau 96]) IE (e.g. [Stevenson and Greenwood 05]) Text classification / clustering (e.g. [Bloehdorn et al. 05])


Concept Hierarchy – Tools : Concept Hierarchy – Tools


Slide101 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


Specific Relations / Attributes : Specific Relations / Attributes Part-of [Charniak et al. 98] X consists of Y Qualia [Yamada et al. 04, Cimiano & Wenderoth 05] Formal: such X as Y Purpose: X is used for Y Agentive: a ADV Xed Y Causation [Girju 02], [Sanchez 04] X leads to Y Attributes [Poesio and Almuhareb 05]


Attributes [Poesio et al. 2005] : Attributes [Poesio et al. 2005] Distinguish: Qualities (e.g. color of a car) Parts (e.g. hood of a car) Related-Objects (e.g. the track of the deer) Activities (e.g. the repairing of the car) Related-Agents (e.g. the driver of the car) Non-Attributes (e.g. the majority of the deer) Train classifier with the following features: Morphological Information Clustering Attributes on the basis of their attributes Issuing question patterns to Google (What is the color of ? vs. *When is the color of?) Attributive Use (the size of the X vs. The X of the size.) Results: 2-Way-Classifier: 89.2% (Attribute), 55.1% (Non-Attribute) 5-Way Classifier: 79.9 – 93% (Attributes), 60.2% (Non-Attribute)


Qualia Structures : Qualia Structures Match patterns on the web to discover qualia relations [Cimiano and Wenderoth,2005] Formal: Y such as X Telic: X is used for Y Constitutive: X is made of Y Evaluation: judge assigns credits from 0 (wrong) to 3 (totally correct)


General Relations: Exploiting Linguistic Structure : General Relations: Exploiting Linguistic Structure OntoLT: SubjToClass_PredToSlot_DObjToRange Heuristic Maps a linguistic subject to a class, its predicate to a corresponding slot for this class and the direct object to the range of the slot TextToOnto: Acquisition of Subcategorization Frames love(man,woman) love(kid,mother) love(kid,grandfather) Problem related to acquisition of subcategorization frames and selectional restrictions in Natural Language Processing e.g. [Resnik 97], [Ribas 95], [Clark and Weir 02] love(person,person)


Which Relations are Actually the Same? : Which Relations are Actually the Same? Clustering of verbs semantically according to their alternation behavior [Schulte im Walde 00] Use EM algorithm Examples: {advise, teach, instruct} {fly, move, roll} {start, finish, stop, begin} {fight, play} {meet, play} {need, like, want , desire}


Finding the Right Level of Abstraction : Finding the Right Level of Abstraction [Ciramita et al. 05] Genia Corpus. + Genia Ontology Verb-based relations X activates B Use X2 to decide to generalize or not (significance level) Results: 83.3% of relations correct according to human evaluation 53.1% correctly generalized


Relations - Evaluation : Relations - Evaluation Gold Standard e.g. [Cimiano et al. 06], [Schutz and Buitelaar 05], Mädche and Staab 00] Human Evaluation A posteriori (e.g. [Schutz and Buitelaar 05]) Task-based evaluation QA, e.g. `Who killed JFK?´ maps to KILL (X:person, Y:person) -> answer type is person


Relations – Tools : Relations – Tools


Slide110 : Ontology Learning Layer Cake Terms (Multilingual) Synonyms Concept Formation Concept Hierarchy Relations Axiom Schemata General Axioms Relation Hierarchy


Axiom Schemata & General Axioms : Axiom Schemata & General Axioms DIRT (Discovery of Inference Rules from Text [Lin et al. 01]) calculate significant collocations on dependency paths Examples: „X solves Y“ Y is solved by X, X resolves Y, X finds a solution to Y, X tries to solve Y, Y deals with X, Y is resolved by X, X addresses Y, X seeks a solution to Y, X do something about Y, ... AEON [Völker et al. 05]: Rigidity, Identity, Unity, Dependence [Haase and Völker 05] Disjointness Axioms on the basis of coordination: i.e. disjoint(man,woman)


Axioms & Rules - Evaluation : Axioms & Rules - Evaluation Gold Standard Human-defined axioms ([Völker et al. 05]) Human Evaluation A posteriori Task-based evaluation Consistency of Ontologies


Tools - Axioms : Tools - Axioms


Part IV : Part IV WrapUp


Overview : Overview What have we learned in the tutorial? Role of Ontologies in NLP and Vice Versa Definition of Tasks in Ontology Learning Where are we today? Variety of (incomparable) Methods Orientation Towards Comparison, Evaluation and Integration Where are we heading? Combinations of Methods Integration of (Combinations of ) Methods into Ontology Life-Cycle Formal Criteria for evaluation


What have we learned? : What have we learned? Ontologies and NLP: a crucial symbiosis Top-Down: Ontologies provide domain knowledge that can be employed in disambiguation, interpretation, reasoning, etc. Bottom-Up: NLP provides methods for data-driven ontology development Variety of tasks and techniques OL reuses techniques from NLP and ML Evaluation Lots of different types of evaluation (gold standard, human, other) Results often uncomparable (datasets, measures) Task-based evaluation is important


Where are we today? : Where are we today? A lot of methods, little combination, quite spurious results Similarity-based techniques lead to spurious results Pattern-based approaches lead to low recall Currently only initial blueprints for combination Applications No real application of automatically learned ontologies OL in Ontology Engineering How can ontology learning techniques be integrated into the process of ontology engineering? How can users be involved in semi-automatic quality assurance of OL (results)?


Where are we heading? : Where are we heading? Large scale data sets Web-based methods to reduce data sparseness Combination of methods Improve quality of results by compensating for drawbacks of different methods Comparison of methods Need for shared tasks, gold standards, and evaluation measures to move the field forward Applications Demonstrate benefit of automatically learned ontologies


Thanks for your attention! : Thanks for your attention! Any questions?


References : References [Abecker et al. 1997] - A. Abecker, S. Decker, K. Hinkelmann, U. Reimer. In: Proceedings of the International Workshop on Knowledge-Based Systems for Knowledge Management in Enterprises at the German AI Conference (KI-97), 1997. [Agichtein and Gravano, 2000] - E. Agichtein, L. Gravano. Snowball: Extracting Relations from Large Plain-Text Collections. In: Proceedings of the 5th ACM International Conference on Digital Libraries (ACM DL), pp. 85-94, 2000. [Agirre and Rigau 1996] - E. Agirre, G. Rigau. Word sense disambiguation using conceptual density. In: Proceedings of the International Conference on Computational Linguistics (COLING’96), pp. 16-22, 1996. [Ahmad et al. 2003] - K. Ahmad, M. Tariq, B. Vrusias, C. Handy. Corpus-Based Thesaurus Construction for Image Retrieval in Specialist Domains. In: Proceedings of the 25th European Conference on Advances in Information Retrieval (ECIR), pp. 502-510, 2003. [Alani et al. 2003] - H. Alani, S. Kim, D.E. Millard, M.J. Weal, W. Hall, P.H. Lewis, N. R. Shadbolt. Automatic Ontology-Based Knowledge Extraction from Web Documents. IEEE Intelligent Systems, 18(1), pp. 14-21, 2003.


References : References [Alfonseca and Manandhar, 2002] - E. Alfonseca, S. Manandhar. Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures. In: Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2002), pp. 1-7, 2002. [Baldewein et al. 2004] – U. Baldewein, K. Erk, S. Pado, D. Prescher. Semantic Role Labelling with Similarity-Based Generalization Using EM-based Clustering. In Proceedings of Senseval, 2004. [Beale et al.1995] - S. Beale, S. Nirenburg, K. Mahesh. Semantic Analysis in the Mikrokosmos Machine Translation Project. In: Proceedings of the 2nd Symposium on Natural Language Processing, pp. 297-307, 1995. [Bloehdorn et al. 2005] – S. Bloehdorn, P. Cimiano, A. Hotho, Learning Ontologies to Improve Text Clustering and Classification, In From Data and Information Analysis to Knowledge Engineering: Proceedings of the 29th Annual Conference of the German Classification Society (GfKl), 2005. [Bisson et al. 2000] - G. Bisson, C. Nedellec, L. Canamero. Designing clustering methods for ontology building - The Mo’K workbench. In: Proceedings of the ECAI Ontology Learning Workshop, pp. 13-19, 2000.


References : References [Buitelaar, Sintek 2004] – P. Buitelaar, M. Sintek. OntoLT Version 1.0: Middleware for Ontology Extraction from Text. In: Proceedings. of the Demo Session at the International Semantic Web Conference (ISWC), 2004. [Buitelaar et al. 2004b] – P. Buitelaar, D. Olejnik, M. Hutanu, A. Schutz, T. Declerck, M. Sintek. Towards Ontology Engineering Based on Linguistic Analysis. In: Proceedings of LREC, 2004. [Buitelaar et al . 2004c] - P. Buitelaar, D. Olejnik, M. Sintek. A Protégé Plug-In for Ontology Extraction from Text Based on Linguistic Analysis. In: Proceedings of the 1st European Semantic Web Symposium (ESWS), 2004. [Buitelaar et al. 2005] – P. Buitelaar, M. Sintek and M. Kiesel Feature Representation for Cross-Lingual Cross-Media Semantic Web Applications. In: Proc. of the ISWC Workshop on Knowledge Markup and Semantic Annotation (SemAnnot2005), 2005. [Buitelaar et al., 2006a] – P. Buitelaar, T. Eigner, G. Gulrajani, A. Schutz, M. Siegel, N. Weber, P. Cimiano, G. Ladwig, M. Mantel and H. Zhu, Generating and Visualizing a Soccer Knowledge Base, Demo Proceedings of EACL, 2006. [Buitelaar et al., 2006b] – P. Buitelaar, P. Cimiano. S. Racioppa, M. Siegel, Ontology-based information extraction with SOBA, Proceedings of LREC 2006, to appear.


References : References [Caraballo 1999] – S.A. Caraballo. Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 120-126, 1999. [Carreras and Márquez 2004] – X. Carreras, L. Márquez. Introduction to the CoNLL-2004 Shared Task: Semantic Role Labelling, in Proceedings of CoNLL, 2004. [Cederberg and Widdows 2003] – S. Cederberg, D. Widdows. Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In: Proceedings of the Conference on Natural Language Learning (CoNNL), 2003. [Charniak, Berland 1999] - E. Charniak, M. Berland. Finding parts in very large corpora. In: Proceedings of the 37th Annual Meeting of the ACL, pp. 57-64, 1999. [Ciramita et al. 2005] - M. Ciramita, A. Gangemi, E. Ratsch, J. Saric, I. Rojas. Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology. In. Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005.


References : References [Ciramita et al. 2003] - M. Ciramita, T. Hofmann, M. Johnson. Hierarchical Semantic Classification: Word Sense Disambiguation with World Knowledge. In. Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), 2003. [Cimiano et al. 2004] - P. Cimiano, S. Handschuh, S. Staab. Towards the Self-Annotating Web. IN: Proceedings of the 13th World Wide Web Conference, pp. 462-471, 2004. [Cimiano et al. 2004b] – P. Cimiano, A. Hotho, S. Staab. Comparing Conceptual, Partitional and Agglomerative Clustering for Learning Taxonomies from Text In: Proceedings of the European Conference on Artificial Intelligence (ECAI’04), pp. 435-439. IOS Press, 2004. [Cimiano and Staab 2004] - P. Cimiano, S. Staab. Learning by Googling, SIGKDD Explorations, 6(2), 2004. [Cimiano et al. 2005] - P. Cimiano, G. Ladwig, S. Staab. Gimme, The Context: Context-driven automatic semantic annotation with C-PANKOW, IN: Proceedings of the 14th World Wide Web Conference, 2005. [Cimiano et al. 2005b] - P. Cimiano, L. Schmidt-Thieme, A. Pivk, S. Staab, Learning Taxonomic Relations from Heterogeneous Evidence, Ontology Learning from Text: Methods, Applications and Evaluation, IOS Press, pp. 59-73, 2005.


References : References [Cimiano et al. 2005c] – P. Cimiano and S. Staab, Learning Concept Hierarchies from Text with a Guided Agglomerative Clustering Algorithm. In: Proceedings of the ICML 2005 Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods. 2005. [Cimiano and Wenderoth 2005] - P. Cimiano, J. Wenderoth, Automatically Learning Qualia Structures from the Web. In: Proceedings of the ACL Workshop on Deep Lexical Acquisition, pp. 28-37, 2005. [Cimiano and Hartung 2005] - P. Cimiano, M. Hartung, Automatically Learning Qualia Structures from the Web. In: Proceedings of the International Lexical Resources and Evaluation Conference (LREC), 2006, to appear. [Clark and Weir 2002] - S. Clark, D.J. Weir. Class-Based Probability Estimation Using a Semantic Hierarchy. Computational Linguistics, 28(2), pp. 187-206, 2002. [Cleuziou et al. 2004] - G. Cleuziou, L. Martin, C. Vrain. PoBOC: An Overlapping Clustering Algorithm, Application to Rule-Based Classification and Textual Data. In: Proceedings of the European Conference on Artificial Intelligence (ECAI), pp. 440-444, 2004. [Copestake et al. 1992] - Copestake, A., B. Jones, A. Sanfilippo, H. Rodriguez, P. Vossen, S. Montemagni, E. Marinai. Multilingual Lexical Representation. ESPRIT BRA-3030 ACQUILEX - WP No. 043, 1992.


References : References [Cucchiarelli and Velardi 1998] – A. Cucchiarelli and P. Velardi 1998. Finding a domain-appropriate sense inventory for semantically tagging a corpus. Natural Language Engineering, 4(4):325–344. [Ding et al. 2004] – L. Ding, T. Finin, A. Joshi and R. Pan, R.S. Cost, Y. Peng, P. Reddivari, V. Doshi, J. Sachs. Swoogle: A search and metadata engine for the semantic web. In: Proceedings 13th ACM Conference on Information and Knowledge Management, pp. 652–659, 2004. [Dorow and Widdows 2003] – B. Dorow, D. Widdows. Discovering Corpus-Specific Word Senses. In: Proceedings of EACL, pp. 79-82, 2003. [Downey et al. 2004] - D. Downey, O. Etzioni, S. Soderland, D. Weld. Learning Text Patterns for Web Information Extraction and Assessment. In: Proceedings of the AAAI Workshop on Adaptive Text Extraction and Mining, 2004. [Etzioni et al. 2004] - O. Etzioni, M. Cafarella, D. Downey, S. Kok, A.-M. Popescu, T. Shaked, S. Soderland, D.S. Weld, A. Yates, Web-Scale Information Extraction in KnowItAll (Preliminary Results), In: Proceedings of the 13th World Wide Web Conference, pp. 100-109, 2004. [Etzioni et al. 2005] - O. Etzioni, M. Cafarella, D. Downey, A-M. Popescu, T. Shaked, S. Soderland, D.S. Weld, A. Yates, Unsupervised Named-Entity Extraction from the Web: An Experimental Study. Artificial Intelligence, 165(1), pp. 91-134, 2005. [Faure and Nedellec, 1998] – D. Faure, C. Nedellec. A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: Proceedings of LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, 1998.


References : References [Fensel 2001] - D. Fensel, Ontologies: Silver bullet for knowledge management and electronic commerce, Springer, 2001. [Fillmore 1968] - C.J. Fillmore. The Case for Case. In: Bach, E., and Harms, R. (eds.). Universals in Linguistic Theory. New York: Holt, Reinhart, and Winston, 1968. [Firth 1957] - J. Firth, A synopsis of linguistic theory 1930-1955, Longman, Studies in Linguistic Analysis, Philological Society, 1957. [Frantzi and Ananiadou, 1999] – K.T. Frantzi, S. Ananiadou.The C-Value/NC-Value domain independent method for multi-word term extraction. Journal of Natural Language Processing, 6(3):145-179,1999. [Ganter and Wille 1999] – B. Ganter, R. Wille. Formal Concept Analysis – Mathematical Foundations, Springer Verlag, 1999. [Gasperin et al. 2001] - C. Gasperin, P. Gamallo, A. Agustini, G. Lopes and V. de Lima, Using Syntactic Contexts for Measuring Word Similarity. In: Proceedings of the ESSLLI Workshop on Semantic Knowledge Acquisition and Categorization, 2001. [Gaizauskas et al. 1995] - R. Gaizauskas, T. Wakao, K. Humphreys, H. CunninghamY. Wilks. Description of the LaSIE system as used for MUC-6. In Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann, California, 1995. [Gildea and Jurafksy 2002] - G. Gildea, D. Jurafsky. Auomatic Labeling of Semantic Roles. Computational Linguistics, 2002.


References : References [Girju et al. 2002] - R. Girju, D. Moldovan, Text Mining for Causal Relations, In: Proceedings of the FLAIRS Conference, pp. 360-364, 2002. [Gluschko et al. 1999] - R. J. Gluschko and J. M. Tenenebaum and B. Meltzer. An XML Framework for Agent-based E-Commerce. In: Communications of the ACM 42(3):106-114, 1999. [Gonzalo et al. 1998] - J. Gonzalo, F. Verdejo, I. Chugur, J. Cigarran, Indexing with WordNet synsets can improve Text Retrieval, In: Proceedings of the COLING/ACL '98 Workshop on Usage of WordNet for NLP, pp. 38-44, 1998. [Grefenstette, 1992] - Grefenstette. Sextant: Exploring unexplored contexts for semantic extraction from syntactic analysis. In: Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, Newark, Delaware, 28 June - 2 July 1992. [Grefenstette 1992] – G. Grefenstette. Evaluation techniques for automatic semantic extraction: Comparing syntactic and window-based approaches. In: Proceedings of the Workshop on Acquisition of Lexical Knowledge from Text, 1992. [Grefenstette 1994] – G. Grefenstette. Explorations in Automatic Thesaurus Discovery, Kluwer Academic Publishers, 1994. [Grefenstette 1998] – G. Grefenstette. Cross-Language Information Retrieval, Kluwer Academic Publishing, 1998. [Gruber 1993] - T.R. Gruber, Toward Principles for the Design of Ontologies Used for Knowledge Sharing, Formal Analysis in Conceptual Analysis and Knowledge Representation, Kluwer, 1993.


References : References [Guarino et al. 1999] - N. Guarino, C. Masolo, G. Vetere. OntoSeek: Content-Based Access to the Web. In: IEEE Intelligent Systems, 14(3), 70--80, 1999. [Haase and Völker, 2005] - P. Haase, J. Völker, Ontology Learning and Reasoning -- Dealing with Uncertainty and Inconsistency. In: Proceedings of the Workshop on Uncertainty Reasoning for the Semantic Web (URSW), 2005. [Hearst 1992] - M.A. Hearst, Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, pp. 539-545, 1992. [Hearst and Schütze 1993] – M.A. Hearst, H. Schütze. Customizing a lexicon to better suit a computational task. In: Proceedings of the ACL SIGLEX Workshop on Acquisition of Lexical Knowledge from Text, 1993. [Hendler 2000] - J. Heflin, J. Hendler. Searching the Web with SHOE, In: Papers from the AAAI Workshop on Artificial Intelligence for Web Search, pp. 35-40, 2000. [Hovy and Nirenburg 1992] – E. Hovy, S. Nirenburg. Approximating an interlingua in a principled way. In Proceedings of the Workshop on Speech and Natural Language, 1992. [Iwanska et al., 2000] - L.M. Iwanska, N. Mata, K. Kruger. Fully Automatic Acquisition of Taxonomic Knowledge from Large Corpora of Texts. Natural Language Processing and Knowledge Processing, 335--345, MIT/AAAI Press, 2000. [Kashyap 1999] - V. Kashyap. Design and Creation of Ontologies for Environmental Information Retrieval. Proceedings of the 11th European Workshop on Knowledge Acquisistion, Modeling,and Management (EKAW), 1999.


References : References [Kavalec and Svatek, 2005] – M. Kavalec, V. Svatek. A Study on Automated Relation Labelling. In Ontology Learning. In: P.Buitelaar, P. Cimiano, B. Magnini (eds.), Ontology Learning and Population from Text: Methods, Evaluation and Applications, IOS Press, 2005. [Kesseler 1996] - M. Kesseler. A Schema Based Approach to HTML Authoring. In: World Wide Web Journal 96(1), O’Reilly, 1996. [Knight 1993] – K. Knight. Building a Large Ontology for Machine Translation, In Proceedings of the DARPA Human Language Conference, 1993. [Lee 1999] – L. Lee. Measures of Distributional Similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp- 25-32, 1999. [Lee et al. 2004] – C.S. Lee, S. M. Guo and Z. W. Jian, Weighted Fuzzy Ontology for Chinese e-News Summarization, In. Proceedings of the IEEE International Conference on Fuzzy Systems, 2004. [Liddy, 1994] – E.D. Liddy, W. Pail, E.S. Yu, M. McKenna. Document Retrieval Using Linguistic Knowledge. In Proceedings of RIAO 94, pp. 106-114, 1994. [Lin and Pantel 2001] - D. Lin, P. Pantel, DIRT - Discovery of Inference Rules from Text. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 323--328, 2001. [Lin and Pantel 2001] - D. Lin, P. Pantel, Discovery of Inference rules for Question Answering. Natural Language Engineering, 7(4), pp. 343-360, 2001.


References : References [Lenci et al. 2002] - A. Lenci, A. Agua, R. Bartolini, S. Busemann, N. Calzolari, E. Cartier, K. Chevreau, and J. Coch. Multilingual summarization by integrating linguistic resources in the MLIS-MUSI project. In Proceedings of LREC, 2002. [Lopez and Motta 2004] – V. Lopez, E.Motta. Ontology-Driven