logging in or signing up WordSenseDisambiguat ion Crystal Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 132 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 12, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Word Sense Disambiguation and Information Retrieval : Word Sense Disambiguation and Information Retrieval By Guitao Gao Qing Ma Prof: Jian-Yun NieOutline: Outline Introduction WSD Approches ConclusionIntroduction: Introduction Task of Information Retrieval Content Repesentation Indexing Bag of words indexing Problems: Synonymy: query expansion Polysemy: Word Sense Disambiguation WSD Approaches: WSD Approaches Disambiguation based on manually created rules Disambiguation using machine readable dictionaries Disambiguation using thesauri Disambiguation based on unsupervised machine learning with corpora Disambiguation based on manually created rules: Disambiguation based on manually created rules Weiss’ approach [Lesk 1988] : set of rules to disambiguate five words context rule: within 5 words template rule: specific location accuracy : 90% IR improvement: 1% Small & Rieger’s approach [Small 1982] : Expert system Disambiguation using machine readable dictionaries : Disambiguation using machine readable dictionaries Lesk’s approach [Lesk 1988] : Senses are represented by different definitions Looked up context words definitions Find co-occurring words Select most similar sense Accuracy: 50% - 70%. Problem: no enough overlapping words between definitions Disambiguation using machine readable dictionaries: Disambiguation using machine readable dictionaries Wilks’ approach [Wilks 1990] : Attempt to solve Lesk’s problem Expanding dictionary definition Use Longman Dictionary of Contemporary English ( LDOCE ) more word co-occurring evidence collected Accuracy: between 53% and 85%. Wilks’ approach [Wilks 1990]: Wilks’ approach [Wilks 1990] Commonly co-occurring words in LDOCE. [Wilks 1990] Disambiguation using machine readable dictionaries: Disambiguation using machine readable dictionaries Luk’s approach [Luk 1995]: Statistical sense disambiguation Use definitions from LDOCE co-occurrence data collected from Brown corpus defining concepts : 1792 words used to write definitions of LDOCE LDOCE pre-processed :conceptual expansion Luk’s approach [Luk 1995]:: Luk’s approach [Luk 1995]: Noun “sentence” and its conceptual expansion [Luk 1995] Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Collect co-occurrence data of defining concepts by constructing a two-dimensional Concept Co-occurrence Data Table (CCDT) Brown corpus divided into sentences collect conceptual co-occurrence data for each defining concept which occurs in the sentence Insert collect data in the Concept Co-occurrence Data Table. Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Score each sense S with respect to context C [Luk 1995] Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Select sense with the highest score Accuracy: 77% Human accuracy: 71%Approaches using Roget's Thesaurus [Yarowsky 1992] : Approaches using Roget's Thesaurus [Yarowsky 1992] Resources used: Roget's Thesaurus Grolier Multimedia Encyclopedia Senses of a word: categories in Roget's Thesaurus 1042 broad categories covering areas like, tools/machinery or animals/insects Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. tool, implement, appliance, contraption, apparatus, utensil, device, gadget, craft, machine, engine, motor, dynamo, generator, mill, lathe, equipment, gear, tackle, tackling, rigging, harness, trappings, fittings, accoutrements, paraphernalia, equipage, outfit, appointments, furniture, material, plant, appurtenances, a wheel, jack, clockwork, wheel-work, spring, screw, Some words placed into the tools/machinery category [Yarowsky 1992] Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Collect context for each category: From Grolier Encyclopedia each occurrence of each member of the category extracts 100 surrounding words Sample occurrence of words in the tools/machinery category [Yarowsky 1992] Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Identify and weight salient words: Sample salient words for Roget categories 348 and 414 [Yarowsky 1992] To disambiguate a word: sums up the weights of all salient words appearing in context Accuracy: 92% disambiguating 12 words Introduction to WordNet(1): Introduction to WordNet(1) Online thesaurus system Synsets: Synonymous Words Hierachical Relationship Introduction to WordNet(2): Introduction to WordNet(2) [Sanderson 2000] Voorhees’ Disambg. Experiment: Voorhees’ Disambg. Experiment Calculation of Semantic Distance: Synset and Context words Word’s Sense: Synset closest to Context Words Retrieval Result: Worse than non-Disambig. Gonzalo’s IR experiment(1): Gonzalo’s IR experiment(1) Two Questions Can WordNet really offer any potential for text retrieval How is text Retrieval performance affected by the disambiguation errors? Gonzalo’s IR experiment(2): Gonzalo’s IR experiment(2) Text Collection: Summary and Document Experiments 1. Standard Smart Run 2. Indexed In Terms of Word-Sense 3. Indexed In Terms of Synset 4. Introduction of Disambiguation Error Gonzalo’s IR experiment(3): Gonzalo’s IR experiment(3) Experiements %correct document retrieved Indexed by synsets 62.0 Indexing by word senses 53.2 Indexing by words 48.0 Indexing by synsets(5% error) 62.0 Id. with 10% errors 60.8 Id. with 20% errors 56.1 Id. with 30% errors 54.4 Id. with all possible 52.6 Id. with 60% errors 49.1 Gonzalo’s IR experiment(4): Gonzalo’s IR experiment(4) Disambiguation with WordNet can improve text retrieval Solution lies in reliable Automatic WSD techniqueDisambiguation With Unsupervised Learning: Disambiguation With Unsupervised Learning Yarowsky’s Unsupervised Method One Sense Per Collocation eg: Plant(manufacturing/life) One Sense Per Discourse eg: defense(War/Sports)Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Algorithm Details Step1:Store Word and its contexts as line eg:….zonal distribution of plant life….. Step2: Identify a few words that represent the word Sense eg. plant(manufacturing/life) Step3a: Get rules from the training set plant + X => A, weight plant + Y => B, weight Step3b:Use the rules created in 3a to classify all occurrences of plant sample set. Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Step3c: Use one-sense-per-discourse rule to filter or augment this addition Step3d: Repeat Step 3 a-b-c iteratively. Step4: the training converges on a stable residual set. Step 5: the result will be a set of rules. Those rules will be used to disambiguate the word “plant”. eg. plant + growth => life plant + car => manufacturing Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Advantages of this method: Better accuracy compared to other unsupervised method No need for costly hand-tagged training sets(supervised method) Schütze and Pedersen’s approach[Schütze 1995] : Schütze and Pedersen’s approach [Schütze 1995] Source of word sense definitions Not using a dictionary or thesaurus Only using only the corpus to be disambiguated (Category B TREC-1 collection ) Thesaurus construction Collect a (symmetric ) term-term matrix C Entry cij : number of times that words i and j co-occur in a symmetric window of total size k Use SVD to reduce the dimensionality Schütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Thesaurus vector: columns Semantic similarity: cosine between columns Thesaurus: associate each word with its nearest neighbors Context vector: summing thesaurus vectors of context wordsSchütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Disambiguation algorithm Identify context vectors corresponding to all occurrences of a particular word Partition them into regions of high density Tag a sense for each such region Disambiguating a word: Compute context vector of its occurrence Find the closest centroid of a region Assign the occurrence the sense of that centroidSchütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Accuracy: 90% Application to IR replacing the words by word senses sense based retrieval’s average precision for 11 points of recall increased 4% with respect to word based. Combine the ranking for each document: average precision increased: 11% Each occurrence is assigned n(2,3,4,5) senses; average precision increased: 14% for n=3 Schütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont.Conclusion: Conclusion How much can WSD help improve IR effectiveness? Open question Weiss: 1%, Voorhees’ method : negative Krovetz and Croft, Sanderson : only useful for short queries Schütze and Pedersen’s approaches and Gonzalo’s experiment : positive result WSD must be accurate to be useful for IR Schütze and Pedersen’s, Yarowsky’s algorithm: promising for IR Luk’s approach : robust for data sparse, suitable for small corpus. References: References [Krovetz 92] R. Krovetz & W.B. Croft (1992). Lexical Ambiguity and Information Retrieval, in ACM Transactions onInformation Systems, 10(1). Gonzalo 1998] J. Gonzalo, F. Verdejo, I. Chugur and J. Cigarran, “Indexing with WordNet synsets can improve Text Retrieval”, Proceedings of the COLING/ACL ’98 Workshop on Usage of WordNet for NLP, Montreal,1998 [Gonzalo 1992] R. Krovetz & W.B. Croft . “Lexical Ambiguity and Information Retrieval”, in ACM Transactions on Information Systems, 10(1), 1992 [Lesk 1988] M. Lesk , “They said true things, but called them by wrong names” – vocabulary problems in retrieval systems, in Proc. 4th Annual Conference of the University of Waterloo Centre for the New OED, 1988 [Luk 1995] A.K. Luk. “Statistical sense disambiguation with relatively small corpora using dictionary definitions”. In Proceedings of the 33rd Annual Meeting of the ACL, Columbus, Ohio, June 1995. Association for Computational Linguistics. [Salton 83] G. Salton & M.J. McGill (1983). Introduction To Modern Information Retrieval. The SMART and SIRE experimental retrieval systems, in New York: McGraw-Hill [Sanderson 1997] Sanderson, M. Word Sense Disambiguation and Information Retrieval, PhD Thesis, Technical Report (TR-1997-7) of the Department of Computing Science at the University of Glasgow, Glasgow G12 8QQ, UK. [Sanderson 2000] Sanderson, Mark, “Retrieving with Good Sense”, http://citeseer.nj.nec.com/sanderson00retrieving.html, 2000 References cont.: References cont. [Schütze 1995] H. Schütze & J.O. Pedersen. “Information retrieval based on word senses”, in Proceedings of the Symposium on Document Analysis and Information Retrieval, 4: 161-175. [Small 1982] S. Small & C. Rieger , “Parsing and comprehending with word experts (a theoryand its realisation) ” in Strategies for Natural Language Processing, W.G. Lehnert & M.H. Ringle, Eds., LEA: 89-148, 1982 [Voorhees 1993] E. M. Voorhees, “Using WordNet™ to disambiguate word sense for text retrieval, in Proceedings of ACM SIGIR Conference”, (16): 171-180. 1993 [Weiss 73] S.F. Weiss (1973). Learning to disambiguate, in Information Storage and Retrieval, 9:33-41, 1973 [Wilks 1990] Y. Wilks, D. Fass, C. Guo, J.E. Mcdonald, T. Plate, B.M. Slator (1990). ProvidingMachine Tractable Dictionary Tools, in Machine Translation, 5: 99-154, 1990 [Yarowsky 1992] D. Yarowsky, `“Word sense disambiguation using statistical models of Roget’s categories trained on large corpora, in Proceedings of COLING Conference”: 454-460, 1992 [Yarowsky 1994] Yarowsky, D. “Decision lists for lexical ambiguity resolution:Application to Accent Restoration in Spanish and French.” In Proceedings of the 32rd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 1994 [Yarowsky 1995] Yarowsky, D. “Unsupervised word sense disambiguation rivaling supervised methods.” In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189-- 196, Cambridge, MA, 1995 You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
WordSenseDisambiguat ion Crystal Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 132 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 12, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Word Sense Disambiguation and Information Retrieval : Word Sense Disambiguation and Information Retrieval By Guitao Gao Qing Ma Prof: Jian-Yun NieOutline: Outline Introduction WSD Approches ConclusionIntroduction: Introduction Task of Information Retrieval Content Repesentation Indexing Bag of words indexing Problems: Synonymy: query expansion Polysemy: Word Sense Disambiguation WSD Approaches: WSD Approaches Disambiguation based on manually created rules Disambiguation using machine readable dictionaries Disambiguation using thesauri Disambiguation based on unsupervised machine learning with corpora Disambiguation based on manually created rules: Disambiguation based on manually created rules Weiss’ approach [Lesk 1988] : set of rules to disambiguate five words context rule: within 5 words template rule: specific location accuracy : 90% IR improvement: 1% Small & Rieger’s approach [Small 1982] : Expert system Disambiguation using machine readable dictionaries : Disambiguation using machine readable dictionaries Lesk’s approach [Lesk 1988] : Senses are represented by different definitions Looked up context words definitions Find co-occurring words Select most similar sense Accuracy: 50% - 70%. Problem: no enough overlapping words between definitions Disambiguation using machine readable dictionaries: Disambiguation using machine readable dictionaries Wilks’ approach [Wilks 1990] : Attempt to solve Lesk’s problem Expanding dictionary definition Use Longman Dictionary of Contemporary English ( LDOCE ) more word co-occurring evidence collected Accuracy: between 53% and 85%. Wilks’ approach [Wilks 1990]: Wilks’ approach [Wilks 1990] Commonly co-occurring words in LDOCE. [Wilks 1990] Disambiguation using machine readable dictionaries: Disambiguation using machine readable dictionaries Luk’s approach [Luk 1995]: Statistical sense disambiguation Use definitions from LDOCE co-occurrence data collected from Brown corpus defining concepts : 1792 words used to write definitions of LDOCE LDOCE pre-processed :conceptual expansion Luk’s approach [Luk 1995]:: Luk’s approach [Luk 1995]: Noun “sentence” and its conceptual expansion [Luk 1995] Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Collect co-occurrence data of defining concepts by constructing a two-dimensional Concept Co-occurrence Data Table (CCDT) Brown corpus divided into sentences collect conceptual co-occurrence data for each defining concept which occurs in the sentence Insert collect data in the Concept Co-occurrence Data Table. Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Score each sense S with respect to context C [Luk 1995] Luk’s approach [Luk 1995] cont.: Luk’s approach [Luk 1995] cont. Select sense with the highest score Accuracy: 77% Human accuracy: 71%Approaches using Roget's Thesaurus [Yarowsky 1992] : Approaches using Roget's Thesaurus [Yarowsky 1992] Resources used: Roget's Thesaurus Grolier Multimedia Encyclopedia Senses of a word: categories in Roget's Thesaurus 1042 broad categories covering areas like, tools/machinery or animals/insects Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. tool, implement, appliance, contraption, apparatus, utensil, device, gadget, craft, machine, engine, motor, dynamo, generator, mill, lathe, equipment, gear, tackle, tackling, rigging, harness, trappings, fittings, accoutrements, paraphernalia, equipage, outfit, appointments, furniture, material, plant, appurtenances, a wheel, jack, clockwork, wheel-work, spring, screw, Some words placed into the tools/machinery category [Yarowsky 1992] Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Collect context for each category: From Grolier Encyclopedia each occurrence of each member of the category extracts 100 surrounding words Sample occurrence of words in the tools/machinery category [Yarowsky 1992] Approaches using Roget's Thesaurus [Yarowsky 1992] cont.: Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Identify and weight salient words: Sample salient words for Roget categories 348 and 414 [Yarowsky 1992] To disambiguate a word: sums up the weights of all salient words appearing in context Accuracy: 92% disambiguating 12 words Introduction to WordNet(1): Introduction to WordNet(1) Online thesaurus system Synsets: Synonymous Words Hierachical Relationship Introduction to WordNet(2): Introduction to WordNet(2) [Sanderson 2000] Voorhees’ Disambg. Experiment: Voorhees’ Disambg. Experiment Calculation of Semantic Distance: Synset and Context words Word’s Sense: Synset closest to Context Words Retrieval Result: Worse than non-Disambig. Gonzalo’s IR experiment(1): Gonzalo’s IR experiment(1) Two Questions Can WordNet really offer any potential for text retrieval How is text Retrieval performance affected by the disambiguation errors? Gonzalo’s IR experiment(2): Gonzalo’s IR experiment(2) Text Collection: Summary and Document Experiments 1. Standard Smart Run 2. Indexed In Terms of Word-Sense 3. Indexed In Terms of Synset 4. Introduction of Disambiguation Error Gonzalo’s IR experiment(3): Gonzalo’s IR experiment(3) Experiements %correct document retrieved Indexed by synsets 62.0 Indexing by word senses 53.2 Indexing by words 48.0 Indexing by synsets(5% error) 62.0 Id. with 10% errors 60.8 Id. with 20% errors 56.1 Id. with 30% errors 54.4 Id. with all possible 52.6 Id. with 60% errors 49.1 Gonzalo’s IR experiment(4): Gonzalo’s IR experiment(4) Disambiguation with WordNet can improve text retrieval Solution lies in reliable Automatic WSD techniqueDisambiguation With Unsupervised Learning: Disambiguation With Unsupervised Learning Yarowsky’s Unsupervised Method One Sense Per Collocation eg: Plant(manufacturing/life) One Sense Per Discourse eg: defense(War/Sports)Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Algorithm Details Step1:Store Word and its contexts as line eg:….zonal distribution of plant life….. Step2: Identify a few words that represent the word Sense eg. plant(manufacturing/life) Step3a: Get rules from the training set plant + X => A, weight plant + Y => B, weight Step3b:Use the rules created in 3a to classify all occurrences of plant sample set. Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Step3c: Use one-sense-per-discourse rule to filter or augment this addition Step3d: Repeat Step 3 a-b-c iteratively. Step4: the training converges on a stable residual set. Step 5: the result will be a set of rules. Those rules will be used to disambiguate the word “plant”. eg. plant + growth => life plant + car => manufacturing Yarowsky’s Unsupervised Method cont.: Yarowsky’s Unsupervised Method cont. Advantages of this method: Better accuracy compared to other unsupervised method No need for costly hand-tagged training sets(supervised method) Schütze and Pedersen’s approach[Schütze 1995] : Schütze and Pedersen’s approach [Schütze 1995] Source of word sense definitions Not using a dictionary or thesaurus Only using only the corpus to be disambiguated (Category B TREC-1 collection ) Thesaurus construction Collect a (symmetric ) term-term matrix C Entry cij : number of times that words i and j co-occur in a symmetric window of total size k Use SVD to reduce the dimensionality Schütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Thesaurus vector: columns Semantic similarity: cosine between columns Thesaurus: associate each word with its nearest neighbors Context vector: summing thesaurus vectors of context wordsSchütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Disambiguation algorithm Identify context vectors corresponding to all occurrences of a particular word Partition them into regions of high density Tag a sense for each such region Disambiguating a word: Compute context vector of its occurrence Find the closest centroid of a region Assign the occurrence the sense of that centroidSchütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont. Accuracy: 90% Application to IR replacing the words by word senses sense based retrieval’s average precision for 11 points of recall increased 4% with respect to word based. Combine the ranking for each document: average precision increased: 11% Each occurrence is assigned n(2,3,4,5) senses; average precision increased: 14% for n=3 Schütze and Pedersen’s approach[Schütze 1995] cont.: Schütze and Pedersen’s approach [Schütze 1995] cont.Conclusion: Conclusion How much can WSD help improve IR effectiveness? Open question Weiss: 1%, Voorhees’ method : negative Krovetz and Croft, Sanderson : only useful for short queries Schütze and Pedersen’s approaches and Gonzalo’s experiment : positive result WSD must be accurate to be useful for IR Schütze and Pedersen’s, Yarowsky’s algorithm: promising for IR Luk’s approach : robust for data sparse, suitable for small corpus. References: References [Krovetz 92] R. Krovetz & W.B. Croft (1992). Lexical Ambiguity and Information Retrieval, in ACM Transactions onInformation Systems, 10(1). Gonzalo 1998] J. Gonzalo, F. Verdejo, I. Chugur and J. Cigarran, “Indexing with WordNet synsets can improve Text Retrieval”, Proceedings of the COLING/ACL ’98 Workshop on Usage of WordNet for NLP, Montreal,1998 [Gonzalo 1992] R. Krovetz & W.B. Croft . “Lexical Ambiguity and Information Retrieval”, in ACM Transactions on Information Systems, 10(1), 1992 [Lesk 1988] M. Lesk , “They said true things, but called them by wrong names” – vocabulary problems in retrieval systems, in Proc. 4th Annual Conference of the University of Waterloo Centre for the New OED, 1988 [Luk 1995] A.K. Luk. “Statistical sense disambiguation with relatively small corpora using dictionary definitions”. In Proceedings of the 33rd Annual Meeting of the ACL, Columbus, Ohio, June 1995. Association for Computational Linguistics. [Salton 83] G. Salton & M.J. McGill (1983). Introduction To Modern Information Retrieval. The SMART and SIRE experimental retrieval systems, in New York: McGraw-Hill [Sanderson 1997] Sanderson, M. Word Sense Disambiguation and Information Retrieval, PhD Thesis, Technical Report (TR-1997-7) of the Department of Computing Science at the University of Glasgow, Glasgow G12 8QQ, UK. [Sanderson 2000] Sanderson, Mark, “Retrieving with Good Sense”, http://citeseer.nj.nec.com/sanderson00retrieving.html, 2000 References cont.: References cont. [Schütze 1995] H. Schütze & J.O. Pedersen. “Information retrieval based on word senses”, in Proceedings of the Symposium on Document Analysis and Information Retrieval, 4: 161-175. [Small 1982] S. Small & C. Rieger , “Parsing and comprehending with word experts (a theoryand its realisation) ” in Strategies for Natural Language Processing, W.G. Lehnert & M.H. Ringle, Eds., LEA: 89-148, 1982 [Voorhees 1993] E. M. Voorhees, “Using WordNet™ to disambiguate word sense for text retrieval, in Proceedings of ACM SIGIR Conference”, (16): 171-180. 1993 [Weiss 73] S.F. Weiss (1973). Learning to disambiguate, in Information Storage and Retrieval, 9:33-41, 1973 [Wilks 1990] Y. Wilks, D. Fass, C. Guo, J.E. Mcdonald, T. Plate, B.M. Slator (1990). ProvidingMachine Tractable Dictionary Tools, in Machine Translation, 5: 99-154, 1990 [Yarowsky 1992] D. Yarowsky, `“Word sense disambiguation using statistical models of Roget’s categories trained on large corpora, in Proceedings of COLING Conference”: 454-460, 1992 [Yarowsky 1994] Yarowsky, D. “Decision lists for lexical ambiguity resolution:Application to Accent Restoration in Spanish and French.” In Proceedings of the 32rd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 1994 [Yarowsky 1995] Yarowsky, D. “Unsupervised word sense disambiguation rivaling supervised methods.” In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189-- 196, Cambridge, MA, 1995