logging in or signing up APAN multilingual exchange 2005 Dennison Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 36 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 05, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Multilingual Information Exchange: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.orgThe general problem: The general problem Searching for multilingual resources is not easy: on the web on metadata catalogues / bibliographical databases on full text documents Results are generally in the language used in the search query => We need a multilingual approach and multilingual tools (Thesauri / Ontologies, etc.)What we can achieve (1): Multilingual concept resolution: What we can achieve (1): Multilingual concept resolution With a multilingual thesaurus or ontology we can find resources on any language Because we can realize ...... Multilingual concept resolution!What we can achieve (2): Brokering: What we can achieve (2): Brokering With a multilingual thesaurus or ontology we can find resources from several sources also if we do not know the terminology and the language used in these sources vessels crafts fishing vessels ships navio navire 船舶 bateau de pêche fishing boat fishing vessel Results in multiple languages from multiple databases How to build a multilingual Thesaurus / Ontology: How to build a multilingual Thesaurus / Ontology Lexicalizations of concepts in multiple languages: {… fishing boat; bateau de pêche; 捕捞渔船 … } For every language we can have synonyms: { … fishing vessel, fishing boat, fishing craft … } { … bateau de pêche, navire de pêche, … } { … 捕捞渔船, … } FAO activities (ongoing): FAO activities (ongoing) Food safety ontology (English, Spanish, French) Fishery ontology (English, Chinese) Food and Nutrition ontology-based portal (English, Spanish, French) Extensive work with AGROVOC RDFS / OWL version Semantic refinements Expand multilingual coverage Expand subject coverage The multilingual vocabulary...: The multilingual vocabulary... Must cover all concepts of interest to the users in the various languages, ... at a minimum all domain concepts lexicalized in any of the participating languages Must accommodate hierarchical structures suggested by different languages (Dr. Soergel)Problems (1): Problems (1) Translation of an English thesaurus into German does not make a German thesaurus => whenever possible we need to consider the concept in his globality (many languages, definitions, “surrounding context” etc.) Equivalence of terms holds only in some contexts More difficult to translate non-specialized terms (Dr. Soergel)Problems (2): Problems (2) Two terms mean almost the same thing but differ slightly in meaning or connotation: English: alcoholism French: alcoholisme English: vegetable (includes potatoes) German: Gemüse (does not include potatoes) If the difference is big enough, one needs to introduce two separate concepts under a broader term; otherwise a scope note needs to clearly instruct indexers in all languages how the term is to be used so that the indexing stays, as far as possible, free from cultural bias or reflects multiple biases by assigning several descriptors. (Dr. Soergel)Available resources: example: Available resources: example SuperThes, ... SWAD-Europe initiative: thesaurus activities RDF encoding of multilingual thesaurus Multilingual labelling approach (mirroring relations for every language) Interlingual mapping approach (different structures to be mapped)SWAD-Europe: Inter-Thesaurus Mapping: SWAD-Europe: Inter-Thesaurus Mapping SKOS mapping: Exact Inexact Major Minor Partial Broad Narrow AND OR NOT Inter-Thesaurus Mapping: example: Inter-Thesaurus Mapping: example <ag:Concept> <descriptor xml:lang="fr">Academie</descriptor> <map:exactMatch> <map:AND> <map:memberList rdf:parseType="Collection"> <aat:Concept> <descriptor xml:lang="en">Academy</descriptor> </aat:Concept> <aat:Concept> <descriptor xml:lang="en">Buildings</descriptor> </aat:Concept> </map:memberList> </map:AND> </map:exactMatch> </ag:Concept>Available resources: another possibility: Available resources: another possibility Use OWL Define concepts Define terms Define string Define relationships between these 3 elements: <similatTo>, <equivalentTo>, (+ skos suggestions) <hasSynonym>, <hasAntonym>, <hasCognate> <hasSpellingVariant>, <hasTranslation>Available resources: other techniques NLP: Available resources: other techniques NLP Knowledge discovery: helps on the creation of ontologies in a specific language Used to create good IS Concept extraction Multilingual search engine …Conclusion: Conclusion We need multilingual tools Ontologies better than traditional thesauri The task is not easy Subject experts are essential NLP could help We need tools To help experts to realize the mapping To do annotations … Live demo: Live demo http://www.fao.org Slide17: Thank you. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
APAN multilingual exchange 2005 Dennison Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 36 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 05, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Multilingual Information Exchange: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.orgThe general problem: The general problem Searching for multilingual resources is not easy: on the web on metadata catalogues / bibliographical databases on full text documents Results are generally in the language used in the search query => We need a multilingual approach and multilingual tools (Thesauri / Ontologies, etc.)What we can achieve (1): Multilingual concept resolution: What we can achieve (1): Multilingual concept resolution With a multilingual thesaurus or ontology we can find resources on any language Because we can realize ...... Multilingual concept resolution!What we can achieve (2): Brokering: What we can achieve (2): Brokering With a multilingual thesaurus or ontology we can find resources from several sources also if we do not know the terminology and the language used in these sources vessels crafts fishing vessels ships navio navire 船舶 bateau de pêche fishing boat fishing vessel Results in multiple languages from multiple databases How to build a multilingual Thesaurus / Ontology: How to build a multilingual Thesaurus / Ontology Lexicalizations of concepts in multiple languages: {… fishing boat; bateau de pêche; 捕捞渔船 … } For every language we can have synonyms: { … fishing vessel, fishing boat, fishing craft … } { … bateau de pêche, navire de pêche, … } { … 捕捞渔船, … } FAO activities (ongoing): FAO activities (ongoing) Food safety ontology (English, Spanish, French) Fishery ontology (English, Chinese) Food and Nutrition ontology-based portal (English, Spanish, French) Extensive work with AGROVOC RDFS / OWL version Semantic refinements Expand multilingual coverage Expand subject coverage The multilingual vocabulary...: The multilingual vocabulary... Must cover all concepts of interest to the users in the various languages, ... at a minimum all domain concepts lexicalized in any of the participating languages Must accommodate hierarchical structures suggested by different languages (Dr. Soergel)Problems (1): Problems (1) Translation of an English thesaurus into German does not make a German thesaurus => whenever possible we need to consider the concept in his globality (many languages, definitions, “surrounding context” etc.) Equivalence of terms holds only in some contexts More difficult to translate non-specialized terms (Dr. Soergel)Problems (2): Problems (2) Two terms mean almost the same thing but differ slightly in meaning or connotation: English: alcoholism French: alcoholisme English: vegetable (includes potatoes) German: Gemüse (does not include potatoes) If the difference is big enough, one needs to introduce two separate concepts under a broader term; otherwise a scope note needs to clearly instruct indexers in all languages how the term is to be used so that the indexing stays, as far as possible, free from cultural bias or reflects multiple biases by assigning several descriptors. (Dr. Soergel)Available resources: example: Available resources: example SuperThes, ... SWAD-Europe initiative: thesaurus activities RDF encoding of multilingual thesaurus Multilingual labelling approach (mirroring relations for every language) Interlingual mapping approach (different structures to be mapped)SWAD-Europe: Inter-Thesaurus Mapping: SWAD-Europe: Inter-Thesaurus Mapping SKOS mapping: Exact Inexact Major Minor Partial Broad Narrow AND OR NOT Inter-Thesaurus Mapping: example: Inter-Thesaurus Mapping: example <ag:Concept> <descriptor xml:lang="fr">Academie</descriptor> <map:exactMatch> <map:AND> <map:memberList rdf:parseType="Collection"> <aat:Concept> <descriptor xml:lang="en">Academy</descriptor> </aat:Concept> <aat:Concept> <descriptor xml:lang="en">Buildings</descriptor> </aat:Concept> </map:memberList> </map:AND> </map:exactMatch> </ag:Concept>Available resources: another possibility: Available resources: another possibility Use OWL Define concepts Define terms Define string Define relationships between these 3 elements: <similatTo>, <equivalentTo>, (+ skos suggestions) <hasSynonym>, <hasAntonym>, <hasCognate> <hasSpellingVariant>, <hasTranslation>Available resources: other techniques NLP: Available resources: other techniques NLP Knowledge discovery: helps on the creation of ontologies in a specific language Used to create good IS Concept extraction Multilingual search engine …Conclusion: Conclusion We need multilingual tools Ontologies better than traditional thesauri The task is not easy Subject experts are essential NLP could help We need tools To help experts to realize the mapping To do annotations … Live demo: Live demo http://www.fao.org Slide17: Thank you.