logging in or signing up 14 EVALITA NER intro Speranza Kliment Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 40 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 31, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: EVALITA 2007 The Named Entity Recognition Task Manuela Speranza, FBK-irstOutline: Outline Named Entity Recognition at EVALITA 2007 Introduction to the task Participants Evaluation Dataset Metrics Results Ranking Discussion Conclusion EVALITA 2007 Workshop Rome, September 10, 2007Introduction to the NER Task: Introduction to the NER Task Task: Recognize Named Entities in Italian newspaper articles Four types of Named Entities: Geo-Political Entities (GPE): e.g. Italy Location Entities (LOC): e.g. Tevere Organization Entities (ORG): e.g. FIAT Person Entities (PER): e.g. Napolitano Based on the ACE Entity Recognition and Normalization Task Adaptations from ACE: limit the task to the recognition of Named Entities adapt it to Italian EVALITA 2007 Workshop Rome, September 10, 2007Participants: Participants In the NER Task we had six participants: FBK-irst, Trento (FBKirst_Zanoli_NER) LDC, University of Pennsylvania (LDC_Walker_NER) University of Alicante (UniAli_Kozareva_NER) University of Dortmund (UniDort_Jungermann_NER) University of Duisburg-Essen (UniDuE_Roessler_NER) Yahoo, Barcelona (Yahoo_Ciaramita_NER) Only one Italian institution, while two from Spain and two from Germany One participant from the USA EVALITA 2007 Workshop Rome, September 10, 2007Evaluation Dataset: I-CAB (i): Evaluation Dataset: I-CAB (i) 525 news stories from the Italian local newspaper “L’Adige” 4 days 5 categories Two sections 7-8 September 2004 7-8 October 2004 News Stories Cultural News Economic News Sports News Local News Number of words = 182.500 Average number of words per file = 348 EVALITA 2007 Workshop Rome, September 10, 2007 training (335 news stories) test (190 news stories)Evaluation Dataset: I-CAB (ii): EVALITA 2007 Workshop Rome, September 10, 2007 Evaluation Dataset: I-CAB (ii)Evaluation of Results: Evaluation of Results Scorer: CONLL Shared Task 2002 Metrics: Precision (Pr.), Recall (Re.), and F-Measure (FB1) Official ranking is based on FB1 EVALITA 2007 Workshop Rome, September 10, 2007 Official Ranking: Official RankingOfficial Ranking: Official RankingDiscussion: DiscussionDiscussion: DiscussionDiscussion: DiscussionDiscussion: DiscussionConclusions: Conclusions Good interest from the community: 14 initial registrations 6 participants (though only one Italian Institution) Relatively high rate of abandonment (8/14, 60%) Good performance best system at CONLL: 88.8% for English, 72.4% for German best system at EVALITA: 82.1% EVALITA 2007 Workshop Rome, September 10, 2007 Slide15: Thanks to all who participated EVALITA 2007 Workshop Rome, September 10, 2007References: References ACE. http://www.nist.gov/speech/tests/ace/index.htm CONLL. http://www.cnts.ua.ac.be/conll2002/ner/ L’Adige. http://www.ladige.it/ Linguistic Data Consortium (LDC). Automatic Content Extraction English Annotation Guidelines for Entities, version 5.6.1 2005.05.23. http://projects.ldc.upenn.edu/ ace/docs/English-Entities-Guidelines_v5.6.1.pdf Magnini, Cappelli, Pianta, Speranza, Bartalesi Lenzi, Sprugnoli, Romano, Girardi, Negri. Annotazione di contenuti concettuali in un corpus italiano: I-CAB. In Proceedings of SILFI 2006, X Congresso Internazionale della Società di Linguistica e Filologia Italiana, Firenze 14-17 giugno 2006. Magnini, Pianta, Speranza, Bartalesi Lenzi, Sprugnoli. Italian Content Annotation Bank (I-CAB): Named Entities, Technical report, ITC-irst, 2007. http://evalita.itc.it/tasks/I-CAB-Report-Named-Entities.pdf ONTOTEXT. http://ontotext.itc.it/ EVALITA 2007 Workshop Rome, September 10, 2007 You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
14 EVALITA NER intro Speranza Kliment Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 40 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 31, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: EVALITA 2007 The Named Entity Recognition Task Manuela Speranza, FBK-irstOutline: Outline Named Entity Recognition at EVALITA 2007 Introduction to the task Participants Evaluation Dataset Metrics Results Ranking Discussion Conclusion EVALITA 2007 Workshop Rome, September 10, 2007Introduction to the NER Task: Introduction to the NER Task Task: Recognize Named Entities in Italian newspaper articles Four types of Named Entities: Geo-Political Entities (GPE): e.g. Italy Location Entities (LOC): e.g. Tevere Organization Entities (ORG): e.g. FIAT Person Entities (PER): e.g. Napolitano Based on the ACE Entity Recognition and Normalization Task Adaptations from ACE: limit the task to the recognition of Named Entities adapt it to Italian EVALITA 2007 Workshop Rome, September 10, 2007Participants: Participants In the NER Task we had six participants: FBK-irst, Trento (FBKirst_Zanoli_NER) LDC, University of Pennsylvania (LDC_Walker_NER) University of Alicante (UniAli_Kozareva_NER) University of Dortmund (UniDort_Jungermann_NER) University of Duisburg-Essen (UniDuE_Roessler_NER) Yahoo, Barcelona (Yahoo_Ciaramita_NER) Only one Italian institution, while two from Spain and two from Germany One participant from the USA EVALITA 2007 Workshop Rome, September 10, 2007Evaluation Dataset: I-CAB (i): Evaluation Dataset: I-CAB (i) 525 news stories from the Italian local newspaper “L’Adige” 4 days 5 categories Two sections 7-8 September 2004 7-8 October 2004 News Stories Cultural News Economic News Sports News Local News Number of words = 182.500 Average number of words per file = 348 EVALITA 2007 Workshop Rome, September 10, 2007 training (335 news stories) test (190 news stories)Evaluation Dataset: I-CAB (ii): EVALITA 2007 Workshop Rome, September 10, 2007 Evaluation Dataset: I-CAB (ii)Evaluation of Results: Evaluation of Results Scorer: CONLL Shared Task 2002 Metrics: Precision (Pr.), Recall (Re.), and F-Measure (FB1) Official ranking is based on FB1 EVALITA 2007 Workshop Rome, September 10, 2007 Official Ranking: Official RankingOfficial Ranking: Official RankingDiscussion: DiscussionDiscussion: DiscussionDiscussion: DiscussionDiscussion: DiscussionConclusions: Conclusions Good interest from the community: 14 initial registrations 6 participants (though only one Italian Institution) Relatively high rate of abandonment (8/14, 60%) Good performance best system at CONLL: 88.8% for English, 72.4% for German best system at EVALITA: 82.1% EVALITA 2007 Workshop Rome, September 10, 2007 Slide15: Thanks to all who participated EVALITA 2007 Workshop Rome, September 10, 2007References: References ACE. http://www.nist.gov/speech/tests/ace/index.htm CONLL. http://www.cnts.ua.ac.be/conll2002/ner/ L’Adige. http://www.ladige.it/ Linguistic Data Consortium (LDC). Automatic Content Extraction English Annotation Guidelines for Entities, version 5.6.1 2005.05.23. http://projects.ldc.upenn.edu/ ace/docs/English-Entities-Guidelines_v5.6.1.pdf Magnini, Cappelli, Pianta, Speranza, Bartalesi Lenzi, Sprugnoli, Romano, Girardi, Negri. Annotazione di contenuti concettuali in un corpus italiano: I-CAB. In Proceedings of SILFI 2006, X Congresso Internazionale della Società di Linguistica e Filologia Italiana, Firenze 14-17 giugno 2006. Magnini, Pianta, Speranza, Bartalesi Lenzi, Sprugnoli. Italian Content Annotation Bank (I-CAB): Named Entities, Technical report, ITC-irst, 2007. http://evalita.itc.it/tasks/I-CAB-Report-Named-Entities.pdf ONTOTEXT. http://ontotext.itc.it/ EVALITA 2007 Workshop Rome, September 10, 2007