logging in or signing up Lecture4 BGsept Mercede Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 27 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: January 09, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Information Access I Multilingual Text Summarization: Information Access I Multilingual Text Summarization GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i SkövdeTypes of summaries(Spärck Jones 1999, Hovy & Lin 1999) : Types of summaries (Spärck Jones 1999, Hovy & Lin 1999) With respect to content: Indicative: provide an idea what the text is about, but do not render the content Informative: shortened versions of the text With respect to the way of creating: Extracts: reused portions of the text Abstracts: re-generated text reflecting the important content Compressed texts: (Knight & Marcu 2000): compressing syntactic parse trees in order to get a shorter text Text compression (Knight & Marcu 2000, Lin 2003): Text compression (Knight & Marcu 2000, Lin 2003) ”Given the original sentence t, find the best short sentence s generated from t, i.e. maximize P(s|t). Original sentence (Lin 2003): In Louisiana, the hurricane landed with wind speeds of about 120 miles per hour and caused severe damage in small coastal centres such as Morgan City, Franklin and New IberiaText compression (2) (fragments of Fig. 1 in Lin 2003): Text compression (2) (fragments of Fig. 1 in Lin 2003)Slide5: Different genres and tasks require different summaries (informative summaries not so good for detective stories ) and Different texts require different summarization techniquesSlide6: A special case: dialogue summarization: selecting successful ’dialog transactions’ – the game theoretical approach (Verbmobil: Wahlster, Alexandersson)Multilingual summarization:: Multilingual summarization: Extracting/compressing + MT or Abstracting + multilingual generationSlide8: A possible combination system including multilingual summarization of news reportsSlide9: · Evaluation of different methods of semantic classification in the lexicon · Development of a summarization module that would be well-suited for the news domain · A comparison between the ‘traditional’ machine translation (MT) on the one side, and information extraction (IE) combined with reading comprehension (RC) and multilingual text generation (MTG) on the other side · Exploration of the interplay between textual structure, syntax, and prosodic markers. The main objectives of the Newspeak project:Slide10: GUERILLA FIGHTS IN LEBANON Israeli warplanes and artillery attacked suspected guerrilla hideouts Friday following a series of clashes in south Lebanon. Four guerrillas were reportedly killed. Guerrillas of the Syrian-backed Amal group attacked Israeli and allied militia positions in the Israeli-occupied zone at daybreak, Lebanese security officials said. Three guerrillas were killed in the assaults, said an Israeli army spokesman in Jerusalem. Amal said none of its fighters was killed. One of the main problems with media texts: no possibility of stating what is a true fact (hence, some criticism could be raised against TREC factoid questions...)Slide11: The Theory of Mental Spaces (Fauconnier 1985, Fauconnier and Sweetser 1996)Slide12: The notion of ’mental spaces’ (Fauconnier 1985, Sweetser & Fauconnier 1996, Sanders & Redeker 1996)Slide13: GUERILLA FIGHTS IN LEBANON Israeli warplanes and artillery attacked suspected guerrilla hideouts Friday following a series of clashes in south Lebanon. Four guerrillas were reportedly killed. Guerrillas of the Syrian-backed Amal group attacked Israeli and allied militia positions in the Israeli-occupied zone at daybreak, Lebanese security officials said. Three guerrillas were killed in the assaults, said an Israeli army spokesman in Jerusalem. Amal said none of its fighters was killed. One of the main problems with media texts: no possibility of stating what is a true fact (hence, some criticism could be raised against TREC factoid questions...)Slide14: ’Mental Spaces’ in sample text 1Slide15: Sample text 2 BEIT JALA, West Bank Israeli troops pulled out of Beit Jala before dawn on Thursday, leaving the Palestinian town quiet amid reports of fresh violence in other West Bank towns. The Palestinians said the Israel Defence Forces had staged incursions into Hebron, killing one and injuring 16 others, and Tulkarem, killing one and injuring 10. The Israel Defence Forces (IDF) had no immediate comment on the accusation that troops had entered Tulkarem, and strongly denied there was an incursion at Hebron. Slide16: ’Mental Spaces’ in sample text 2Slide17: Newspeak – the extraction and generation modulesSlide18: Exploding objects missile bomb WordNet Classification:Slide19: WordNet vs. Newspeak noun classificationSlide20: WordNet vs. Newspeak noun classification (2)Slide21: The outline of the summarization processSlide22: Named Entity Recognition and ClassificationSlide23: Iraqi President Saddam Hussein is striking a defiant tone a day after U.S. President George Bush's State of the Union address, saying his nation is ready to "destroy and defeat" any American attack. In a televised meeting with his military commanders on Wednesday, Saddam said the U.S. had no right to attack his country, and every American soldier is coming "as an aggressor." "If they have illusions, by God, America will be harmed," the Iraqi leader said. "[It is] not in the American people's interest that such harm come to it, its reputation and economy." In a powerful address Tuesday evening, Bush braced Americans and the rest of the world for a possible war with Iraq, warning that America was determined in its resolve to see Saddam disarmed. Sample text 3Slide24: [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(tone,[[],speech_act(manner)]),circ([semcat(is,[[],cop([])]),semcat(striking,[[],[]]),semcat(a,[[],det([])]),semcat(defiant,[[],[]])]),said([semcat(a,[[],det([])]),semcat(day,[[],time_period([])]),semcat(after,[[],prep([])]),semcat(U.S. President George Bush_s State,[propername,place([country]),group_of_people([]),human([high_status]),human([]),place([d23,convent_borders])]),semcat(of,[[],prep([])]),semcat(the,[[],det([])]),semcat(Union,[propername,explosion([]),group_of_people([]),place([country])]),semcat(address,[[],speech_act([neutral]),place([d2])]),semcat(saying,[[],say_verb([neutral])]),semcat(his,[[],poss([])]),semcat(nation,[[],place([country]),group_of_people([])]),semcat(is,[[],cop([])]),semcat(ready,[[],[]]),semcat(to,[[],prep([])]),semcat(",[[],[]]),semcat(destroy,[[],[]]),semcat(and,[[],konj([])]),semcat(defeat,[[],[]]),semcat(",[[],[]]),semcat(any,[[],det([])]),semcat(American,[propername,human([])]),semcat(attack,[[],military_operation([])]),semcat(.,[[],[]])]),[]] [source(semcat(Saddam,[propername,[]])),semcat(said,[[],say_verb([neutral])])… Sample output from SemCat + speaker and speech act identification Slide25: coreference checked [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(tone,[[],speech_act(manner)]),circ([semcat(is,[[],cop([])]),semcat(striking,[[],[]]),semcat(a,[[],det([])]),semcat(defiant,[[],[]])]),said([semcat(a,[[],det([])]),semcat(day,[[],time_period([])]),semcat(after,[[],prep([])]),semcat(U.S. President George Bush_s State,[propername,place([country]),group_of_people([]),human([high_status]),human([]),place([d23,convent_borders])]),semcat(of,[[],prep([])]),semcat(the,[[],det([])]),semcat(Union,[propername,explosion([]),group_of_people([]),place([country])]),semcat(address,[[],speech_act([neutral]),place([d2])]),semcat(saying,[[],say_verb([neutral])]),semcat(his,[[],poss([])]),semcat(nation,[[],place([country]),group_of_people([])]),semcat(is,[[],cop([])]),semcat(ready,[[],[]]),semcat(to,[[],prep([])]),semcat(",[[],[]]),semcat(destroy,[[],[]]),semcat(and,[[],konj([])]),semcat(defeat,[[],[]]),semcat(",[[],[]]),semcat(any,[[],det([])]),semcat(American,[propername,human([])]),semcat(attack,[[],military_operation([])]),semcat(.,[[],[]])]),[]] [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(said,[[],say_verb([])]),… Sample output from SemCat + speaker and speech act identification (2)Slide26: Searle’s classification of illocutionary actsSlide27: The classification of speech act phrases in the Newspeak lexicon (1)Slide28: The classification of speech act phrases in the system lexicon (2)Slide29: Some principles for selection of claims to be rendered: 1) Informatives: Neutral, the sender is not marked for high status: officials said, the news agency reported, reportedly… A claim p introduced by a neutral informative is rendered in the summary; the source is omitted if there are no denials or confirmations of p in the text and if the source is not marked for high status, like ‘President’ Neutral, the sender marked for high status, and ‘declarations’: the President said…the government condemned… The source is rendered if it is marked for high status Affirmative; confirmations of explicit claims: Israeli sources confirmed that… Confirmations of previous explicit claims are omitted in the summary Affirmative; confirmations of claims that are not explicitly mentioned: Both the information source and the claim, including the type of the speech act phrase, are rendered in the summary, if the speech act is a confirmation of a claim not present in the news reportSlide30: Some principles for selection of claims to be rendered: 1) Informatives: Negative, or neutral followed by denied claims: The president denied, The Israeli source said that it is not true… Both the initial claim and its denial are rendered in the summary together with the information about the senders Slide31: 2) Utterance refusal, negated speech act phrases, hypotheses, commissives, interpretations: The Israeli sources neither denied or confirmed, the minister did not say, if…, the defense secretary declined to say…, the government had no immediate comments… Utterance refusals or negated speech act phrases related to an explicit claim are omitted If a source refuses to confirm/deny a claim that has not been explicitly mentioned in the previous part of the text, the whole speech act is rendered, inclusive the type of the speech act Hypotheses and commissives are rendered together with their sources and marked for unsure epistemic status Slide32: Some principles for selection of claims to be rendered: 3) Epistemic spaces: e. g. no one knows if the device was planted deliberately or if it was leftover from New Year’s Eve If two claims would exclude each other in the same mental space, and if no source in the text takes responsibility for any of these claims, both claims are to be rendered as hypotheses Slide33: Sample input text RAMALLAH, West Bank -- Palestinian leader Yasser Arafat said Thursday that elections as part of a reform of the Palestinian Authority will be held this winter, whether or not Israeli forces withdraw from the Palestinian territories. That represented a change of course from Arafat, who said last week that no elections would be held until the Israelis pulled back. Shortly after Arafat's announcement, a committee he had appointed to set up elections resigned, according to Israel Radio, because Arafat would not agree to a specific date for the elections. Other Palestinian leaders said the resignations were a procedural matter. Arafat also condemned Wednesday's suicide bombing in the Israeli town of Rishon Letzion . Two Israelis were killed and at least 37 others wounded when the bomber detonated explosives in the center of a crowded pedestrian district. The terror attack marked the second time in two weeks a suicide bombing directed at civilians has rocked Rishon Letzion, a town about 15 miles southeast of Tel Aviv. On May 8, a suicide attack at a pool hall killed 15 people and wounded dozens of others. "Suddenly there was an explosion," 16-year-old Shmuel Voller told The Associated Press on Wednesday. The bombing occurred on Rothschild Street in the heart of the town around 9:15 p.m. (2:15 p.m. ET).Slide34: Generation: sample summary RAMALLAH, West Bank -- Palestinian leader Yasser Arafat said Thursday that elections as part of a reform of the Palestinian Authority will take place this winter, whether or not Israeli forces withdraw from the Palestinian territories. On Wednesday, a suicide bombing took place in the Israeli town of Rishon Letzion, on Rothschild Street in the center of a crowded pedestrian district, around 9:15 p.m. (2:15 p.m. ET). Two Israelis were killed and at least 37 others wounded. Arafat condemned the attack.TL vocabulary more restricted than SL vocabularyTL pattern fit textual/semantic relations: Swedish: Israeliska trupper tågade ut ur Beit Jala Israeli+pl troops marched out of/left Beit Jala (tågade ut ur instead of *drog ut av) Polish: Wojska izraelskie wycofały się z Beit Jala Troops Israeli backed out from Beit Jala (wycofały się instead of *wyciągnęły or *wyciągały). Generation TL vocabulary more restricted than SL vocabulary TL pattern fit textual/semantic relationsSlide36: E: A bomb exploded in Bilbao, Spain, early Friday morning. S: En bomb exploderade i den spanska staden Bilbao tidigt på fredagsmorgonen a bomb explode-past in def Spanish city Bilbao early on Friday-morning-def E: There were no injuries. S: Inga personskador rapporterades no person-injuries report-past-passive E: ETA is suspected for being responsible for the attack. S: Förmodligen ligger ETA bakom bombdådet. Presumably lay-pres ETA behind bomb-outrage-def GenerationSlide37: The grammatical and semantic characteristics of Polish nouns Slide39: Chłopcy sta-li Boy+PL stand+PAST+PL+MALE+HUMAN ’The boys were standing (there)’Slide40: Pojawili się więc Algierczycy, Jemeńczycy, obywatele Bangladeszu, Uzbecy, Kirgizi i Tadżycy. Algierczycy Jemeńczycy obywatele Uzbecy n hum ma pl nom 35 n hum ma pl nom n hum ma pl nom n hum ma pl nom 35 14 36 Kirgizi Tadżycy n hum ma pl nom n hum ma pl nom 38 35 Pojawili v hum ma pl Stop-list and a suffix list with declension numbers ’There arrived Algerians, Yemenis, citizens of Bangladesh, Uzbeks, Kirgizis, and Tadjiks’ Extracting ’superanimate’ nouns (1)Slide41: Postverbal subjects: We wtorek w stolicy Kataru zebrali się na nieformalnej konferencji ministrowie 22 państw Ligi Arabskiej. Preverbal subjects: W przyjętej w Dausze wspólnej deklaracji Arabowie zdecydowanie potępili terroryzm we wszelkich formach. Antecedents of the relative pronoun ’którzy’: Komórka składała się z wielu dziesiątek osób, w tym dwóch pilotów, którzy kształcili się w tych samych szkołach amerykańskich, co Mohammed Atta. ‘22 ministers of the Arab countries gathered together at an informal conference in the capital of Qatar on Tuesday.‘ ‘In the joint declaration the Arab leaders have strongly condemned all forms of terrorism.’ ‘The cell consisted of dozens of people, including two pilots, who had completed their education at the same American schools that Mohammed Atta attended.’ Extracting ’superanimate’ nouns (2)Slide42: The decrease of unknown superanimate noun forms during the training phase (training on 4 files, ca 11 000 words each) – normalized dataSlide43: Correctly classified nounsSlide44: The results of post-editing after the training phase The lexical coverage of different text domainsSlide45: The general procedure for extracting and classifying different word classes in PolishSlide46: GUI for linking WN synsets and Polish wordsSlide47: · Domain extension · Further work on the target lexicon · Feedback from the generation module into the source lexicon · Continued study of relations between textual structure and prosody Further development You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Lecture4 BGsept Mercede Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 27 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: January 09, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Information Access I Multilingual Text Summarization: Information Access I Multilingual Text Summarization GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i SkövdeTypes of summaries(Spärck Jones 1999, Hovy & Lin 1999) : Types of summaries (Spärck Jones 1999, Hovy & Lin 1999) With respect to content: Indicative: provide an idea what the text is about, but do not render the content Informative: shortened versions of the text With respect to the way of creating: Extracts: reused portions of the text Abstracts: re-generated text reflecting the important content Compressed texts: (Knight & Marcu 2000): compressing syntactic parse trees in order to get a shorter text Text compression (Knight & Marcu 2000, Lin 2003): Text compression (Knight & Marcu 2000, Lin 2003) ”Given the original sentence t, find the best short sentence s generated from t, i.e. maximize P(s|t). Original sentence (Lin 2003): In Louisiana, the hurricane landed with wind speeds of about 120 miles per hour and caused severe damage in small coastal centres such as Morgan City, Franklin and New IberiaText compression (2) (fragments of Fig. 1 in Lin 2003): Text compression (2) (fragments of Fig. 1 in Lin 2003)Slide5: Different genres and tasks require different summaries (informative summaries not so good for detective stories ) and Different texts require different summarization techniquesSlide6: A special case: dialogue summarization: selecting successful ’dialog transactions’ – the game theoretical approach (Verbmobil: Wahlster, Alexandersson)Multilingual summarization:: Multilingual summarization: Extracting/compressing + MT or Abstracting + multilingual generationSlide8: A possible combination system including multilingual summarization of news reportsSlide9: · Evaluation of different methods of semantic classification in the lexicon · Development of a summarization module that would be well-suited for the news domain · A comparison between the ‘traditional’ machine translation (MT) on the one side, and information extraction (IE) combined with reading comprehension (RC) and multilingual text generation (MTG) on the other side · Exploration of the interplay between textual structure, syntax, and prosodic markers. The main objectives of the Newspeak project:Slide10: GUERILLA FIGHTS IN LEBANON Israeli warplanes and artillery attacked suspected guerrilla hideouts Friday following a series of clashes in south Lebanon. Four guerrillas were reportedly killed. Guerrillas of the Syrian-backed Amal group attacked Israeli and allied militia positions in the Israeli-occupied zone at daybreak, Lebanese security officials said. Three guerrillas were killed in the assaults, said an Israeli army spokesman in Jerusalem. Amal said none of its fighters was killed. One of the main problems with media texts: no possibility of stating what is a true fact (hence, some criticism could be raised against TREC factoid questions...)Slide11: The Theory of Mental Spaces (Fauconnier 1985, Fauconnier and Sweetser 1996)Slide12: The notion of ’mental spaces’ (Fauconnier 1985, Sweetser & Fauconnier 1996, Sanders & Redeker 1996)Slide13: GUERILLA FIGHTS IN LEBANON Israeli warplanes and artillery attacked suspected guerrilla hideouts Friday following a series of clashes in south Lebanon. Four guerrillas were reportedly killed. Guerrillas of the Syrian-backed Amal group attacked Israeli and allied militia positions in the Israeli-occupied zone at daybreak, Lebanese security officials said. Three guerrillas were killed in the assaults, said an Israeli army spokesman in Jerusalem. Amal said none of its fighters was killed. One of the main problems with media texts: no possibility of stating what is a true fact (hence, some criticism could be raised against TREC factoid questions...)Slide14: ’Mental Spaces’ in sample text 1Slide15: Sample text 2 BEIT JALA, West Bank Israeli troops pulled out of Beit Jala before dawn on Thursday, leaving the Palestinian town quiet amid reports of fresh violence in other West Bank towns. The Palestinians said the Israel Defence Forces had staged incursions into Hebron, killing one and injuring 16 others, and Tulkarem, killing one and injuring 10. The Israel Defence Forces (IDF) had no immediate comment on the accusation that troops had entered Tulkarem, and strongly denied there was an incursion at Hebron. Slide16: ’Mental Spaces’ in sample text 2Slide17: Newspeak – the extraction and generation modulesSlide18: Exploding objects missile bomb WordNet Classification:Slide19: WordNet vs. Newspeak noun classificationSlide20: WordNet vs. Newspeak noun classification (2)Slide21: The outline of the summarization processSlide22: Named Entity Recognition and ClassificationSlide23: Iraqi President Saddam Hussein is striking a defiant tone a day after U.S. President George Bush's State of the Union address, saying his nation is ready to "destroy and defeat" any American attack. In a televised meeting with his military commanders on Wednesday, Saddam said the U.S. had no right to attack his country, and every American soldier is coming "as an aggressor." "If they have illusions, by God, America will be harmed," the Iraqi leader said. "[It is] not in the American people's interest that such harm come to it, its reputation and economy." In a powerful address Tuesday evening, Bush braced Americans and the rest of the world for a possible war with Iraq, warning that America was determined in its resolve to see Saddam disarmed. Sample text 3Slide24: [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(tone,[[],speech_act(manner)]),circ([semcat(is,[[],cop([])]),semcat(striking,[[],[]]),semcat(a,[[],det([])]),semcat(defiant,[[],[]])]),said([semcat(a,[[],det([])]),semcat(day,[[],time_period([])]),semcat(after,[[],prep([])]),semcat(U.S. President George Bush_s State,[propername,place([country]),group_of_people([]),human([high_status]),human([]),place([d23,convent_borders])]),semcat(of,[[],prep([])]),semcat(the,[[],det([])]),semcat(Union,[propername,explosion([]),group_of_people([]),place([country])]),semcat(address,[[],speech_act([neutral]),place([d2])]),semcat(saying,[[],say_verb([neutral])]),semcat(his,[[],poss([])]),semcat(nation,[[],place([country]),group_of_people([])]),semcat(is,[[],cop([])]),semcat(ready,[[],[]]),semcat(to,[[],prep([])]),semcat(",[[],[]]),semcat(destroy,[[],[]]),semcat(and,[[],konj([])]),semcat(defeat,[[],[]]),semcat(",[[],[]]),semcat(any,[[],det([])]),semcat(American,[propername,human([])]),semcat(attack,[[],military_operation([])]),semcat(.,[[],[]])]),[]] [source(semcat(Saddam,[propername,[]])),semcat(said,[[],say_verb([neutral])])… Sample output from SemCat + speaker and speech act identification Slide25: coreference checked [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(tone,[[],speech_act(manner)]),circ([semcat(is,[[],cop([])]),semcat(striking,[[],[]]),semcat(a,[[],det([])]),semcat(defiant,[[],[]])]),said([semcat(a,[[],det([])]),semcat(day,[[],time_period([])]),semcat(after,[[],prep([])]),semcat(U.S. President George Bush_s State,[propername,place([country]),group_of_people([]),human([high_status]),human([]),place([d23,convent_borders])]),semcat(of,[[],prep([])]),semcat(the,[[],det([])]),semcat(Union,[propername,explosion([]),group_of_people([]),place([country])]),semcat(address,[[],speech_act([neutral]),place([d2])]),semcat(saying,[[],say_verb([neutral])]),semcat(his,[[],poss([])]),semcat(nation,[[],place([country]),group_of_people([])]),semcat(is,[[],cop([])]),semcat(ready,[[],[]]),semcat(to,[[],prep([])]),semcat(",[[],[]]),semcat(destroy,[[],[]]),semcat(and,[[],konj([])]),semcat(defeat,[[],[]]),semcat(",[[],[]]),semcat(any,[[],det([])]),semcat(American,[propername,human([])]),semcat(attack,[[],military_operation([])]),semcat(.,[[],[]])]),[]] [source(semcat(Iraqi President Saddam Hussein,[propername,human([]),human([high_status])])),semcat(said,[[],say_verb([])]),… Sample output from SemCat + speaker and speech act identification (2)Slide26: Searle’s classification of illocutionary actsSlide27: The classification of speech act phrases in the Newspeak lexicon (1)Slide28: The classification of speech act phrases in the system lexicon (2)Slide29: Some principles for selection of claims to be rendered: 1) Informatives: Neutral, the sender is not marked for high status: officials said, the news agency reported, reportedly… A claim p introduced by a neutral informative is rendered in the summary; the source is omitted if there are no denials or confirmations of p in the text and if the source is not marked for high status, like ‘President’ Neutral, the sender marked for high status, and ‘declarations’: the President said…the government condemned… The source is rendered if it is marked for high status Affirmative; confirmations of explicit claims: Israeli sources confirmed that… Confirmations of previous explicit claims are omitted in the summary Affirmative; confirmations of claims that are not explicitly mentioned: Both the information source and the claim, including the type of the speech act phrase, are rendered in the summary, if the speech act is a confirmation of a claim not present in the news reportSlide30: Some principles for selection of claims to be rendered: 1) Informatives: Negative, or neutral followed by denied claims: The president denied, The Israeli source said that it is not true… Both the initial claim and its denial are rendered in the summary together with the information about the senders Slide31: 2) Utterance refusal, negated speech act phrases, hypotheses, commissives, interpretations: The Israeli sources neither denied or confirmed, the minister did not say, if…, the defense secretary declined to say…, the government had no immediate comments… Utterance refusals or negated speech act phrases related to an explicit claim are omitted If a source refuses to confirm/deny a claim that has not been explicitly mentioned in the previous part of the text, the whole speech act is rendered, inclusive the type of the speech act Hypotheses and commissives are rendered together with their sources and marked for unsure epistemic status Slide32: Some principles for selection of claims to be rendered: 3) Epistemic spaces: e. g. no one knows if the device was planted deliberately or if it was leftover from New Year’s Eve If two claims would exclude each other in the same mental space, and if no source in the text takes responsibility for any of these claims, both claims are to be rendered as hypotheses Slide33: Sample input text RAMALLAH, West Bank -- Palestinian leader Yasser Arafat said Thursday that elections as part of a reform of the Palestinian Authority will be held this winter, whether or not Israeli forces withdraw from the Palestinian territories. That represented a change of course from Arafat, who said last week that no elections would be held until the Israelis pulled back. Shortly after Arafat's announcement, a committee he had appointed to set up elections resigned, according to Israel Radio, because Arafat would not agree to a specific date for the elections. Other Palestinian leaders said the resignations were a procedural matter. Arafat also condemned Wednesday's suicide bombing in the Israeli town of Rishon Letzion . Two Israelis were killed and at least 37 others wounded when the bomber detonated explosives in the center of a crowded pedestrian district. The terror attack marked the second time in two weeks a suicide bombing directed at civilians has rocked Rishon Letzion, a town about 15 miles southeast of Tel Aviv. On May 8, a suicide attack at a pool hall killed 15 people and wounded dozens of others. "Suddenly there was an explosion," 16-year-old Shmuel Voller told The Associated Press on Wednesday. The bombing occurred on Rothschild Street in the heart of the town around 9:15 p.m. (2:15 p.m. ET).Slide34: Generation: sample summary RAMALLAH, West Bank -- Palestinian leader Yasser Arafat said Thursday that elections as part of a reform of the Palestinian Authority will take place this winter, whether or not Israeli forces withdraw from the Palestinian territories. On Wednesday, a suicide bombing took place in the Israeli town of Rishon Letzion, on Rothschild Street in the center of a crowded pedestrian district, around 9:15 p.m. (2:15 p.m. ET). Two Israelis were killed and at least 37 others wounded. Arafat condemned the attack.TL vocabulary more restricted than SL vocabularyTL pattern fit textual/semantic relations: Swedish: Israeliska trupper tågade ut ur Beit Jala Israeli+pl troops marched out of/left Beit Jala (tågade ut ur instead of *drog ut av) Polish: Wojska izraelskie wycofały się z Beit Jala Troops Israeli backed out from Beit Jala (wycofały się instead of *wyciągnęły or *wyciągały). Generation TL vocabulary more restricted than SL vocabulary TL pattern fit textual/semantic relationsSlide36: E: A bomb exploded in Bilbao, Spain, early Friday morning. S: En bomb exploderade i den spanska staden Bilbao tidigt på fredagsmorgonen a bomb explode-past in def Spanish city Bilbao early on Friday-morning-def E: There were no injuries. S: Inga personskador rapporterades no person-injuries report-past-passive E: ETA is suspected for being responsible for the attack. S: Förmodligen ligger ETA bakom bombdådet. Presumably lay-pres ETA behind bomb-outrage-def GenerationSlide37: The grammatical and semantic characteristics of Polish nouns Slide39: Chłopcy sta-li Boy+PL stand+PAST+PL+MALE+HUMAN ’The boys were standing (there)’Slide40: Pojawili się więc Algierczycy, Jemeńczycy, obywatele Bangladeszu, Uzbecy, Kirgizi i Tadżycy. Algierczycy Jemeńczycy obywatele Uzbecy n hum ma pl nom 35 n hum ma pl nom n hum ma pl nom n hum ma pl nom 35 14 36 Kirgizi Tadżycy n hum ma pl nom n hum ma pl nom 38 35 Pojawili v hum ma pl Stop-list and a suffix list with declension numbers ’There arrived Algerians, Yemenis, citizens of Bangladesh, Uzbeks, Kirgizis, and Tadjiks’ Extracting ’superanimate’ nouns (1)Slide41: Postverbal subjects: We wtorek w stolicy Kataru zebrali się na nieformalnej konferencji ministrowie 22 państw Ligi Arabskiej. Preverbal subjects: W przyjętej w Dausze wspólnej deklaracji Arabowie zdecydowanie potępili terroryzm we wszelkich formach. Antecedents of the relative pronoun ’którzy’: Komórka składała się z wielu dziesiątek osób, w tym dwóch pilotów, którzy kształcili się w tych samych szkołach amerykańskich, co Mohammed Atta. ‘22 ministers of the Arab countries gathered together at an informal conference in the capital of Qatar on Tuesday.‘ ‘In the joint declaration the Arab leaders have strongly condemned all forms of terrorism.’ ‘The cell consisted of dozens of people, including two pilots, who had completed their education at the same American schools that Mohammed Atta attended.’ Extracting ’superanimate’ nouns (2)Slide42: The decrease of unknown superanimate noun forms during the training phase (training on 4 files, ca 11 000 words each) – normalized dataSlide43: Correctly classified nounsSlide44: The results of post-editing after the training phase The lexical coverage of different text domainsSlide45: The general procedure for extracting and classifying different word classes in PolishSlide46: GUI for linking WN synsets and Polish wordsSlide47: · Domain extension · Further work on the target lexicon · Feedback from the generation module into the source lexicon · Continued study of relations between textual structure and prosody Further development