ATS

Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Text Summarization: 

Text Summarization Manabu Okumura Precision & Intelligence Laboratory, Tokyo Institute of Technology

Table of Contents: 

Table of Contents 1.What is text summarization? 2.Sentence extraction as a summarization method 3.Current trends in the summarization method 4. Evaluation of summaries

Headline news — informing: 

Headline news — informing From the tutorial of Hovy & Marcu in COLING/ACL’98

Abstracts of papers — time saving: 

Abstracts of papers — time saving From the tutorial of Hovy & Marcu in COLING/ACL’98

Text summarization= the process to reduce the length or complexity of the original text, without losing the main content: 

Text summarization= the process to reduce the length or complexity of the original text, without losing the main content Definition

Input and Output: 

Input and Output Input Source Text Compression Rate or Summary Length Rate = Summary Length/Source Length Output Summary Text

Current Applications: 

Current Applications Search Engines: summarize the information in hit lists retrieved by search engines Meeting Summarization: find out what happened at the conference I missed Hand-held devices: create a screen-sized summary of a book Aids for the Handicapped: compact the text and read it out for a blind …

Types of Summary: 

Types of Summary Indicative vs. informative ...used for quick categorization vs. content processing. Extract vs. abstract ...lists fragments of text vs. re-phrases content coherently. Generic vs. query-biased ...provides author’s view vs. reflects user’s interest. Single-document vs. multi-document source ...based on one text vs. fuses together many texts.

Summary Function: 

Summary Function Indicative summaries provide a reference function for selecting documents for more in-depth reading Informative summaries cover all the important information in the source at some level of detail

Extract vs. Abstract: 

Extract vs. Abstract An extract is a summary consisting entirely of material copied from the source An abstract is a summary at least some of whose material is not present in the source

Aspects that describe summaries: 

Aspects that describe summaries Input [Sparck Jones, 97] subject type: domain genre: newspaper articles, editorials, letters, reports... source size: single doc; multiple docs (few; many) Purpose situation: embedded in larger system (MT, IR) or not? audience: focused or general usage: IR, sorting, skimming... Output format: paragraph, table, etc. style: informative, indicative, ...

A Summarization Machine: 

A Summarization Machine EXTRACTS ABSTRACTS MULTIDOCS Extract Abstract Indicative Generic Query-oriented 10% 50% 100% Informative DOC QUERY Modified from the tutorial of Hovy & Marcu in COLING/ACL’98

Necessary NLP techniques: 

Necessary NLP techniques Morphological analyzer(tokenizer, POS tagger) Parser(Syntactic/Rhetorical) Discourse interpretation(Coreference resolution) Language generation

Related Technologies: 

Related Technologies Information Retrieval(IR) Query-biased summarization Information Extraction(IE) Key information is known beforehand(as template) Question Answering(QA) Text Mining Text Classification Text Clustering

Slide15: 

(Input) Sam Schwartz retired as executive vice president of the famous hot dog manufacturer, Hupplewhite Inc. He will be succeeded by Harry Himmelfarb. → (Output) EVENT leave job PERSON Sam Schwartz POSITION executive vice president COMPANY Hupplewhite Inc. EVENT start job PERSON Harry Himmelfarb POSITION executive vice president COMPANY Hupplewhite Inc. Example of Information Extraction (Domain: Turnover)

Example of QA: 

Example of QA Q: What is the fastest car in the world? Correct answer: .., the jaguar XJ220 is the dearest (Pounds 415,000), fastest (217 mph/ 350 kmh) and most sought-after car in the world. Wrong answer: .. will stretch Volkswagen’s lead in the World’s fastest-growing vehicle market. Demand is expected to soar in the next few years as more Chinese are able to afford their own cars.

A Generic QA Architecture: 

A Generic QA Architecture QA Question Processing Doc. Retrieval (IR) Answer Extraction & Formulation Q queries Relevant Doc. Ans. Doc. Doc. Q Answers Resources

Slide18: 

Ideal method: 1.Understanding the text, 2. Transforming the analytical result of the text into the internal representation of a summary, (Finding the important parts of the analytical result) 3.Generating a summary from the internal representation → Current method: 1.Extracting the important parts(sentences) in the text, 2.Just arranging them in the order in the text Text Summarization Methods(1/2)

Text Summarization Methods(2/2): 

Text Summarization Methods(2/2) Sentence Extraction Sentence Simplification(Compression) Try to shorten the text by removing unimportant parts in the sentences

Information in text useful for sentence extraction: 

Information in text useful for sentence extraction Word frequency in text, Title of text, Position of sentence, Cue phrases in text, Discourse structure of text, Cohesion in text, … [Paice,90]

1.Word frequency in text : 

1.Word frequency in text Important sentences contain content words that occur frequently in the text. Content words that frequently occur in text tend to indicate the topic of the text. The degree of the importance of words is calculated based on their frequency(tf). The degree of the importance of sentences is calculated based on the importance of words that they contain. [Luhn,58],[Zechner,96]

example2: 

example2 科学での言葉は、鋭利なメスのように物や事物の中に切り込み分けている。 切り分けたうえで、またそれらの物を組みなおしてみる、それが科学での言葉の役割である。 だから科学の言葉は、科学者の自己を離れて物のほうに張りついている。 科学が主観を排除して、客観的事実だけを示すというのは、このように科学では言葉が自己を離れているということによるのである。 これにたいして、詩での言葉は物を表現するときでも、自己を離れるということはない。 物そのものが語っているように見えても、言葉は詩人の自己の手に握られているのである。 だからたとえその詩がどんなに客観的な描写に見えようとも、そこには詩人の自己の言葉が生きているのである。 石田春夫,「学生のための自分学」より 言葉:7 科学:6 詩:4 自己:5

Slide23: 

Texts tend to have the structure that depends on their genre. Technical papers: Introduction, main parts, Conclusion Newspapers: headlines, subheads, body Genre-dependent text structure

2.Title of text: 

2.Title of text Titles and headings are considered as the concise summaries of texts. Sentences that contain content words in titles and headings are important.

3.Position of sentence: 

3.Position of sentence Technical papers: Topical sentences tend to occur at initial position of paragraphs. Important sentences tend to occur at beginning or end of text. Newspapers: It’s a good method to extract the first some sentences in the body (Lead method). Lead method is said to be quite effective for newspaper summarization. More than 90% acceptability.[Brandow,95],[Wasson,98]

Optimum Position Policy (OPP): 

Optimum Position Policy (OPP) Claim: Important sentences are located at positions that are genre-dependent; these positions can be determined automatically through training (Lin and Hovy, 97). Corpus: 13000 newspaper articles (ZIFF corpus). Step 1: For each article, determine overlap between sentences and the index terms for the article. Step 2: Determine a partial ordering over the locations where sentences containing important words occur: Optimal Position Policy (OPP) From the tutorial of Hovy & Marcu in COLING/ACL’98

Opp (cont.) : 

Opp (cont.) OPP for ZIFF corpus: (T) > (P2,S1) > (P3,S1) > (P2,S2) > {(P4,S1),(P5,S1),(P3,S2)} >… (T=title; P=paragraph; S=sentence) OPP for Wall Street Journal: (T)>(P1,S1)>... Results: testing corpus of 2900 articles: Recall=35%, Precision=38%. Results: 10%-extracts cover 91% of the salient words. From the tutorial of Hovy & Marcu in COLING/ACL’98

4.Cue phrases in text: 

4.Cue phrases in text There are cue phrases that are positively/ negatively correlated to important sentences. positive: ‘In this paper, ‘In conclusion’, ‘our work’,…(in technical papers) negative: conjunctives that indicate illustration, such as ‘for example’

5.Discourse structure of text: 

5.Discourse structure of text The discourse structure of a text can be constructed based on the surface information in the text, such as discourse markers (conjunctives), and the ‘centrality’ of the sentences in the structure reflects their importance. [Miike,94],[Marcu,97]

Rhetorical parsing (Marcu,97): 

Rhetorical parsing (Marcu,97) [With its distant orbit {– 50 percent farther from the sun than Earth –} and slim atmospheric blanket,1] [Mars experiences frigid weather conditions.2] [Surface temperatures typically average about –60 degrees Celsius (–76 degrees Fahrenheit) at the equator and can dip to –123 degrees C near the poles.3] [Only the midday sun at tropical latitudes is warm enough to thaw ice on occasion,4] [but any liquid water formed that way would evaporate almost instantly5] [because of the low atmospheric pressure.6] [Although the atmosphere holds a small amount of water, and water-ice clouds sometimes develop,7] [most Martian weather involves blowing dust or carbon dioxide.8] [Each winter, for example, a blizzard of frozen carbon dioxide rages over one pole, and a few meters of this dry-ice snow accumulate as previously frozen carbon dioxide evaporates from the opposite polar cap.9] [Yet even on the summer pole, {where the sun remains in the sky all day long,} temperatures never warm enough to melt frozen water.10] From the tutorial of Hovy & Marcu in COLING/ACL’98

Rhetorical parsing (2): 

Rhetorical parsing (2) 5 Evidence Cause 5 6 4 4 5 Contrast 3 3 Elaboration 1 2 2 Background Justification 2 Elaboration 7 8 8 Concession 9 10 10 Antithesis 8 Example 2 Elaboration Summarization = selection of the most important units 2 > 8 > 3, 10 > 1, 4, 5, 7, 9 > 6 From the tutorial of Hovy & Marcu in COLING/ACL’98

Slide32: 

Summaries of the various length can be obtained, at any depth of tree(structure). More coherent summary might be obtained, since summarization is based on the discourse structure. Merits of the method

6.Cohesion in text: 

6.Cohesion in text Important sentences are the ones that are connected with many other sentences. Use lexical cohesion to determine the connectedness between sentences [Skorokhod’ko,72] b) Use similarity between sentences to determine the connectedness between them [Salton,96]

Slide34: 

Semantic relationship between sentences is indicated by the use of related words in them [Halliday & Hasan,76] Lexical cohesion

Slide36: 

Table 2: Measuring lexical cohesion in text unit pairs

Sentence extraction by combining multiple information: 

Sentence extraction by combining multiple information How to combine multiple information? The importance of each sentence is the weighted sum of the importance from multiple information(Linear combination). How to weight each information? Human tuning [Edmundson,69] Automatic weighting using a set of summaries as training corpus -multiple regression analysis [Watanabe,96]

[Edmundson, 69]: 

[Edmundson, 69] Cue method: Stigma words (“hardly”, “impossible”) Bonus words (“significant”) Key method: Similar to [Luhn,58] Title method: Title + headings Location method: Sentences under headings Sentences near beginning or end of document and/or paragraphs From the tutorial of Radev in SIGIR’04

[Edmundson, 69] (2): 

[Edmundson, 69] (2) Linear combination of four features: Manually labelled training corpus Key not important From the tutorial of Radev in SIGIR’04

Machine learning methods for sentence extraction(1/2): 

Machine learning methods for sentence extraction(1/2) Sentence extraction can be considered as the task of classifying sentences into 2 classes (important/ unimportant).

Slide41: 

Probabilistic learning Learn the conditional probability given multiple information whether a sentence belongs to the summary [Kupiec,95],[Jang,97],[Teufel,97] Decision tree learning Learn the rules with multiple information for classifying sentences [Nomoto,97],[Aone,97] Machine learning methods for sentence extraction(2/2)

Slide42: 

Framework for ML-based Sentence Extraction From the tutorial of Maybury & Mani in ACL’01

[Kupiec et al., 95]: 

[Kupiec et al., 95] Extracts of roughly 20% of original text Feature set: Sentence length |S|>5 Fixed phrases 26 manually chosen Paragraph Sentence position in paragraph Thematic words Binary: whether sentence is included in manual extract Uppercase words Not common acronyms Corpus: 188document + summary pairs from scientific journals From the tutorial of Radev in SIGIR’04

[Kupiec et al., 95] (2): 

[Kupiec et al., 95] (2) Uses Bayesian classifier: Assuming statistical independence:

[Kupiec et al., 95] (3): 

[Kupiec et al., 95] (3) Performance: For 25% summaries, 84% precision For smaller summaries, 74% improvement over Lead

Problems of the sentence extraction methods: 

Problems of the sentence extraction methods 1.Performance might degrade, if the text consists of multiple topics. 2.Anaphors in the extracted sentences might not have any antecedent in the summary. 3.The summary might be incoherent, since the sentences are just extracted from various parts of the text.

Some solutions: 

Some solutions Segmenting the text, and extract sentences in each segment, Adding to the summary some sentences that are prior to the sentence that contains an anaphor, Removing unnecessary conjunctives and adverbs, Sentence extraction that takes into account the relationship between sentences, or the discourse structure of the text

New topics in text summarization: 

New topics in text summarization Abstraction vs. extraction Try to re-phrase content Query-biased summarization vs. generic summarization Try to reflect the user’s interest Multi-document summarization vs. single-document summarization Try to fuse together the content of many texts Sentence simplification vs. sentence extraction Try to shorten the text by removing unimportant parts in the sentences

Summarization by Abstraction, Paraphrase: 

Summarization by Abstraction, Paraphrase Abstract represents the content of the original text, by generalization or paraphrasing Abstract generation= extract (of the important concepts)+ concept fusion +generation

John bought some vegetables, fruit, bread, and milk. → John bought some groceries. : 

John bought some vegetables, fruit, bread, and milk. → John bought some groceries. Concept fusion with conceptual hierarchy[Hovy,97]

Concept fusion with script knowledge [Kondo,97]: 

Concept fusion with script knowledge [Kondo,97] Script knowledge can be derived from the definition sentences in a dictionary 説得する(persuade):よく話して(say)納得させる(convince) 納得する(convinced):物事を理解して(understand)承認する(agree) 承認する(agree):相手の言い分を聞き入れる(accept)

私は彼女に事情を話した. I told her the situation. 彼女は私の言うことを理解し, She understood me, 聞き入れてくれた. and accepted my saying. → 私は彼女を説得した. I persuaded her of the situation.: 

私は彼女に事情を話した. I told her the situation. 彼女は私の言うことを理解し, She understood me, 聞き入れてくれた. and accepted my saying. → 私は彼女を説得した. I persuaded her of the situation.

Query-biased vs. Generic: 

Query-biased vs. Generic Summaries can be constructed solely from the content in the original text. (static summary) → Summaries should be dynamic, reflecting the user’s interest.

Query-biased summaries for IR: 

Query-biased summaries for IR In case summaries are used for relevance judgment of texts in IR, they should reflect the query that the user inputs. More importance is given to the sentences that contain words in the query. [Tombros,98]

Multi-document summarization: 

Multi-document summarization Extract important parts in each text, Clarify duplicate parts and different parts between texts.

Maximal Marginal Relevance(1/2): 

Maximal Marginal Relevance(1/2) For both single-document and multi-document summarization Can be used also for IR Redundancy reduction in single document summarization

Maximal Marginal Relevance(2/2): 

Maximal Marginal Relevance(2/2) Taking into account both relevance to the query(importance of sentences) and the difference from the sentences already selected Q:Query, R:Set of sentences ranked beforehand, S:Subset of R, already selected

Slide58: 

Newsblaster (http://www.cs.columbia.edu/nlp/newsblaster) NewsInEssence (http://www.newsinessence.com) Multi-article Summarization System on the WWW

Newsblaster [McKeown & et. al., 02]: 

Newsblaster [McKeown & et. al., 02] From the tutorial of Radev in SIGIR’04

Sentence simplification as summarization: 

Sentence simplification as summarization In case the unit for extraction is a sentence, much information might be lost by removing a unit. → Removing unimportant fragments in each sentence and shorten the text

Generation of captions in broadcasting: 

Generation of captions in broadcasting Apply the transformation rules to each sentence repeatedly, and transform the sentence into a shorter one.

Slide62: 

Remove the ‘sahen’ verb at end of sentence (「7月中に解散します」→「7月中に解散へ」) Remove the adverbs of politeness at end of sentence (「余震が相次ぎました」→「余震が相次いだ」) … [若尾,97],[加藤,98]

Sentence simplification for text skimming: 

Sentence simplification for text skimming Parsing a sentence, and extract the skeletal structure of the sentence from the parse tree. Relative clauses, embedded sentences, adjuncts are considered as unimportant, and can be removed. [Grefenstette,98]

Slide64: 

From the tutorial of Maybury & Mani in ACL’01 [Grefenstette, 98]

Summarization using LM: 

Summarization using LM Source language: full document Target language: summary From the tutorial of Radev in SIGIR’04

Language modeling(1/2): 

Language modeling(1/2) Source/target language Coding process Noisy-Channel Model e Noisy channel f Recovery e*

Language modeling(2/2): 

Language modeling(2/2) (Bigram model)

[Knight & Marcu, 00]: 

[Knight & Marcu, 00] Use structured (syntactic) information Two approaches: noisy channel decision based

[Berger & Mittal, 00]: 

[Berger & Mittal, 00] Gisting (OCELOT) content selection (preserve frequencies) word ordering (single words, consecutive positions)

Framework of Text Summarization System: 

Framework of Text Summarization System Sentence Extraction Duplicate Reduction Sentence Simplification Revision and/or Generation 2 is optional in single document summarization 4 has not been fully studied

Evaluation Types [Sparck Jones & Galliers, 96] : 

Evaluation Types [Sparck Jones & Galliers, 96] Intrinsic measures (glass-box): how good is the summary as a summary? compare against ideal output/source text criteria—quality, informativeness, etc. Extrinsic measures (black-box): how well does the summary help a user with a task? time to perform the task, accuracy of the task Reading Comprehension Tests [Morris et al., 92], IR, text categorization [SUMMAC, 98]

Quality Evaluation: 

Quality Evaluation Subjective grading 12 Quality Questions in DUC series Criteria: grammar, coherence, …

Examples of Quality Questions: 

Examples of Quality Questions About how many of the sentences are missing important components (e.g. the subject, main verb, direct object, modifier) - causing the sentence to be ungrammatical, unclear, or misleading? About how many pronouns are there whose antecedents are incorrect, unclear, missing, or come only later? About how many dangling conjunctions are there ("and", "however"...)?

Informativeness Evaluation: 

Informativeness Evaluation Comparison against ideal output(reference summary) Precision, Recall, F-measure for sentence extraction ROUGE for any summaries

Precision and Recall(1/2): 

Precision and Recall(1/2)

Precision and Recall(2/2): 

Precision and Recall(2/2)

ROUGE: 

ROUGE ROUGE-N: N-gram co-occurrence statistics between system output and reference summary Reference: Police killed the gunmen. System1: Police kill the gunmen. System2: The gunmen kill police. S2=S3

Extrinsic test: Text Classification: 

Extrinsic test: Text Classification Can you perform some task faster? example: Text Classification. measures: time and effectiveness. TIPSTER/SUMMAC evaluation: February, 1998 (SUMMAC, 98). Two tests: 1. Categorization 2. Ad Hoc (query-sensitive) 2 summaries per system: fixed-length (10%), best. 16 systems (universities, companies; 3 intern’l). From the tutorial of Hovy and Marcu in COLING/ACL’98

SUMMAC Ad Hoc (Query-Based) Test: 

SUMMAC Ad Hoc (Query-Based) Test Procedure (SUMMAC, 98): 1. 1000 newspaper articles from each of 5 categories. 2. Systems summarize each text (query-based summary). 3. Humans decide if summary is relevant or not to query. 4. Testers measure R and P: how relevant are the summaries to their queries? (many other measures as well) Results: 3 levels of performance From the tutorial of Hovy and Marcu in COLING/ACL’98

Reading Comprehension Tests: 

Reading Comprehension Tests Human reads full texts or summaries, and then answers a test. The scores measuring the percentage of correct answers indicate the usefulness of summaries. If reading a summary allows a human to answer questions as accurately as he could with the source, the summary is highly informative.

Resources (1): 

Resources (1) Books: Inderjeet Mani. Automatic Summarization, John Benjamins Publishing Company, 2001. Advances in Automatic Text Summarization, MIT Press, 1999. Online bibliographies: http://www.cs.columbia.edu/~radev/summarization/ http://www.cs.columbia.edu/~jing/summarization.html http://www.dcs.shef.ac.uk/~gael/alphalist.html

Resources (2): 

Resources (2) Online Tutorial Slides: Dragomir R. Radev. Text Summarization Tutorial, ACM SIGIR, 2004. http://www.summarization.com/sigirtutorial2004.ppt Eduary Hovy and Daniel Marcu. Automatic Text Summarization Tutorial, COLING/ACL, 1998. http://www.isi.edu/~marcu/acl-tutorial.ppt Horacio Saggion, Automatic text summarization: past, present, and future, IBERAMIA, 2004. http://www.dcs.shef.ac.uk/~saggion/saggion04.PDF

Resources(3): 

Resources(3) Multi-document Summarization System Softwares http://www.summarization.com/mead/ http://www.clsp.jhu.edu/ws2001/groups/asmd/

authorStream Live Help