Phrase EM NAACL Presentation

Uploaded from authorPOINT
Download as
 PPT
Presentation Description 

No description available

authorSTREAM Premium Service
What's up on authorSTREAM?
Views: 183
Like it  ( Likes) Dislike it  ( Dislikes)
Added: September 25, 2007 This Presentation is Public 
Presentation Category : Science & Technology All Rights Reserved
Presentation Transcript

Why Generative Models Underperform Surface Heuristics: Why Generative Models Underperform Surface Heuristics UC Berkeley Natural Language Processing John DeNero, Dan Gillick, James Zhang, and Dan Klein


Overview: Learning Phrases: Overview: Learning Phrases


Overview: Learning Phrases: Overview: Learning Phrases Sentence-aligned corpus Phrase-level generative model


Outline: Outline I) Generative phrase-based alignment Motivation Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements


Motivation for Learning Phrases: Motivation for Learning Phrases J ’ ai un chat . I have a spade .


Motivation for Learning Phrases: Motivation for Learning Phrases


Motivation for Learning Phrases: Motivation for Learning Phrases … appelle un chat un chat …


A Phrase Alignment Model Compatible with Pharaoh: A Phrase Alignment Model Compatible with Pharaoh les chats aiment le poisson frais .


Training Regimen That Respects Word Alignment: Training Regimen That Respects Word Alignment


Training Regimen That Respects Word Alignment: Training Regimen That Respects Word Alignment les chats aiment le poisson cats like fresh fish . . frais .


Performance Results: Performance Results


Performance Results: Performance Results


Outline: Outline I) Generative phrase-based alignment Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements


Example: Maximizing Likelihood with Competing Segmentations: Training Corpus French: carte sur la table English: map on the table French: carte sur la table English: notice on the chart Example: Maximizing Likelihood with Competing Segmentations


Example: Maximizing Likelihood with Competing Segmentations: Training Corpus French: carte sur la table English: map on the table French: carte sur la table English: notice on the chart Example: Maximizing Likelihood with Competing Segmentations


EM Training Significantly Decreases Entropy of the Phrase Table: EM Training Significantly Decreases Entropy of the Phrase Table French phrase entropy: 10% of French phrases have deterministic distributions


Effect 1: Useful Phrase Pairs Are Lost Due to Critically Small Probabilities: Effect 1: Useful Phrase Pairs Are Lost Due to Critically Small Probabilities In 10k translated sentences, no phrases with weight less than 10-5 were used by the decoder.


Effect 2: Determinized Phrases Override Better Candidates During Decoding: Effect 2: Determinized Phrases Override Better Candidates During Decoding the situation varies to an enormous degree the situation varie d ' une immense degré the situation varies to an enormous degree the situation varie d ' une immense caractérise Heuristic Learned


Effect 3: Ambiguous Foreign Phrases Become Active During Decoding: Effect 3: Ambiguous Foreign Phrases Become Active During Decoding Translations for the French apostrophe


Outline: Outline I) Generative phrase-based alignment Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements


Motivation for Reintroducing Entropy to the Phrase Table: Motivation for Reintroducing Entropy to the Phrase Table Useful phrase pairs are lost due to critically small probabilities. Determinized phrases override better candidates. Ambiguous foreign phrases become active during decoding.


Reintroducing Lost Phrases: Reintroducing Lost Phrases Interpolation yields up to 1.0 BLEU improvement


Smoothing Phrase Probabilities: Smoothing Phrase Probabilities


Conclusion: Conclusion Generative phrase models determinize the phrase table via the latent segmentation variable. A determinized phrase table introduces errors at decoding time. Modest improvement can be realized by reintroducing phrase table entropy.


Questions?: Questions?