phrase EM NAACL presentation

Uploaded from authorPOINT
Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Why Generative Models Underperform Surface Heuristics: 

Why Generative Models Underperform Surface Heuristics UC Berkeley Natural Language Processing John DeNero, Dan Gillick, James Zhang, and Dan Klein

Overview: Learning Phrases: 

Overview: Learning Phrases

Overview: Learning Phrases: 

Overview: Learning Phrases Sentence-aligned corpus Phrase-level generative model

Outline: 

Outline I) Generative phrase-based alignment Motivation Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements

Motivation for Learning Phrases: 

Motivation for Learning Phrases J ’ ai un chat . I have a spade .

Motivation for Learning Phrases: 

Motivation for Learning Phrases

Motivation for Learning Phrases: 

Motivation for Learning Phrases … appelle un chat un chat …

A Phrase Alignment Model Compatible with Pharaoh: 

A Phrase Alignment Model Compatible with Pharaoh les chats aiment le poisson frais .

Training Regimen That Respects Word Alignment: 

Training Regimen That Respects Word Alignment

Training Regimen That Respects Word Alignment: 

Training Regimen That Respects Word Alignment les chats aiment le poisson cats like fresh fish . . frais .

Performance Results: 

Performance Results

Performance Results: 

Performance Results

Outline: 

Outline I) Generative phrase-based alignment Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements

Example: Maximizing Likelihood with Competing Segmentations: 

Training Corpus French: carte sur la table English: map on the table French: carte sur la table English: notice on the chart Example: Maximizing Likelihood with Competing Segmentations

Example: Maximizing Likelihood with Competing Segmentations: 

Training Corpus French: carte sur la table English: map on the table French: carte sur la table English: notice on the chart Example: Maximizing Likelihood with Competing Segmentations

EM Training Significantly Decreases Entropy of the Phrase Table: 

EM Training Significantly Decreases Entropy of the Phrase Table French phrase entropy: 10% of French phrases have deterministic distributions

Effect 1: Useful Phrase Pairs Are Lost Due to Critically Small Probabilities: 

Effect 1: Useful Phrase Pairs Are Lost Due to Critically Small Probabilities In 10k translated sentences, no phrases with weight less than 10-5 were used by the decoder.

Effect 2: Determinized Phrases Override Better Candidates During Decoding: 

Effect 2: Determinized Phrases Override Better Candidates During Decoding the situation varies to an enormous degree the situation varie d ' une immense degré the situation varies to an enormous degree the situation varie d ' une immense caractérise Heuristic Learned

Effect 3: Ambiguous Foreign Phrases Become Active During Decoding: 

Effect 3: Ambiguous Foreign Phrases Become Active During Decoding Translations for the French apostrophe

Outline: 

Outline I) Generative phrase-based alignment Model structure and training Performance results II) Error analysis Properties of the learned phrase table Contributions to increased error rate III) Proposed Improvements

Motivation for Reintroducing Entropy to the Phrase Table: 

Motivation for Reintroducing Entropy to the Phrase Table Useful phrase pairs are lost due to critically small probabilities. Determinized phrases override better candidates. Ambiguous foreign phrases become active during decoding.

Reintroducing Lost Phrases: 

Reintroducing Lost Phrases Interpolation yields up to 1.0 BLEU improvement

Smoothing Phrase Probabilities: 

Smoothing Phrase Probabilities

Conclusion: 

Conclusion Generative phrase models determinize the phrase table via the latent segmentation variable. A determinized phrase table introduces errors at decoding time. Modest improvement can be realized by reintroducing phrase table entropy.

Questions?: 

Questions?