Scoring and Optimization

From MT Talks
Revision as of 14:59, 24 August 2015 by Tamchyna (talk | contribs)
Jump to navigation Jump to search
Lecture 13: Scoring and Optimization
Lecture video: web TODO
Youtube

{{#ev:youtube|https://www.youtube.com/watch?v=rDkZOINdPhw&index=11&list=PLpiLOsNLsfmbeH-b865BwfH15W0sat02V%7C800%7Ccenter}}

Features of MT Models

Phrase Translation Probabilities

Phrase translation probabilities are calculated from occurrences of phrase pairs extracted from the parallel training data. Usually, MT systems work with the following two conditional probabilities:

These probabilities are estimated by simply counting how many times (for the first formula) we saw aligned to and how many times we saw in total. For example:

Lexical Weights

Lexical weights are a method for smoothing the phrase table. Infrequent phrases have unreliable probability estimates; for instance many long phrases occur together only once in the corpus, resulting in . Several methods exist for computing lexical weights. The most common one is based on word alignment inside the phrase. The probability of each foreign word is estimated as the average of lexical translation probabilities over the English words aligned to it. Thus for the phrase with the set of alignment points , the lexical weight is:

Language Model

Word and Phrase Penalty

Distortion Penalty

Decoding

Phrase-Based Search

Decoding in SCFG

Optimization of Feature Weights