Phrase-based Model: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 8: | Line 8: | ||
{{#ev:youtube|https://www.youtube.com/watch?v=aA4jFayPNeQ|800|center}} | {{#ev:youtube|https://www.youtube.com/watch?v=aA4jFayPNeQ|800|center}} | ||
Phrase-based machine translation is probably the most widely used approach to MT today. | Phrase-based machine translation (PBMT) is probably the most widely used approach to MT today. It is relatively simple and easy to adapt to new languages. | ||
== Phrase Extraction == | |||
PBMT uses '''phrases''' as the basic unit of translation. Phrases are simply sequences of words which have been observed in the training data, they don't correspond to any linguistic notion of phrases. | |||
In order to obtain a '''phrase table''' (a probabilistic dictionary of phrases), we need [[Word Alignment|word-aligned]] parallel data. A heuristic is used to | |||
== See Also == | |||
* [www.statmt.org/book/slides/05-phrase-based-models.pdf Philipp Koehn's slides on PBMT] |
Revision as of 14:36, 7 April 2015
![]() | |
Lecture video: |
web TODO Youtube |
---|
{{#ev:youtube|https://www.youtube.com/watch?v=aA4jFayPNeQ%7C800%7Ccenter}}
Phrase-based machine translation (PBMT) is probably the most widely used approach to MT today. It is relatively simple and easy to adapt to new languages.
Phrase Extraction
PBMT uses phrases as the basic unit of translation. Phrases are simply sequences of words which have been observed in the training data, they don't correspond to any linguistic notion of phrases.
In order to obtain a phrase table (a probabilistic dictionary of phrases), we need word-aligned parallel data. A heuristic is used to
See Also
- [www.statmt.org/book/slides/05-phrase-based-models.pdf Philipp Koehn's slides on PBMT]