Rich Vocabulary: Difference between revisions

From MT Talks
Jump to navigation Jump to search
No edit summary
No edit summary
Line 13: Line 13:
While German has some degree of inflection, it is the Germans' fondness of complex word compounds that causes the large vocabulary problem for MT. Consider the following compound:
While German has some degree of inflection, it is the Germans' fondness of complex word compounds that causes the large vocabulary problem for MT. Consider the following compound:


[[File:rindfleish-prezi.png|300px]]
[[File:rindfleish-prezi.png|500px]]


=== Finnish -- agglutination ===
=== Finnish -- agglutination ===


[[File:finnish-prezi.png|300px]]
[[File:finnish-prezi.png|500px]]


=== Czech -- fusional inflection ===
=== Czech -- fusional inflection ===


[[File:czech-inflection-prezi.png|300px]]
[[File:czech-inflection-prezi.png|500px]]


== Large Vocabulary Sizes in MT Pipeline ==
== Large Vocabulary Sizes in MT Pipeline ==

Revision as of 13:23, 12 August 2015

Lecture 12: Rich Vocabulary
Lecture video: web TODO
Youtube

{{#ev:youtube|https://www.youtube.com/watch?v=eSIbNT-yjdg%7C800%7Ccenter}}

Examples of Languages with a Rich Vocabulary

German -- compounding

While German has some degree of inflection, it is the Germans' fondness of complex word compounds that causes the large vocabulary problem for MT. Consider the following compound:

Finnish -- agglutination

Czech -- fusional inflection

Large Vocabulary Sizes in MT Pipeline

Word Alignment

Phrase Extraction

Decoding

Evaluation

Possible Solutions