Deep Syntax: Difference between revisions

From MT Talks
Jump to navigation Jump to search
No edit summary
No edit summary
Line 16: Line 16:


== Prague Dependency Treebank ==
== Prague Dependency Treebank ==
The Prague Dependency Treebank (PDT) is a corpus of Czech sentences manually annotated according to the FGD.


== VALLEX ==
== VALLEX ==


== MT Using Deep Syntax: TectoMT ==
== MT Using Deep Syntax: TectoMT ==

Revision as of 13:41, 7 October 2015

Lecture 14: Deep Syntax
Lecture video: web TODO
Youtube

{{#ev:youtube|https://www.youtube.com/watch?v=lJwCW2mFk2M&index=11&list=PLpiLOsNLsfmbeH-b865BwfH15W0sat02V%7C800%7Ccenter}}

Functional Generative Description

The functional generative description (FGD) is a linguistic theory developed by Petr Sgall in Prague in the 1960's. It formally describes the language as a system of layers, ranging from the most basic layers (phonology) to abstract ones (deep syntax/semantic -- the tectogrammatical layer). The theory was developed with the intention to capture the language using a computer and indeed, much of the theory has been implemented as computer programs. However, the system of layers was gradually simplified and currently, only four layers are used (we refer to the annotation scheme for the Prague Dependency Treebank). An example of the layered description is shown on the following image (taken from PDT-2.0 documentation):

The lowest layer contains the sentence "as is", without any annotation. The m-layer provides a morphological analysis for each word (and also fixes typing errors). The a-layer is a dependency tree which describes the surface syntax of the sentence. Finally, the t-layer is a more abstract dependency tree which describes the deep syntax of the sentence.

Prague Dependency Treebank

The Prague Dependency Treebank (PDT) is a corpus of Czech sentences manually annotated according to the FGD.

VALLEX

MT Using Deep Syntax: TectoMT