Constituency Trees: Difference between revisions
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
== Context Free Grammar == | == Context Free Grammar == | ||
Grammars, generally, are a way of describing a potentially infinite set of strings (sentences) using a finite set of production rules. In a | Grammars, generally, are a way of describing a potentially infinite set of strings (sentences) using a finite set of production rules. In a [https://en.wikipedia.org/wiki/Context-free_grammar context-free grammar], production rules take the following form: | ||
V → w | V → w | ||
V is a non-terminal symbol (for natural languages, non-terminals usually correspond to | V is a non-terminal symbol (for natural languages, non-terminals usually correspond to '''phrases''', such as NP for noun phrases) and w is a (non-empty) string of terminals (words) and nonterminals. It is known that CFGs cannot fully describe natural languages but for MT, they can serve as a very useful simplification. | ||
One nonterminal symbol serves as the top-level nonterminal where the generation starts (or where analysis ends) -- for natural languages, we usually use the symbol S (sentence). | |||
S -> NP VP | |||
NP -> dogs | |||
VP -> sleep | |||
NP -> Det Adj N | |||
Det -> the | |||
Adj -> black | |||
N -> cat | |||
== Syntax in Machine Translation == | == Syntax in Machine Translation == |
Revision as of 09:44, 6 August 2015
Lecture video: |
web TODO Youtube |
---|
{{#ev:youtube|https://www.youtube.com/watch?v=y_9SEdG1u3U%7C800%7Ccenter}}
Context Free Grammar
Grammars, generally, are a way of describing a potentially infinite set of strings (sentences) using a finite set of production rules. In a context-free grammar, production rules take the following form:
V → w
V is a non-terminal symbol (for natural languages, non-terminals usually correspond to phrases, such as NP for noun phrases) and w is a (non-empty) string of terminals (words) and nonterminals. It is known that CFGs cannot fully describe natural languages but for MT, they can serve as a very useful simplification.
One nonterminal symbol serves as the top-level nonterminal where the generation starts (or where analysis ends) -- for natural languages, we usually use the symbol S (sentence).
S -> NP VP NP -> dogs VP -> sleep NP -> Det Adj N Det -> the Adj -> black N -> cat