6.1.3 Overview: Grammar Definition
The development of language models for speech recognition
can be traced directly to Chomsky's formal language
theory. This theory specifies a hierarchy of grammars (loosely defined
as rules for a language) and automata (language models) that
can recognize sentences in that language. In order to explain
the hierarchy, we must first formally define a grammar, G, as:
G = (V, T, P, S)
where:
V contains the set of all non-terminal symbols.
T contains the set of all terminal symbols.
P is a set of production or rewrite rules.
S is a special symbol called the start symbol.
As an example, each word in the sentence, "Julie loves speech"
is a terminal symbol contained in T. A set of production rules,
P, for a grammar that can generate the sentence is shown below:
S -> NP VP
VP -> V NP
NP -> NOUN
NP -> NAME
NOUN -> speech
NAME -> Julie Ethan
VERB -> loves chases
The set of non-terminal symbols, V, include NP, VP, NOUN, NAME,
and VERB. Finally, a language consists of all possible strings
of terminal symbols that can be generated by the production rules of
the grammar. Other possible strings in this language include
"Julie chases Ethan" and "Ethan loves speech." However, the string
"speech loves Julie" is also possible, but unlikely to occur in
normal conversation. This illustrates the simplicity of the example
grammar and the need for greater complexity to represent natural
spoken language. Continue to
6.1.4 for further description of
Chomsky's grammar hierarchy and how it can be used to adequately
model the complexity of spoken language.
|
|