6.1.5 Overview: Grammars for Speech Recognition
The three lower levels of Chomsky's hierarchy are of interest for
speech recognition. Let's begin at the bottom:
Regular grammars require that
every production rule contain at
least one terminal symbol on the right-hand side (RHS),
for example, A -> wB or A->w. This constraint makes them
the least powerful, however, recognizable by a finite state machine,
thus the simplest to program. Network decoding is an example of a
language model at this level in the hierarchy.
Context free grammars (CFGs) allow production rules that have only
non-terminal symbols on the RHS, for example, A->B.
This increases their power and flexibility beyond regular grammars,
but requires a push-down automata, which can store information
encoded in the non-terminal, to recognize strings. JSGF is an
example of a language model for a CFG.
Context sensitive grammars (CSGs) allow production
rules which have terminal symbols on the left hand side (LHS) and
the RHS, for example, aAb -> aBb. This allows representing
the context of a word more specifically than lower levels, but
requires a more powerful automata to recognize sentences in the
language. This entails using more complex techniques to parse
sentences in the language.
N-Grams are classified at this level.
In practice, CSGs can be represented as CFGs. In addition, CFGs offer
a good compromise between representational power and parsing
efficiency. Therefore, most speech recognition systems use
CFG-based language models.
See
SRSTW'02
and
Speech Course Notes
for more theoretial details.
Continue to Section 6.2.1 to learn how to
create network language models using ISIP software.
|
|