Overview
This research focuses on identifying methods for increasing the
flexibility of an interaction with a spoken language dialog system, while
balancing the critical need for response efficiency in a vehicular
environment. This has entailed developing a demonstrational vehicular dialog
system and using it as a mechanism for exploring the fundamental research
issues. The demonstration system integrates the ISIP public domain speech
recognition system as a component in the
DARPA Communicator
architecture.
The dialog system uses a
DARPA Communicator
Hub Compliant architecture, composed of a number of servers that interact
with each other through the
DARPA Communicator
Hub. Our system is composed of the Hub and five primary servers:
- Audio Server receives signals using microphones from users
and then sends the signals to an automatic speech recognizer. It
then sends synthesized speech which is gotten from the Speech
Synthesizer to users.
- Speech Recognizer
takes signals from Audio Server and produces a word lattice.
- Semantic Parser takes word lattices from Speech Recognizer
and parses them and produces the best interpretations in the form
of semantic frames.
- Dialog Manager retrieves semantic frame from the Parser;
clarifies information from user if required; uses clarification and
conversational context to resolve ambiguities; translates the
semantic frame to a database query; retrieves information and
responds to user.
- MySQL Database Application receives SQL queries from the
Dialog Manager; retrieves data from a MySQL database and accesses
the web to retrieve additional information if necessary.
These five servers communicate via the hub to accomplish the task of
understanding and responding to a spoken request. The DARPA hub
communication interface uses scripts to direct the flow of information
among the servers. The In-Vehicle Dialog System currently contains
information about the Mississippi State University campus and surrounding
Starkville area, but is designed in a modular fashion to easily support
addition of other city information as well as additional servers.
The Semantic Parser and Dialog Manager code base were originally
written by the
Center for Speech and
Language Research
at the University of Colorado for a travel reservation application.
Through data collection and analysis, a new symantic grammar and dialog
manager code base were derived to support an in-vehicle application. The
current semantic grammar contains approximately 500 rules and over 2000
words.
|