The first SWITCHBOARD database was collected at Texas Instruments in
1992 [1]. Dr. Godfrey stated that large structured collections of
speech and text are essential to progress in speech and speaker
recognition research-an accepted truth in the speech community. At the
time of this writing, the Linguistic Data Consortium (LDC) is in the
process of collecting SWB-II phase 3.
To test the versatility our data collection system, we have
implemented the LDC's SWITCHBOARD-II phase 3 protocol. The application
parameter file was created using
lk_appbuilder.
The modularly designed Oracle database access routines used in the SWB
implementation also provide useful examples of how the system may be
expanding to interface with other software.
This project was sponsored by the
Linguistic Data Consortium.