The first SWITCHBOARD database was collected at Texas Instruments in 1992 [1]. Dr. Godfrey stated that large structured collections of speech and text are essential to progress in speech and speaker recognition research-an accepted truth in the speech community. At the time of this writing, the Linguistic Data Consortium (LDC) is in the process of collecting SWB-II phase 3.

To test the versatility our data collection system, we have implemented the LDC's SWITCHBOARD-II phase 3 protocol. The application parameter file was created using lk_appbuilder. The modularly designed Oracle database access routines used in the SWB implementation also provide useful examples of how the system may be expanding to interface with other software.

This project was sponsored by the Linguistic Data Consortium.