The Institute for Signal and Information Processing (ISIP) was created in 1994
to develop public domain speech recognition software. One of the primary goals
of our program is to educate students and researchers who are new to speech
recognition, in addition to providing tools for those already established in
the field. Our speech site is a dedicated resource for any and all interested
in speech research. The site consists of several sub-sites, including software,
databases, and support. In this tutorial I will introduce some of the more
popular sections of our website.
Let's begin with the software downloads, where you will find a comprehensive listing
of our software. Previous as well as current software can be found on this page.
You can download our latest version of the Production System - a research
environment that includes a generalized hierarchical Viterbi search-based
decoder and a network trainer recommended for serious speech and signal
processing researchers. From any page within the speech site, click on the
software link on the left navigation bar. At the bottom of the left
column, click on the link to the download archives. Once you have downloaded
the software you will want to view our documentation pages described
in the next section.
To better understand out software environment, take a look at our
Within this section of the site you will find extensive documentation for
the ISIP foundation classes (IFCs). The IFCs and software environment are
designed to provide everything from complex data structures to an abstract
file I/O interface. This software environment begins at the lowest level,
the operating system (/system), and culminates in a state-of-the-art public
domain large vocabulary speech recognition system. This section
can be accessed from any ASR page by clicking on the Docs link on the
left of the page.
After you have downloaded the software and browsed through the
documentation, you will want to learn how to use the software more effectively.
In our tutorials
we provide comprehensive guides for many aspects of our software
and related research activities. Our most popular tutorial focuses on our
production software. In this tutorial, we provide detailed examples
that you can download and run to verify that your installation is performing
correctly. These examples include the necessary data and configuration files
to replicate each step in a typical system development cycle. At the end of
each section, you will find extensive examples that provide a more integrated
view of the topic presented in that section. Finally, the last section in the
tutorial is devoted to examples of how to build some of the most popular
recognition applications in use today. To navigate to the production
tutorial follow these directions. From any page within the ASR site,
click the tutorials link on the left of the page.
Java Applets and Demos
Once you become an expert at using our software you may want to peruse our
Java applications. These applications demonstrate
the usefulness and abilities of our software. Included are demos for
Convolution, Dynamic Time-Warping, Filter Design, Pattern
Recognition, Pole/Zero Analysis of Linear Systems, and a Spectrum
Analysis tool. As an example, the convolution demo provides an interactive animation
of two signals as they convolve over time. The user can program ranges
and even create custom signals. From any page within the speech site, click
on the Demos link located on the left navigation bar. Then, follow the link
labeled Java Applications.
We have also included a collection of demos that contain
audio-enhanced web pages demonstrating various phenomena related to speech
production and perception. Included are demos to illustrate speech
recognition units as well as various transcription challenges.
The phonetic units demo allows you to listen to an utterance transcribed
using different linguistic units. Common recognition units such as
context-dependent phones are included. From any page within the speech
site, click on the demos link located on the left navigation bar.
Then follow the link labeled Audio Demos.