[Project Reporting] ANNUAL REPORT FOR AWARD # 9809300

Joseph Picone ; Mississippi State Univ
CARE: Internet-Accessible Speech Recognition Technology

Participant Individuals:
Graduate student(s) : Aravind Ganapathiraju; Xinping Zhang; Yufeng Wu;
Vishwanath Mantha; Richard Duncan
Undergraduate student(s) : Robert Brown; Lorena Rogers; Clayton Graff
High school student(s) : Tao Weilundemo
Senior personnel(s) : William C Chapman
Graduate student(s) : Jonathan Hamaker; Shivali Srivastava
Undergraduate student(s) : Issac Alphonso; Cedric D George; Joey D Foote;
Mahnas J Mohammadi-Aragh; Antonio M Robinson; Rick P King; David B Kay;
David J Laundre; Jennifer P Vogel; Jason Rogers
Graduate student(s) : Naveen Parihar; Ram Sundaram; Kaihua Haung
Undergraduate student(s) : Robert Blackwood; Erich Deitenbeck; Joseph
Langley; Katy Muir; Kenneth Poteete; Jason Wallace; Byron Williams; Troy
Lindsey

Partner Organizations:
Department of Defense: Collaborative Research

The Department of Defense has been a consistent consumer of the
technology developed in this project, and regularly gives us feedback
on design issues. We support them as an alpha test site since they are
typically the first to use our system. Though we have separate funding
from DoD, their impact on the production system has been important.

Other collaborators:

We maintain several web-based resources geared towards collaboration.
This includes a mailing list for project-related announcements that
has grown to over 170 participants. These users provide feedback on
design questions in addition to bug reports and other such support
requests that arise from use of the software.

Our annual workshops also train approximately 36 sites per year on
our system and promote interactions and collaborations.

Activities and findings:

Research and Education Activities:

======================================================================

08/15/00 - 08/14/01: RESEARCH AND EDUCATIONAL ACTIVITIES
In the third year of this project, we focused our efforts in two core
areas:
·
Java Applets: overhauled our Java interfaces to use servlet
technology, and launched two new applications: feature extraction and
recognition.
·
Production System Release: completed two alpha releases of a new
version of the production system that greatly enhance its flexibility
and functionality. Extended the interface to support more diverse file
formats (both input and output). Integrated our front end application
building software.
We also continued activities in the areas central to the overall
research program:
·
Foundation classes: added many algorithms at both the math and signal
processing layers of the system. Introduced classes to handle general
statistical pattern recognition. Revamped many of the underlying
classes to make better use of templates and templatized functions.
·
Workshops: hosted a software design review in January 2001, and two
one-week training workshops held in May (`00 and `01). An impressive
collection of on-line resources related to these workshops is publicly
available.
·
Software engineering: upgraded our software development process to use
a new problem-tracking tool that was written specifically to deal with
bug life-cycle issues. Streamlined our support activities and improved
the ease of use of our distribution and verification procedures.
The workshops continue to be extremely successful as demand has far
surpassed our original estimates for enrollment (and taxed our
facilities). We have seen a dramatic increase in the number of
commercial users of the recognition system. In fact, some of our most
active users are now commercial users. On-line support has improved,
but still remains a challenge given the wide range of experience
levels from the users.

======================================================================

08/15/99 - 08/14/00: RESEARCH AND EDUCATIONAL ACTIVITIES

In the second year of this project, we focused our efforts in five
major areas:
·       Production System Release: the first release of the production
speech recognition system, based on our modular libraries, is
scheduled for July 1.
·       Hosted two workshops: a software design review held in January 2000,
and a one-week training workshop held in May 2000.
·       Software engineering: upgraded our distribution to use the autoconf
facility, added an automated report tracking system to our on-line
support, created a multi-platform support facility.
·       Foundation classes: added algorithms and other signal processing
building blocks, introduced classes to handle acoustic models, search
algorithms, and knowledge sources, and released a front-end that
allows arbitrary algorithms to be implemented using a graphical user
interface.
·       Java Applets: enhanced our pattern recognition applet with several
important new features, including generation of arbitrary data sets,
clustering, and visualization of decision surfaces.
The workshops appear to be extremely successful as demand has far
surpassed our original estimates for enrollment (and taxed our
facilities). The number of serious users of the recognition system is
continually growing. It is becoming a challenge to provide same-day
response to most support requests, particularly given the wide range
of experience levels from the users.

See the attached pdf file for the entire report.

======================================================================

08/15/98 - 08/14/99: RESEARCH AND EDUCATIONAL ACTIVITIES

In the first year of this project, we focused our efforts in three
major areas:
·       Core Technology: extensions of the speech recognition system
required to enhance its appeal to our customer base (driven by
customer feedback);
·       Foundation Classes: building blocks such as vectors, matrices, and
data structures that simplify and standardize the development of
higher-level classes;
·       Web-Based Information: a comprehensive and informative web site that
constitutes a central point of contact for everything related to the
project.
We have seen interest in the project grow as evidenced by the fact our
mailing list has grown to 150 participants, and we have received
several serious inquiries about collaborations based on our system
(one of which resulted in participation in a joint NSF/EU proposal
[1]). Major milestones for the first year of the project included the
release of a fully functional speech recognition system (including
feature extraction and training), and the development of a remote job
submission capability that lets users submit jobs to our system over
the Internet.

See the attached pdf file for the entire report.

Findings:

======================================================================

08/15/00 - 08/14/01: MAJOR FINDINGS
In the third year of this project, we continued conducting our annual
workshops. In January 2001, we hosted 14 visitors for a software
design review. In May 2001, we hosted 24 participants for our one-week
training workshop. Several collaborations resulted from our previous
workshops, and it appears the same will be true this year as well. A
promising trend this year has been a significantly increased level of
interest by commercial users in the software and technology.
A new problem-tracking tool was introduced that has greatly impacted
our software design process. This tool, called Varmint, is publicly
available and is part of our public domain software distribution.
Varmint allows us to track the life-cycle of a bug, and is
particularly useful in a multi-programmer environment in which several
people may touch a bug during its life-cycle. Varmint is styled after
several commercial tools, and was designed based on several common
models of bug tracking used by information technology professionals.
The most significant benefit of using this tool is that programmers
are explicitly conscious of the bugs for which they are responsible,
and have a clear understanding of the priority of these bugs with
respect to the current software release schedule. This creates the
proper atmosphere of accountability required to make sure releases are
clean before they are made.
We also released a version of our job submission applet that uses Java
servlets. While the servlet programming environment is still rapidly
evolving, servlets have been instrumental in allowing our interfaces
to be sufficiently powerful. For example, users can now browse our
filesystems or their own filesystems using the same interface, and can
collect a diverse set of files for processing. These features allow
users to benchmark their local implementations against our reference
implementations. One windfall from this has been a reduction in
support requests on trivial issues such as file formats because users
can now debug this themselves using the applet.
We have continued to release a large amount of supporting materials on
our web site that are relevant to our mission of providing
comprehensive conversational speech recognition tools. We now deliver
high-quality transcriptions for 500 hours of Switchboard
conversational speech data, and offer acoustic models trained on this
data. ISIP resources are frequently referenced at major speech
recognition forums such as the Department of Defense's Speech
Transcription Workshop (also known as the Hub 5E workshop). This year,
we added phonetically transcribed data originally developed at the
International Computer Science Institute at the University of
California, Berkeley. This data, along with our word-level
transcriptions, are provided in a format that makes it easy to use for
investigations into phonetic-level performance of speech recognition
systems. This data is being used by several students in their
dissertation research.
Our on-line support and tutorial materials continue to grow as we
collect more feedback from our users. Our support line averages 5
support requests per day, about the same level of activity we
experienced in the previous year of this project. Typically one of
these requests will be nontrivial and require a more measured
response. We estimate that we spend approximately 20 hours per week
performing direct support of the software through the on-line help
service. The user base for our software appears to have stabilized at
about 175 users.
We have also in the past year begun to graduate our first
thesis-option students who contributed to this project. Thesis topics
have ranged from advance pattern recognition techniques (Support
Vector Machines) to experimental topics such as the influence of
transcription errors on performance. These theses are the first work
to make extensive use of the ISIP tools.

======================================================================

08/15/99 - 08/14/00: MAJOR FINDINGS
In the second year of this project, we introduced two pivotal
activities in the project: annual workshops and a rigorous software
distribution process. The workshop activities are proceeding smoothly.
In January 2000, we hosted nine visitors from several foreign
countries (China, Finland), government agencies (FBI, DoD), and
industrial sites (IBM, MITRE, Lincoln Labs). We reviewed the goals of
the research program, the architecture of the system, provided several
demonstrations, and collected feedback on features and capabilities
needed in future versions of the system. Several collaborations
resulted from this meeting, including an audio indexing opportunity
with George Tech, and invited talks at IBM. More collaborations are
planned. At the time this report was written, we are completing plans
for the May 2000 workshop, which will include 25 participants, of
which 20 are graduate students. Demand for the workshop was strong -
we tripled the number of participants over what was originally
budgeted. We turned away approximately 10 potential participants due
to space and resource limitations.
One surprise in the second year of the project has been the growing
importance of supporting the Linux operating system, and the
difficulties in doing so. Through experience, we have learned that
despite using the same compiler and software (GNU gcc, make, etc.) but
a different flavor of Unix (Sun Solaris x86), we cannot guarantee
robust releases for Linux users. Hence, we spent some time enhancing
our ability to achieve platform independence across a wide range of
Unix platforms. We now routinely run our releases through a suite of
systems before actually making the release. We also have encountered a
large demand for Windows-based ports of our system. Currently, we do
this through the use of a Unix-like shell available under Windows
(Cygnus' cygwin tools). This has also increased the support overhead
in making releases of our system. Despite the demand for a native
Windows port, we are not assigning this a high priority until the core
system is stable. This is primarily because there is a lack of
standardization of C++ compilers and development environments,
therefore making it hard to support both environments simultaneously
when the software base is changing rapidly.
With the addition of a full-time staff person, it has been possible to
expand our on-line support activities. We now handle approximately 5
serious support requests per day. Many support requests involve
hand-holding inexperienced users on basic computing issues - we are
still struggling with how to deal with these in a timely manner. The
remainder involve unexpected program crashes that require extensive
diagnosis. We have implemented an automated problem tracking system to
make sure such requests are properly tracked. We also provide an
ability for users to upload their data to us, so that we can replicate
their problems. The majority of serious problems seem to relate to
compiler-dependent problems (for example, a bug which did not show up
on one version of an operating system, but is fatal on a different
architecture with a different compiler). This also exposed the
critical need for an in-house multi-platform evaluation facility.
Last but not least, we have made extensive progress enhancing basic
functionality of the system. We have developed a generalized
hierarchical search engine that is the first such system of its type.
We provide an ability to decode speech using either networks for
N-gram language models. Users can supply either type of model, or
both, at any level of the system. For example, it is possible to
constrain the search using N-grams of parts of speech, as well as
N-grams of words. We have also developed a front-end that allows users
to configure the signal processing portions of the system without
writing any code - algorithms can be specified using a GUI-oriented
tool that automatically schedules the necessary operations required.

======================================================================

08/15/98 - 08/14/99: MAJOR FINDINGS
Since the main component of this project is the development and
dissemination of speech recognition technology, we did not expect to
generate a significant list of technology-related research
accomplishments in the first year of the project. Nevertheless, we
have begun some interesting research as peripheral activities. These
research topics include the use of Support Vector Machines for
improved acoustic modeling, the study of the influence of
context-sensitive word duration models on conversational speech
recognition performance (a step in the direction of introducing
prosodics into the speech recognition problem), and implementation of
a new segmental Baum-Welch training algorithm (preliminary results for
these approaches look promising; detailed results should be available
by December 1999). The fact that such research can be easily performed
with our system supports our contention that the system is
extensible.
With respect to the core technology component of the program, we
believe we have delivered a decoder that is extremely efficient for
conversational speech recognition, and is competitive with
state-of-the-art. Decoding time and memory requirements are within the
reach of standard PC-class computers. This is important in the context
of this program because it will increase access to this technology by
allowing smaller research labs to be able to use the system with
fairly modest computing environments. To move to larger domains than
conversational speech, such as broadcast news, we have developed a
dynamic language modeling capability that caches large language models
to decrease physical memory requirements. We have also demonstrated
that porting of the system to any gcc compliant platform is fairly
easy. The only outstanding issue is wide character support (Unicode)
under Linux. Once Linux compilers catch up (expected in Fall'99), our
cross-platform support problems should be minimal.
Foundation class development has proceeded using a model similar to
Java, but adapted to the demands of speech research. We have found it
extremely useful to abstract the user from the details of the
operating system through the use of our system classes. These handle
all low-level interactions with the operating system, and centralize
many tasks such as memory management, file management and I/O. The
next level above the system classes, the math classes, provide the
user with basic data type building blocks. Here we have followed an
STL model, and have demonstrated that a mixture of templates and fixed
classes are an optimal way to compromise between the needs of
low-level programmers to see physical data types (such as short
integers) and the needs of high-level programmers to be able to build
generic math objects (for example, a matrix of signals). Templates
have only become practical with recent releases of C++ compilers.
Web-based dissemination of project information has proven to be a
mixed bag. Unfortunately, a significant percentage of people
interested in our technology and resources appear to still have
limited Internet bandwidth and access. Hence, the demand for small
distributions that can be downloaded via slow modems still exists.
This severely limits what we are able to accomplish in the way of
on-line documentation, interactive applets, and distribution of
toolkits including enough data to run a reasonable experiment. Our
anonymous CVS server has been very useful in that it allows users only
to download pieces of the code that have changed - thereby reducing
the amount of data one needs to download to remain current.
The remote job submission facility, though extremely unique and
impressive, is not receiving the initial traffic we had expected.
Users still seem to prefer to download the package and build the demos
on their local machines. We hope to improve the visibility of this
facility by enhancing and streamlining the user interface in the next
year of this project.

Training and Development:

================================================================
08/15/00 to 08/14/01:

 - Workshop:

   Students attending our one-week summer workshop receive 5
   half-day lectures on theory and implementation details as
described
   below. Participants in our two-day workshop held annually in
   January receive a half-day training in which they learn how to
   run large-scale systems.

 - Software:

   Both undergraduates and graduate students learn state-of-the-art
   skills in large-scale software engineering. For example, they
   learn to develop software using a configuration management tool,
   and to prioritize problem solving using a problem-tracking tool.
   Such real world experience makes them invaluable to leading edge
   software companies.

================================================================
08/15/99 to 08/14/00:

 - Workshop:

   Students attending our one-week summer workshop receive 5
   half-day lectures on theory and implementation details, and
   then spend afternoons in the lab acquiring hands-on skills.
   Students are led through some lab exercises by a senior graduate
   student, and then supervised by lab instructors on more
   open-ended assignments. They work from individual workstations,
   and can see the lab instructor's work on a projection system at
   the front of the class. This format has been very successful for
   training students on our software.

================================================================
08/15/98 to 08/14/99:

Students working on this project develop skills
in four major areas:

 - core technology: speech recognition

   In the first year of this project, students have been exposed
   to all major components of a speech recognition system:
   search algorithms, acoustic modeling, training, and lexicon
   development.

 - software engineering: concurrent software development

   Our students are trained to use a state-of-the-art concurrent
   development package (CVS). We make heavier us of this than most,
   and students quickly learn to grapple with the realities of
merging
   code from several developers.

   Students are also exposed to a strict procedure for software
design
   and development that includes design reviews, code reviews,
   diagnostic testing, memory and format checking, multiple platform
   portability testing, documentation, and release.

 - programming languages: C++, perl, and Tk/Tcl

   Since our primary programming language is C++, students are
quickly
   trained to be experts in C++. While we emphasize that code be
clean,
   simple, and easy to read, students nevertheless learn about subtle
   issues of the language, including memory management and compiler
   optimization.

   Students also perform some of their work in perl and Tk/Tcl.
   Tk/Tcl is particularly useful for developing high performance
   GUI-oriented applications. Perl is used primarily to massage data
into
   our standard formats.

 - web programming languages: HTML and Java

   All students must deliver documentation and presentations in html,
   and hence quickly become proficient web programmers. A significant
   portion of our project involves the development of Java applets.
   Our undergraduates, in particular, get a heavy exposure to Java.

Outreach Activities:

================================================================
08/15/00 to 08/14/01:

In addition to the workshops and our active recruitment of
undergraduates from underrepresented groups described below, we also
participate in recruitment efforts for local area students. We
recently participated in several meetings with local area high school
students intended to increase their interest in engineering
undergraduate education. Our public domain software project attracted
significant interest.

We also maintain an active presence on the web and have advised
numerous students about speech technology-related projects.
One of the most impressive included some discussions with
a sixth grader interested in doing a science fair project
on speech recognition.

================================================================
08/15/99 to 08/14/00:

 - Workshops:

   As mentioned elsewhere, the two workshops we host annually are a
major
   component of our outreach activities. Based on our attendance
lists,
   we seem to be achieving our goal of attracting people new to the
details
   of speech recognition technology.

 - Undergraduates:

   We have been successful at recruiting more underrepresented groups
   to work in entry-level capacities as undergraduate hourly workers.

================================================================
08/15/98 to 08/14/99:

We normally invite high school students to participate in our
research
as part of a special program created with a local math and sciences
school in our area. In the first year of this project, we had one
high
school student spend a semester programming a Java application. He
was
an outstanding student who actually produced useful code during his
tenure in our group - one of our better experiences with high school
students.

Journal Publications:
N. Deshmukh, A. Ganapathiraju, and J. Picone, "Hierarchical Search for Large
Vocabulary Conversational Speech Recognition", IEEE Signal Processing Magazine,
vol. 16, (1999), p. 84. Published
J. Picone, J. Hamaker, R. Brown, R.A. Cole and J.H.L. Hansen, "Modern DSP: The
Story of Three Greek Philosophers", IEEE Signal Processing Magazine, vol. 16,
(1999), p. 48. Published
R. Duncan, "Requirements Engineering in Extreme Programming", Crosstalk: The
Journal of Defense Software Engineering, vol. 1, (2001), p. 1. Published

Book(s) of other one-time publications(s):
Vishwanath Mantha, "A New Look at the SWITCHBOARD Corpus" , bibl. Mississippi State
University, (2000). Thesis Submitted
Aravind Ganapathiraju, "Support Vector Machines for Speech Recognition" , bibl.
Mississippi State University, (). Thesis Submitted
J. Picone, C. Atkeson and I. Alphonso, "Harnessing High Bandwidth: Applications in
Speech Recognition" , bibl. presented at the Spring 2000, Internet2 Member Meeting,
Washington, DC, USA, March 2000, (2000). Conference Presentation Published
P.J. Price and J. Picone, "Automatic Speech Recognition: Better Than Text?" , bibl.
presented at the AAAS Annual Meeting and Science Innovation Exposition, Washington,
D.C., USA, February 2000, (2000). Conference presentation Published
G. Doddington, A.Ganapathiraju, J. Picone and Y. Wu, "Adding Word Duration
Information to Bigram Language Models" , bibl. presented at IEEE Automatic Speech
Recognition and Understanding Workshop, Keystone, Colorado, USA, December 1999.,
(1999). Conference publication Published
N. Deshmukh, A. Ganapathiraju, J. Hamaker, J. Picone and M. Ordowski, "A Public
Domain Speech-to-Text System" , bibl. Proceedings of the 6th European Conference on
Speech Communication and Technology, vol. 5, pp. 2127-2130, Budapest, Hungary,
September 1999., (1999). conference paper Published
A. Ganapathiraju, J. Hamaker and J. Picone, "A Hybrid ASR System Using Support
Vector Machines" , bibl. A. Ganapathiraju, J. Hamaker and J. Picone, 'A Hybrid ASR
System Using Support Vector Machines,' proceedings of the International Conference
of Spoken Language Processing,, (2000). Published
of Collection: , "International Conference of Spoken Language Processing"
J. Picone, R. Duncan, and J. Hamaker, "Internet-Accessible Speech Recognition
Technology" , bibl. J. Picone, R. Duncan, and J. Hamaker, 'Internet-Accessible
Speech Recognition Technology,' submitted to the O'Reilly Open Source Convention,
San Diego, California, USA, July 2001., (2001). Conference Accepted
of Collection: , "O'Reilly Open Source Convention"

Other Specific Products:

Software (or netware)

Object-oriented speech recognition software built from a hierarchy
of general purpose modules including math, data structures, and signal
processing. Makes extensive use of C++ and templates.

The software has been placed in the public domain and can be
downloaded from our web site.

Teaching aids

Java applets that teach fundamentals of signal processing
and pattern recognition.

The software has been placed in the public domain and can be
run or downloaded from our web site.

Internet Dissemination:

http://www.isip.msstate.edu/projects/speech

This highly interactive web site has been developed as part
of the project to disseminate all information about the project.
It contains software, publications, Java applets, and even
a remote job submission testbed.

Contributions:

Contributions within Discipline:

 ================================================================
08/15/00 to 08/14/01:

Our contributions in the third year of this project primarily
impact the fields of speech recognition, human language
technology, and digital signal processing. Our major
accomplishments are as follows:

 - Java Applets:
    -> enhanced our job submission applet to use new Java servlet
       technology. Added applications that perform feature analysis
       and speech recognition decoding.

 - Foundation Class Enhancements:
    -> released a major overhaul of the system that makes
       more extensive use of templates and templatized functions;
       significantly impacted code size and run-time efficiency.

 - Production System Release:
    -> two alpha releases of a new version of the production speech
       recognition system based on our modular libraries.

 - Hosted two workshops:
    -> a software design review held in January 2001 that
       included a one-day training session
    -> a one-week training workshop held in May 2001
       that included 24 participants representing 12 universities
       and 10 companies. 12 participants were graduate students.

 - Software engineering:
    -> overhauled our software engineering process to make better use
       of problem-tracking and configuration management.

 - Foundation classes:
    -> released a new version of our application building tool
       that makes building acoustic front ends very simple and
       intuitive.

 - On-Line Support:
    -> we are averaging approximately 5 serious support
       requests per day, and support a user group that has
       grown to over 170 participants. Recent enhancements to our
       process have allowed us to significantly streamline the
       software support process.

================================================================
08/15/99 to 08/14/00:

Our contributions in the second year of this project primarily
impact the fields of speech recognition, human language
technology, and digital signal processing. Our major
accomplishments are as follows:

 - Production System Release:
    -> first release of the production speech recognition
       based on our modular libraries is scheduled for July 1.

 - Hosted two workshops:
    -> a software design review held in January 2000 that
       included a one-day training session
    -> a one-week training workshop to be held in May 2000
       that will include 25 participants representing 6
       countries, 17 universities, one government agency and
       one company. Twenty of the participants are graduate
       students.

 - Software engineering:
    -> added an automated report tracking system to our on-line
       support
    -> upgraded our distribution to use the autoconf facility:
       a standard way of distribution Unix software in which
       the distribution automatically configures itself.
    -> enhanced our ability to do multi-platform testing (we
       now benchmark our releases on Sun Sparc, Sun Solaris x86,
       Linux, and Windows before making a release), and bug
       detection (we make extensive use of professional
       strength debugging tools).

 - Foundation classes:
    -> refined the existing core mathematics classes
    -> added data structures, algorithms, and other signal
       processing building blocks.
    -> introduced classes to handle acoustic models, search
       algorithms, and knowledge sources.
    -> released a production front-end that allows arbitrary
       algorithms to be implemented using a graphical user
       interface

 - Java Applets:
    -> enhanced our pattern recognition applet with several
       important new features, including generation of
       arbitrary data sets, clustering, and visualization of
       decision surfaces.

 - On-Line Support:
    -> we are averaging approximately 5 serious support
       requests per day, and support a user group that has
       grown to over 170 participants. Support is definitely
       becoming a time-consuming issue.

================================================================
08/15/98 to 08/14/99:

Our contributions in the first year of this project primarily
impact the fields of speech recognition, human language
technology, and digital signal processing. Our major
accomplishments are as follows:

 - Development of ISIP's Foundation Classes (IFCs)
 - Creation of a Comprehensive Web Site
 - Java Applets
 - Remote Job Submission Facility
 - Human Resources and Outreach

These are described in detail in various sections below.

 - Development of ISIP's Foundation Classes (IFCs)

   The foundation classes include general mathematics
   (scalars, vectors, matrices), data structures (linked
   lists, binary trees) and other useful abstractions
   (command line parsing, database management).  We have
   completed implementation of the math classes.

   The abstractions we use for these build upon ideas
   promoted in the ANSI C++ standard template library, but
   also add important features required for speech
   recognition research and technology development, such as
   explicit control of the data size of an integral type.

   Several interesting software engineering practices were
   implemented, including internal diagnostics that
   automatically test a class. For example, by simply typing
   'make diagnose', a test program is generated for a class,
   which can be run, debugged, checked for memory leaks,
   etc. This is proving to be an invaluable tool for
   guaranteeing the quality of the code.

 - Creation of a Comprehensive Web Site

   The entire project can be viewed from a web site created
   to support this project. The URL is:

    http://www.isip.msstate.edu/projects/speech

   This site includes a place to download software,
   educational information such as tutorials, applets,
   technical reports, application toolkits, some Java applets
   demonstrating core concepts, and a remote job submission
   facility described below.

   We have implemented a facility to manage and distribute
   our software using a package called Concurrent Versions
   System (CVS). This allows users to download our production
   code via an anonymous CVS server (similar to ftp) that
   automatically updates their code as revisions are made.
   CVS is generally considered to be state-of-the-art in
   software management.

   We have also implemented web pages that maintain an
   archive of our mailing lists used for the project. These
   archives are located at

    http://www.isip.msstate.edu/data/mailing_lists

   and are automatically updated daily.

Contributions to Other Disciplines:

 Our new Java servlet technology places our job submission applet
on the leading edge of Java technology, and provides an extremely
powerful paradigm for remote job submission applications.

The software being developed within the foundation classes is
intended
to be a general purpose testbed for signal processing applications
beyond speech recognition and human language engineering. The Java
applets are of general educational use for undergraduate
engineering.

Contributions to Education and Human Resources:

 ================================================================
08/15/00 to 08/14/01:

   Our workshops continue to be an excellent example of the outreach
   activities in this project. This year, we had 12 participants from
   companies interested in making commercial use of this technology.
   Some of these companies are relatively new to this field and
   greatly appreciated the open access such a workshop provides.

   We also continue to promote participation of undergraduate
   students in our research project. Undergraduates have made
   significant software contributions to this project, and have
   leveraged this experience into significant job opportunies with
   leading companies.

================================================================
08/15/99 to 08/14/00:

   Our workshops are an excellent example of the outreach
   activities in this project. Most of the participants
   represent schools not prominent in the field of speech
   recognition. We have at least one underrepresented
   university participating in our May 2000 workshop. Demand
   for the workshop was so great that we increased
   the size from 8 participants (originally budgeted) to 25
   participants in the first year of the summer workshop.

   We also significantly improved participation of undergraduate
   students from underrepresented groups in our research project
   in the second year of the program.

================================================================
08/15/98 to 08/14/99:

   This grant has directly supported the equivalent of four
   full-time graduate students and several
   undergraduates. Undergraduate students have made major
   contributions in web site development, Java programming,
   and speech system tool development.

   We have also interacted with one high school student in
   this program.  This student developed the first version of
   a Swing-based Java applet that is an enhancement of an
   existing applet. He graduated in Spring'99 and is pursuing
   a degree in computer science.

Contributions to Resources for Science and Technology:

 ================================================================
08/15/00 to 08/14/01:

 - Workshops:

   We continue to develop extensive on-line documentation for the
   workshops we host. All presentation materials are available from
   the web. This year we added self-contained laboratory modules that
   are downloadable from the web.

 - Tutorials:

   Now that our software base has been stabilizing, we have
   begun developing extensive on-line educational material
   related to the system. We are overhauling our web site to reflect
   a more contemporary look and feel for on-line training
   (using a so-called trailhead approach).

================================================================
08/15/99 to 08/14/00:

 - Workshops:

   We have developed extensive on-line documentation for the
   two workshops we host. All presentation materials are
   available from the web. Related coursework (such as an
   updated set of notes for a speech recognition course) is
   also available from the web.

 - Tutorials:

   Now that our software base has been stabilizing, we have
   begun developing extensive turnkey scripts that run canned
   experiments on important applications (such as conversational
   speech recognition). These scripts are extremely important
   in that they show users how to implement subtle details
   of the technology.

================================================================
08/15/98 to 08/14/99:

 - Java Applets

   We have upgraded our existing set of Java signal
   processing applets to a new interface available in Java
   called SWING. This is the latest attempt by Java
   developers to provide a standard high-level interface to
   applications programmers. The previous interface we used
   is being obsoleted. Hence, it was necessary to make this
   step.

   We also introduced two new applets. The first applet
   teaches users about digital filter design. This applet
   also served as our initial testbed for SWING. The second
   applet demonstrates pattern classification. Users can
   create data sets, and classify them using a host of
   classifiers. This is still under development.

 - Remote Job Submission Facility

   One development we are most proud of, and somewhat ahead
   of schedule on, is an applet that allows users to submit
   speech recognition experiments over the Internet. This
   applet is central to our vision of Internet-based
   educational resources. Users can choose an experiment,
   configure various parameters related to the experiment,
   and submit the job.  The job is distributed to our bank of
   servers, and results are returned via the web and/or
   email. Users can supply their own audio data to the
   recognizer.

   This applet is still in the preliminary stages of
   development, but appears to be quite promising. A major
   application of this applet will be to allow users to run
   the system in debug mode and obtain results which they can
   use to benchmark their own algorithms.

Contributions Beyond Science and Engineering:

 All technology developed in this project is available
via the web and is public domain. Industry as well as academia
are free to use this technology with no restrictions. We currently
have several industrial partners using the code, and have
developed one supporting toolkit that is in production use
at a company.

Special Requirements for Annual Project Report:

Unobligated funds: less than 20 percent of current funds

Categories for which nothing is reported:
Special Reporting Requirements
Animal, Human Subjects, Biohazards


  ------------------------------------------------------------------------
 [FastLane Home Page] [Take you to the Project System Control Screen]  We welcome comments on this system