[Project Reporting] ANNUAL REPORT FOR AWARD # 9809300 Joseph Picone ; Mississippi State Univ CARE: Internet-Accessible Speech Recognition Technology Participant Individuals: Graduate student(s) : Aravind Ganapathiraju; Xinping Zhang; Yufeng Wu; Vishwanath Mantha; Richard Duncan Undergraduate student(s) : Robert Brown; Lorena Rogers; Clayton Graff High school student(s) : Tao Weilundemo Senior personnel(s) : William C Chapman Graduate student(s) : Jonathan Hamaker; Shivali Srivastava Undergraduate student(s) : Issac Alphonso; Cedric D George; Joey D Foote; Mahnas J Mohammadi-Aragh; Antonio M Robinson; Rick P King; David B Kay; David J Laundre; Jennifer P Vogel; Jason Rogers Graduate student(s) : Naveen Parihar; Ram Sundaram; Kaihua Haung Undergraduate student(s) : Robert Blackwood; Erich Deitenbeck; Joseph Langley; Katy Muir; Kenneth Poteete; Jason Wallace; Byron Williams; Troy Lindsey Partner Organizations: Department of Defense: Collaborative Research The Department of Defense has been a consistent consumer of the technology developed in this project, and regularly gives us feedback on design issues. We support them as an alpha test site since they are typically the first to use our system. Though we have separate funding from DoD, their impact on the production system has been important. Other collaborators: We maintain several web-based resources geared towards collaboration. This includes a mailing list for project-related announcements that has grown to over 170 participants. These users provide feedback on design questions in addition to bug reports and other such support requests that arise from use of the software. Our annual workshops also train approximately 36 sites per year on our system and promote interactions and collaborations. Activities and findings: Research and Education Activities: ====================================================================== 08/15/00 - 08/14/01: RESEARCH AND EDUCATIONAL ACTIVITIES In the third year of this project, we focused our efforts in two core areas: · Java Applets: overhauled our Java interfaces to use servlet technology, and launched two new applications: feature extraction and recognition. · Production System Release: completed two alpha releases of a new version of the production system that greatly enhance its flexibility and functionality. Extended the interface to support more diverse file formats (both input and output). Integrated our front end application building software. We also continued activities in the areas central to the overall research program: · Foundation classes: added many algorithms at both the math and signal processing layers of the system. Introduced classes to handle general statistical pattern recognition. Revamped many of the underlying classes to make better use of templates and templatized functions. · Workshops: hosted a software design review in January 2001, and two one-week training workshops held in May (`00 and `01). An impressive collection of on-line resources related to these workshops is publicly available. · Software engineering: upgraded our software development process to use a new problem-tracking tool that was written specifically to deal with bug life-cycle issues. Streamlined our support activities and improved the ease of use of our distribution and verification procedures. The workshops continue to be extremely successful as demand has far surpassed our original estimates for enrollment (and taxed our facilities). We have seen a dramatic increase in the number of commercial users of the recognition system. In fact, some of our most active users are now commercial users. On-line support has improved, but still remains a challenge given the wide range of experience levels from the users. ====================================================================== 08/15/99 - 08/14/00: RESEARCH AND EDUCATIONAL ACTIVITIES In the second year of this project, we focused our efforts in five major areas: · Production System Release: the first release of the production speech recognition system, based on our modular libraries, is scheduled for July 1. · Hosted two workshops: a software design review held in January 2000, and a one-week training workshop held in May 2000. · Software engineering: upgraded our distribution to use the autoconf facility, added an automated report tracking system to our on-line support, created a multi-platform support facility. · Foundation classes: added algorithms and other signal processing building blocks, introduced classes to handle acoustic models, search algorithms, and knowledge sources, and released a front-end that allows arbitrary algorithms to be implemented using a graphical user interface. · Java Applets: enhanced our pattern recognition applet with several important new features, including generation of arbitrary data sets, clustering, and visualization of decision surfaces. The workshops appear to be extremely successful as demand has far surpassed our original estimates for enrollment (and taxed our facilities). The number of serious users of the recognition system is continually growing. It is becoming a challenge to provide same-day response to most support requests, particularly given the wide range of experience levels from the users. See the attached pdf file for the entire report. ====================================================================== 08/15/98 - 08/14/99: RESEARCH AND EDUCATIONAL ACTIVITIES In the first year of this project, we focused our efforts in three major areas: · Core Technology: extensions of the speech recognition system required to enhance its appeal to our customer base (driven by customer feedback); · Foundation Classes: building blocks such as vectors, matrices, and data structures that simplify and standardize the development of higher-level classes; · Web-Based Information: a comprehensive and informative web site that constitutes a central point of contact for everything related to the project. We have seen interest in the project grow as evidenced by the fact our mailing list has grown to 150 participants, and we have received several serious inquiries about collaborations based on our system (one of which resulted in participation in a joint NSF/EU proposal [1]). Major milestones for the first year of the project included the release of a fully functional speech recognition system (including feature extraction and training), and the development of a remote job submission capability that lets users submit jobs to our system over the Internet. See the attached pdf file for the entire report. Findings: ====================================================================== 08/15/00 - 08/14/01: MAJOR FINDINGS In the third year of this project, we continued conducting our annual workshops. In January 2001, we hosted 14 visitors for a software design review. In May 2001, we hosted 24 participants for our one-week training workshop. Several collaborations resulted from our previous workshops, and it appears the same will be true this year as well. A promising trend this year has been a significantly increased level of interest by commercial users in the software and technology. A new problem-tracking tool was introduced that has greatly impacted our software design process. This tool, called Varmint, is publicly available and is part of our public domain software distribution. Varmint allows us to track the life-cycle of a bug, and is particularly useful in a multi-programmer environment in which several people may touch a bug during its life-cycle. Varmint is styled after several commercial tools, and was designed based on several common models of bug tracking used by information technology professionals. The most significant benefit of using this tool is that programmers are explicitly conscious of the bugs for which they are responsible, and have a clear understanding of the priority of these bugs with respect to the current software release schedule. This creates the proper atmosphere of accountability required to make sure releases are clean before they are made. We also released a version of our job submission applet that uses Java servlets. While the servlet programming environment is still rapidly evolving, servlets have been instrumental in allowing our interfaces to be sufficiently powerful. For example, users can now browse our filesystems or their own filesystems using the same interface, and can collect a diverse set of files for processing. These features allow users to benchmark their local implementations against our reference implementations. One windfall from this has been a reduction in support requests on trivial issues such as file formats because users can now debug this themselves using the applet. We have continued to release a large amount of supporting materials on our web site that are relevant to our mission of providing comprehensive conversational speech recognition tools. We now deliver high-quality transcriptions for 500 hours of Switchboard conversational speech data, and offer acoustic models trained on this data. ISIP resources are frequently referenced at major speech recognition forums such as the Department of Defense's Speech Transcription Workshop (also known as the Hub 5E workshop). This year, we added phonetically transcribed data originally developed at the International Computer Science Institute at the University of California, Berkeley. This data, along with our word-level transcriptions, are provided in a format that makes it easy to use for investigations into phonetic-level performance of speech recognition systems. This data is being used by several students in their dissertation research. Our on-line support and tutorial materials continue to grow as we collect more feedback from our users. Our support line averages 5 support requests per day, about the same level of activity we experienced in the previous year of this project. Typically one of these requests will be nontrivial and require a more measured response. We estimate that we spend approximately 20 hours per week performing direct support of the software through the on-line help service. The user base for our software appears to have stabilized at about 175 users. We have also in the past year begun to graduate our first thesis-option students who contributed to this project. Thesis topics have ranged from advance pattern recognition techniques (Support Vector Machines) to experimental topics such as the influence of transcription errors on performance. These theses are the first work to make extensive use of the ISIP tools. ====================================================================== 08/15/99 - 08/14/00: MAJOR FINDINGS In the second year of this project, we introduced two pivotal activities in the project: annual workshops and a rigorous software distribution process. The workshop activities are proceeding smoothly. In January 2000, we hosted nine visitors from several foreign countries (China, Finland), government agencies (FBI, DoD), and industrial sites (IBM, MITRE, Lincoln Labs). We reviewed the goals of the research program, the architecture of the system, provided several demonstrations, and collected feedback on features and capabilities needed in future versions of the system. Several collaborations resulted from this meeting, including an audio indexing opportunity with George Tech, and invited talks at IBM. More collaborations are planned. At the time this report was written, we are completing plans for the May 2000 workshop, which will include 25 participants, of which 20 are graduate students. Demand for the workshop was strong - we tripled the number of participants over what was originally budgeted. We turned away approximately 10 potential participants due to space and resource limitations. One surprise in the second year of the project has been the growing importance of supporting the Linux operating system, and the difficulties in doing so. Through experience, we have learned that despite using the same compiler and software (GNU gcc, make, etc.) but a different flavor of Unix (Sun Solaris x86), we cannot guarantee robust releases for Linux users. Hence, we spent some time enhancing our ability to achieve platform independence across a wide range of Unix platforms. We now routinely run our releases through a suite of systems before actually making the release. We also have encountered a large demand for Windows-based ports of our system. Currently, we do this through the use of a Unix-like shell available under Windows (Cygnus' cygwin tools). This has also increased the support overhead in making releases of our system. Despite the demand for a native Windows port, we are not assigning this a high priority until the core system is stable. This is primarily because there is a lack of standardization of C++ compilers and development environments, therefore making it hard to support both environments simultaneously when the software base is changing rapidly. With the addition of a full-time staff person, it has been possible to expand our on-line support activities. We now handle approximately 5 serious support requests per day. Many support requests involve hand-holding inexperienced users on basic computing issues - we are still struggling with how to deal with these in a timely manner. The remainder involve unexpected program crashes that require extensive diagnosis. We have implemented an automated problem tracking system to make sure such requests are properly tracked. We also provide an ability for users to upload their data to us, so that we can replicate their problems. The majority of serious problems seem to relate to compiler-dependent problems (for example, a bug which did not show up on one version of an operating system, but is fatal on a different architecture with a different compiler). This also exposed the critical need for an in-house multi-platform evaluation facility. Last but not least, we have made extensive progress enhancing basic functionality of the system. We have developed a generalized hierarchical search engine that is the first such system of its type. We provide an ability to decode speech using either networks for N-gram language models. Users can supply either type of model, or both, at any level of the system. For example, it is possible to constrain the search using N-grams of parts of speech, as well as N-grams of words. We have also developed a front-end that allows users to configure the signal processing portions of the system without writing any code - algorithms can be specified using a GUI-oriented tool that automatically schedules the necessary operations required. ====================================================================== 08/15/98 - 08/14/99: MAJOR FINDINGS Since the main component of this project is the development and dissemination of speech recognition technology, we did not expect to generate a significant list of technology-related research accomplishments in the first year of the project. Nevertheless, we have begun some interesting research as peripheral activities. These research topics include the use of Support Vector Machines for improved acoustic modeling, the study of the influence of context-sensitive word duration models on conversational speech recognition performance (a step in the direction of introducing prosodics into the speech recognition problem), and implementation of a new segmental Baum-Welch training algorithm (preliminary results for these approaches look promising; detailed results should be available by December 1999). The fact that such research can be easily performed with our system supports our contention that the system is extensible. With respect to the core technology component of the program, we believe we have delivered a decoder that is extremely efficient for conversational speech recognition, and is competitive with state-of-the-art. Decoding time and memory requirements are within the reach of standard PC-class computers. This is important in the context of this program because it will increase access to this technology by allowing smaller research labs to be able to use the system with fairly modest computing environments. To move to larger domains than conversational speech, such as broadcast news, we have developed a dynamic language modeling capability that caches large language models to decrease physical memory requirements. We have also demonstrated that porting of the system to any gcc compliant platform is fairly easy. The only outstanding issue is wide character support (Unicode) under Linux. Once Linux compilers catch up (expected in Fall'99), our cross-platform support problems should be minimal. Foundation class development has proceeded using a model similar to Java, but adapted to the demands of speech research. We have found it extremely useful to abstract the user from the details of the operating system through the use of our system classes. These handle all low-level interactions with the operating system, and centralize many tasks such as memory management, file management and I/O. The next level above the system classes, the math classes, provide the user with basic data type building blocks. Here we have followed an STL model, and have demonstrated that a mixture of templates and fixed classes are an optimal way to compromise between the needs of low-level programmers to see physical data types (such as short integers) and the needs of high-level programmers to be able to build generic math objects (for example, a matrix of signals). Templates have only become practical with recent releases of C++ compilers. Web-based dissemination of project information has proven to be a mixed bag. Unfortunately, a significant percentage of people interested in our technology and resources appear to still have limited Internet bandwidth and access. Hence, the demand for small distributions that can be downloaded via slow modems still exists. This severely limits what we are able to accomplish in the way of on-line documentation, interactive applets, and distribution of toolkits including enough data to run a reasonable experiment. Our anonymous CVS server has been very useful in that it allows users only to download pieces of the code that have changed - thereby reducing the amount of data one needs to download to remain current. The remote job submission facility, though extremely unique and impressive, is not receiving the initial traffic we had expected. Users still seem to prefer to download the package and build the demos on their local machines. We hope to improve the visibility of this facility by enhancing and streamlining the user interface in the next year of this project. Training and Development: ================================================================ 08/15/00 to 08/14/01: - Workshop: Students attending our one-week summer workshop receive 5 half-day lectures on theory and implementation details as described below. Participants in our two-day workshop held annually in January receive a half-day training in which they learn how to run large-scale systems. - Software: Both undergraduates and graduate students learn state-of-the-art skills in large-scale software engineering. For example, they learn to develop software using a configuration management tool, and to prioritize problem solving using a problem-tracking tool. Such real world experience makes them invaluable to leading edge software companies. ================================================================ 08/15/99 to 08/14/00: - Workshop: Students attending our one-week summer workshop receive 5 half-day lectures on theory and implementation details, and then spend afternoons in the lab acquiring hands-on skills. Students are led through some lab exercises by a senior graduate student, and then supervised by lab instructors on more open-ended assignments. They work from individual workstations, and can see the lab instructor's work on a projection system at the front of the class. This format has been very successful for training students on our software. ================================================================ 08/15/98 to 08/14/99: Students working on this project develop skills in four major areas: - core technology: speech recognition In the first year of this project, students have been exposed to all major components of a speech recognition system: search algorithms, acoustic modeling, training, and lexicon development. - software engineering: concurrent software development Our students are trained to use a state-of-the-art concurrent development package (CVS). We make heavier us of this than most, and students quickly learn to grapple with the realities of merging code from several developers. Students are also exposed to a strict procedure for software design and development that includes design reviews, code reviews, diagnostic testing, memory and format checking, multiple platform portability testing, documentation, and release. - programming languages: C++, perl, and Tk/Tcl Since our primary programming language is C++, students are quickly trained to be experts in C++. While we emphasize that code be clean, simple, and easy to read, students nevertheless learn about subtle issues of the language, including memory management and compiler optimization. Students also perform some of their work in perl and Tk/Tcl. Tk/Tcl is particularly useful for developing high performance GUI-oriented applications. Perl is used primarily to massage data into our standard formats. - web programming languages: HTML and Java All students must deliver documentation and presentations in html, and hence quickly become proficient web programmers. A significant portion of our project involves the development of Java applets. Our undergraduates, in particular, get a heavy exposure to Java. Outreach Activities: ================================================================ 08/15/00 to 08/14/01: In addition to the workshops and our active recruitment of undergraduates from underrepresented groups described below, we also participate in recruitment efforts for local area students. We recently participated in several meetings with local area high school students intended to increase their interest in engineering undergraduate education. Our public domain software project attracted significant interest. We also maintain an active presence on the web and have advised numerous students about speech technology-related projects. One of the most impressive included some discussions with a sixth grader interested in doing a science fair project on speech recognition. ================================================================ 08/15/99 to 08/14/00: - Workshops: As mentioned elsewhere, the two workshops we host annually are a major component of our outreach activities. Based on our attendance lists, we seem to be achieving our goal of attracting people new to the details of speech recognition technology. - Undergraduates: We have been successful at recruiting more underrepresented groups to work in entry-level capacities as undergraduate hourly workers. ================================================================ 08/15/98 to 08/14/99: We normally invite high school students to participate in our research as part of a special program created with a local math and sciences school in our area. In the first year of this project, we had one high school student spend a semester programming a Java application. He was an outstanding student who actually produced useful code during his tenure in our group - one of our better experiences with high school students. Journal Publications: N. Deshmukh, A. Ganapathiraju, and J. Picone, "Hierarchical Search for Large Vocabulary Conversational Speech Recognition", IEEE Signal Processing Magazine, vol. 16, (1999), p. 84. Published J. Picone, J. Hamaker, R. Brown, R.A. Cole and J.H.L. Hansen, "Modern DSP: The Story of Three Greek Philosophers", IEEE Signal Processing Magazine, vol. 16, (1999), p. 48. Published R. Duncan, "Requirements Engineering in Extreme Programming", Crosstalk: The Journal of Defense Software Engineering, vol. 1, (2001), p. 1. Published Book(s) of other one-time publications(s): Vishwanath Mantha, "A New Look at the SWITCHBOARD Corpus" , bibl. Mississippi State University, (2000). Thesis Submitted Aravind Ganapathiraju, "Support Vector Machines for Speech Recognition" , bibl. Mississippi State University, (). Thesis Submitted J. Picone, C. Atkeson and I. Alphonso, "Harnessing High Bandwidth: Applications in Speech Recognition" , bibl. presented at the Spring 2000, Internet2 Member Meeting, Washington, DC, USA, March 2000, (2000). Conference Presentation Published P.J. Price and J. Picone, "Automatic Speech Recognition: Better Than Text?" , bibl. presented at the AAAS Annual Meeting and Science Innovation Exposition, Washington, D.C., USA, February 2000, (2000). Conference presentation Published G. Doddington, A.Ganapathiraju, J. Picone and Y. Wu, "Adding Word Duration Information to Bigram Language Models" , bibl. presented at IEEE Automatic Speech Recognition and Understanding Workshop, Keystone, Colorado, USA, December 1999., (1999). Conference publication Published N. Deshmukh, A. Ganapathiraju, J. Hamaker, J. Picone and M. Ordowski, "A Public Domain Speech-to-Text System" , bibl. Proceedings of the 6th European Conference on Speech Communication and Technology, vol. 5, pp. 2127-2130, Budapest, Hungary, September 1999., (1999). conference paper Published A. Ganapathiraju, J. Hamaker and J. Picone, "A Hybrid ASR System Using Support Vector Machines" , bibl. A. Ganapathiraju, J. Hamaker and J. Picone, 'A Hybrid ASR System Using Support Vector Machines,' proceedings of the International Conference of Spoken Language Processing,, (2000). Published of Collection: , "International Conference of Spoken Language Processing" J. Picone, R. Duncan, and J. Hamaker, "Internet-Accessible Speech Recognition Technology" , bibl. J. Picone, R. Duncan, and J. Hamaker, 'Internet-Accessible Speech Recognition Technology,' submitted to the O'Reilly Open Source Convention, San Diego, California, USA, July 2001., (2001). Conference Accepted of Collection: , "O'Reilly Open Source Convention" Other Specific Products: Software (or netware) Object-oriented speech recognition software built from a hierarchy of general purpose modules including math, data structures, and signal processing. Makes extensive use of C++ and templates. The software has been placed in the public domain and can be downloaded from our web site. Teaching aids Java applets that teach fundamentals of signal processing and pattern recognition. The software has been placed in the public domain and can be run or downloaded from our web site. Internet Dissemination: http://www.isip.msstate.edu/projects/speech This highly interactive web site has been developed as part of the project to disseminate all information about the project. It contains software, publications, Java applets, and even a remote job submission testbed. Contributions: Contributions within Discipline: ================================================================ 08/15/00 to 08/14/01: Our contributions in the third year of this project primarily impact the fields of speech recognition, human language technology, and digital signal processing. Our major accomplishments are as follows: - Java Applets: -> enhanced our job submission applet to use new Java servlet technology. Added applications that perform feature analysis and speech recognition decoding. - Foundation Class Enhancements: -> released a major overhaul of the system that makes more extensive use of templates and templatized functions; significantly impacted code size and run-time efficiency. - Production System Release: -> two alpha releases of a new version of the production speech recognition system based on our modular libraries. - Hosted two workshops: -> a software design review held in January 2001 that included a one-day training session -> a one-week training workshop held in May 2001 that included 24 participants representing 12 universities and 10 companies. 12 participants were graduate students. - Software engineering: -> overhauled our software engineering process to make better use of problem-tracking and configuration management. - Foundation classes: -> released a new version of our application building tool that makes building acoustic front ends very simple and intuitive. - On-Line Support: -> we are averaging approximately 5 serious support requests per day, and support a user group that has grown to over 170 participants. Recent enhancements to our process have allowed us to significantly streamline the software support process. ================================================================ 08/15/99 to 08/14/00: Our contributions in the second year of this project primarily impact the fields of speech recognition, human language technology, and digital signal processing. Our major accomplishments are as follows: - Production System Release: -> first release of the production speech recognition based on our modular libraries is scheduled for July 1. - Hosted two workshops: -> a software design review held in January 2000 that included a one-day training session -> a one-week training workshop to be held in May 2000 that will include 25 participants representing 6 countries, 17 universities, one government agency and one company. Twenty of the participants are graduate students. - Software engineering: -> added an automated report tracking system to our on-line support -> upgraded our distribution to use the autoconf facility: a standard way of distribution Unix software in which the distribution automatically configures itself. -> enhanced our ability to do multi-platform testing (we now benchmark our releases on Sun Sparc, Sun Solaris x86, Linux, and Windows before making a release), and bug detection (we make extensive use of professional strength debugging tools). - Foundation classes: -> refined the existing core mathematics classes -> added data structures, algorithms, and other signal processing building blocks. -> introduced classes to handle acoustic models, search algorithms, and knowledge sources. -> released a production front-end that allows arbitrary algorithms to be implemented using a graphical user interface - Java Applets: -> enhanced our pattern recognition applet with several important new features, including generation of arbitrary data sets, clustering, and visualization of decision surfaces. - On-Line Support: -> we are averaging approximately 5 serious support requests per day, and support a user group that has grown to over 170 participants. Support is definitely becoming a time-consuming issue. ================================================================ 08/15/98 to 08/14/99: Our contributions in the first year of this project primarily impact the fields of speech recognition, human language technology, and digital signal processing. Our major accomplishments are as follows: - Development of ISIP's Foundation Classes (IFCs) - Creation of a Comprehensive Web Site - Java Applets - Remote Job Submission Facility - Human Resources and Outreach These are described in detail in various sections below. - Development of ISIP's Foundation Classes (IFCs) The foundation classes include general mathematics (scalars, vectors, matrices), data structures (linked lists, binary trees) and other useful abstractions (command line parsing, database management). We have completed implementation of the math classes. The abstractions we use for these build upon ideas promoted in the ANSI C++ standard template library, but also add important features required for speech recognition research and technology development, such as explicit control of the data size of an integral type. Several interesting software engineering practices were implemented, including internal diagnostics that automatically test a class. For example, by simply typing 'make diagnose', a test program is generated for a class, which can be run, debugged, checked for memory leaks, etc. This is proving to be an invaluable tool for guaranteeing the quality of the code. - Creation of a Comprehensive Web Site The entire project can be viewed from a web site created to support this project. The URL is: http://www.isip.msstate.edu/projects/speech This site includes a place to download software, educational information such as tutorials, applets, technical reports, application toolkits, some Java applets demonstrating core concepts, and a remote job submission facility described below. We have implemented a facility to manage and distribute our software using a package called Concurrent Versions System (CVS). This allows users to download our production code via an anonymous CVS server (similar to ftp) that automatically updates their code as revisions are made. CVS is generally considered to be state-of-the-art in software management. We have also implemented web pages that maintain an archive of our mailing lists used for the project. These archives are located at http://www.isip.msstate.edu/data/mailing_lists and are automatically updated daily. Contributions to Other Disciplines: Our new Java servlet technology places our job submission applet on the leading edge of Java technology, and provides an extremely powerful paradigm for remote job submission applications. The software being developed within the foundation classes is intended to be a general purpose testbed for signal processing applications beyond speech recognition and human language engineering. The Java applets are of general educational use for undergraduate engineering. Contributions to Education and Human Resources: ================================================================ 08/15/00 to 08/14/01: Our workshops continue to be an excellent example of the outreach activities in this project. This year, we had 12 participants from companies interested in making commercial use of this technology. Some of these companies are relatively new to this field and greatly appreciated the open access such a workshop provides. We also continue to promote participation of undergraduate students in our research project. Undergraduates have made significant software contributions to this project, and have leveraged this experience into significant job opportunies with leading companies. ================================================================ 08/15/99 to 08/14/00: Our workshops are an excellent example of the outreach activities in this project. Most of the participants represent schools not prominent in the field of speech recognition. We have at least one underrepresented university participating in our May 2000 workshop. Demand for the workshop was so great that we increased the size from 8 participants (originally budgeted) to 25 participants in the first year of the summer workshop. We also significantly improved participation of undergraduate students from underrepresented groups in our research project in the second year of the program. ================================================================ 08/15/98 to 08/14/99: This grant has directly supported the equivalent of four full-time graduate students and several undergraduates. Undergraduate students have made major contributions in web site development, Java programming, and speech system tool development. We have also interacted with one high school student in this program. This student developed the first version of a Swing-based Java applet that is an enhancement of an existing applet. He graduated in Spring'99 and is pursuing a degree in computer science. Contributions to Resources for Science and Technology: ================================================================ 08/15/00 to 08/14/01: - Workshops: We continue to develop extensive on-line documentation for the workshops we host. All presentation materials are available from the web. This year we added self-contained laboratory modules that are downloadable from the web. - Tutorials: Now that our software base has been stabilizing, we have begun developing extensive on-line educational material related to the system. We are overhauling our web site to reflect a more contemporary look and feel for on-line training (using a so-called trailhead approach). ================================================================ 08/15/99 to 08/14/00: - Workshops: We have developed extensive on-line documentation for the two workshops we host. All presentation materials are available from the web. Related coursework (such as an updated set of notes for a speech recognition course) is also available from the web. - Tutorials: Now that our software base has been stabilizing, we have begun developing extensive turnkey scripts that run canned experiments on important applications (such as conversational speech recognition). These scripts are extremely important in that they show users how to implement subtle details of the technology. ================================================================ 08/15/98 to 08/14/99: - Java Applets We have upgraded our existing set of Java signal processing applets to a new interface available in Java called SWING. This is the latest attempt by Java developers to provide a standard high-level interface to applications programmers. The previous interface we used is being obsoleted. Hence, it was necessary to make this step. We also introduced two new applets. The first applet teaches users about digital filter design. This applet also served as our initial testbed for SWING. The second applet demonstrates pattern classification. Users can create data sets, and classify them using a host of classifiers. This is still under development. - Remote Job Submission Facility One development we are most proud of, and somewhat ahead of schedule on, is an applet that allows users to submit speech recognition experiments over the Internet. This applet is central to our vision of Internet-based educational resources. Users can choose an experiment, configure various parameters related to the experiment, and submit the job. The job is distributed to our bank of servers, and results are returned via the web and/or email. Users can supply their own audio data to the recognizer. This applet is still in the preliminary stages of development, but appears to be quite promising. A major application of this applet will be to allow users to run the system in debug mode and obtain results which they can use to benchmark their own algorithms. Contributions Beyond Science and Engineering: All technology developed in this project is available via the web and is public domain. Industry as well as academia are free to use this technology with no restrictions. We currently have several industrial partners using the code, and have developed one supporting toolkit that is in production use at a company. Special Requirements for Annual Project Report: Unobligated funds: less than 20 percent of current funds Categories for which nothing is reported: Special Reporting Requirements Animal, Human Subjects, Biohazards ------------------------------------------------------------------------ [FastLane Home Page] [Take you to the Project System Control Screen] We welcome comments on this system