1. INTRODUCTION One page of general intro 1.1 Motivation - why was SWB collected? 1.2 Technical Motivation - why do we need transcribed data? why is this important in the recognition process? 1.3 Summary of the first release of SWB: 1.3.1 Experimental Conditions 1.3.2 Speaker statistics 1.3.3 CD distribution by LDC 1.3.4 Changes to the original distributions 1.3.5 Audio format conversion (have a note saying that we cannot distribute audio files) 1.3.6 Echo Cancellation 1.3.7 Extraction of transcription files 1.4 Technology summary (Recognition performance history) 1.5 Organization of thesis 2. SEGMENTATION AND TRANSCRIPTION (written from the point o 2.1 Original Guidelines 2.2 Changes in Guidelines (This will include handling of non-speech events) *** need to explain why - consult the FAQ *** 2.3 Workflow process 2.4 Segmenter Tool 2.5 SWB FAQ - review all the interesting problems 3. HUMAN TRANSCRIPTION PERFORMANCE (write this chapter from the point of view of documenting how well humans do on this task) 3.1 The QC process flow 3.2 QC scripts ( This will be a large section consisting of a description of all the scripts we use.) 3.3 Cross validation 3.4 Data distribution via the Internet 6.5 Lexicon Development (will have subsections that will cover alternate pronunciations and laughter,partial words and comparison with M-W) 4. EXPERIMENTS AND RESULTS 4.1 N-gram statistical Analysis 4.2 Switched speaker statistics 4.3 OOV analysis 4.4 Disfluency effect statistics (Non speech tags, relatinships between channels 0 and 1) 5. SUMMARY AND FUTURE WORK 5.1 Conclusions 5.2 What we learned from this experience 5.3 Future work