What is in the training data? | |
---|---|
|
|
Unique Words | 15,127 |
Number of Word Tokens | 659,713 |
Number of Monosyllabic Words * | 529 |
Training tokens covered by top 200 Monosyllabic Words | 95% |
Word tokens covered by top 529 Monosyllabic Words | 75% |
Word tokens covered by top 200 Monosyllabic Words | 71% |
* Dependent on the alignment and lexicon