What is in the training data? | |
|---|---|
| |
|
| Unique Words | 15,127 |
| Number of Word Tokens | 659,713 |
| Number of Monosyllabic Words * | 529 |
| Training tokens covered by top 200 Monosyllabic Words | 95% |
| Word tokens covered by top 529 Monosyllabic Words | 75% |
| Word tokens covered by top 200 Monosyllabic Words | 71% |
* Dependent on the alignment and lexicon