EVALUATION PARADIGM FOR SURNAME GENERATION PROBLEM
- 18,494 names and 25,648 manually transcribed pronunciations
in the database using
Worldbet symbols
- Divide the database into train and test sets (3 cuts of
train-test pairs)
- Context length is used as feature to describe
the sound
- Build different models using different context lengths
- Evaluate these models on the test data and compare
the performance of these models against the reference model
(manually transcribed pronunciations)