AUTOMATED SCORING USING STRING EDITS AND DYNAMIC PROGRAMMING
To automatically score a hypothesis, we must first align
it with the reference text, and then count word errors
(substitutions, deletions, and insertions).
The desired output is shown below:
Input REF: CUT TALL SPRUCE TREES
Input HYP: HAUL MOOSE FOR FREE
Align REF: CUT TALL SPRUCE *** TREES
Align HYP: *** HAUL MOOSE FOR FREE
< 3 Sub | 1 Ins | 1 Del | 0 Cor | 4 Ref Words >
The solution to this problem can be achieved using
dynamic programming with a Levenstein distance metric
(each non-matching pair adds one to the accumulated distance).
We can demonstrate this using a DP grid: