BravoBrava
Mississippi State University
•
Information extraction is the analysis of
natural language to collect information
about specified types of entities
•
As the focus shifts to providing enhanced annotations, WER may not be the
most appropriate measure of performance (content-based scoring).
F-Measure
0%
10%
20%
30%
70%
90%
80%
100%
Word Error Rate (Hub-4 Eval’98)
Evaluation Metrics
Beyond WER: Named Entity
Recall
=
# slots correctly filled
# slots filled in key
Precision =
# slots correctly filled
# slots filled by system
F-Measure
=
2 x recall x precision
(recall + precision)
•
An example of named entity annotation:
Mr. <en type=“person”>
Sears
</en> bought
a new suit at <en type=“org”>
Sears
</en>
in <en type=“location”>
Washington
</en>
<time type=“date”>
yesterday
</time>
•
Evaluation Metrics: