WORD CLASSES: A STATISTICAL APPROACH
- Word Classes: Assign words to similar classes
based on their usage in real text (clustering). Can be derived
automatically using statistical parsers.
- Typically more refined than POS tags (all words in a class will
share the same POS tag). Based on semantics (meaning).
- Word classes are used extensively in language model probability
smoothing.
- Examples:
- {Monday, Tuesday, ..., weekends}
- {great, big, vast, ..., gigantic}
- {down, up, left, right, ..., sideways}