LEXICAL PART OF SPEECH:
- Lexicon: alphabetic arrangement of words and their definitions.
A term often used to describe the list of allowable
words for a speech recognition system.
- Lexical Part of Speech: A restricted inventory of word-type
categories which capture generalizations of word forms and
distributions ("dog" and "cat" are nouns and animals).
- Part of Speech (POS): noun, verb, adjective, adverb,
interjection, conjunction, determiner, preposition, and pronoun.
- Proper Noun: names such as "Velcro" or "Spandex". Pose a very
challenging problem for speech recognition because of the lack of
pronunciation rules (e.g., "Nyugen", "Sorbet").
- Open POS Categories:
Tag |
Description |
Function |
Example |
N |
Noun |
Named entity |
cat |
V |
Verb |
Event or condition |
forget |
Adj |
Adjective |
Descriptive |
yellow |
Adv |
Adverb |
Manner of action |
quickly |
Interj |
Interjection |
Reaction |
Oh! |
- Closed POS Categories: some level of universal agreement
on the categories (e.g, conjunction, determiner, preposition).
- Penn Treebank: the LDC's
Penn Treebank
is one of the most ambitious projects to date in which
large amounts of data have been categorized.
- Wordnet: Princeton's
Wordnet
is another very important and ambitious project
to develop an on-line lexical reference system.