INFORMATION RETRIEVAL
-
Goals and Issues of IR
- Given query find documents to help answer query
- IR is not question - answering
- Helps summarize documents
- Links related documents
- Multi- and cross-lingual capabilities desirable
- Representation of knowledge (text or media?)
- Queries (type of query language)
- Evaluation methods (TREC SDR)
-
Components of an IR System
- Tokenization into words
- Removal of function words
- Phrase identification (noun phrases, names etc.)
- Feature weighting to indicate importance in text