The data for the final exam can be found at:

 http://www.isip.msstate.edu/publications/courses/ece_8990_pr/exams/1999/data

There are the following files:

train/train_1.data		=> training data for set 1
train/train_2.data		=> training data for set 2

test/test_1.data		=> development testing data for set 1
test/test_1.class		=> answers for testing data for set 1
test/test_2.data		=> development testing data for set 2
test/test_2.class		=> answers for testing data set 2

eval/eval_1.data		=> blind evaluation data for set 1
eval/eval_2.data		=> blind evaluation data for set 2

There is also a scoring program, score.cc, and a binary for
Sun Sparcs (score.exe). It is run as follows:

 score.exe test_1.class test_1.example

and outputs the following:

 error on token no. 19: ref = 11, hyp = 10
 error on token no. 375: ref = 9, hyp = 8
 (377 correct out of 379 tokens; percent error =    0.53%)
 
Each data file consists of a class tag, followed by a vector of data:

 isip02_[2]: m test_1.data
 11: -3.189 2.620 -0.833 0.450 0.009 0.444 -0.348 0.126 -0.614 0.081

Students must email me their hypotheses for the two evaluation sets,
and we will score them using the answers (not available to the students).

Development of an algorithm should do the following:

 - train on train_1.data
 - evaluate the models on test_1.data (using score and test_1.class)
 - for the final evaluation, train on both train_1.data and test_1.data,
   and evaluate eval_1.data

This should be done for set 1 and set 2.

Here is a brief description of the data.

 Data set 1: static classification
 
 10		dimension of vectors
 11		classes
 83		eval set vectors
 379		development set vectors
 528		training set vectors

 Data set 2: temporal modeling
 
 39		dimension of vectors 
 5		classes
 225		eval set vectors (sets of 5 vectors for each class)
 350		development set vectors (sets of 5 vectors for each class)
 925		training set vectors (sets of 5 vectors for each class)

 On set no. 2, the data can be assumed to be sequences of 5 vectors
 (every group of 5 vectors has the same class assignment and can be
 thought of as having occurred sequentially in time).
 The motivated student might try to do some temporal modeling of the data.

Let the games begin. If you have any questions, or would like your
data evaluated, send mail to help@isip.msstate.edu.

Regards,

Joe Picone