This data was generated by computing a 256x256 DCT on the annotated patches in v4.0.0 of the TUH DPATH Breast data. The most significant coefficients were retained (32x32) for R, G and B. The csv file contains a header row and data. The first entry in the data is a numeric label. The legend for this label is as follows: Legend: 0 = 'norm' 1 = 'artf' 2 = 'nneo' 3 = 'infl' 4 = 'susp' 5 = 'dcis' 6 = 'indc' 7 = 'null' 8 = 'bckg' For scoring, these labels were collapsed to: Ignore: (1) artf : 1116 [ 5.02% / 17.33% ] (4) susp : 115 [ 0.52% / 0.52% ] (7) null : 618 [ 2.78% / 3.30% ] Map to 'non-cancer': Map to 'interesting': (0) norm : 4797 [ 21.57% / 45.57% ] (2) nneo : 7162 [ 32.20% / 100.00% ] (3) infl : 958 [ 4.31% / 7.60% ] (5) dcis : 1048 [ 4.71% / 12.32% ] (6) indc : 1484 [ 6.67% / 24.01% ] (8) bckg : 4943 [ 22.22% / 67.80% ] 6-way choice (weight by occurrence): (0) norm (2) nneo (3) infl (5) dcis (6) indc (8) bckg Scoring computes the average of the error rates for the first five classes, and averages that with the score on bckg. The scoring script is in /scripts.