File: AAREADME.txt Database: TUH EEG Slowing Corpus Version: 2.0.1 ------------------------------------------------------------------------------- Change Log: v2.0.1 (20240207): Headers were modified. No change to the signal data. ------------------------------------------------------------------------------- This file contains some basic statistics about the TUH EEG Slowing Corpus, a corpus developed to aid in the development algorithms using machine learning that can differentiate between seizure and slowing events. This corpus is a subset of the TUH EEG Corpus and contains sessions that are known to contain seizure events, slowing events, and complex background events. When you use this specific corpus in your research or technology development, we ask that you reference the corpus using this publication: von Weltin, E., Ahsan, T., Shah, V., Jamshed, D., Golmohammadi, M., Obeid, I., & Picone, J. (2017). Electroencephalographic Slowing: A Source of Error in Automatic Seizure Detection. In J. Picone & I. Obeid (Eds.), Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (pp. 1–5). Philadelphia, Pennsylvania, USA: IEEE. This publication can be retrieved from: https://www.isip.piconepress.com/publications/unpublished/conferences/2017/ieee_spmb/slowing/ Our preferred reference for the TUH EEG Corpus, from which this slowing corpus was derived, is: Obeid, I., & Picone, J. (2016). The Temple University Hospital EEG Data Corpus. Frontiers in Neuroscience, Section Neural Technology, 10, 196. http://doi.org/http://dx.doi.org/10.3389/fnins.2016.00196 v2.0.0 of the TUH EEG Slowing Corpus was based on v2.0.0 of the TUH EEG Corpus. Please see the documentation for TUH EEG v2.0.0 to understand how the data is structured. There are 6 types of files in this release: *.edf: the EEG sampled data in European Data Format (EDF) *.tse: term-based annotation files containing a single 10-second sample per file. These files, as well as .lbl files, use an additional field in the file name to indicate the specific event in that file. *.tse_agg: aggregated term-based annotations. All samples from a single file are contained in the tse_agg file. *.lbl: event-based annotations in the same filename convention as .tse *.lbl_agg: event-based annotations in the same filename convention as .tse All annotations are 10 seconds long. Slowing and complex background samples were collected manually. Seizure samples were collected automatically using the existing TUH EEG Seizure Corpus. All annotations are term-based. Term-based annotations use one label that applies to all channels. They are most useful for machine learning research in which we tend to worry more about the overall classification of a segment and are not concerned with individual channels. Time-synchronous event (TSE) files use a simple format that looks like this: 252.0000 262.0000 seiz 1.0000 The fields are: start time in secs, stop time in secs, label and probability (by default, set to 1.0). Label files (lbl) are more complicated and essentially describe a graph that can represent hierarchical annotations, though that feature is not utilized in this release, as all annotations are term-based. They contain the start and stop times, a channel index, a level index, and probability for each symbol. Clinical EEGs use a variety of channel configurations. In the larger TUH EEG Corpus, there are over 40 different channel configurations. In this subset, there are two type of EEGs: averaged reference (AR) and linked ears reference (LE). Fortunately, all files in this subset contain the standard channels you would expect from a 10/20 configuration, and all files can be converted to a TCP montage (which is what we use internally for our processing). The channel numbers in the annotation files (*.lbl*) refers to the channels defined using a standard ACNS TCP montage. This is our preferred way of viewing seizure data. The TCP-REF montage is defined as follows: montage = 0, FP1-F7: EEG FP1-REF -- EEG F7-REF montage = 1, F7-T3: EEG F7-REF -- EEG T3-REF montage = 2, T3-T5: EEG T3-REF -- EEG T5-REF montage = 3, T5-O1: EEG T5-REF -- EEG O1-REF montage = 4, FP2-F8: EEG FP2-REF -- EEG F8-REF montage = 5, F8-T4 : EEG F8-REF -- EEG T4-REF montage = 6, T4-T6: EEG T4-REF -- EEG T6-REF montage = 7, T6-O2: EEG T6-REF -- EEG O2-REF montage = 8, A1-T3: EEG A1-REF -- EEG T3-REF montage = 9, T3-C3: EEG T3-REF -- EEG C3-REF montage = 10, C3-CZ: EEG C3-REF -- EEG CZ-REF montage = 11, CZ-C4: EEG CZ-REF -- EEG C4-REF montage = 12, C4-T4: EEG C4-REF -- EEG T4-REF montage = 13, T4-A2: EEG T4-REF -- EEG A2-REF montage = 14, FP1-F3: EEG FP1-REF -- EEG F3-REF montage = 15, F3-C3: EEG F3-REF -- EEG C3-REF montage = 16, C3-P3: EEG C3-REF -- EEG P3-REF montage = 17, P3-O1: EEG P3-REF -- EEG O1-REF montage = 18, FP2-F4: EEG FP2-REF -- EEG F4-REF montage = 19, F4-C4: EEG F4-REF -- EEG C4-REF montage = 20, C4-P4: EEG C4-REF -- EEG P4-REF montage = 21, P4-O2: EEG P4-REF -- EEG O2-REF For example, channel 1 is a difference between electrodes F7 and T3, and represents an arithmetic difference of the channels (F7-REF)-(T3-REF), which are channnels contained in the EDF file. Finally, here are some basic descriptive statistics about the data: TOTAL: files: 300 aggregate files: 112 sessions: 75 patients: 38 COMPLEX BACKGROUND: files: 100 aggregate files: 50 sessions: 40 patients: 26 SEIZURE: files: 100 aggregate files: 61 sessions: 43 patients: 30 SLOWING: files: 100 aggregate files: 45 sessions: 33 patients: 20 18 of the files containing seizures also contain complex background samples. 26 of the files containing slowing samples also contain complex background samples. 6 files contain only complex background samples. There are 1,000 seconds of complex background, seizure, and slowing annotations in the aggregated files and in the non-aggregated files respectively. --- If you have any additional comments or questions about this data, please direct them to help@nedcdata.org. Best regards, Joseph Picone NEDC Data Resources Development Manager