Electroencephalography (EEG) Resources

 

Corpora: TUEG | TUAB | TUAR | TUEP | TUEV | TUSZ | TUSL
Software: EGAS | ERDR | EEGR | EVAL | PYPR | MEDF | EDFB | PYED
Documentation: ELEC | ANNO |
Instructions: BROW | RSYN | DISK

To request access to the TUH EEG Corpus, please fill out this form and email a signed copy to help@nedcdata.org. Please include "Download The TUH EEG Corpus" in the subject line or click on this link.

Note this is an Adobe Acrobat form, and it is best filled out using Adobe Acrobat or a similarly compatible tool. We suggest you download a copy to your desktop and fill it out using a local app, rather than attempt to complete the form from within a browser.

The form must be filled out correctly or it will be returned to you. Please follow the instructions on the form very carefully, including completing the address information accurately. This is very important and we cannot accept forms with incorrect addresses.

Once your form is accepted, you will receive the username and password to our resources in a separate email, and be added to our listserv. This usually takes about 24 hours. The TUH EEG Corpus is freely available. The only reason we require registration is that we need to track who downloads the data. We also want to be able to inform you of any updates to the releases.

Once you have obtained the username and password, you can selectively download portions of the corpus using your browser. You can also use the rsync interface described below.


Corpora

  • The TUH EEG Corpus (TUEG): A rich archive of 26,846 clinical EEG recordings collected at Temple University Hospital (TUH) from 2002 - 2017. Read this journal paper for a more complete description of the corpus.

  • The TUH Abnormal EEG Corpus (TUAB): A corpus of EEGs that have been annotated as normal or abnormal. Read Silvia Lopez's MS thesis for a description of the corpus.

  • The TUH EEG Artifact Corpus (TUAR): This subset of TUEG that contains annotations of 5 different artifacts: (1) eye movement (EYEM), (2) chewing (CHEW), (3) shivering (SHIV), (4) electrode pop, electrode static, and lead artifacts (ELPP), and (5) muscle artifacts (MUSC).

  • The TUH EEG Epilepsy Corpus (TUEP): This is a subset of TUEG that contains 100 subjects epilepsy and 100 subjects without epilepsy, as determined by a certified neurologist. The data was developed in collaboration with a number of partners including NIH.

  • The TUH EEG Events Corpus (TUEV): This corpus is a subset of TUEG that contains annotations of EEG segments as one of six classes: (1) spike and sharp wave (SPSW), (2) generalized periodic epileptiform discharges (GPED), (3) periodic lateralized epileptiform discharges (PLED), (4) eye movement (EYEM), (5) artifact (ARTF) and (6) background (BCKG).

  • The TUH EEG Seizure Corpus (TUSZ): This corpus has EEG signals that have been manually annotated data for seizure events (start time, stop, channel and seizure type). For more information about this corpus, please refer to our book section. The electrode configurations and annotation guidelines are described here.

  • The TUH EEG Slowing Corpus (TUSL): This is another subset of TUEG that contains annotations of slowing events. This corpus has been used to study common error modalities in automated seizure detection.



Software

  • NEDC EEG Annotation System (EAS): A tool that allows rapid annotation of EEG signals. The tool includes spectrogram and energy plots, and is capable of transcribing data in real time. Learn more about this tool from our IEEE SPMB 2018 paper.

  • NEDC ResNet Decoder Real-Time (ERDR): A real-time EEG seizure detection system based on a ResNet-18 neural network and transfer learning. This package contains a real-time decoder that is described in this publication. This is also part of our real-time demonstration system.

  • NEDC ResNet Seizure Detection System (EEGR): A research version of our real-time seizure detection system that includes a trainer and a decoder. This package is used to develop and evaluate models. This package contains a decoder that is described in this publication.

  • NEDC Eval EEG (EVAL): A Python-based scoring package that implements a variety of standard evaluation metrics. A complete description of the software can be found here.

  • NEDC PyPrint EDF (PYPR): A Python-based tool that decodes the header and signal data in an Edf file. This is not as simple as it might seem because channels can be permuted and must be decoded using channel labels.

  • MATLAB EDF (MEDF): MATLAB code that loads EEG signal data from an EDF file.

  • EDF Browser (EDFB): An open-source program that can be used to view files such as EEG, EMG, ECG, etc., available for Windows and Linux.

  • Python-based EDF (PYED): A Python interface to EDFLib that lets you read and write EDF files (the distribution format for TUH EEG).



Documentation

  • Electrodes (ELEC): A document that describes how EEG signals are stored in a multichannel signal file format. This document also includes a description of the channel labels, which are required to properly decode the data.

  • Annotations: (ANNO): A document that describes how we annotate seizures and store the annotations in various file formats.



Instructions

All of our released corpora are now available these ways:

  1. From the web at:

          https://www.isip.piconepress.com/projects/nedc/data/

    You can directly browse the directories and explore the data. This is convenient if you want to sample the data and explore formats, content, etc.

    The username and password are the same as what you use to access the web-based version of these resources. If you do not have the username and password, register by filling out this form and we will contact you with registration information by email.

  2. Rsync, which is available on Linux and Mac platforms, is our preferred way of downloading data. It allows you to easily keep your copy of the data in sync with ours.

    Windows users can get access to rsync by installing MobaXterm. Some tips on how to install and use MobaXterm are here.

    Before you attempt to download an entire corpus, you should test your ability to download data by executing this command:

          rsync -auxvL --delete nedc-tuh-eeg@www.isip.piconepress.com:data/tuh_eeg/TEST .

    If for some reason this fails, change "-auxvL" to "-auxvvvL". This will generate a log file that your IT support team can use to diagnose the problems with your downloads.

    Once this command works correctly, then you should go here to select the corpus you want to download. A typical rsync command to download a specific release (e.g., vx.x.x) of a specific corpus (AAAA) is:

          rsync -auxvL --delete nedc-tuh-eeg@www.isip.piconepress.com:data/tuh_eeg/AAAA/vx.x.x/ .

    You can also download all versions of a corpus that are available on the server by dropping the version number:

          rsync -auxvL --delete nedc-tuh-eeg@www.isip.piconepress.com:data/tuh_eeg/AAAA/ .

    Note that the "." at the end of this command is very important since it denotes the destination directory. Without a destination directory specification, the command will not transfer any data.

    The username and password are the same as what you use to access the web-based version of these resources.

    Note that the "-L" option in rsync instructs it to follow links. All of our corpora are linked back to TUEG. It is best to always use the "-L" option.

  3. If Internet connectivity is a problem, you can send us a 8T USB drive. We will copy the data to this disk and send it to you. You must arrange for postage as described below. If you elect this option, you need to send us a 8T USB drive and provide a UPS or FedEx account number for return shipping.

    Please send us a conventional USB-mounted disk drive. We have had problems with other types of media such as thumb drives. Any standard USB-powered USB 2.0 compatible 8T drive, such as a Western Digital or Seagate, will work fine. Because of the time it takes to copy the data, we need a drive that can maintain a stable connection, and thumb drives have proven to be unreliable.

    Mail the drive to:

          Joseph Picone
          1610 Rhawn Street
          Philadelphia, PA 19111
          Tel: 708-848-2846

    Please email us for details before shipping the drive. If you ship us a drive directly from a reseller such as Amazon, please make sure that the shipment contains information that we can use to identify you. This information should include a point of contact (POC), the name of your institution, and contact information (name, surface mail address and telephone number for the POC).

    Please note that disk drives sent to international destinations will often get caught in Customs for weeks. Rsync is a much better option than going through your local governments.

If you are having trouble deciding what to do, email us and describe what specific resources in which you are interested. We will be happy to guide you through the process.