NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE1.0)
hosted by University of Massachusetts Lowell, Worcester, Amherst
Adverse drug events (ADEs) are common and occur in approximately 2-5% of hospitalized adult patients. Each ADE is estimated to increase healthcare cost by more than $3,200. Severe ADEs rank among the top 5 or 6 leading causes of death in the United States. Prevention, early detection and mitigation of ADEs could save both lives and dollars. Employing natural language processing (NLP) techniques on electronic health records (EHRs) provides an effective way of real-time pharmacovigilance and drug safety surveillance.
We’ve annotated 1092 EHR notes with medications, as well as relations to their corresponding attributes, indications and adverse events. It provides valuable resources to develop NLP systems to automatically detect those clinically important entities. Therefore we are happy to announce a public NLP challenge, MADE1.0, aiming to promote deep innovations in related research tasks, and bring researchers and professionals together exchanging research ideas and sharing expertise. The ultimate goal is to further advance ADE detection techniques to improve patient safety and health care quality.
- Registration: begins August 1st, 2017 to February 5th, 2018.
- Training data release: November 1st, 2017, changed to November 15th, 2017, due to change in Data Recipient Agreement.
- Evaluation script available: Feb 5, 2018
- System submission: March 5-6, 2018
- Special panel session with AMIA summit 2018, March 14, 2018
The entire dataset contains 1092 de-identified EHR notes from 21 cancer patients. Each EHR note was annotated with medication information (medication name, dosage, route, frequency, duration), ADEs, indications, other signs and symptoms, and relations among those entities. We split the data into a training set consisting of ~900 notes and a test set consisting of ~180 notes. Both will be released in BioC format.
MADE1.0 challenge consists of three tasks defined as follows.
- Named entity recognition (NER): develop systems to automatically detect mentions of medication name and its attributes (dosage, frequency, route, duration), as well as mentions of ADEs, indications, other signs & symptoms.
- Relation identification (RI): given the truth entity annotations, build system to identify relations between medication name entities and its attribute entities, as well as relations between medication name entities and ADE, indications and other sign & symptoms.
- Integrated task (IT): design and develop a integrative system to conduct the above two tasks together.
For each task, we evaluate two configurations: Standard evaluation using only MADE resources** (up to 2 runs) and Extended evaluation (up to 2 runs) using any customized resources available. Best score for each setting will be utilized for team ranking. MADE resources only refer to released training data plus the word embedding trained using wiki, de-ided Pittsburgh EHR and pubmed articles, which can be downloaded here: http://126.96.36.199/word_embed/. Please cite the NAACL 2016 paper below for word embedding.
The evaluation measures are:
|Task 1||Precision/Recall/F1-measure on mention-level annotations, using standard and extended resources
Primary Metric: micro-averaged F1.
|Task 2||Precision/Recall/F1-measure on relations, using standard and extended resources
Primary Metric: micro-averaged F1.
|Task 3||Using abovementioned measures of Task 1 and Task 2.|
Below are the publications of two baseline systems. In this competition, we used a different approach to partition the training and testing data. Therefore, the performance of the two baseline systems can be used as an approximation only.
1. Structured prediction models for RNN based sequence labeling in clinical text. (Abhyuday Jagannatha, Hong Yu; EMNLP 2016)
2. Bidirectional Recurrent Neural Networks for Medical Event Detection in Electronic Health Records. (Abhyuday Jagannatha, Hong Yu; NAACL HLT 2016)
Submission and Timeline
1) The test data for Task 1(Entity--NER) and Task3 (Entity+Relation--IT) will be released to teams on March 5,2018 at 10:00am Eastern Time, and each team is requested to submit results within 24 hours.
2) The test data with ground truth Entity labels for Task 2(Relation only--RI) will be released on March 6, 2018 at 11:00am, and each team is requested to submit Task 2 results within 24 hours.
3) We will release the evaluation script on Feb 1, 2018 Feb 5, 2018 so that groups can validate their output format and test out the script in advance. Submissions that do not conform to the provided BioC format will be rejected without consideration or notification.
4) Right after evaluation we require top performed systems (based on exact matching score using MADE resource only) to send us the software or set up an online server by March 8, 2018 so that we can validate the results on our site. System performance that can’t be validated will be rejected.
5) A special scientific panel session for this challenge will be held at AMIA Summit 2018 on March 14, 2018 (1:30pm – 3:00pm). Six teams will be invited for panel presentations, but all the teams are welcome to attend the session at AMIA summit (Visa application needs to be prepared in advance).
6) We will work on a journal special issue for this challenge after AMIA Summit, and another workshop is also in consideration.
: The test data will be released March 5 2018 at 10:00am Eastern time as scheduled. The notification will be sent out through the google group email(firstname.lastname@example.org). Please notify us if you don't get it.
: MADE1.0 Online Workshop
Date: 5/4/2018 Time: 10am-2pm(U.S. Eastern Time)
Total Number of Registration Allowed: 100
Feifan Liu University of Massachusetts Medical School
Hong Yu University of Massachusetts Lowell
Abhyuday Jagannatha University of Massachusetts Amherst
Weisong Liu University of Massachusetts Lowell