• MADE Data Use Agreement is required for obtaining the MADE DatasetThe NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE1.0) has ended. To request de-identified MADE1.0 dataset, please submit a Data Use Agreement request through the UMass Medical School OCR portal here if you are part of the UMass system. Otherwise, please fill out the Contact us form in order for us to submit a data use agreement request for you.

    Please note that our previous message stated we could not transmit any data until we receive your IRB approval letter to receive our data. However, we understand many region’s IRB treat activities with de-identified data as non-human subject research. If this is your case, could you please provide a statement in pdf format stating you’ve verified with your institution that your IRB does not require approval to use our de-identified data? The statement should have date, signature from your team organizer, and your institution’s official letter head. We are only allowed to transmit data after receiving either your IRB approval or your statement of no IRB required.


  • MIMIC_SDOH_Annotation_Dataset: The data we annotated was taken from the ICU patient discharge summaries provided in the MIMIC III dataset. For our purpose, we needed to only focus on a specifc section of the note relating to the patient's social history. We extracted the social history section from each note using MedSpaCy. Two individuals annotated mentions of SDOHs. A physician was consulted to validate our ourderstanding of the language used in the notes. The annotation can be downloaded from this GitHub link