Resources


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

mimic clinical natural language processing large language models domain transfer bias stigmatizing language

Published: Nov. 6, 2023. Version: 1.0.0


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

mimic clinical natural language processing large language models domain transfer bias stigmatizing language

Published: Nov. 6, 2023. Version: 1.0.0


Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, Majid Afshar

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Credentialed Access

Medication Extraction Labels for MIMIC-IV-Note Clinical Database

Akshay Goel, Almog Gueta, Omry Gilon, Sofia Erell, Amir Feder

Medication extraction NLP labels for 600 discharge summaries in MIMIC-IV-Note dataset.

Published: Dec. 12, 2023. Version: 1.0.0


Database Credentialed Access

RuMedNLI: A Russian Natural Language Inference Dataset For The Clinical Domain

Pavel Blinov, Aleksandr Nesterov, Galina Zubkova, Arina Reshetnikova, Vladimir Kokh, Chaitanya Shivade

RuMedNLI is the full counterpart dataset of MedNLI in Russian language.

natural language inference recognizing textual entailment russian language

Published: April 1, 2022. Version: 1.0.0


Database Restricted Access

Gout Emergency Department Chief Complaint Corpora

John David Osborne, Tobias O'Leary, Amy Mudano, James Booth, Giovanna Rosas, Gurusai Sujitha Peramsetty, Anthony Knighton, Jeff Foster, Ken Saag, Maria Ioana Danila

A corpus of chief complaints tagged with predicted gout flare status and chart reviewed gout flare status. Ideal for input to masked language model training to supplement lengthy clinical text notes.

gout emergency department nlp

Published: Oct. 19, 2020. Version: 1.0


Database Open Access

PADS - Parkinsons Disease Smartwatch dataset

Julian Varghese, Alexander Brenner, Lucas Plagwitz, Catharina van Alen, Michael Fujarski, Tobias Warnecke

The PADS dataset contains smartwatch-based records from interactive neurological assessments of Parkinsons disease patients, differential diagnoses and healthy controls. The data is complemented with non-motor symptoms and medical history information

wearables movement disorders parkinsons disease

Published: March 25, 2024. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, Majid Afshar

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0