Yuval Tabach
Hebrew University of Jerusalem
Idit Bloch
Hebrew University of Jerusalem
Elad Sharon
Hebrew University of Jerusalem
Yossi Matias
Google (United States)
Marinka Žitnik
Harvard University
Avinatan Hassidim
Bar-Ilan University
Or Zuk
Hebrew University of Jerusalem
Eyal Oren
San Diego State University
Dana Sherill-Rofe
Hebrew University of Jerusalem
Lior Nissim
Hebrew University of Jerusalem
@article{Tsaban_2021, doi = {10.1093/nargab/lqab024}, url = {https://doi.org/10.1093%2Fnargab%2Flqab024}, year = 2021, month = {apr}, publisher = {Oxford University Press ({OUP})}, volume = {3}, number = {2}, author = {Tomer Tsaban and Doron Stupp and Dana Sherill-Rofe and Idit Bloch and Elad Sharon and Ora Schueler-Furman and Reuven Wiener and Yuval Tabach}, title = {{CladeOScope}: functional interactions through the prism of clade-wise co-evolution}, journal = {{NAR} Genomics and Bioinformatics} }
@article {Stupp2022.04.13.22273438, author = {Stupp, Doron and Barequet, Ronnie and Lee, I-Ching and Oren, Eyal and Feder, Amir and Benjamini, Ayelet and Hassidim, Avinatan and Matias, Yossi and Ofek, Eran and Rajkomar, Alvin}, title = {Structured Understanding of Assessment and Plans in Clinical Documentation}, year = {2022}, doi = {10.1101/2022.04.13.22273438}, publisher = {Cold Spring Harbor Laboratory Press}, URL = {https://www.medrxiv.org/content/early/2022/04/17/2022.04.13.22273438}, journal = {medRxiv} }
The dataset, presented here, contains annotations of assessment and plan sections of notes from the publicly available and de-identified MIMIC-III dataset, marking the active problems, their assessment description, and plan action items. Action items are additionally marked as one of 8 categories (listed below). The dataset contains over 30,000 annotations of 579 notes from distinct patients, annotated by 6 medical residents and students. The dataset is divided into 4 partitions - a training set (481 notes), validation set (50 notes), test set (48 notes) and an inter-rater set. The inter-rater set contains the annotations of each of the raters over the test set. Rater 1 in the inter-rater set should be regarded as an intra-rater comparison (details in the paper). The labels underwent automatic normalization to capture entire word boundaries and remove flanking non-alphanumeric characters. Code for transforming labels into TensorFlow examples and training models as described in the paper will be made available at GitHub: https://github.com/google-research/google-research/tree/master/assessment_plan_modeling In order to use these annotations, the user additionally needs to obtain the text of the notes which is found in the NOTE_EVENTS table from MIMIC-III, access to which is to be acquired independently (https://mimic.mit.edu/) Annotations are given as character spans in a CSV file with the following schema: Field Type Semantics partition categorical (one of [train, val, test, interrater] The set of ratings the span belongs to. rater_id int Unique id for each the raters note_id int The note’s unique note_id, links to the MIMIC-III notes table (as ROW-ID). span_type categorical (one of [PROBLEM_TITLE,@article{Tsaban_2021, doi = {10.1093/nargab/lqab024}, url = {https://doi.org/10.1093%2Fnargab%2Flqab024}, year = 2021, month = {apr}, publisher = {Oxford University Press ({OUP})}, volume = {3}, number = {2}, author = {Tomer Tsaban and Doron Stupp and Dana Sherill-Rofe and Idit Bloch and Elad Sharon and Ora Schueler-Furman and Reuven Wiener and Yuval Tabach}, title = {{CladeOScope}: functional interactions through the prism of clade-wise co-evolution}, journal = {{NAR} Genomics and Bioinformatics} }