Ramya Ramesh

University of California, Berkeley

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Brinda Gurusamy

University of California, Berkeley

Vibhor Sehgal

University of California, Berkeley

Sara Rizvi

University of California, San Francisco

G. Burkhard Mackensen

University of Washington

Rima Arnaout

University of California, San Francisco

Jeffrey Tran

University of Arizona

Sharmila Subramaniam

University of California, San Francisco

Ritu Thamman

University of Pittsburgh

Ronald Mastouri

Indiana University – Purdue University Indianapolis

Emeka Anyanwu

University of Pennsylvania

Cooperative Institutions

University of California, San Francisco

University of California, Berkeley

University of Washington

Queen Alexandra Hospital

Portsmouth Hospitals NHS Trust

The Japanese Society of Gastroenterological Surgery

Hiroshima University Hospital

Iwate Medical University

Hudson Institute

University of Portsmouth

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Mapping echocardiogram reports to a structured ontology: a task for statistical machine learning or large language models?

medRxiv (Cold Spring Harbor Laboratory) (2024)

Sharmila Subramaniam Sara Rizvi Ramya Ramesh Vibhor Sehgal Brinda Gurusamy

Abstract Background Big data has the potential to revolutionize echocardiography by enabling novel research and rigorous, scalable quality improvement. Text reports are a critical part of such analyses, and ontology is a key strategy for promoting interoperability of heterogeneous data through consistent tagging. Currently, echocardiogram reports include both structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both non-deep learning and deep-learning (e.g., large language model, or LLM) based techniques. Challenges to date in using echo text with LLMs include small corpus size, domain-specific language, and high need for accuracy and clinical meaning in model results. Methods We tested whether we could map echocardiography text to a structured, three-level hierarchical ontology using NLP. We used two methods: statistical machine learning (EchoMap) and one-shot inference using the Generative Pre-trained Transformer (GPT) large language model. We tested against eight datasets from 24 different institutions and compared both methods against clinician-scored ground truth. Results Despite all adhering to clinical guidelines, there were notable differences by institution in what information was included in data dictionaries for structured reporting. EchoMap performed best in mapping test set sentences to the ontology, with validation accuracy of 98% for the first level of the ontology, 93% for the first and second level, and 79% for the first, second, and third levels. EchoMap retained good performance across external test datasets and displayed the ability to extrapolate to examples not initially included in training. EchoMap’s accuracy was comparable to one-shot GPT at the first level of the ontology and outperformed GPT at second and third levels. Conclusions We show that statistical machine learning can achieve good performance on text mapping tasks and may be especially useful for small, specialized text datasets. Furthermore, this work highlights the utility of a high-resolution, standardized cardiac ontology to harmonize reports across institutions.

10.1101/2024.02.20.24302419

Cite

Citations (0)

Abstract 19161: EchoMap Automatically Maps Echocardiogram Report Text to Ontology

Circulation (2023)

Sara Rizvi Ramya Ramesh Vibhor Sehgal Brinda Gurusamy Jeffrey Tran

Background: Big data has the potential to revolutionize echocardiography by enabling novel research and rigorous, scalable quality improvement. Text reports are a key part of such analyses. Currently, echocardiogram reports include both structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both non-deep learning and deep-learning (e.g., large language model, or LLM) based techniques. Challenges to date in using echo text with LLMs include small size, domain-specific language, and high need for accuracy and clinical meaning in model results. Hypotheses: We tested whether we could map echocardiography text to a structured ontology using NLP. Methods: We developed a three-tier ontology for the echocardiographic anatomic structures, functional elements, and descriptive characteristics in an adult transthoracic echocardiogram using 919 sentences from UCSF’s structured echocardiogram report text. We tested LLM fine-tuning as well as non-LLM techniques to map echocardiography sentences to this ontology. Two-hundred twenty-eight UCSF sentences served as an internal test set. Additional test datasets included free text from UCSF reports; structured text sentences from two other hospitals; and sentences from reports representing 17 additional hospitals. Results: Despite all adhering to clinical guidelines for reporting, there were notable differences by institution in what structural and functional information was included in structured reporting. A non-LLM hierarchical model performed best in mapping sentences to the ontology, with internal test accuracy of 96% for the first level of the ontology, 91% for the second level, and 77% for the third level. Echomap retained good performance across diverse datasets and displayed the ability to extrapolate to ontological terms not initially included in training. Conclusions: We show that non-LLM NLP methods can achieve good performance and may be especially useful for small, specialized text datasets where clinical meaning is important. These results highlight the utility of a high-resolution, standardized cardiac ontology to harmonize reports across institutions.

10.1161/circ.148.suppl_1.19161

Cite

Citations (0)

Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports

Scientific Reports (2025)

Sharmila Subramaniam Sara Rizvi Ramya Ramesh Vibhor Sehgal Brinda Gurusamy

Abstract Big data can revolutionize research and quality improvement for cardiac ultrasound. Text reports are a critical part of such analyses. Cardiac ultrasound reports include structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both statistical- and large language model based techniques. We tested whether we could use NLP to map cardiac ultrasound text to a three-level hierarchical ontology. We used statistical machine learning (EchoMap) and zero-shot inference using GPT. We tested eight datasets from 24 different institutions and compared both methods against clinician-scored ground truth. Despite all adhering to clinical guidelines, institutions differed in their structured reporting. EchoMap performed best with validation accuracy of 98% for the first ontology level, 93% for first and second levels, and 79% for all three. EchoMap retained performance across external test datasets and could extrapolate to examples not included in training. EchoMap’s accuracy was comparable to zero-shot GPT at the first level of the ontology and outperformed GPT at second and third levels. We show that statistical machine learning can map text to structured ontology and may be especially useful for small, specialized text datasets.

Ground truth

10.1038/s41598-024-83540-y

Cite

Citations (0)