Log in

Sign up for our weekly newsletter!

When & Where
Fri, May 14, 2021 - 1:00 PM to 2:00 PM
Remote (via Zoom)

Computational Text Analysis Working Group (CTAWG)

Title: [UCSF Bakar] Enhancing physicians’ prognoses using deep learning: an ergonomic UI to find similar patient groups and medical trends

 Abstract: Our Capstone project addresses the needs of applying Natural Language Processing and Deep Learning to perform data mining on de-identified clinical record data. Our goal was to transform the 250,000 hospital notes from MIMIC-III and n2c2 NLP Research Data Sets into aggregated results that physicians can easily interpret. After converting the clinical notes into lists of medical concepts through Apache cTAKES, we analyzed these concepts via NLP methods and connected them with neighboring medical concepts in the semantic context. We eventually created word embeddings and document embeddings and performed clustering to group patients based on shared features and similarity.

We developed a web-based user interface to display and enable the analysis of our patient clusters at a glance. Physicians, clinical practitioners, and researchers can search for nearest neighbors of a given new patient in terms of symptoms, laboratory results, disease, or treatments by using our platform. Therefore, doctors can possibly gain more insights into the patient characteristics and optimize treatment plans based on the trends observed in similar medical profiles. We hope to contribute to the progress of what has become one of the most significant challenges of Health 2.0: precision medicine, the science of delivering the right treatment, to the right person, at the right moment. 

Team Members: Bo Zhou, Samuel Harreschou, Sixtine Lauron, James Corbitt (Master of Engineering Students at UC Berkeley)


Gabriel Gomes Ph.D., Mechanical Engineering Professor at UC Berkeley

Gundolf Schenk Ph.D., Senior Biomedical Data Scientist at UCSF Bakar Institute of Computational Sciences

Check out future working group sessions.

Training Keywords: 
Computational Text Analysis
Primary Tool: 
Training Learner Level: 
Mixed Learning Levels
Training Host: 
Log in to register for this training.