Author: Jane S. Moon, MD
International Anesthesia Research Society
The Daily Dose April 2023
The electronic health record (EHR) has become ubiquitous in our professional lives, and utilizing electronic data to build clinical decision support systems has the potential to improve patient care. However, what are the challenges involved with applying these machine learning models to clinical practice? This question was explored in depth in “Putting Electronic Data to Work at the Bedside: Clinical Decision Support for the 21st Century Clinician,” a dynamic session moderated by Jeanine Wiener-Kronish, MD, Distinguished Henry Isaiah Dorr Professor of Research and Teaching at Massachusetts General Hospital, on Saturday, April 15, at the IARS 2023 Annual Meeting.
Vesela Kovacheva, MD, PhD, Attending Anesthesiologist at Brigham and Women’s Hospital and Assistant Professor of Anaesthesia at Harvard Medical School, spoke on “Opportunities to Integrate Big Data into Clinical Practice.” According to Dr. Kovacheva, although big data is everywhere in healthcare today, barriers to data access can limit research. Examples of such obstacles include a lack of programming expertise, outdated infrastructures, and the highly regulated nature of healthcare environments. Other challenges to the rapid and efficient use of AI technologies include the heterogeneity of EHR data, the lack of medical device interoperability, and the scarcity of data science resources.
Dr. Kovacheva introduced as an antidote to such issues a novel artificial intelligence (AI) system called the Medical Record Longitudinal AI System (MERLIN), which has tremendous potential for clinical implementation. Unlike current AI technologies, MERLIN is a sophisticated platform that acquires clinical data from multimodal sources and stores it in its simplest (“atomized”) form so that it is always “clean” and available for downstream projects. The stored data can then be processed at lightning speed to generate datasets that are accessible to a variety of researchers for the creation of a wide range of machine-learning models (Kovacheva 2023).
As a specific example, Dr. Kovacheva described her team’s successful application of MERLIN in the prediction of preeclampsia, a leading cause of maternal morbidity and mortality (Kovacheva 2023). As pregnant patients’ risk for preeclampsia is based primarily on pre-pregnancy clinical factors, a large percentage of pregnancies that go on to develop preeclampsia are missed. Dr. Kovacheva successfully used MERLIN to incorporate not only clinical risk factors, but also a hypertension genetic (i.e. polygenic) risk score from the Massachusetts General Brigham Biobank, to predict individual risk for preeclampsia with accuracy in both early and late pregnancy.
Interestingly, while evaluating the predictive power of their machine-learning models, Dr. Kovacheva’s team discovered and rectified a dataset issue that brought to light the issue of healthcare inequity. They initially found the predictive power of their models to be significantly lower in Black patients compared to White patients. Upon investigation, they discovered that more Black patients were lost to clinical follow-up and thus had incomplete data. When they later excluded the incomplete data sets from their predictive power calculations, they found that the numbers equalized. Reflecting upon this phenomenon, Dr. Kovacheva emphasized the importance not only of complete data, but also of paying attention to the social determinants of health when creating machine-learning models.
Michael Mathis, MD, Assistant Professor of Anesthesiology at the University of Michigan and Research Director of the Multicenter Perioperative Outcomes Group, then spoke on “Predictive Algorithms in the Cardiac ICU.” Dr. Mathis presented his team’s 2022 study (Mathis 2022) on the prediction of postoperative deterioration in cardiac surgery patients using EHR and physiologic waveform data as an example to highlight broader challenges in the translation of prediction algorithms to clinical practice.
The study authors found that prediction models of clinical deterioration that used both EHR and processed signal waveform data (from EKG, pulse plethysmography, arterial catheter tracing) performed better than those that used either modality alone. However, this high performance was still primarily driven by waveform data. For the purposes of his talk, Dr. Mathis focused on the limitations of his own study to elucidate some of the major obstacles to clinical application of prediction models.
First, he discussed the phenomenon of “dataset shift,” which he defined as an “evolving mismatch between testing and training data.” In his study, for example, the best performing prediction model showed decreased performance in the 2017 to 2020 test set compared to the 2013 to 2017 training set. This could be explained by time-sensitive factors like IT software updates, changes in EHR documentation, evolving clinical practice behaviors, and shifting patient demographics. To mitigate the unintended consequences of dataset shift, he emphasized the need for ongoing vigilance of frontline clinicians and continuous oversight of a clinical data governance team (Finlayson 2021).
Next, Dr. Mathis addressed the issue of “algorithmic bias” — the notion that the behaviors of clinicians participating in algorithm formation may be amplified in the prediction models. For example, in his study, the outcome measure of clinical deterioration was partially defined by clinician actions (e.g. initiation or escalation of a vasoactive agent), potentially leading to bias.
He also discussed the importance of fostering clinician trust in prediction models by striving to make them as accurate, transparent, credible, and actionable as possible. He emphasized the value of continuous algorithm governance, the promotion of AI literacy, and transparency about how specific machine learning models work. “A magical black box risk score doesn’t inspire trust,” he said. Utilizing variables that correlate with expert opinion can also increase credibility, and incorporating risk factors that can be modified in real time (e.g. blood pressure, pulse oximeter readings, hematocrit) and can add clinical value.
Kyan Safavi, MD, MBA, Assistant Professor at Harvard Medical School and Director of the Massachusetts General Hospital (MGH) Electronic Safety Net, then shared lessons from the Electronic Safety Net (ESN) program he implemented at MGH in his presentation, “Translating Digital to the Bedside: The Electronic Safety Net.” He described the goal of the MGH ESN as guiding various teams to create custom-built algorithms from the common data ecosystem to address local safety challenges. Specifically, the MGH ESN has thus far been used to monitor patients for acute physiologic deterioration (Safavi 2021) and to track deviations from expected or evidence-based care plans, as in sepsis or lung rescue.
Dr. Safavi described the many challenges involved with translating data from the “pipeline to bedside” within healthcare environments that are by nature complicated, busy, and have limited resources. He identified some key components of successful clinical application of prediction/detection models (Marwaha 2022).
First, both the clinical and financial value of the programs must be clearly defined and accepted by both the institutions and clinicians alike. Second, it is important to build an effective team composed of experts in key subject matters: clinical operations, business analysis, information systems, and biomedical engineering. Third, the project must be novel but also “fit seamlessly” into existing workflows (Leon 2022). Fourth, careful attention must be paid to the modes and frequency of alarm delivery, with precautions taken against alarm fatigue and data overload. Finally, continual effort is needed to evaluate clinical efficacy and cost effectiveness.
To conclude the session, Dr. Wiener-Kronish moderated a robust Q&A discussion that covered a wide range of topics: ensuring data accuracy; implicit bias and its effect on algorithm creation; the role of clinicians in making ultimate practice decisions; translatability of local AI models to outside institutions; and the role of good leadership in promoting culture change.