
Heart failure is characterized by weakening or damage to the heart muscle, leading to a gradual buildup of fluid in the patient’s lungs, legs, feet, and other parts of the body. This condition is chronic and incurable, often leading to arrhythmia and sudden cardiac arrest. For centuries, bloodletting and leeches were treatments famously performed by European barber surgeons at a time when doctors rarely operated on patients.
In the 21st century, the management of heart failure has become decidedly less medieval. Today, patients receive a combination of healthy lifestyle changes, prescribed medications, and in some cases, pacemakers. However, heart failure remains one of the leading causes of morbidity and mortality, placing a significant burden on healthcare systems worldwide.
“About half of people diagnosed with heart failure die within five years of diagnosis,” says Teya Bergamaski, an MIT doctoral student in the labs of professors Nina T. and Robert H. Rubin and co-lead author of a new paper that introduces a deep learning model to predict heart failure. “Understanding what happens to patients after admission is critical to allocating limited resources.”
Paper published in lancet electronic clinical medicine This paper by a team of researchers from MIT, Massachusetts General Brigham, and Harvard Medical School shares the results of the development and testing of PULSE-HF. PULSE-HF roughly means “predicting changes in left ventricular systolic function from ECG in heart failure patients.” The project was conducted in Schultz’s lab, which is affiliated with the MIT Abdul Latif Jameel Clinic for Machine Learning in Health. Developed and retrospectively tested in three different patient cohorts: Massachusetts General Hospital, Brigham and Women’s Hospital, and MIMIC-IV (a public dataset), this deep learning model accurately predicts changes in left ventricular ejection fraction (LVEF), the percentage of blood pumped out of the heart’s left ventricle.
A healthy human heart pumps approximately 50 to 70 percent of its blood out of the left ventricle with each beat. Anything below this is considered a sign of a potential problem. “This model takes[an electrocardiogram]and outputs a prediction of whether the ejection fraction will fall below 40 percent within the next year,” says Tiffany Yau, an MIT doctoral student in Schultz’s lab and co-author of the PULSE-HF paper. “This is the most severe subgroup of heart failure.”
If PULSE-HF predicts that a patient’s ejection fraction is likely to worsen within a year, clinicians can prioritize that patient for follow-up. Low-risk patients can then reduce the number of office visits and spend less time having 10 electrodes affixed to their bodies for a 12-lead ECG. This model can also be implemented in clinical settings with fewer resources, such as rural clinics where cardiac sonographers are not typically staffed to perform routine ultrasound examinations.
“The biggest thing that differentiates (PULSE-HF) from other heart failure ECG methods is that it predicts rather than detects,” Yau says. The paper points out that to date, no other method exists to predict future LVEF decline in heart failure patients.
During the testing and validation process, researchers measured PULSE-HF’s performance using a metric known as the “area under the receiver operating characteristic curve” (AUROC). AUROC is typically used to measure a model’s ability to distinguish between classes on a scale of 0 to 1. 0.5 is random, 1 is perfect. PULSE-HF achieved AUROCs ranging from 0.87 to 0.91 in all three patient cohorts.
Of note, the researchers also built a version of PULSE-HF for single-lead ECG. This means that only one electrode needs to be placed on the body. Although 12-lead ECG is generally considered superior in that it is more comprehensive and accurate, the performance of the single-lead version of PULSE-HF was as strong as the 12-lead version.
Despite the elegant simplicity behind the idea of PULSE-HF, like most clinical AI research, it does not appear to be painstaking in execution. “It took many years (to complete this project),” Bergamaski recalls. “I repeated it over and over again.”
One of the team’s biggest challenges was collecting, processing, and cleaning the ECG and echocardiogram datasets. Although the model aims to predict a patient’s ejection fraction, labels for training data were not always readily available. Just as students learn from textbooks with answers, labeling is important to help machine learning models correctly identify patterns in data.
Clean, linear text in the TXT file format typically works best when training models. However, echocardiogram files are typically provided in the form of PDF, and when the PDF is converted to a TXT file, the text (broken up by line breaks and formatting) becomes difficult to read in the model. The unpredictable nature of real-world scenarios, such as restless patients and loose leads, also took a toll on the data. “There are a lot of signal artifacts that need to be removed,” Bergamaski says. “It’s like a never-ending rabbit hole.”
Bergamaschi and Yau acknowledge that more complex methods could help filter the data to get a better signal, but there are limits to the usefulness of these approaches. “At what point does it stop?” Yau asks. “You have to think about your use case. Is it easiest to use this model that deals with slightly messy data? Probably so.”
The researchers anticipate that the next step for PULSE-HF will be to test the model in a prospective study in real patients whose future ejection fraction is unknown.
Despite the inherent challenges of getting a clinical AI tool like PULSE-HF to the finish line, including the risk of potentially extending their PhD by an additional year, the students feel their years of hard work have been worth it.
“I think what’s rewarding is that it’s also challenging,” Bergamaski says. “A friend said to me, ‘If you think you’ll find your calling after graduation, if you really do, you’ll find your calling within a year of graduation.’ … The way we are evaluated as researchers in the (ML and health) field is different than other researchers in the ML field. Everyone in this community understands the unique challenges that exist here.”
“There’s too much suffering in the world,” says Yau, who joined Schultz’s lab after a health incident made him realize the importance of machine learning in medicine. “Anything that seeks to alleviate suffering is something I consider a valuable use of my time.”
