Authors: Zako J et al.
Journal of Cardiothoracic and Vascular Anesthesia, 2026, 10.1053/j.jvca.2026.02.030
This systematic review examined how well artificial intelligence–assisted point-of-care ultrasound (POCUS) estimates left ventricular ejection fraction (LVEF) in real time across bedside clinical settings. The authors focused specifically on prospective observational studies in which AI was used during acquisition and/or interpretation of ultrasound images at the point of care, rather than retrospective loop analysis. Their goal was to determine whether AI-enhanced handheld or portable ultrasound can help non-cardiologists and non-radiologists assess systolic function when comprehensive echocardiography is not immediately available.
The authors searched PubMed/MEDLINE, Embase, and Cochrane from inception through June 11, 2025. Twelve prospective studies met inclusion criteria. These studies spanned intensive care units, emergency departments, perioperative settings, hospital wards, cardiology wards, COVID wards, and even home visits. Sample sizes ranged from 30 to 424 patients. Most studies evaluated proprietary platforms such as GE Vscan/LVivo, EchoNous KOSMOS, Mindray AI, Exo AI, and Us2.ai.
Overall, AI-assisted POCUS showed reasonably good agreement with reference standards for continuous LVEF assessment, but the precision was variable. Correlation coefficients ranged from 0.56 to 0.92, and intraclass correlation coefficients ranged from 0.84 to 0.94. In several studies, Bland-Altman analyses showed fairly wide limits of agreement, in some cases broad enough to make exact numeric LVEF estimation unreliable for high-stakes decisions. A recurring pattern was a tendency for AI to underestimate LVEF relative to the reference standard.
For categorical classification, especially using a 50% threshold to distinguish normal from reduced systolic function, performance was generally stronger. Sensitivity ranged from 70% to 93%, specificity from 89% to 100%, and reported area under the receiver operating characteristic curve ranged from 0.85 to 0.98 in studies that measured it. Weighted kappa values ranged from 0.49 to 0.83. These findings suggest that AI-assisted POCUS performs better as a screening or triage tool for reduced LVEF than as a substitute for precise quantitative echocardiography.
A key practical point from the review is that AI may help novice or less-experienced users achieve clinically useful bedside assessments. Several studies showed that residents, medical students, nurses, or novice ultrasound operators were able to obtain AI-supported LVEF estimates that compared reasonably well with expert reference standards. This supports the idea that AI can reduce, though not eliminate, operator dependence.
At the same time, the review emphasizes major limitations in the evidence base. No included study was low risk of bias across all QUADAS-2 domains. A frequent problem was patient flow and timing, including delayed comparison between the index and reference tests and exclusion of patients with failed image acquisition or unsuccessful AI analysis. Many studies excluded poor-quality scans from final analysis, which likely makes the technology look better than it would in real-world practice. In some studies, failure rates due to image quality were substantial.
Another important limitation is heterogeneity. Reference standards varied widely, ranging from expert Simpson biplane echocardiography to expert visual estimation. Operator skill levels also varied substantially. Clinical context mattered as well. For example, performance appeared less robust in the cardiac operating room, where image quality is often degraded by positioning, ventilation, and perioperative constraints. This means results cannot be generalized uniformly across all settings or all devices.
The review’s overall conclusion is appropriately cautious. AI-assisted POCUS can approximate reference LVEF and can reasonably identify reduced LVEF at the bedside, making it useful as a screening and triage tool when formal echocardiography is not immediately available. However, it should not yet be viewed as a replacement for comprehensive echocardiography, especially when precise LVEF measurement would change major management decisions. The authors call for future studies using intention-to-diagnose designs, standardized Simpson biplane reference standards, shorter delays between tests, clearer reporting of technical failures, and more consistent use of image quality feedback tools.
What You Should Know
This is an important review because it addresses a real bedside problem: many clinicians need rapid information about systolic function but do not always have immediate access to a full echocardiogram or an expert sonographer.
The technology appears most useful for triage rather than final diagnosis. In other words, AI-assisted POCUS may help answer whether LVEF is probably reduced, but it is less dependable for exact numeric quantification.
The evidence also suggests that AI can help non-expert users, which may expand access to bedside cardiac assessment in ICUs, emergency departments, perioperative areas, rural sites, and even home-based care.
Still, image quality remains a major bottleneck. AI does not remove the need for adequate views, and many of the best-looking performance numbers likely reflect studies that excluded technically difficult patients.
Clinically, this means AI-assisted LVEF assessment can be useful for screening and rapid decision support, but borderline values or management-changing results still require confirmation with formal echocardiography.
Key Points
Twelve prospective observational studies were included across ICU, ED, perioperative, ward, and community settings.
Agreement between AI-derived and reference LVEF was generally moderate to strong, but exact numeric agreement was often imperfect.
For classifying reduced LVEF, sensitivity ranged from 70% to 93% and specificity from 89% to 100%.
AI-assisted POCUS appears more reliable as a screening and triage tool than as a replacement for comprehensive echocardiographic quantification.
Many studies had important bias concerns, especially exclusion of failed scans and variability in timing, operators, and reference standards.
Image quality remains a major determinant of performance, even with AI assistance.
Thank you to the Journal of Cardiothoracic and Vascular Anesthesia for allowing us to summarize this article.