With 2.5 quintillion bytes of data created every day, the business of mining, storing and analyzing increasingly large and varied data sets is booming and yielding impressive results, but there are pitfalls.
According to Julian M. Goldman, MD, an anesthesiologist at Massachusetts General Hospital and Harvard Medical School, and medical director of Partners HealthCare Systems Biomedical Engineering, all in Boston, the medical industry in particular stands to benefit from this technology as it matures. Self-taught artificial intelligence has already demonstrated better sensitivity and specificity for predicting heart attacks when compared with physicians, he reported, but this is just the beginning.
At the American Society of Anesthesiologists’ INSIGHTS + INNOVATIONS 2017 Conference, Dr. Goldman noted several examples of sources of health care data that will improve from Big Data and analytics:
- Claims: HIPAA’s promotion of national standards for Electronic Document Interchange between health care providers and insurance companies. Claim transactions include ICD diagnostic codes, medications, dates, provider IDs and cost.
- Electronic health record (EHR): diagnosis, treatment, prescriptions, lab tests and radiology.
- Pharmaceutical R&D: clinical trials data, genomic data.
- Patient behavior and sentiment data: “There is a lot of effort now to capture patient experience,” Dr. Goldman said.
- Medical device data: patient sensor data from the home or hospital. While some of these data go into the EHR, most data that medical devices collect are currently discarded.
“The more that we can do to improve data quality at the point where it’s collected, the more likely we’ll have data for Big Data analysis,” Dr. Goldman said. He also is a member of the editorial advisory board of Anesthesiology News.
Big Data or Bad Data?
In 2008, the web service operated by Google provided estimates of influenza activity in more than 25 countries by monitoring millions of users’ health tracking behaviors online. Data were computed for approximately 50 million Google queries entered weekly within the United States from 2003 to 2008. Searches were divided by state and compared with data from the CDC. The data set was then used to develop an algorithm to predict flu outbreak based on searches for related symptoms. What worked remarkably with historical data, however, ended up significantly overestimating influenza when used to predict the future.
“You may think you have a model that can predict things reliably and run into the same trap with Big Data analysis,” Dr. Goldman said. “Of course, these things aren’t going to work all the time, but when there’s a large public failure, it sets everything back, so we have to temper our expectations.”
As the Google flu trend story demonstrates, effectively using Big Data is not easy, and public failures have a tendency to slow down progress. Advanced technologies require massive data sets for development and validation, Dr. Goldman noted, so providers should expect iterative refinements in data quality and sources, driven by useful outcomes. Although Big Data ultimately will provide quality insights for the field, Dr. Goldman also advocated common sense when interpreting research.
“If you think the results sound suspicious, you probably should be suspicious,” he said. “Big Data analysis may be revealing something that you couldn’t see any other way, but if it flies in the face of foundational principles, you need to be sure you understand the data and how the analysis is being done.”
One surprising barrier to Big Data, Dr. Goldman added, is malware. Concerns about computer malware are likely to inhibit data sharing and building smarter systems in the future. “It’s actually getting harder to pull data from systems in some instances because of malicious software,” he said. “People are very concerned, and the more concerned they become, the more manufacturers will lock down the harder-to-get data that we need. Software that’s acting as an evil actor with nefarious intents is a huge problem because it’s a complete change in the way we treat technology.”
Data Ownership Controversial
Randy Clark, MD, associate professor of anesthesiology at the University of Colorado, in Denver, expressed concerns about data sharing and the protection of private health information. “At the University of Colorado, we work for the School of Medicine, which is a separate economic community from the hospitals that own the Epic systems, but it’s the physicians who are inputting a great deal of information into those systems. Do you foresee conflicts in the future over ownership of the value that can be derived from these new systems?”
“Someone has to pay for the initial system to be built, and then, when the data are incomplete, someone has to pay to modify the system with Epic or another EHR,” Dr. Goldman said. “The economic issue is, who is making the investment and who is deriving benefit from the data? The first part is a business decision, and that is a local decision. The second part is a national and international issue about open data sharing in health care. Will we be a society that wants to find ways to do that for the betterment of our planet?”
“Given its economic value, data is becoming a currency,” added Maxime Cannesson, MD, PhD, professor of anesthesiology at the University of California, Los Angeles. “Outside parties want to buy it, and hospitals are going to sell it, preferring to go to the highest bidder. This issue is very polarizing. The future will be to share data, but the value is in the intelligence, which involves the veracity of the data and how it’s analyzed.”
“The applications built on top of the data and the algorithms used to modify that data are where the revenue stream is,” Dr. Goldman said.