To the Editor:
We read with interest the article by Arina et al.To our knowledge, it is the most comprehensive and complete review on this topic.
As underlined by the authors, the subject is very heterogeneous. For this reason and because data processing is not yet standardized, searches should be optimized by performing multiple searches with more specific terms, outcomes, and algorithms. During our research on infectious complications in urolithiasis, we found that increasing the specificity of terms resulted in an additional inclusion of papers that were not found in a more general search. Just as an example, combining terms like [((deep neural network) AND (urolithiasis)) AND (sepsis)] or [((machine learning) AND (urolithiasis)) AND (sepsis)] leads to finding two more studies on sepsis risk in specific settings. One explanation is that PubMed’s indexing structure places “neural network” under “artificial intelligence” but not under “machine learning. The different results for the same topic reflects the key findings from Bernardi et al. In their integrative review, they state that one of main causes of low-quality data is the variability and the lack of consensus in assessing data quality domains and metrics. The standardization of terminology, syntax, and schema of databases will have a positive impact on the pre-collection and research phases. Although the main scope of the article was to assess the overall quality of studies in this area, in general, at this early stage, we suggest searching for specific outcomes, settings, and machine learning algorithms to evaluate the performance of a particular model. This approach may also reduce the selection bias1 thanks to a more homogeneous studied population.
The data quality concerns raised by Arina et al. are therefore real. As stated before, there is no standard definition of data quality in literature, and no unique guidelines are available. Although the World Health Organization (Geneva, Switzerland) has defined data attributes (accuracy and validity, reliability, completeness, readability, timeliness and punctuality, accessibility, meaning or usefulness, confidentiality, security), and general recommendations to divide data quality into four dimensions are provided (quality of data sources, raw data, semantic conversion, and linking process) in healthcare settings, there are no specific data quality guidelines. We find an interesting preliminary approach to the creation of such guidelines in the paper by Syed et al. where the authors created a Digital Health Data Quality–Dimension and Outcome framework consisting of (1) the six dimensions of data quality (similar to those established by the World Health Organization); (2) the relationships between the dimensions of digital health data quality, with consistency being the most influential dimension that affects all other dimensions; (3) five digital health data quality outcomes: clinical, clinician, research, business process, and organizational; and (4) relationships between digital health data quality dimensions and data quality outcomes, with consistency and accessibility dimensions influencing all data quality outcomes. However, despite the growing warning over healthcare data quality, it seems that none of the articles included in this review shared an a priori complete data quality checking protocol. Data pipelines are becoming increasingly complex. At every stage, from storage to transformation and analysis, there must be clear methods to guarantee and verify the quality of the data. The lack of clarity over the process could lead to a black box effect where clinicians “trust” the algorithm without fully understanding the clinical support tool and without being able to understand the quality of the output provided.
Several strategies for data management are emerging with the aim of improving data quality. The implementation of blockchain is a very promising approach to achieve the result for two reasons: (1) it increases data quality, acquisition, extraction (which is also important in research contexts), and security by leading to data integrity, access control, data registration, nonrepudiation, and versioning of data; (2) it improves all dimensions of data quality mentioned above by ensuring consistency and accessibility.
In conclusion, although more and more articles are published on artificial intelligence data-driving tools the lack of standardization of database terminology and data quality leads to results that are difficult to evaluate, interpret, and compare. A more rigorous methodology is needed to reduce uncertainty on the validity of the results and thus create applicable tools in real practice. More efforts are put into creating new machine learning models rather than ensuring their reliability, and authors are more focused on the performance of the model itself rather than ensuring its validity and transparency. This article is an important milestone that clearly points out that we need to focus on data quality, which is the sine qua non condition to have reliable and secure clinical support tools for our patients.