Data quality in healthcare and the elusive digital twin

Explore the challenges of data quality in healthcare and how to navigate unstructured data for more accurate "digital twins."
Digital Twin - Insight Brief

INSIGHT BRIEF

Bridging reality and medical records: Data equality and the elusive digital twin

In the digital era, where every click, search, and transaction is meticulously captured and analyzed, the healthcare industry faces a paradox. The goal of creating a “digital twin” for every patient—a comprehensive, up-to-date digital representation of their health—seems more attainable than ever. Yet, despite the vast troves of data at our disposal, from detailed genomic information to routine lab results, crafting an accurate digital counterpart remains an elusive endeavor. The reality is that the data underpinning these digital reflections is often riddled with gaps, unstructured formats, and inconsistencies.

Our latest insight brief, Bridging reality and medical records: Data quality and the elusive digital twin, explores the challenges that complicate data utility in healthcare. From the siloing of information within electronic health records (EHR) to the variability in how data is captured and structured, each issue adds to the complexity of achieving a digital twin that truly reflects a patient’s health status.

Ready for a clearer picture?

Only have time for an excerpt? Continue reading to learn why unstructured data continues to be a barrier to data quality.

Unstructured data and varied data accessibility

Patient data is recorded in different formats depending on the specific condition, test, or specialty. For instance, diagnosis data is often documented with ICD-10-CM codes and surgeries with CPT® codes. While this type of information is readily incorporated into the patient record, nearly 80% of clinical data is unstructured, including PDFs and free text in clinical notes. Without tools to extract and integrate this unstructured data into the EHR, its full potential remains untapped, potentially causing gaps in the patient narrative.

Here, genomics provides a helpful example. Healthcare is still in the early days of gathering and using genomics data. While it can be valuable to help guide more precise and proactive care, this information is often contained in PDFs – a file format that is not readily searchable in the EHR. This obstacle to accessing highly personalized genomics data means that unstructured information is essentially “trapped” in PDFs, leaving insights untapped and unable to inform a robust and detailed digital twin.

For more on the challenges of creating high quality clinical data, download the full insight brief, Bridging reality and medical records: Data quality and the elusive digital twin.

CPT is a registered trademark of the American Medical Association. All rights reserved.

Interested in more IMO Health resources?

Sign up today and have resources delivered straight to your inbox.

Latest Resources​

Explore how IMO Clinical AI bridges the gap between classical ML and agentic AI, offering solutions that meet varying AI adoption levels.
Learn how IMO Health experts leverage the medical problem list to enhance HCC data capture, simplify risk adjustment, and support value-based care.
Article
Temps are tanking, string lights are shining, festive foods are flowing—holiday season is here. Let’s hope you avoid these 12 ICD-10-CM codes.

For award-winning solutions in healthcare IT and data analytics, you're in the right place.