At the 2024 HLTH conference, Andrei Naeymi-Rad, VP of Corporate Strategy at IMO Health, delivered an insightful talk on the pitfalls of relying on open-source data standards like the Unified Medical Language System (UMLS) and Observational Medical Outcomes Partnership (OMOP) for artificial intelligence (AI) in healthcare. His overarching message was clear: “Good enough is not good in healthcare.”
Watch Naeymi-Rad’s full presentation here:
In a rush? Keep scrolling for four key takeaways that every healthcare leader should consider when addressing data quality challenges.
1. Open-source standards struggle with clinical nuance
Open-source terminologies like UMLS often rely on synonymy architecture and basic mappings, both of which fail to capture the nuances of clinical language.
“If I use an acronym like MI, that probably means Myocardial Infarction, right? But if I’m a pediatrician, MI can also mean Mitral Incompetence,” Naeymi-Rad said. “If I’m just using an open-source reference library operation to be the backbone of my data operations, this is a good example of where acronyms, eponyms, abbreviations will break that architecture and break it quite often.”
The inability of an AI model to understand such context can lead to errors in clinical documentation, billing, and overall model performance.
2. Coding inaccuracies impact billing and patient care
When physicians use standardized code systems like ICD-10-CM or SNOMED® by themselves, they often miss the specifity needed to accurately capture complex patient conditions. For example, a statement describing type 2 diabetes mellitus with stage 3B chronic kidney disease, with long-term current use of insulin use, cannot be captured fully in ICD-10.
“The only way to truly understand the full statement at scale is to make sure you’re taking the entirety of the statement as an understanding,” Naeymi-Rad said. “This is really, really important when you get into ambient documentation or understanding complex clinical conditions in multiple different areas of a statement.”
Misrepresented data not only affects reimbursement but also compromises care prioritization and population health insights.
3. Crosswalks perpetuate coding errors, jeopardize patient safety
Crosswalks, which connect disparate terminology systems, result in miscoded and under-coded representations of patient populations. Even when combined with a skilled large language model (LLM), crosswalks fail to capture the appropriate level of specificity required for an accurate diagnosis statement.
“If you’re not coding data appropriately at the front end, and you’re using crosswalks, and you’re under coding and miscoding your patient populations, you’re actually not providing the correct treatment protocols for your patients directly,” Naeymi-Rad said. “That can have an impact on decision support triggers.”
Ultimately, crosswalks can create a data governance and data fidelity problem for large institutions and organizations that are trying to understand the criticality of their patient populations better.
4. CDI and population groups can’t remedy underlying data quality issues
While Clinical Documentation Improvement (CDI) specialists and population groups can help mitigate data issues, they are not an effective solution to poor foundational data quality.
“We all know these are challenges—but still we lean back on the open-source terminology services that are out there to fill in these gaps because we feel like it’s good enough.”
Relying on open-source data may seem cost-effective, but the financial and clinical impacts over time are far too great. At the end of the day, open-source terminologies lack specificity and context, resulting in workflow inefficiencies, loss of revenue, subpar patient care, and more.
Addressing data quality at the source is essential. See how IMO Health fits into the equation by contacting us at sales@imohealth.com or 847-272-1242 to chat with a team member today.
For a complimentary data quality assessment, click here.
SNOMED and SNOMED CT® are registered trademarks of SNOMED International.