Properly codified clinical concepts make up the bedrock of artificial intelligence (AI), including natural language processing (NLP) and large language models (LLMs). Without them, AI solutions cannot reach their full potential, taking data from valuable to actionable. Plus, models built on robust clinical terminology that’s correctly mapped to standardized codes are significantly more accurate overall.
In our recent webinar, Elevating Health Tech: How quality data drives effective AI, made in collaboration with HLTH, industry leaders explore the necessity of superior data quality to fuel effective NLP and AI solutions in healthcare.
Learn how accurate clinical coding, semantic interoperability, and comprehensive data regulation can bolster AI-driven insights, enhance patient care, and accelerate health tech innovation.
Short on time? Continue scrolling for a few key insights from this session.
Navigating the evolving landscape of data quality in healthcare AI
Panelists emphasized that as AI technology advances, data quality needs are shifting from standard structure and consistency concerns to more serious questions about “contextual completeness,” as Gigi Yuen-Reed, PhD, Chief Data and AI Officer at Cohere Health, put it.
Calum Yacoubian, MD, Director of Healthcare Strategy, Applied AI Science at IQVIA, noted that while highly specialized stakeholders mainly handled AI tasks in the beginning, the democratization of AI tools has allowed a broader range of individuals to engage with the technology, bringing both opportunities and new challenges.
“I think the questions and the pitfalls are the same in many ways as they were before, but now there’s just a much larger cohort of people who are experiencing them,” Yacoubian said.
An emerging concern is ensuring not only the quality of data going into AI models (“garbage in, garbage out”) but also the quality and reliability of AI-generated outputs, especially in clinical settings.
“As we get closer and closer to being a fundamental part of patient care, really understanding the traceability and auditability of the decision that was made through the AI [is key]” Ivana Naeymi-Rad, Chief Operating Officer at IMO Health, stated.
Enhancing AI data training in healthcare
When it comes to training AI models, ensuring high-quality data is paramount, the speakers said, especially when AI-generated data is reused in model training.
“It’s this virtuous cycle and delineating, is the data generated by humans versus AI or augmented AI?” Yuen-Reed said. “It’s a key part of what we have to do at Cohere… delineating the noise from the signal and the signal from the noise is quite a different technique depending on how the data is being created.”
Gayathri Narayan, Vice President and General Manager of AI Scribe at ModMed, highlighted the need for providers to interact with patients in a structured way to support AI training. She noted how one of their solutions captured mostly casual conversation rather than clinical details early on, forcing ModMed to educate providers on communicating with AI in mind, as if speaking to someone who cannot see.
For us at IMO Health, the challenge lies primarily in the inconsistency and incompleteness of clinical notes used to train LLMs, Naeymi-Rad said. She explained that models trained on publicly accessible data lack comprehensive clinical context, which can lead to issues in clinical decision support and billing.
“Similarly, around reimbursement, we’re seeing that a lot of out-of-the box models will identify simple ICD-10 terms or codes, but there’s so much complexity in how a clinician documents,” Naeymi-Rad said. “You’ll start seeing things like a diabetes mellitus type 2 getting identified, but they’re not putting together the long-term insulin dependency in the chronic kidney disease, so you end up having three sperate codes where you could have had a single bill…”
Harnessing AI for stronger data quality in healthcare
Speakers discussed recent advancements in applying AI for clinical data quality and efficiency. Yuen-Reed brought up one of Cohere’s key solutions, which involves extracting vital information from clinical notes to support fast, accurate prior authorization decisions. She explained that while AI helps streamline this process, human clinicians still play a role in determining the “ground truth,” or the definitive interpretation of notes, which can vary between clinicians and over time.
“One thing that has really given us a lot of mileage is trying to balance human labeling with on-the-ground learning,” Yuen-Reed said. “We observe how the clinician interacts with the notes in a subtle way… it really helps augment the ability to collect high quality data without too much human energy.”
Naeymi-Rad underscored IMO Health’s progress in entity extraction for clinical notes, enabling various clients—from ambient vendors to provider organizations—to better interpret clinical documentation. She also shared her excitement about AI’s potential in life sciences, where LLMs are helping analyze vast volumes of literature for insights.
“I think if leveraged in the right way, AI will be a significant game changer for healthcare in this country, but I think we all need to band together and work collaboratively to solve it,” Naeymi-Rad said.