With the growing adoption of artificial intelligence (AI) in healthcare, there’s a keen interest in its ability to interpret complex biomedical data, including medical literature, trial protocols, and patient records. Additionally, organizations are finding that AI’s potential for enhancing drug discovery, clinical trials, disease phenotyping, and precision medicine hinges on its understanding of clinical and biomedical text. Without training on this kind of text, machines cannot accurately understand important, complex concepts.
This is where specialized natural language processing (NLP) models and biomedical domain expertise become crucial.
Our latest white paper, NLP and generative AI in life sciences and precision medicine, explores the nuances of four key applications, demonstrating how sophisticated NLP models and deep biomedical knowledge are essential for extracting meaningful insights.
Only have time for an excerpt? Continue reading below. Otherwise, click the button to download.
Clinical trial optimization
Clinical trials are the gold standard used to evaluate the effects of vaccines, drugs, medical devices, and treatments on human health outcomes, assessing the benefits and harms vs. standard treatments. Unstructured, narrative text is at the heart of several key trial steps, including protocol definition, parsing clinical trial protocols, and patient recruitment.
Recruiting a representative and clinically meaningful population is a crucial step and one of the biggest barriers to the successful implementation of clinical trials. Suboptimal criteria selection can lead to low accrual, resulting in trial incompletion. Overly rigid criteria restrict patient access and may reduce the potential relevance for patients that could otherwise benefit from the intervention. Challenges to effective and efficient recruiting include:
- Extracting eligibility criteria from lengthy clinical trial protocol documents and creating clear, succinct criteria. This task is critical, yet time consuming and inefficient.
- Eligibility screening, which is a major bottleneck in recruitment, requiring clinical research staff to manually review patient medical history and clinical conditions, and match them to trial eligibility criteria.
Generative AI and NLP for clinical trials
Integrating NLP and generative AI into clinical trial design and recruitment reduces the time required to initiate and conduct clinical trials, enhances the representativeness of the participant pool, and supports stronger trial results, ultimately benefiting both investigators and patients.
To standardize eligibility definitions, NLP techniques can automatically extract criteria from clinical trial protocol documents, identifying disease cohort characteristics, summarizing them into variables, and extracting end point measures. When NLP is optimized in medical domains and across disease areas, domain experts can fine tune the approach using prompts to specify entities, demographics, lab texts, and biomarkers that are unique to the trial. They can get to specific relevant attributes with values, modifiers, and related conditions for both inclusion and exclusion.
Further, by structuring clinical trial information into a knowledge base, trial designers can simulate alternatives and optimize criteria. With links to real world data, trial designers can more readily determine how many patients in a database are eligible. NLP also creates efficiencies in patient recruitment. By generating criteria text and executing queries against medical records data in the EHR, it creates an electronic eligibility process to prescreen and validate patients.
An IMO Health example
Advancing clinical trial study with AI-powered eligibility criteria extraction
Using clinical trials data acquired from ClinicalTrials.gov, spanning oncologic, neurodegenerative, autoimmune, endocrine, and circulatory system disorders, IMO Health (Melax Tech) scientists developed a system comprised of pre-processing, knowledge ingestion, GPT-based prompt modeling, post-processing, and interim evaluation modules. The system evaluated 180 manually annotated trials covering nine distinct diseases and exhibited outstanding performance in criteria entity identification, consistent proficiency, and effective handling of the intricate contextual aspects of criteria. The results produced accuracy of 78.95% across a wide range of diseases.