As healthcare organizations increasingly turn to artificial intelligence (AI) for innovative solutions, one area that stands to benefit is drug discovery and repurposing. AI’s ability to sift through vast amounts of biomedical data, including clinical trials and patient records, holds immense promise for accelerating the identification of new therapeutic uses for existing drugs. However, the effectiveness of AI in this domain depends heavily on its capacity to comprehend and analyze complex biomedical texts. Without the right training, AI tools may struggle to deliver accurate insights.
In our white paper, NLP and generative AI in life sciences and precision medicine, we delve into how natural language processing (NLP) models, when trained on robust clinical terminology with deep biomedical expertise, are key to unlocking AI’s potential in drug discovery and repurposing.
Only have time for an excerpt? Continue reading below. Otherwise, click the button to download.
Drug discovery and drug repurposing
A critical step in drug discovery and drug repurposing is identifying evidence from massive and rapidly growing biomedical literature to help generate hypotheses. The use of systematic literature review and data mining support this work to build a knowledge base, assist research gap analysis, synthesize evidence, and direct research. Effective literature review also includes details that support traceability requirements for FDA regulatory submission.
The process of literature review poses many challenges. It is:
- Labor intensive due to the volume of articles in sources such as PubMed Central® (PMC), MEDLINE®, and Online Mendelian Inheritance in Man (OMIM)
- Prone to errors as a highly manual process
- Difficult to stay current with sources that grow and evolve rapidly
Generative AI and NLP for drug discovery and drug repurposing
Generative AI can significantly improve literature analysis for drug discovery and drug repurposing. The combined use of NLP and generative AI supports each step of the process from study protocol setting and literature retrieval, to abstract screening, full-text screening, data element extraction from full-text articles, results summary, and data visualization. Unique NLP tasks predict articles’ relevance based on their title, abstract, and other metadata. Named entity recognition parses full-length articles and extracts data elements from both text and tables and highlights supporting information. With AI-automated literature review and mining, one can specify protocol with natural language, speed the process to define fine-tuning tasks, and create a “living” system that proactively and continuously updates relevant literature in a timely manner.
While the steps in each literature review are the same and can be more efficient with AI, the NLP requirements to address different business goals, diseases, and compounds vary in different contexts. Even work in the same disease space requires unique knowledge to produce reliable results. When augmented with domain expertise, data scientists can fine-tune and specify a protocol to customize and refine models that extract and summarize information unique to each study. Ultimately, scientists can dedicate more time to ensuring data quality and synthesizing evidence, while staying current.
Unique NLP tasks predict articles’ relevance based on their title, abstract, and other metadata.1
An IMO Health example
Accelerating drug repurposing with AI-drive framework
A recent study developed a framework to apply generative AI for drug repurposing studies. IMO Health scientists used NLP to extract biomedical entities and relations from 35 million PubMed abstracts. Using deep learning- based models, they built a knowledge graph of 20,000 entities (drugs, diseases, genes, etc.) and 10 million relations (“inhibits,” “treats,” “stimulates,” etc.) and scoring systems to predict the “treats” relations for each drug-disease pair. The evaluation module applied link prediction for 15 successful pairs of drugs and their new indications and found that all are ranked in the top 0.5% across all diseases.2
To understand the applications of NLP and generative AI in clinical trial optimization; disease phenotyping and precision medicine; and adverse drug reactions, click below.
1Soysal E, Warner J, Wang J, et al. Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP. Stud Health Technol Inform. Aug 2019. Accessed via: https://pubmed.ncbi.nlm.nih.gov/31438083/
2Huang LC, Li Y, Lee K, et al. Knowledgesphere: An Automated and Integrative Framework for Drug Repurposing Empowered By Knowledge Graph and AI. Value in Health, Volume 26, Issue 6, S2. June 2023. Accessed via: https:// www.ispor.org/heor-resources/presentations-database/presentation/intl2023-3668/127231