The data normalization journey begins with pulling data from multiple source systems and then integrating it into a centralized data repository, or data warehouse. However, data can only be said to be truly normalized when the multi-sourced data is accurately captured and attributed using a standardized, mutually understandable, and interoperable clinical terminology. Unfortunately, implementing an in-house normalization platform with these capabilities may be prohibitively expensive. While third-party normalization systems have upfront costs, these systems will ultimately save money and effort by avoiding associated implementation and maintenance expenditures.
Why is data normalization necessary?
Patient records are often stored in multiple locations, such as electronic health records (EHRs), pharmacy information systems, and lab information systems. These various data systems were created by numerous vendors at different points in time, many prior to when interoperability standards were universally adopted. As a result, integrating data from these disparate sources has been – and remains – a challenge.
Additionally, there is no consensus for how to apply standard codes to essential data elements such as diagnoses, medications, and labs. This leads to the same information being represented by multiple different code systems. Therefore, there is a need to seamlessly capture and digest clinical information from disparate systems and, ultimately, provide a common language for exchanging information. This process is called data normalization.
Challenges to in-house data normalization
The process of implementing an in-house data normalization system can be both costly and time-intensive. Two of the main reasons are outlined below:
- Skilled resources (human and technical): An in-house data normalization system requires a knowledgeable team of clinicians, informaticists, and software engineers, all of whom need to be paid for their input and expertise. In addition, setting up an effective normalization infrastructure can be expensive. On the other hand, attempting to reduce costs by implementing a bare minimum resourcing strategy can lead to a bottleneck in daily operations.
- Data warehousing: Data engineers may have to set up extensive pre-processing in their data processing pipelines. Also, low quality data ingested into the data pipeline will force the overuse of extract, transfer, load (ETL) processes, leading to additional unnecessary infrastructure expenses. Storing different variations of clinical data without a standard coding schema (often called late binding) eases import but makes data cleansing, processing, and reporting much more complicated and expensive.
What is the solution?
A third-party normalization platform can minimize the costs associated with developing an in-house system. Organizations such as health information exchanges, integrated delivery networks, healthcare payers, life sciences and pharmaceutical companies, clinical data and vaccine registries, and those involved in consumer health and population health management may consider seamlessly adding a highly distributed data normalization engine to their data platform, enabling the mapping of disparate local terminologies to a shared and interoperable terminology. This will allow different participants within their enterprise to have access to fully normalized and centralized data at rest in the warehouse, avoiding the complexity of each organization tackling the difficult work of interpreting disparate data on their own. In short, third-party vendors can offer fully automated, easy-to-implement, and reliable normalization platforms, easing both cost and burden.
To learn about IMO’s unique approach to data normalization, click here.