Case Studies

Advancing science through data normalization and collaboration

Estimated reading time: 2 minutes

Client: National government health agency

Challenge: Scientific databases are siloed, limiting integration and collaborative data sharing

Solution: A model to standardize metadata and compare findings across multiple datasets

Challenge:

Normalizing and comparing data in multiple databases

Collaboration in the scientific research community relies on the ability to transfer knowledge and share key findings. Unfortunately, data is stored in a variety of Laboratory Information Management Systems (LIMS) and Laboratory Information Systems (LIS), each with their own structure and data labels. Due to the differing structure, it is difficult to normalize and compare data existing in multiple databases, as each platform uses different keyword and search criteria. This siloed nature of research data required human analysts to manually sift through databases to choose cohorts of interest.

Our client, a national government health agency, was tasked with compiling the data across systems so U.S. agencies and research organizations can conduct better scientific analyses.

Solution:

Applying normalized structure and metadata labelling to databases

In collaboration with a team of subject matter experts in epidemiology and other disciplines, Infinia ML used the latest advancements in Pretrained Learning Models (PLM) to develop and apply normalized structure and metadata labelling to various research databases. The centralized data portal enables users to quickly access restricted datasets for study planning, execution, and analysis.

Value:

Decreasing parsing time and backend data structuring

With a central index of various conditions and variables, data from multiple sources can be integrated quickly and cost-effectively. This decreases the time and resources dedicated to parsing and structuring backend data, and enhances integrative analyses that lead to important insights and discoveries.

Applying these techniques elsewhere

  • Normalize job titles, pricing models
  • Harmonize database searching methods

Related Case Studies