Case Studies

How Infinia ML gave a healthcare disruptor the power to scale

Estimated reading time: 2 minutes

Client: SignalPath

Challenge: Understand complex medical language and extract relevant data for processing

Solution: Data pipelines and machine learning models leveraging OCR and NLP to pull content from large PDF documents containing paragraphs, images, tables, and footnotes


Understanding complex medical language

During clinical trials, drug companies provide hundred-page PDF protocol documentation with all the information required to conduct the trial safely and accurately. SignalPath created a SaaS-based platform to standardize the execution of the rules contained in lengthy, complex trial protocol documents leading to boosts in trial efficiency, quality and profitability.

In order for their end users to access protocol data for the trial, SignalPath needs to digitize important aspects of the underlying PDFs. This process previously required a team of trained human digitizers to read, understand, extract, and input complex medical language, which hindered SignalPath’s ability to scale.


Digesting unstructured data like a trained human

Infinia ML developed machine learning models to extract text data from non-standardized documents that contain charts, paragraphs and images. Once data is extracted, other models process blocks of text and data from charts and automatically route the information based on the context, the same way that a human would.

Extracted information is presented to SignalPath employees that verify the accuracy of the data and perform additional processing that can only be accomplished by a trained human eye.


Prioritizing security issues for quick remediation

Infinia ML was able to use our technical expertise in machine learning and healthcare to train algorithms to find and extract all of the relevant information from these large trial documents. SignalPath’s workforce is now able to focus on high-level decisions about nuances in the data, saving the their teams hundreds of hours a year and allowing their business process to scale for more trials.

Applying these techniques elsewhere

This same machine learning technology can be applied to a number of other use cases to automatically find context and relevance in unstructured data:

  • Read complex legal and financial documents to find the important clauses
  • Identify, extract, and redact sensitive information from documents and records
  • Automatically read and process intake forms, loan applications, invoices, and more.

Related Case Studies