Foundation AI helps Stockwell Harris streamline its document intake using Extract Filer

Stockwell Harris, a 60-attorney workers compensation law firm, maintained a staff of 8-12 temp workers/day to sort, separate, scan and electronically file roughly 10,000 pages/day of incoming documents--including insurance claim files to be set up as new files, and mail and fax correspondence including correspondence, pleadings, medicals, med-legals, bills and liens, panel requests, and Notices.

The manual process of searching the system for each document to find the correct Matter to which it belonged, then titling it properly was cumbersome and time consuming; often, it took several days for documents to make their way into the practice management system, and many documents were filed incorrectly. These errors and delays meant that attorneys couldn’t be as responsive as they needed to be, and sometimes even missed important documents that were misfiled by staff.

With a 2 week integration of Foundation Extract,
Stockwell Harris saw:


goals

Goals

  • ingest document packages
    To automatically ingest document packages directly from Stockwell's batch scanner and fax system and split them into their separate documents.
  • determine
    To determine which file in their Content Management System (CMS) each document belongs to (using AI to replicate the same lookup process that Stockwell's clerks used to perform manually).
  • extract
    To extract the requisite metadata (e.g., dates and names) and enter it into their CMS.
  • index
    To index the document directly into the correct file folder (matter) in their CMS.

Approach

Approach

  • OCR/Object
    Uses OCR/Object Detection model to convert the PDF into a readable format.
  • Classification
    Employs Boosting Tree-based Classification models and CNNs to separate the consolidated file into its constituent documents and classify each document by type.
  • NLP models
    Applies multiple NLP models, including BERT and graph-based convolutional networks (GCNs), to extract relevant information from each document depending on its type.
  • Fuzzy matches
    Fuzzy matches each document based on the extracted information to the correct file (matter) in their CMS.

Results

Results

  • Ingestion
    Increased Ingestion Speed: 80%
  • Accuracy
    Increased Accuracy: up to 98%
  • Cost Savings
    Cost Savings: Over $400,000 / year
  • work remotely
    Operations staff can now work remotely

Solution

Foundation AI configured Extract Filer to:

  • Automatically ingest

    Automatically ingest document packages directly from Stockwell's batch scanner and fax system and split them into their separate documents

  • CMS each document

    Determine which file in the CMS each document belongs to (using AI to replicate the same lookup process that Stockwell's clerks used to perform manually)

  • requisite metadata

    Extract the requisite metadata (e.g., dates and names) and enter it into Stockwell Harris’ CMS

  • correct file

    Index the document directly into the correct file folder in the CMS

Data Used

Stockwell Harris’ data consists of over 20 different document types, both structured and unstructured. These document types include Orders, Subpoenas, Emails, Correspondence, AME and PQME documents. As Workers’ Compensation defense touches healthcare and insurance verticals in addition, our system needs to be able to process documents from all three of these domains. Out of the box, Extract Filer is trained to recognize and process every document type that Workers’ Compensation firms encounter.

Methodology

When Stockwell Harris receives mail they unpack the mail, batch it, and load it into the scanner. The scanner places the scanned files into a directory. That directory is automatically synced into the Extract Filer application. New files that are received via email can be dropped into this folder or directly uploaded to the application

As the first step of the processing, Extract Filer splits each file (PDFs that contain more than one document) by page. It converts each page into a high-resolution image using image preprocessing techniques including image denoising and binarization. This preprocessing step ensures that all text is properly recognized by the OCR engine even if it is obscured by a watermark or overlapping image. Once the images are preprocessed, they are run through multiple OCR models to extract text from the image. Extract Filer then performs text processing on the OCRed text to identify the constituent documents in the file. It uses Boosting Tree-based Classification models and Computer Vision techniques that utilize Convolutional Neural Networks (CNNs) to identify the starting page of each document inside the consolidated file. Now that the system has divided the file into its constituent documents, the system then performs Document Classification on each individual document using Boosting Tree algorithms, based on the document's contents. Multiple NLP models, including BERT and graph-based convolutional networks (GCNs), are then applied to understand each document's content and extract relevant information. The extracted information, like adjudication number and claim number, is used to fuzzy match the document to the correct Case Name and Matter ID in Stockwell Harris' downstream CMS.

Users have the option to update or correct the data extracted, to ensure that each document is being filed into the correct matter. These changes are then used as a feedback loop to improve our models’ classification and extraction accuracy.

Once the user has confirmed that the document has been split correctly and that the information extracted from the document is correct, Extract WC automatically renames each document based on the document type and information extracted. For example, for a case belonging to John Doe, an AME document provided by Dr.Jane in December 2020 would be renamed as JohnDoe_AME_Dr.Jane_202012.PDF. The system then automatically indexes each file into Stockwell Harris’ Case Management system based on each document’s type and extracted data. What used to be a manual process involving multiple applications is now done in a singular tool and is largely automated.

Results

processing documents

Within two weeks of deploying Extract, we were processing documents twice as fast and saving $8000 per week.

George Woolverton,
Managing Partner

Extract Filer increases the speed and accuracy of document processing and information archiving, enabling Stockwell Harris to save time and money:

  • Document Volume

    Document Volume: 10,000 pages a day

  • Lead Time

    Decreased Lead Time: 50%

  • Ingestion Speed

    Increased Ingestion Speed: 80%

  • Increased Accuracy

    Increased Accuracy: up to 98%

  • Savings

    Cost Savings: Over $400,000 / year

IDP

The best part is that I know exactly how many documents are getting processed every day, and I can monitor the staff no matter where we are.

Rosanna Renteria,
Office Manager

Because of the COVID-19 pandemic, Stockwell Harris was forced to transition to remote operations with very little notice. Extract Filer enabled its operations staff to seamlessly transition to a remote working environment. So long as staff members have access to a scanning device, they can perform all necessary actions remotely through Extract’s secure web-based interface.

Transformative Document Processing
© Foundation AI