Intelligent Document Management System

The Challenge

Intelligent Document Processing leads to structured representation. Traditional methods are low on accuracy and require detailed rule specification.

The challenge is to automate manual document processing at a large scale while reducing the OCR document processing time

What we do

Innovation

  • Ensemble intelligence to analyze scanned documents, allowing higher accuracy than individual algorithms
  • Automated structure extraction from heterogeneous documents without detailed rule specification
  • Intelligent confidence computation from multiple signals

Approach

An AI solution powered by Deep Learning Algorithms, which analyzes scanned documents using an ensemble of models to extract information and structure them in a defined format and has following capabilities:

  • Image processing to remove noise, adjust tilt
  • Converting scanned documents to text by the use of multiple technologies, including custom Deep Learning models
  • Automated layout and format detection (tables, paragraphs)
  • Automated detection of constructs (addresses, invoice items, custom constructs)
  • Entity recognition
  • Knowledge graph creation

Result

Previously, 15-20 people were manually working on data extraction (1000 – 4000 documents per day). We automated this task and improved the processing time using advanced OCR to scan 100 document images per minute.

  • 70% data extraction accuracy
  • Leveraged Elasticsearch to perform indexing in under 4 min