top of page


The need to digitalize documents is overwhelming. We do have situations of different invoice (or other documents) formats that makes it extremely laborious to digitalize them. Digitalization is no longer an option for companies today. It is the only way to have control over vast amounts of data. The invoices often are in pdf format or scanned paper format that can take humongous time and effort to input to a digital format for future use or for authorities or audit. We have developed InvoiceReader to address this by transforming unstructured data into a structured digital format. InvoiceReader can extract specific fields from different types of documents and group them into the right fields in the digital format - thereby freeing up precious time for data handling teams and significantly increasing their productivity.

Try Demo
InvoiceReader product

How it works

Document scaning

The InvoiceReader scans through the entire documentation and “understands“ the entire document

Document to digital

The second engine of the AI model converts the scanned documents to digital form

Digitally embedded

The digital data is then passed through an embedding layer and later passed through graph neural networks

Data extracted from nueral network

The graph neural networks understand the data passed and extract out specific fields from the documents. The data is now digitalized for current and future use


InvoiceReader features -1

The AI model uses OCR (Optical Character Recognition) to extract out the details

InvoiceReader features -2

A combination of Graphical Neural Networks and BERT (Bidirectional Encoder Representation from Transformers) is used to extract out specific details from the input data

InvoiceReader features -3

A probability score is added to the outputs in case of similar looking data (example from, to addresses) to ensure human check

InvoiceReader features -4

The model can be easily trained to digitalize other documents as well


The AI model can be used to accelerate/ augment digitalization of many routine tasks like manual data capture from different documents - pdf, word, excel, scanned document or even printed documents. Examples of this include digitalization of:

Inter/ Intra company invoices with all relevant fields populated automatically within an organization

Different formats of Invoices from different channel partners (suppliers, distributors) for claims & payments

Documents such as various mandatory licenses as governed by law for company and for distributors

Any historical documents that currently may be available in non-digital formats

Regulatory Licenses that are often in printed or pdf formats

Client quotations to create an accurate data-base of all submitted quotations and revisions thereof

Investment Portfolio digitalization from pdf files

Patient reports including demographics that are generally in pdf or word format

The applications of Invoice Reader are immense and really help you improve accuracy by minimizing human errors during manual digitalization. The model needs to be trained in various types of documents to be deployed for digitalization. This is easily achieved and is undertaken by MCG.

bottom of page