Back to blog

5 Features That Make Datamolino Unique

Most invoice capture tools will pull a supplier name, a total and a VAT figure off a PDF without much trouble. That part has been a solved problem for years. The interesting question, once you're processing invoices at volume, is what happens after the header. Below are the five things Datamolino does that most of the competition either doesn't do at all, or only half does.

5 Features That Make Datamolino Unique

Most invoice capture tools will pull a supplier name, a total and a VAT figure off a PDF without much trouble. That part has been a solved problem for years. The interesting question, once you’re processing invoices at volume, is what happens after the header. Below are the five things Datamolino does that most of the competition either doesn’t do at all, or only half does.

1. Line-item extraction

Datamolino reads every line on every invoice. Description, quantity, unit price, tax code, ledger account. All of it flows through to Xero, QuickBooks or FreeAgent with the fields already mapped, so nobody has to open the PDF and retype lines into the accounting system.

The usual alternative is header-only capture, which looks fine until a supplier sends an itemised invoice with fifteen lines coded across four different cost centres. Then someone has to sit down with Acrobat open on one screen and the accounting system on the other. At a few hundred invoices a month that’s annoying. At 1,500 a week it’s a full-time job.

“Dext does not handle line item exports — that’s exactly why we moved to Datamolino.” — Peter Walsh, Facilities Management (UK, ~1,500 invoices/week)

For clients who code lines to projects, cost centres or tracking categories, there’s more detail on the line-item data extraction page.

2. Automatic PDF splitting

43% of PDFs uploaded to Datamolino contain more than one invoice. Sometimes it’s a batch scan of a week’s post, and sometimes it’s a supplier statement with five invoices behind it. Sometimes it’s just that Outlook merged the attachments on the way through.

Datamolino splits these by reading the content of each page rather than assuming page breaks line up with invoice breaks. They usually don’t. Invoices run over onto a second page roughly as often as they fit on one. The system identifies where each invoice starts and ends and creates a separate transaction for each. Batches up to 50 pages are supported.


In practice this means you can drop a week’s post onto a scanner’s feeder, walk away, and come back to individual invoices ready for coding. No splitting in Acrobat first.



3. Rule-based coding


Coding rules in Datamolino are something you write, not something the system guesses at. There are two levels. Folder-level defaults set the tax code, ledger account and tracking category for everything uploaded to that folder — “UK purchases,” or “subcontractors,” for example. Supplier-level overrides can then change any of those for a specific supplier. British Gas always codes to Utilities, Amazon by description keyword. And so on.


When a supplier changes something and the coding needs to change with it, you edit one rule. Nobody is retraining a model or hoping the next upload behaves.


Teams switching over from tools that code things “automatically” usually arrive with a mental list of corrections they’ve been making manually every month. Those corrections become rules on day one, and the manual corrections stop.



4. Export Guard (the checksum)


Before anything exports to your accounting system, Datamolino adds up the line items and compares the total to the invoice total. If the two numbers don’t match, Datamolino blocks the export and flags the invoice for review.


“The system validates invoice accuracy by comparing the sum of line items with the total invoice amount using a checksum to prevent export errors.” — From Archie Todd meeting (UK, 100 invoices/month)


This catches a surprising mix of problems: a line Datamolino missed, a quantity that got read wrong, occasionally a supplier whose own invoice doesn’t add up (this happens more than you’d think, especially with handwritten or hybrid paper-and-digital invoices). Because the check runs on every export, the kind of bug where a client’s Xero reconciliation is off by a few pounds and nobody notices until month-end doesn’t really happen.



5. Duplicate detection


Duplicates get caught in two places. When you upload a file Datamolino has already seen (identical hash, same PDF), it’s blocked before processing and you aren’t charged for it. Near-duplicates are more interesting: same supplier, same invoice number, same date, but a different file. That’s usually a supplier who resent the email, or a PDF that got re-exported with a new timestamp. Occasionally it’s a genuine duplicate invoice the supplier has billed twice, and catching it before export saves a real overpayment. Either way, Datamolino flags it before export rather than blocking outright, and there’s a manual override for the false positives.


“Duplicate detection prevents reprocessing of identical files while allowing manual override if needed.” — Molly Leith, AP Automation (UK, 600 invoices/month)