Every invoice you capture gets reduced to data. The real question is how much data.
Some tools pull the header: supplier name, invoice date, total, maybe VAT. Others go deeper and extract every line, with description, quantity, unit price, amount, tax treatment, and the ledger account each line belongs to. Both approaches have their place. The trouble starts when bookkeepers pick the wrong one for the work in front of them.
What “header-only” capture actually gives you
Header-only tools read the top-level invoice fields: supplier name, invoice number, date, and net/VAT/gross totals. The invoice itself is attached as a PDF for reference, but the individual charges that make up the total don’t become structured data. If the invoice has twenty lines, your bookkeeping software sees one transaction with one total.
“Xero Bills gets 85% accuracy on headers. Datamolino reads every line — supplier, amount, ledger code, tax code.”
— Jason Allen, Shakti Japan (Xero user for 12 years)
When header-only capture is the right call
A coffee shop receipt, a taxi fare, a utility bill always coded to the same account, a subscription that never changes — there’s nothing for line-item extraction to do. The whole invoice gets coded to one account. The same goes for small expense receipts, single-service invoices (your accountant’s fee, a domain renewal), and high-volume low-complexity suppliers where every invoice looks the same.
If ninety percent of your document intake looks like this, a header-only tool will do the job with less setup effort.
When header-only capture quietly costs you money
Header-only tools don’t fail loudly. They fail invisibly. You don’t find out you needed line-item data until someone asks a question the headers can’t answer.
Mixed VAT invoices
A single invoice with standard-rated, zero-rated, and exempt items has one gross total but three different VAT treatments. Header-only capture gives you a lump sum. You then reconstruct the VAT split manually, or accept that the posting will be wrong until someone fixes it. On Xero, that means either splitting the bill by hand or creating an approximation that will fail a VAT review.
Cost centre and department coding
When one invoice covers multiple departments (a cleaning services bill for six buildings, each billed to a different cost centre), the header tells you the supplier and the total. It says nothing about which building got what share.
“Self-billing invoices contain multiple line items per invoice, each requiring separate coding for different cost centres.”
— Keith Bristow, Multi-Entity Company UK (~800 invoices/month)
At that volume, the line-level data is the work.
Public sector and regulated customers
Local authorities, NHS trusts, and government contractors often require invoice breakdowns at the line level in their contracts. If your finance system can only produce a header-level export, you’re either rebuilding the detail in a spreadsheet or asking the client to go back to the original PDF.
“Detailed line item capture is critical due to public sector customers requiring full invoice breakdowns, now representing 25% of the customer base.”
— Peter Walsh, Facilities Management UK (~1,500 invoices/week)
When a quarter of your revenue depends on line-level reporting, header-only capture becomes a blocker.
Project accounting and job costing
In construction, consulting, or agency work, the profitability of a project is the sum of its lines. Materials on one, labour on another, subcontractor time on a third. Header totals tell you what you spent. Line items tell you what you spent it on. Without the lines, your WIP reports and project margin calculations are guesswork.
Audit and VAT inspection readiness
When HMRC or an internal auditor asks why a charge was coded a particular way, “because the supplier sent an invoice for £2,847.60” won’t pass. The answer lives in the line items: which charge, which VAT code, which nominal account, and why. Without structured line-level data, every audit turns into a PDF-reading exercise.
Self-billing invoices
Self-billing flips the normal direction: the customer issues the invoice on behalf of the supplier. These are common in agriculture, construction, and any industry where the customer controls the measurement (meter readings, weigh tickets, shift logs). They’re almost always multi-line. Using header-only capture here removes the entire point of the document.
What line-item extraction actually captures
When a tool extracts line items properly, each line becomes its own row of structured data: description, quantity, unit price, line amount, VAT rate, ledger account, tax code, and tracking categories (cost centre, department, project). That data flows into Xero, QuickBooks, or whichever finance system you’re using, pre-split and pre-coded. The bill doesn’t need to be manually rebuilt at the other end.
You’re no longer just reading the invoice. You’re reproducing it as accounting data.
The coding problem: rules vs guessing
Line extraction alone isn’t enough. Once you have the lines, something has to decide how each line gets coded.
Some tools use AI or machine learning to guess. It works a lot of the time. It fails unpredictably the rest of the time, and when it fails, you can’t tell why. The “reasoning” is opaque. You end up reviewing everything anyway, because you can’t trust any of it without checking.
Datamolino takes a different approach. Coding is deterministic and rule-based. You set the rules once per supplier, and those rules run the same way every time. If the rule is right, the output is right. If the rule is wrong, you fix it once, and every future invoice from that supplier is right.
This matters for audit defensibility. You can point at a rule and say this is why this line was coded this way. You can’t do that with AI classification.
Export Guard: the check that header tools can’t do
Header-only tools will never catch line items that don’t reconcile to the invoice total. OCR misreads a digit. A line gets partially captured. A discount line is missed. The difference posts to the wrong place, quietly sitting in your ledger until someone reconciles the supplier statement months later.
Datamolino blocks export when lines don’t add up to the total. The invoice can’t leave the system until the numbers agree. A simple mechanical check, but one that only makes sense if you have line-level data in the first place.
One note on pricing, since it’s a recurring question: line-item capture is included at every Datamolino tier. No “pro” plan that unlocks it, no add-on fee per line, no document-type surcharge.
“Dext does not handle line item exports — that’s exactly why we moved to Datamolino.”
— Peter Walsh, Facilities Management UK
Bottom line
Header-only capture is a reasonable choice for simple document flows. Faster to set up, cheaper at entry, and adequate for low-complexity invoices.
It stops being reasonable the moment your invoices carry information that matters at the line level: VAT splits, cost centres, project coding, customer-mandated detail, or audit-grade traceability. At that point, header-only tools aren’t saving you money. They’re moving work from the capture step to the posting step, where it costs more per minute and introduces more errors.
Line items aren’t always better. They’re better when your invoices are doing more than one thing.
Try Datamolino free — process 100 documents at no cost.