The first open benchmark for measuring bank statement extraction accuracy. 15 synthetic statements, 36 parsing challenges, 12 countries, across 3 difficulty tiers.
Get started in three steps.
curl -O https://bankstatemently.com/benchmark/statements/bsb-001-statement.pdf
your-parser bsb-001-statement.pdf > result.jsonHASH=$(shasum -a 256 bsb-001-statement.pdf | cut -d' ' -f1)
curl -X POST https://api.bankstatemently.com/v1/benchmark/evaluate \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d "{\"contentHash\": \"$HASH\", \"transactions\": $(cat result.json)}"Real-world bank statements contain dozens of formatting quirks that break naive parsers.
Statements omit the year from transaction dates (e.g., "03/15" instead of "03/15/2025"), which makes it impossible to sort or reconcile without guessing.
In 4 statementsStatements only print the date on the first transaction of each day — the rest have blank date cells.
In 1 statementStatements place the plus or minus sign after the amount (e.g., "123.45-" instead of "-123.45").
In 1 statementStatements embed currency symbols (e.g., "$1,234.56") directly in amount cells, which breaks numeric parsing in Excel and most CSV importers.
In 2 statementsStatements sometimes lack clear row or column lines.
In 1 statementStatements use separate Credit and Debit columns in some formats, while others use a single Amount column with signs.
In 10 statementsPage headers and footers contain bank logos, page numbers, and disclaimers that can interfere with transaction extraction.
In 1 statementStatements are often image-based PDFs with no selectable text, so copy-paste and standard PDF extractors return nothing useful.
In 3 statementsYour score is measured across two dimensions — the same framework we use to independently test commercial converters.
Field-by-field comparison of dates, amounts, descriptions, and balances against ground truth.
Balance reconciliation, total validation, and row-level alignment across the full statement.