December 2020
If your product moves money on behalf of customers, and you manage the ledger, you need reconciliation. You can think of recon as the process of making sure every transaction in your system matches one in the external world. For every dollar you’ve moved, an external entity agrees with the amount and direction, and has provided “documentation” to that effect.
I’ve seen a few performant recon systems that came under stress as they scaled, and I’ve worked with startups at early enough stages that getting recon right was not existential. I’ve also never built a recon system from scratch. Everything I'll say here is from the perspective of a user of recon systems rather than a maker of them. I’ve been a customer of a couple, and their performance impacted my output so I have strong opinions about what I wish would exist. Since I'm not sure what a perfect recon system would look like, I'm writing this to flesh out the thought, and to smoke out anyone who already knows.
I’ve interacted with reconciliation systems aimed at two types of problems:
Transaction oriented recon makes sure that some external party agrees with every money movement in your ledger. I saw this philosophy in the recon systems I interacted with at Cash App & Square in general. In this model, the objective is to make sure that every money movement action matches your intention. This means the state of the transaction, the direction, and the amount are what you expect. A secondary objective is ensuring the timing matches your intention. This is secondary because in a lot of cases, the actual precise timing doesn't matter as long as it happens "soon", and as long as the underlying accounts aren’t run at a $0 balance. In this case reconciliation solves an accounting problem, ensuring money movements are correct. It also helps ensure that the company's receivables and payables are complete, is useful for regulatory & financial audits, and empowers your treasury team to make good cash management decisions. I suspect most acquirers (Stripe, Square, Adyen, etc.) at least start by pursuing transaction oriented recon.
Typically in an acquiring world, you construct the “internal” ledger from the settlement/capture messages generated by the card networks. This is what product teams look at to inform customer facing features. You construct the “external” ledger from settlement files generated by the acquiring bank. Accounting teams look at the external ledger (technically accounting teams look at both ledgers, but product teams rarely look at the external ledger on an ongoing basis).
A recon system often includes an engineering team paired with an operations team, working together. In cases where the internal and external ledgers disagree, a human (on the ops team) reviews the data. They determine what’s causing the exception, whether it's systematic, how frequently it occurs, and what to do to fix it. The eng team continually optimizes the process to reduce the exception rate over time. Transaction oriented recon primarily solves accounting problems. You’re typically working towards SLAs designed for monthly/quarterly earnings close, and your outputs feed into income/cash flow statements.
Balance oriented recon ensures precise amounts in bank or customer accounts on a periodic basis. You use the same internal and external ledgers as in transaction oriented recon. However, you're comparing not only the amounts, state and direction of a transaction, but also its timing. This type of recon system can be useful for accounting, but is ideal for building systems that report a balance at a point in time. One example is a banking system of record. In the case of a system of record, a balance oriented recon system informs customer-facing balances and FDIC insurance.
Balance oriented recon systems are required for organizations that issue instruments and are the final source of truth for their own ledger. Most financial technology companies today rely on the ledgers managed by their infrastructure providers. For instance, if you issue cards, the banking as a service platform typically connects to the bank’s core, and most traditional bank cores have a balance oriented recon framework built in.
For context - in order to provide FDIC insurance to customers, banks are required to provide an auditable record of customer balances at any point in time. This is usually solved by being able to provide a daily snapshot of customer balances. This function is one of several provided by core processors, and as a result most core processors have a balance oriented recon process built in by default.
However if you’re the rare card issuer managing your own ledger (or really building any kind of financial product where you’re responsible for your own ledger, such as a digital wallet where you own the money transmission licenses), you’ll need to build a balance oriented recon system eventually. It's the way you’ll be sure you have the money that you’re telling customers you have.
One overarching problem that affects all recon systems is what happens when new types of money movement impact a balance. For instance, imagine you run a digital wallet where your primary funding and cash-out transaction types are ACH debits and credits. Also, imagine you’ve built a perfect reconciliation system, with the combination of technology and human process that allows you to tie out balances and payments with zero failures (this is super unlikely). The moment you add payment cards as funding/cash out instruments, you now have a different external ledger to integrate with. It will have different edges than you're used to. You'll deal with potentially different organizations, who have different processes for resolving exceptions. No matter what you do, this will take time to get right, and long after your new feature is launched, you’ll probably discover new, undocumented quirks. Some of these quirks will only be clear when you’re processing money movements at scale. I’ve seen cases where the incorrect MID set with a card network resulted in hundreds of millions of dollars routed to the wrong (internal) account. Survivable error as the transactions were reconciled in aggregate, but bad for accounting and distraction caused to cross functional team members pulled in to swarm the problem.
For balance oriented reconciliation systems in particular, solving timing problems is critical. Timing problems typically occur when a) the payment authorization time and the settlement time are different, and your system’s not necessarily aware, b) you’re dealing with payment types where the settlement amount can be adjusted multiple times c) your ledger updates customers balances when a new payment authorization comes in, rather than a settlement message. In all these cases you’re grappling with a few questions (I don’t actually know the right answers to these):
Very often you work with a wholesale bank whose systems are seasoned and handle the majority of exceptions using manual workflows. This can be frustrating; you’re faced with either adopting their manual processes, which bind your cost structure to theirs, or accepting a higher exception rate temporarily while you build technical systems around their process. There’s no easy trade-off here.
In the course of product development you’ll often prototype by adding new money movement types to your ledger. A lot of these prototypes (as should happen) will be discarded. Despite this, they will have moved real money and affected your real ledger, and (at least for your accounting team's sanity) you’ll need a stateful way to reconcile the money that moved to your ledger. While at Cash I once spent a year integrating into 6 card issuer processor systems while prototyping the Cash Card. With each integration we needed a float (depositing funds with the card issuer so we could test transactions in the real world) which meant our accounting team now had 6 new banking relationships to monitor, 4 of which lasted less than 6 months, but all of which required material floats amounts. In a few cases, the issuer processor didn’t actually enable us to manage our own ledger, so we’d have a parallel ledger (one on our databases and a mirror on theirs) that we’d have to keep in sync. There was at least one integration that we ultimately discarded, which took us several months to reconcile, long after we’d walked away from the partnership. How you handle these cases will depend on what’s financially “material” for your organization. In our cases, prototype floats were all sub $100k, so survivable at our scale. But tracking these down repeatedly was an insane level of tedium.
Sometimes you’ll contract with multiple banks for different financial services. For instance one bank for merchant acquiring and another for card issuing. (Disappointing tip: Using the same partner for both functions typically doesn’t simplify Ledger management or reconciliation.) If your ledger spans both banks, you’ll need an internal process that ensures the right amounts are in the right accounts, based on customer needs (and sometimes regulatory requirements). This causes some fun problems as you scale - I once saw a process that used a SQL-like cron job - which was under-monitored - to move funds between two accounts a few times a day. Sometimes that job failed, and if you weren’t watching, you’d be stuck a few months later trying to figure out why the actual amounts in each account didn’t match your expectations.
This gets even more complex if the accounts operate under different regulatory constructs. For instance, transactions processed under MTLs often have to be kept in separate accounts from commercial transactions. This means a balance that mixes both p2p and commercial payments (eg Paypal) ends up being funded from separate payment rails.
A lot of early recon systems will reconcile against aggregate views (a batch of transactions) rather than an individual transaction level. For instance you might look at all transactions in a specific day from a specific counterparty. Even at small scale this gets messy, but it’s better than nothing and happens more than you’d think.
Thinking ahead - someone probably should build a startup out of this, but it will be a slog.
A while back when I was idea hunting I once asked a senior member of our accounting team; if I could solve one problem for them, what should I pick? The response: “instant month-end close”. One frustrating problem that financial services companies face is closing the books at month end (or quarter end if you're public). I've felt this pain before and know it to be true for other financial services organizations. Of course, recon is not the only thing that slows down the close, but it's often the long pole. This manifests in: 1) its end of quarter and you need to close the books 2) your internal ledger doesnt match your external ledger 3) your product, eng, ops, and accounting teams grind to a halt hunting for the dollars. In addition, given that "money" is your product in financial services, this function is extra important. The bigger you are, the higher the stakes; as a startup, quarterly closes probably don’t matter. As a public company, earnings are at risk. When a recon system works well, “instant month-end close” is the prize.
Even at scale, not all financial technology companies manage their own ledger. Many don’t. However, if you’re running a P2P program and relying on your own money transmission licenses, you’re likely managing a ledger yourself. State regulators can request details of customer balances during routine audits, and you’re obliged to provide them. As an example, last I checked, even gargantuan fintechs like Chime don’t manage their own ledger for the core checking account product. They rely on Galileo, their core processor, which includes a balance reconciliation system (if you’re at Chime and this is no longer correct, please let me know)
If your competitive edge involves innovating around money movement, you generally will need to manage the ledger yourself. This includes features like instant funds availability after a stock sale or after receiving credit card swipes. This is because current infrastructure providers aren’t mature enough to give a young product this kind of flexibility. By the time the current crop of infrastructure providers mature, the axis along which you’re trying to innovate will be a commodity, because anyone else will be able to access it via API.
If you’re building a financial services product and managing your own ledger and you have yet to build out a reconciliation process, hopefully this helps. And if you’ve encountered problems not outlined here, I’d love to hear about them: ayo [at] kunle [dot] app.
Thanks to Molly Zhang, Ryan Lea, Dimitri Dadiomov, Timothy Thairu, Jim Esposito, & Dhanji Prasanna for reviewing this in draft form.
Share this essay on twitterTo get notified when I publish a new essay, please subscribe here.