The Agent That Manages My Personal Budget

Meet Budsy

Somewhere around week three of my OpenClaw setup, I realized I had 14 AI agents handling code, marketing, email, and research, but I was still manually categorizing bank transactions in a spreadsheet. That felt wrong.

So I built a personal budget app. The agent that runs it is nicknamed Budsy. It lives in Slack at #ai-budget, runs on Claude Sonnet 4.6, and has one job: help me understand where my money goes without me having to think about it more than necessary.

The app itself is a Next.js application backed by Supabase. Forge, my lead engineer on Opus 4.6, wrote virtually all of it. It went from a 52/100 audit score to 94/100 through a series of focused improvement sprints. Seventy-nine tests passing. Clean build. Auto-deploys to Netlify.

But the interesting part isn't the tech stack. It's the workflow.

The philosophy: suggest, don't decide

Early on I made a decision that shaped everything: the agent suggests, the human approves. Budsy never auto-creates a booking. Never auto-categorizes a transaction without my sign-off. Never sends anything to my accounting system without me hitting a button.

This sounds conservative. It is. I've seen what happens when you let AI make financial decisions unsupervised. The confidence scores look great right up until the agent miscategorizes a 47,000 NOK payment as "office supplies" because the vendor name was ambiguous.

The threshold I settled on: if the suggestion engine has 90% or higher confidence, it pre-fills the form. Below that, it suggests but doesn't pre-fill. It never auto-creates a draft booking, regardless of confidence. This means I spend a few minutes each day reviewing suggestions and approving them. That's the tradeoff, and I'm comfortable with it.

How transactions flow in

There are two paths for getting transactions into Budsy.

The first is CSV import. I download a statement from my bank, drop the file into the app, and it parses the transactions. Each one gets a row in the database with the date, amount, description, and vendor name.

The second path is newer and more interesting. I have a Folio agent that connects to my bank accounts via API. Every night at 2am Oslo time, a sync script pulls new transactions from Folio, deduplicates against what's already in the database using a unique activity ID, and inserts the new ones. When transactions come in through Folio, the matching engine automatically runs against any unmatched receipts.

The Folio integration was a real project. The sync script has a --dry-run flag so I can see what it would import without committing anything. It writes a log file for each run. It handles multiple bank accounts, each mapped to a specific company in my setup, because I have transactions from both my personal accounts and my company (Bakke & Co AS).

The suggestion engine

This is the core of Budsy's intelligence. When a transaction needs a booking (which is what Norwegians call a "bilag," an accounting entry), the suggestion engine tries to figure out the right account code.

It uses a three-signal hierarchy.

First signal: receipt line items. If there's a matched receipt, and that receipt has been processed by GPT-4o-mini to extract line items, those items give the strongest signal. A receipt from a restaurant with food items and a service charge is clearly a meal expense. A receipt from an electronics store listing a specific product is clearly an equipment purchase.

Second signal: transaction description keywords. The text that your bank puts on the transaction. "REMA 1000 MAJORSTUEN" tells you it's a grocery purchase at a specific store. "SPOTIFY" tells you it's a subscription. The engine maps common keywords to account codes.

Third signal: vendor history. If I've booked transactions from this vendor before, Budsy remembers what account code I used. This is a learning loop. Every time I create a booking, the system stores the mapping between vendor name and account code. Next time a transaction from that vendor appears, it suggests the same code.

The hierarchy matters. Receipt data is the most specific, so it wins. Vendor history is the most reliable for repeat purchases. Transaction descriptions are the fallback when there's no receipt and no history.

The account code system

Norwegian accounting uses a standard chart of accounts (kontoplan). Budsy has 31 account codes seeded in the database, covering the range from 3000 (sales income) to 7770 (other operating expenses), plus balance sheet accounts.

When you're building a personal finance agent, account codes might seem like overkill. They're not. Having standardized categories means the data flows cleanly into Fiken, which is the Norwegian accounting software I use via API. Fiken expects journal entries with specific account codes. If Budsy assigns them correctly, the data moves from my bank to my accounting system with minimal friction.

The account code picker in the UI shows the code number and description in Norwegian. "6300 - Kontorrekvisita" (office supplies). "6540 - Inventar" (furniture). "7100 - Bilkostnader" (car expenses). Budsy suggests one based on the three-signal engine, and I can accept or change it.

The booking queue

Not everything gets booked immediately. Some transactions need me to check a receipt first, or confirm which company they belong to, or decide whether something is a business expense or personal.

The booking queue is a list of pending items. Each one shows the transaction details, the suggested account code, the confidence level, and whether there's a matched receipt. I can open a drawer, review the suggestion, adjust if needed, and approve.

The queue view has filters and sorting. I usually sort by confidence, handling the high-confidence items first (quick approvals) and leaving the ambiguous ones for when I have more context.

There's also a reimbursement detection feature. If a transaction comes from a personal bank account but the vendor and category suggest it was a business expense, Budsy flags it as a potential reimbursement. This happens automatically based on the source account's company type (personal vs. AS).

The receipt pipeline

Receipts arrive through two channels.

The first is the #ai-receipts Slack channel. I can drop a photo or PDF of a receipt there, and it gets processed. The Folio agent picks it up, extracts the data, and stores it.

The second is email. My Hermes agent monitors Gmail for emails that look like receipts, invoices, or order confirmations. It doesn't immediately dump them all into Budsy. That was a design decision I'm glad I made. Instead, Hermes uses a lazy approach: receipts get pulled into Budsy only when a matching transaction is found. This prevents the system from filling up with receipts that might never match anything.

Matching is deterministic. No AI involved. The engine checks amount (within 10% tolerance), date (within 7 days), and vendor name (using Levenshtein distance for fuzzy matching). If those three signals line up, the receipt gets linked to the transaction.

When a receipt matches and a booking is created, only then does the system generate a PDF. I use Puppeteer running locally on my Mac Studio to render the receipt HTML to PDF. No cloud service, no AI, just headless Chrome. The PDF gets attached to the Fiken journal entry when the booking is posted.

I have about 855 existing receipt PDFs on disk from before this system existed. Those can be bulk-ingested since they're already rendered. New receipts follow the lazy path.

What's still manual

I want to be honest about the gaps.

Bank statement downloads are still manual for my personal accounts. The Folio integration handles the business accounts automatically, but my personal bank doesn't have a great API. So once or twice a week, I download a CSV and import it.

Ambiguous transactions still need my brain. "VIPPS *BAKKE" could be anything. The suggestion engine can't help much when the vendor name is just my own name on a Vipps transfer. These go into the queue and wait for me to remember what the payment was for.

The Fiken integration needs an access token that I haven't generated yet for production use. The code is written. The API client exists. It builds journal entries with amounts converted to ore (Norwegian cents), attaches receipt PDFs, and posts to Fiken. But I need to go into the Fiken developer settings and generate the token. One of those "ten-minute tasks" that keeps getting deferred.

And the nightly financial summary, which is supposed to be owned by the CFO agent and posted to Slack at 7am if anything needs attention, isn't wired up yet. The pieces are there. The integration between agents isn't.

How to build something like this

If you want to replicate a personal finance agent, here's what you actually need.

You need transaction data in a structured format. CSV export from your bank is the minimum. An API connection to your bank is better. The data needs to include: date, amount, description or reference text, and counterparty/vendor name.

You need a categorization scheme. It can be as simple as ten categories (groceries, transport, entertainment, subscriptions, rent, dining, shopping, health, utilities, other) or as detailed as a full chart of accounts. Start simple. You can always add granularity.

You need a database. Supabase works well because it gives you Postgres with row-level security out of the box. Store transactions, categories, vendor mappings, and receipts.

You need a suggestion engine. This can start dead simple: a lookup table mapping vendor names to categories based on your past behavior. Every time you categorize a transaction, store the mapping. Next time that vendor appears, suggest the same category. This alone handles 60-70% of transactions after a month of training.

You need a review interface. This is the part most people skip. They try to fully automate categorization and end up with a mess. Build a queue. Show the suggestion. Let the human confirm or correct. The corrections feed back into the suggestion engine.

If you want receipt matching, you need a way to get receipts into the system (email forwarding, photo upload, or Slack channel) and a matching algorithm. Amount plus date plus vendor name, with reasonable tolerances, works surprisingly well without any AI.

The whole thing can start as a weekend project and grow from there. My version is more complex because it integrates with Norwegian accounting software and a multi-agent system. But the core loop, transactions in, suggestions generated, human reviews, bookings created, is something anyone can build.

What changed

Before Budsy, I'd spend a Sunday afternoon every month going through bank statements, manually entering things into accounting software, hunting for receipts in my email. I dreaded it. Most people dread it. That's why most freelancers and small business owners have messy books.

Now the transactions flow in automatically or with a quick CSV import. Receipts match themselves. Account codes get suggested correctly most of the time. My job is reviewing and approving, which takes maybe ten minutes a day when I'm consistent about it.

The booking queue currently has a backlog. I'll be honest about that too. I haven't been perfectly consistent. But the difference between "everything is in the system waiting for review" and "nothing is tracked at all" is enormous. Even with a backlog, I have a clear picture of my spending.

The agent suggests. I decide. The books stay clean. For a system that didn't exist eight weeks ago, built almost entirely by AI agents, I'll take that.