Year 2025 in Review
It's another year again, and I look back on what I did. This is mainly for myself, as I didn't write one of these in 2022, and now I can't remember what I spent all my time doing for the whole year.
Away from Databases
I had started the year by cutting my losses, and moving away from doing database work. A couple years back, there weren't any databases that were immutable, local-first, open-source, reactive, and any other number of properties that I wanted. I spent a good amount of time getting up to speed, but I never got to a point where I had remotely the output necessary to make it a reality.
I reminded myself that databases were never the point, so I'd cut my losses and work on the reactive collaborative notebook with effects. I had to reload my brain with the research I did on reactive systems and algebraic effects around January, and started coding up an algebraic effects system through March.

Visual Programming
I'm not currently working on visual programming, but I seem to have opinions that resonated with others about it.

I wrote this over the course of four days, and had to rewrite it once or twice, as I was trying to figure out what I was really trying to say. It's notable as a year in review, because I remember the feeling of having something important to say, but struggling to figure out how to say it.
I'm glad that I did, because it ended up on my year in review as something I'm proud of doing. In 2026, I need to listen to that feeling more, and pause to do more writing.
DBSP
DBSP is one of those things that is terribly attractive to me, but I feel guilty working on it. Is it core to a reactive notebook with effects? No, but it's certainly tangential. I spent about a month and half on a half-baked implementation.
I used AI to help me read the paper, and I learned a lot from both looking at a reference (but bad) implementation, and doing the implementation myself. It was really nice to have something built up like theorem from the ground up. However, the issue was how to structure certain parts of it outside of the theorem. I had little to no experience with this kind of domain, and there was a lot of rewriting of code. I learned a lot, but I felt it wasn't enough to tell people about it.
At around this time, Claude Code had come out, and I tried to vibe code DBSP with Claude Code using Sonnet 4.0, but it was a big failure. The code was wrong in completely subtle ways, and only after I had to work with it. It felt a bit depressing.
Maybe the problem was that I was too worried about ruining an existing codebase with AI-based slop. So I took a detour of a detour to have something "like" a notebook, but a sample project to vibe code upon without worrying about terrible code.

Personal Consumer Review
I noticed that one of the topics I often asked ChatGPT was to do recommendations for a product category. I've often wanted Wirecutter to tell me not just what they recommend, but what to look for. I wanted them to tell me what good looked like and why that was the case. I thought a GPT-based product recommendation canvas would be a good tool to have.
I vibe coded it and was surprised that the initial prototype looked as good and worked as well as it did. I did end up abandoning the project because I figured it was too obvious, and I saw OpenAI was also doing product shopping and recommendations.

It was around this time that I saw people misusing Claude Code.
Misusing Claude Code
There were people that started using Claude Code for non-coding purposes on their personal knowledge bases in the file system. I thought it was an interesting experiment, so I finally exported myself out of Roam Research into Logseq. Then I asked Claude Code some questions after it explored my personal knowledge base. The result was a breath of fresh air.
Normally, I don't have anyone to talk to about the things I'm thinking or reading about. But Claude Code with access to my knowledge base, I didn't have to explain context to have a conversation with it. It felt like instant connection, because it mimicked when someone gets you. Of course, there are downsides to this, but I soon saw other people misusing Claude Code in different ways, from doing their SOC2 compliance to running their morning retrospectives.
I found it was a powerful combination. I could run a coach for customer development, which I could never do before.

This was the first time I had cobbled some off-the-shelf tools in order to live in the future. I was going to pivot again, this time from notebook to a wiki where agents and humans work in the same digital space.
System Evals
While I was doing this, I took Hamel and Shreya's course on system evals. While Sri and I did write an eZine on system evals, called Forest Friends, I felt like most of the knowledge of evals came from Sri, and I needed to shore up on it.

In the past, I never would have ponied up the money for such a course, but later in my career, it's a quick way to get up to speed on something that would have taken me longer to gather all the materials myself.
I learned some stuff here and there, but excitingly, I could also misuse Claude Code to run a system eval!

Where Agents and Humans Roam
After these experiences, I wanted a wiki that agents and humans can work on in the same digital space together. The set up of Claude Code and Obsidian seemed to have legs, and powerfully so. However, I thought there were improvements that could be made. The ideas I had for the notebook before wouldn't die, but they'd be eventually incorporated.
Around August is when I decided to start work on this, and that's when I went silent in the lab notes, and just worked on writing code.

It wasn't really with any desire for secrecy, as much as it was that context switching between writing code and writing prose was hard for me. So I'd occasionally write an essay and went back to writing code. However, I'm not sure exactly how to articulate what this is yet, but I know I need to do so in 2026. I just thought I'd be further along than I am right now.
The current application is pretty rough UX-wise. A lot of hand-wringing and code-wrangling was spent around supporting offline-editing. I discovered first-hand, where the state of local-first software is, and there are still major pieces that are missing before this is a no-brainer for mainstream devs.

Working with Agents
The cross-cutting theme for this year was trying to see vibe coding worked, and whether it worked for me. Claude Code had come out in February, and there were people that were claiming success with it in production.
Something I've always struggled with was speed vs quality. I tend towards quality, refactoring as I go. However, I always feel terribly slow, compared to other devs. Agentic coding promised a productivity boost, and as desperate as I was for it to work, I just couldn't get it to work for me.
In retrospect, this wasn't really my fault. There were a couple things working against me.
- Vibe coding works well for well-known stacks. While I was using Python at the time, I wasn't using any frameworks.
- Vibe coding works well for well-known domains. What I was working on was not a web app, and DBSP is hardly a common domain.
- Vibe coding works well when you have better models. Since GPT-5.1 in Codex and Claude Sonnet 4.5 in Claude Code came on the scene, I've noticed much better results.
- Vibe coding works well when you have lots of scaffolding for the coding agent to lean on, such as unit tests, type checking, documentation, comments, PRs, etc. If you're just a single dev, you may not do these things because it doesn't make sense for a single dev to do it.
- Vibe coding works well when you know how to break things down for an agent. It was rather unintuitive to me that I needed to break a feature down into code base research, planning, and implementation stages. It was also an eye opener that you don't need to write every instruction yourself. You can ask the AI to ask you questions in order to write the prompt.
- Vibe coding works well when you're used to managing others. If you don't like managing people, you're going to dislike vibe coding.

It's been a hard adjustment. And even now, I can't completely let go of writing code by hand. There are some parts of my app that I leaned completely into vibe coding, such as sections that are scaffolding that I know I'll get rid of, or various bespoked tools not yet available for LiveStore.
One great use is for me to generate prototypes to click around. Sometimes, I need to have something to click on to know how it feels, and vibe coding is a great way to explore the space before doubling down on something.

Another great use is in bridging over the adjacent ecosystems and spaces that would have had too much activation energy to get into, such as Lean.

I haven't leaned completely into vibe coding the core aspect of the code base. At best, I take a middle road coined as "Coding like a Surgeon" by Geoffrey Litt. While I do ask Codex/Claude Code to research the code base and make a plan, I found it better if I asked it to generate a tutorial for me to implement.
On one hand, this could seem like I'm just bottlenecking myself. But I think the benefits are that I get to keep abreast of the code base for future edits, be in control of bloat, and use my taste and view of the future to correctly draw the system boundaries.
One day, I won't have to do this, but for now, coding agents aren't good at, or need a lot of context in order to draw the correct system boundaries for code bases that are out-of-distribution.
Retrospect on 2025
When I look back, I only work on something for about 3 to 6 months before moving on to something else. So while the ideas in my head build upon each other and compound, my work doesn't. I think it was because I never really had full conviction with anything I was working on, and I always felt guilty about it if it didn't directly lead to a product.
However, its' not happening this time with this wiki with agents thing. I had wanted it to be personally usable by the end of the year, but it looks like it'll be another month before that can happen. I hope to privately launch it by the Spring.
My thesis is that a reactive notebook with effects will make building LLM-driven applications much more quickly and pleasant. But that's just a hunch. That's not something verifiable in a book, but needs to be wrought with the friction of users coming up against product. If I don't get there within 2025, I'll be disappointed in myself.
To sum up, for 2025, I wish to change my work habits to do work faster, create digital by-products that make my work more legible, and get a reactive notebook I want to use every day up and running. That's my goal, and my promise to myself a year from now.
- Me at the end of 2024
I've changed the thesis a bit, so I'm not too hard on myself there. However, I'm disappointed in that I haven't created as many digital by-products to make my work more legible. I have dozens and dozens of blog post drafts and ideas that aren't finished, and I need to find a different way to do this.
Commitment to 2026
With a new year comes a new aesthetic. I wanted blog post images that stood out, so for a while I was using pastel origami. But I've been thinking that I want to explore an aesthetic that we started with Forest Friends: a urban architecture with organic curvilinear shapes mated with half-timbered architecture. I want to see what that can look like.
But more seriously, there's two things I can commit to for 2026:
- Launch a product.
- Write a lot more to be legible.
I think with the clarity and conviction that I have now, it seems a little easier. However, I know that there's going to be events that will test that conviction in the coming year.