Lab note #047 Progress on Notebook
My progress on Crafting Interpreters halted when I was researching how Algebraic Effects were implemented, once I realized that I might not need all the gory details. The part I wanted to get to was understanding how closures were implemented, especially if they were relevant to getting Algebraic Effects to work. But I don't think they're strictly necessary for the time being.
For now, the functional notebook is on an odd stack. It's a VSCode extension that runs a webview, which runs python in WASM in a web worker. Much of the effort was spent just getting python running in WASM in a web worker due to VSCode idiosyncracies. This is the type of stuff that I'd love for LLMs to help with, but usually there's not enough training data that it often gets the APIs wrong. Even when I add documentation to Cursor's index, it doesn't seem to help.
For all the stories of people one-shotting features through Cursor's Composer on Twitter, that's not been my experience. Even when I write a more detailed doc, I find that some of the design choices that it makes are just...odd, or at least out of context with my codebase, even though it's supposed to have the best context of my code base as a text editor! So when I tried one-shotting entire features with Cursor Composer, I think the most deterimental thing is that I would abdicate thinking about the problem in detail. So when things didn't work, I would think that maybe it was my prompt, or that I didn't provide enough context.
But I think in my experience, the stuff that I've been doing so far, theorem proving, immutable data structures, and algebraic effects are not typically what engineers are doing, so there's less training data for it. Even some of the stacks I end up using, such as Isabelle, Zig, or VSCode extensions just aren't that popular of a stack, so its suggested code is often wrong. Â
What I found more productive was thinking of the LLM results as a teaching tool, rather than a productivity tool. So when it suggested solutions, I wasn't worried about one-click application to my existing code base. Instead, I could question why it made certain choices and not others. Hence, I could identify where it wrote extraneous code, or could tell it was heading down a dead end.
I've also started leveraging Claude's Project Mode, similar to NotebookLM's Notes or OpenAI's Canvas. Its a new kind of writing tool, where the focus between you and the AI isn't on the conversation itself, but on some digital artifact that you're building together. It's the same feeling you get when you're playing Jenga--everyone's not focused on each other, but at something between us all that we interact with. Through the process, you keep gathering context for the LLM to process, either externally, or something curated from your conversation with it. I think this is a recognition that LLMs do much better when they can operate with the relevant context. But getting that relevant context in the chat is really hard at the moment. For Cursor, it should have the best context for my codebase, but I find it's actually pretty terrible at it for the time being.
What is the UX for collaborating with a UI? I think we're all kind of fishing around for something that makes sense, so far, nothing cracked it entirely. But I think this current crop is in the right directly. I have a feeling that Geoffery Litt is correct in that version control will play a big part in how we collaborate with AIs.
I have most of the mechanics and the communication between each part of the effects system completed. Next, will be getting it to find the correct handler, executing the handler, and resuming control back to where the effect was raised. I want to get this part done, so I can start trying it out. Because any deployment of the notebook will have to have this reactive effects runtime running as well.