trashpanda.cc

Kohlenstoff-basiert Zweibeiniges Säugetier

Some thoughts after playing around with Claude Code

I've spent a morning playing around with Claude Code, which is Anthropic's new "agentic coding" tool. It lives inside the terminal and can work directly on a codebase without needing to copy/paste between an editor and the AI's web interface.

The initial impressions are pretty good - after a couple of hours, I had a simple React app and backend Express service up and running which can pull data about local aircraft traffic from an ADSB receiving station that I've got running on my local network.

That felt like a reasonable challenge for a code tool - more than a "hello, world" use case, some complexity around handling potentially messy data from asynchronous sources, and niche-enough that there aren't hundreds of pre-built examples lying around the web which have ended up in LLM training data. I'd have been able to build it unaided, but it would have taken me longer than a morning.

Before everyone rushes out to fire all their software engineers and get product managers building their products themselves, though, there's a few real-world caveats.

Some thoughts:

It's not cheap. I burnt through $25 of credits inside a couple of hours building a toy application, so the cost is going to stack up if you've got an entire business using the tool. Ironically, the reason why there isn't a link in this post to play with the results is that I burnt through $10 of credits just trying to fix a networking issue with the "live" deployment, and had to decide when to call a halt

It's very quick at the simple stuff. Knocking together a basic application structure or putting together a very simple UI, for example, is something that you can blast through very quickly. Lots of projects already have scaffold generators, but this feels somewhat like having a much more "universal" scaffolder

Things get complicated quickly. There doesn't seem to be much if any intermediate stage between creating simple "hello, world"-level functionality on the one hand, and very verbose and potentially over-engineered approaches on the other. Without specific, the tool has a tendency to build in lots of configuration and options without being prompted. Getting it to stick to a "do the simplest possible thing" takes a surprising amount of effort

It's easy to lose track of what's going on. It's quite hard to follow what's going on in terms of code changes, which means there's a risk that you end up sitting back and just typing 'yes' to all the suggestions. Figuring out what's actually going on once that's happened is a lot harder than keeping on top of the code from the outset

Bugs pop up a lot. It tends to introduce new unrelated problems with the incremental feature improvement steps, so the overall process is a combination of getting it to resolve the bugs while also adding new functionality

Lots of mistakes are really dumb. The bugs that do creep in seem to fall into two categories: obvious howlers that a human probably wouldn't have made in the first place; and things that are a lot more subtle making them harder to spot (and fix)

Getting rid of complex problems can be hard. Fixing the dumb stuff is usually just a case of pointing the problem out, but fixing more in-depth issues can very quickly spiral into a cycle of "fix one thing, break something else" which is hard to get break out of

Sometimes that means throwing stuff away. When that spiral happens, it seems like the only way out is to throw away the changes and start over. The tool doesn't seem too good at keeping track of changes, so you need to stay in charge of managing the Git commits as you go along

The AI is a very convincing bullshitter. Superficially, you're working with a supremely and effective tool, which has a nice line in chatty confidence about what it's doing. What's actually happening, though, can be very different - some bugs were both subtle and hidden. I think I'd feel nervous if I was building something mission-critical in performance terms or highly-sensitive in terms of data.

Overall, it's very impressive, and I definitely achieved a lot more in a morning than I would unaided. But - if I didn't have at least an intermediate knowledge of how to go about building the solution in the first place, I think I'd have ended up with a mess that would have been impossible to fix.

There are lots of analogies that get thrown around with all the hype surrounding AI, but one I keep seeing is that software is in the midst of an industrial revolution. We're moving from a world where everything was built by hand, to one where everything is built by machine.

The more I use these tools, though, the less convinced I am that this is the right way of looking at it. A better analogy would be the automation that you'll find on the flight deck of a commercial airliner.

The routine, non-critical stuff is largely taken care of, but there's an expert in the loop at all times. They're there both to monitor the automation, and crucially to intervene and take charge if the automation either goes wrong, or can't cope with the situation. Some flights could be automated more or less in their entirety from take-off to landing, but would you be prepared to trust your safety to wholly-automated systems on ALL flights?

That throws up two questions. Firstly, if we're automating away the boilerplate coding and the routine stuff, where will future generations of software engineers get the experience they need to reach the intermediate level needed to stay in control of the process?

Secondly, how many upcoming software disasters are going to occur because businesses will succumb to the temptation to cut costs in the short-term with AI tools, and end up with unmaintainable messes as a result?