Are LLMs about to end my career?

Last updated: 21 days ago

Published: about 1 year ago

A common refrain I keep hearing, even from people who should know better, is “you’re a programmer? Your job is dead, isn’t it?” Here’s a measured response.

DISCLAIMER: This post expresses my own personal opinions in the narrowest sense of the term. It does not reflect the opinions nor decisions of my customers or employers - past, present, or future.

What is my role?

At the moment, my role is “lead engineer”. Since I’m working on a start-up, that deceptively short title stands for “wears a multitude of hats”. I’ll look at the most common one I wear, break it down, and see whether an LLM could do the work involved.

Building features

This is what I’m doing the most. At a high level, every feature we’re building will go through the following phases:

conception
planning
implementation
testing
release

Let’s review them in order.

Conception

Broadly, I propose that there are two types of features. One of them is customer-driven, the other business-driven.

The customers drive features by asking for them. This is a laborious, human-in-the-loop process. One needs to meet with the customer (or gather their feedback in other ways). Their wishes need to be translated into usable pieces of input for feature conception. Asks from various key account managers need to be collated into a list. Similar requests need to be merged into one, while keeping track who asked for it, when, and why.

I estimate that an LLM could help with parts of this process, while being useless at others.

For example, collating requests noted by different account managers is something an LLM could do. A sufficiently advanced one, like GPT-4, could conceivably work through different writing voices between managers, and correctly merge them into an unified list, while noting the number of instances the particular request occurred across customers.

What it couldn’t do is provide the context that tends to exist singularly in the account manager’s head. Is this customer happy to wait, or are they historically impatient? Is this something critical to their business, or a pie-in-the-sky idea? Did we recently break something they were relying on, and are thus motivated to appease them? What’s this customer’s relative “pull”, i.e. how motivated are we in general to build features nobody else asks for?

Helping with conception is not normally a huge part of my day, but for completeness: I don’t think an LLM can do this job in totality.

Planning

This is where my role begins in earnest. I serve as an advisor while planning the business impact of a feature. I field questions such as - is this something we can do with ease? Can our stack take the increased load? How do we future-proof the feature? What do we need to look out for?

Since I lead the data team, I need to coordinate with other leads on this. The feature will normally output through an API, so the application lead needs to weigh in on what they need. We need to coordinate with a designer so the feature ends up in a predictable spot in the app. The designers also need to not design something we can’t build. If we need a downtime, we need to coordinate with customer support to put notices out in the field. If it’s a hotly awaited feature, the relevant customers need to be notified it’s coming via the same support channels.

This step involves technical expertise and reasoning at a systems level. It also sees me coordinating with multiple stakeholders across the business. At the moment, an LLM has no hope of performing most of those tasks.

Implementation

This is where an LLM, in the hands of an experienced engineer, can help the most. I wrote before that using GitHub Copilot is like collaborating with a junior developer. It comes in useful to do the obvious bits of the job at a single-file level.

But programming is not only about - perhaps not primarily about, even - writing code. It’s about the corner cases. It’s about working around that one weird external vendor who always has an outage on Tuesdays. It’s about not wrecking the database with happy-go-lucky SQL queries.

I classed Copilot’s output as a junior developer, because it’s decent at happy paths and completely trash at thinking through others. It also doesn’t know everything I know about the context in which the feature exists. It can look at files and ape their contens, but that’s about it.

Not that senior engineers don’t ape. I ape existing solutions all the time. But I do it consciously, while thinking hard about the implications all the time.

For this part, the verdict is: it’s going to be arduous to become a gainfully employed junior developer. If I can smash through the type of tasks I’d give to one in minutes, I cannot make the case for hiring and training one up. I’m painfully aware that this is a vicious cycle, and I discuss it below.

Testing

Again, LLMs aren’t half bad at generating automated test cases. The work they’ve produced for me, targeted at e.g. RSpec, is okay, if - again - skewed heavily towards happy paths.

I think that testing for second-order effects is impossible for an LLM. No LLM I know of can spot a memory leak, or overtaxing the database, or a UI that isn’t broken but just looks iffy.

Not much else to say. Testing is hard. Partial points for the LLM for automating the most straightforward part.

Release

Not understanding second-order effects comes to hurt LLM’s chances again in this section. An automated monitoring platform - much simpler than an LLM - can spot that after deployment X, database Y took a hit. What’s missing, again, is the context.

Suppose we - the engineers responsible - knew the database was going to get taxed more. We would’ve realized it back at the planning stage, or - at most - while benchmarking the solution in testing. We knew, and we agreed that this was fine.

An automated system lacks this context, and the finesse to understand what is actually happening. It sees the database took a hit to performance, determines the release to have been a net detriment, and rolls it back.

Software engineering already feels - some days at least - like fighting against obstinate machinery. If said machinery could also decide that my solution is bad, I should feel bad, and do it some other way, I would immediately quit. In that sense I suppose a misapplication of an LLM could end my career. In terms of doing my job - not even close, here.

Becoming a junior engineer is hard, will be harder

Right now, I cannot make a fiscally-sound argument for hiring a junior developer.

The type of rote task I’d normally give to a junior engineer I can now execute myself in a fraction of the time. I realize that no juniors today means no regulars tomorrow and no seniors next week. However, if I can do it myself - and fast - hiring someone so they can do it slow for the promise of maybe being a decent engineer in three to four years doesn’t make much sense.

I also believe that to become a great engineer, one needs to see multiple codebases, multiple ways of doing things. I continue to apply the best things from each of my previous jobs and projects moving forward. I would not have the experience of multiple ways to do a thing if I didn’t work multiple jobs.

That makes the argument even harder to make. If nobody is training juniors but us - and it’s plausible not many companies will, now - we can send our junior-cum-regular out into the world, but we won’t get a different one back.

It’s a vicious cycle that’s on my mind a lot. I don’t have any solutions for this one, I’m afraid.

You keep saying “LLM”

Yes. I researched in-depth how the sausage is made - or how an LLM is trained, deployed and used - in the past few months. I now refuse to call them “artificial intelligence”.

There’s a term of art gaining popularity, “artificial general intelligence”. That’s being used by the sausage-peddlers as something they don’t have yet. They say they have “AI”, implying they have two of the three letters necessary to create “AGI”, so that’s going to be right around the corner. I think this simply cannot be true.

LLMs are the world’s most elaborate autocomplete engines. Whenever they seem to be innovative, or intelligent, or capable of reasoning, is when they stole from someplace you haven’t seen yet. Peddling LLMs as AI is saying that parroting the internet at great expense is the foundation of reason. Where the seat of consciousness lies is an object of debate amongst philosophers and biologists alike, but I’m fairly sure they’d say it’s not Reddit.

Could a true AI do it?

Sure.

A true AI, capable of reason, research and self-improvement, could do my job. Arguably better than I do: my processes of research and self-improvement are done via some of the least efficient data-absorption devices. My data storage ain’t that great, either. Then there’s the entire “sleeping for 8 hours a night” business (or there should be, at least).

I don’t know when a true AI is coming. I don’t know if it is coming. I don’t believe we have the foundations for it today. It’ll be very interesting to me, as a hopeless nerd, to watch it become a thing. In that sense, I’ll at least be entertained while being made obsolete.

Conclusion

I don’t think I’m out of the fight yet. I might be at some point, and humanity making that leap will be something to watch.

I’m concerned for the people in CS courses and bootcamps right now. While the strata of programmer jobs requiring greater experience may not shrink as much, I’d expect the entry-level job pool to shrink to near-zero. This is true of many industries, not just software engineers.

I admit, with some trepidation, that I don’t know how to fix this.

Comments (0)

Add a comment