V.Taste·March 24, 2026·7 min read

What AI website builders get wrong about taste

Speed isn't the point if the output is generic.

Knid

Founder

We build an AI website builder. Obviously I'm going to have opinions about the category, and obviously those opinions are going to flatter the thing I'm selling. I'll try to be honest anyway. The current crop of AI app builders — Lovable, Bolt, v0, Replit Agent, Framer AI, Figma Make, and the long tail of others — does a lot of things extraordinarily well. And there is one thing they all get wrong. It's the thing that matters most.

What they get right, to acknowledge it up front, is genuinely impressive. They take an English sentence and return working code. That was unthinkable three years ago. They deploy the result to a URL in minutes. They let you iterate by chatting. They handle auth, databases, payments — things that used to need a backend engineer. If you ignore the quality of the output, the engineering behind these products is some of the most remarkable work of the decade.

But you can't ignore the quality of the output, and the quality of the output is where every current AI builder — including, if I'm honest, ours in many places — still falls short. The gap isn't functional. The gap is taste.

What I mean by taste

Taste isn't an arbitrary aesthetic preference. It's the ability to make a thousand small decisions correctly in a way that adds up to something that feels specific and alive. A designer with taste picks the right shade of terracotta the first time — not because there's a rule, but because they've absorbed the context and the correct answer is obvious. A writer with taste cuts the sentence that doesn't need to be there. A developer with taste picks names that make the codebase easier to read in six months.

AI models, as of 2026, are getting alarmingly good at the large-scale stuff. They can generate a whole website. They can structure a database schema. They can write a full reservation flow with email confirmations. What they still can't reliably do is the thousand tiny decisions. They'll put the line-height at 1.5 when the correct answer is 1.6, pick a sans-serif where the serif was the whole point, copy-paste a cliché opening paragraph, and reach for the three-feature grid every single time.

The average of the internet

The reason AI output has this problem is obvious once you think about it. The models are trained on the internet. The internet is, on average, not that great. The modal landing page is a SaaS landing page, which means the output regresses to a SaaS landing page every time unless you explicitly push it somewhere else. Ask for something “clean and modern” and you'll get the average of every clean-and-modern site the model has seen. The average is mid.

This isn't the model's fault. This is an information-theoretic problem. You can't make an original thing by averaging unoriginal things. You can only make the typical thing. If you want the atypical thing, you have to push the model away from its defaults — with specific references, specific constraints, specific taste anchors you bring yourself.

The workaround

Here's the honest truth, which we try to bake into Wemob but which applies to every AI builder: the quality of your output is the quality of your input.If you type “make me a café website” into any of the top AI builders, you'll get something generic, because that's what the average of “café websites on the internet” looks like. If you type something specific — a tone, a mood, a reference, a constraint — you'll get something specific back.

The good AI app builders in 2026 are the ones that nudge you toward being specific. They ask clarifying questions. They surface taste defaults that aren't average. They seed the prompt with strong opinions of their own. The bad ones just accept your vague request and faithfully return the average.

The problem with training on everything

A deeper problem, which the whole category will eventually have to grapple with, is that training on the entire internet gives you a model with no taste. The model has seen every website, but it doesn't know which ones are good. It's read every design manual, but it doesn't know which advice is timeless and which is 2014-specific. It's been exposed to billions of examples, and from its perspective, all of them are equally valid data.

The analogue in human design education is a student who has been shown a thousand websites but never told which ones were masterpieces and which were embarrassments. That student graduates able to produce acceptable work but unable to tell you why one landing page is better than another. That's the current state of every AI model we've tried.

What would fix this

Fixing taste in AI models is a research problem, not a prompt-engineering problem. A few things would help:

Curated training data. A million carefully-chosen websites beats a billion random ones. Quality over quantity. Some labs are starting to do this.

Strong default opinions in the tool. The builder should have taste baked in, not wait for the user to supply it. Our own Wemob makes specific choices — Instrument Serif, warm cream, italic accents — on purpose, so the default output starts with something specific rather than average. Every AI builder in the category should do the same, even if their defaults are different from ours.

A small, opinionated library of references. Rather than training on the whole internet, train on a hundred websites the people building the tool actually admire. You give up breadth, but you get a model with a point of view.

The one thing you can do today

If you're using an AI builder right now and want better output, the fastest improvement you can make is to start every new project by pasting a reference — a screenshot of a site you love, a link to a project you want to feel similar to, a short description of the mood. You're not asking the model for the average. You're giving it a specific anchor and asking it to work from there.

The output won't be as good as what a designer with taste could produce by hand. But it'll be much better than the default, and it'll be yours — not the average of everyone else's.

The quality of your output is the quality of your input. Every time.