Your AI Agent Is Building 5-Star Experiences. That's the Problem.

The 11-star framework that turns AI from a shortcut into a standard

Mar 28, 2026

∙ Paid

María opens Figma at 6 AM in her apartment in Medellín. She has four hours before her contract shift starts. She is building a wardrobe app, solo, no team, no funding. She types a prompt into her AI coding agent: “Create a screen where users can browse their closet.” The agent returns a grid of thumbnails. Rounded corners. A search bar at the top. A filter icon. It works. It is also identical to every other closet screen on every other app that has ever existed.

María stares at it. She knows something is wrong but cannot name it. The screen does what she asked. It does not do what her users need.

She is building a 5-star experience. Functional. Forgettable. Dead on arrival.

The conventional argument

The current AI-agent discourse goes like this: agents make you faster. You describe what you want. The agent builds it. Ship it. Move on. Repeat. The thesis is volume, more features, more screens, more iterations per hour than any solo founder could produce manually.

The problem is that speed toward mediocrity is still mediocrity. It just arrives sooner.

Every AI coding agent on the market, Cursor, Claude Code, Codex, Copilot, defaults to the same pattern: functional, generic, forgettable. You ask for a dashboard, you get a dashboard. You ask for an onboarding flow, you get an onboarding flow. The shapes are correct. The structure is sound. And no user will ever text a friend about it.

This is the 5-star trap. It looks like progress. It compiles. It deploys. And it compounds into a product that feels like every other product.

Where the framework comes from

In 2015, Brian Chesky sat down with Reid Hoffman for what would become one of the most cited episodes of the Masters of Scale podcast. Hoffman asked Chesky how Airbnb thinks about experience design. Chesky described an exercise he runs with his team.

Start at one star. The experience is broken, the host does not show up, the door is locked, nobody answers the phone. Move to three stars. You get in. The place is fine. Nothing memorable. Five stars. The place is clean, the bed is comfortable, there is a welcome note. This is where most companies stop.

Then Chesky pushed further. What is a six-star experience? The host leaves a handwritten note with restaurant recommendations tailored to your taste. Seven stars? A welcome basket with your favorite snacks, how did they know? Eight stars? Elon Musk greets you at the airport. Nine? A parade. Ten? You arrive and the Beatles are there to play a concert for you.

Eleven stars is absurd. It is deliberately impossible. And that is the point.

The exercise is not about building the 11-star version. It is about thinking from the 11 and working backwards to what is actually shippable. Because when you start at 5 and try to push to 6, you add features. When you start at 11 and pull back to 8, you rethink the entire experience.

The gap between those two approaches is the gap between a product people use and a product people remember.

The emotional job

Every interface has two jobs. The functional job is obvious, browse clothes, schedule a meeting, track expenses. The emotional job is invisible and more important.

The functional job of a wardrobe app is: show me my clothes. The emotional job is: make me feel like I know what I am doing when I get dressed.

The functional job of a dashboard is: display metrics. The emotional job is: make me feel like I understand my business right now, in this moment, without digging.

AI agents only see the functional job. They cannot see the emotional one. They do not know that the user arriving at your closet screen feels overwhelmed, not curious. They do not know that the person opening your dashboard at 7 AM is anxious, not analytical.

This is why default AI output lands at 5 stars. The machine solves the functional job perfectly and ignores the emotional job entirely.

What 11-star thinking does to your agent workflow

Here is what changes when you make this framework the foundation of every interaction with your AI agents.

You stop prompting for features. You start prompting for feelings.

Instead of: “Build a screen where users browse their closet.”

You write: “This interface transforms ‘I have no idea what to wear’ into ‘I look incredible and I barely tried’ by making outfit selection feel like a stylist handed you three perfect options. The user should feel relief within 3 seconds of landing on this screen.”

The output from that prompt is categorically different. Not because the agent suddenly has taste, it does not. But because you gave it the emotional specification that determines every layout decision, every copy choice, every animation.

You map the trajectory before you write a single prompt.

Before touching the agent, write out the star levels for the specific experience you are building:

1 star: The screen loads. The user sees a blank grid. No guidance. They close the app.
3 stars: Clothes appear in a grid. No organization. The user scrolls endlessly. They find something by accident.
5 stars: Clothes are categorized. There is a search bar. Filters work. The user finds what they want in 30 seconds. Forgettable.
7 stars: The app suggests three outfits based on the weather and the user’s calendar. The user smiles. Small surprise.
9 stars: The app knows the user has a job interview tomorrow. It surfaces the outfit that got compliments last time it was worn. It accounts for the weather, the dress code, and what is clean.
11 stars: The user opens the app and a stylist has already laid out tomorrow’s outfit on their bed. Not a screen. The physical clothes. On the actual bed.

Obviously you cannot ship 11. But you can ship 9, an experience that anticipates rather than reacts. And you would never have designed that 9-star version by starting at 5 and adding features.

You give your agent a design conviction, not a style preference.

Most people prompt their agents with aesthetic requests: “Make it clean and modern.” This produces the visual equivalent of elevator music. Pleasant. Unmemorable. Interchangeable.

A design conviction is a single sentence that forces every decision:

“Dense information, zero noise — Bloomberg terminal meets Notion.”

“A hand-written letter from your smartest friend.”

“The UI equivalent of a perfectly tailored black suit.”

When you give an agent a conviction instead of a style, the output has a point of view. The typography choice follows from the conviction. The spacing follows. The color palette follows. The micro-copy follows. Everything coheres because everything serves the same sentence.

The 5-star tells

Here is how you know your AI agent is building at 5 stars. Every one of these is a default pattern that agents reach for when you do not intervene:

Generic hero sections with “Welcome to [Product]” and a gradient background
Centered spinners with no context. The user stares at a circle and wonders if the app is broken.
“Something went wrong” as an error message. No specificity. No fix. No warmth.
Gray placeholder rectangles as empty states. The user sees nothing and learns nothing.
Purple-to-blue gradients. The unofficial color scheme of AI-generated interfaces. You have seen it a thousand times. So has everyone else.
Uniform card grids with no visual hierarchy. Everything is the same size, the same weight, the same importance. Nothing leads. Nothing recedes.

These are not bugs. They are the natural output of an agent that was asked for a functional solution and delivered one. The problem is not the agent. The problem is the spec.

Making it operational

This is not a philosophy you apply once. It is a filter you run on everything.

Every time you open a chat with your coding agent, ask three questions before you type:

What does the user feel right now, before they touch this?
What should they feel after?
What star level am I about to accept?

If you cannot answer those, you are not ready to prompt. You will accept whatever the agent gives you, and it will give you 5 stars.

Here is the operating rule I use: never ship the first output. The first output is always the 5-star version. It is the default. Treat it as a sketch, not a deliverable. Push the agent to 7 by naming the emotional gap. Push to 8 by adding the design conviction. Get close to 9 by specifying the sad paths, what happens when the screen is empty, when the data fails to load, when the user has 1,000 items instead of 10.

The agents are fast enough that three rounds of refinement still takes less time than one round of manual coding. Use that speed to raise the floor, not to ship the default.

Why this compounds

A 5-star experience retains users at the baseline rate. They use it when they need it. They forget about it when they do not.

An 8-star experience creates a moment the user did not expect. They text someone. They leave a review that says something specific instead of “works great.” They open the app when they do not strictly need to, because it made them feel something.

That difference, between functional and felt, is the difference between a product that grows linearly through acquisition spend and a product that compounds through word of mouth. The 11-star framework is not about perfection. It is about building the habit of imagining the impossible version first, then working backwards to the version that is shippable and still makes someone pause.

Every week you ship 5-star output, you are training yourself, and your agents, to accept the default. Every week you push to 8, you are compounding craft. And craft, unlike features, does not depreciate.

Brian Chesky did not invent a design methodology. He invented a question: What would the impossible version look like? The answer to that question, trimmed back to reality, is always better than the answer to “What is the functional version?”

María in Medellín is still staring at her closet screen. The grid loads. The thumbnails are fine. She deletes the prompt and types a new one. This time she starts with how the user should feel.

The grid looks different now.

The principle is free. The skill file that makes your AI agent build at 8 stars by default, the full prompt engineering framework, the star-mapping template, and the design system it enforces, is below.

For paid subscribers: the full 11-Star UX/UI Experience Builder skill, the implementation checklist, and the design system architecture below.

Continue reading this post for free, courtesy of Eduardo.

Or purchase a paid subscription.

The Compounding Founder