A team I keep dreaming about
A high-performing, AI-native delivery team is buildable today. The shift that matters is AI in the operating model and a culture honest enough to use it.
A team I keep dreaming about
TL;DR. The team I keep dreaming about runs on two things. AI in the operating model, doing the legwork that delivery leads, scrum masters, agile coaches, and other leaders currently spend hours on every day. And a culture honest enough to use what the AI surfaces. Both are buildable today, at squad or portfolio scale. The agents are the easy half. The culture is the half that needs leaders to admit there is a problem.
There is a kind of team I keep dreaming about. One with a culture of fun and real psychological safety. One that wins or learns and treats both the same way. One where everyone knows what their role actually does and what it does not. One that measures the right thing, the right way. One that is AI-native and AI-ready.
One that does not polish PowerPoint decks to look good in a steering committee, does not sit through 30-minute standups, does not run forced backlog refinements, and does not do sprint retros because the calendar, scrum master, delivery lead, or agile coach says so.
Most delivery teams, programs, and portfolios are not that team. They use Claude and ChatGPT to refine retro templates, research new frameworks, and rewrite the quarterly decks. But the foundation is missing. The trust is missing. The willingness to sit in real conflict is missing. It is lipstick on a pig and calling it a deer. The standup is still status. The retro is still theatre. The quarterly planning is still what the big boss said. The reports still get polished before they hit a leadership inbox. AI shows up in the typing, not the thinking.
That team I am dreaming about is buildable. Right now.
The two pillars of an AI-native delivery team. AI in the operating model on the left, the human culture work on the right, both resting on a shared foundation of clarity, honesty, and leaders willing to admit there is a problem. Generated by Author using Claude.
Why most teams are not there yet
When I say "team" in this piece, I mean any delivery unit. A squad, a team of teams, a program, a portfolio, a department. The pattern is the same. The scale is different.
Most teams are not there yet because they plugged AI into tools, not into the operating model. There is a Cursor licence. There is a ChatGPT tab open in every browser. Developers are running Claude Code. Autocomplete is faster. Boilerplate is faster. Unit tests are faster.
But the operating model did not move. The retro still surfaces the same three issues every fortnight. No leadership action follows. The team is doing the same job at the same speed, just with fancier autocomplete.
The shift that actually changes a team is not AI in the IDE. It is AI in the operating model and it shows up in two places: how the team runs and how the team behaves.
The AI-native delivery operating model. Four agents handle the legwork overnight, the team walks in already informed, and the humans get to do the real conversation, daily decisions, and forward planning. Generated by Author using Claude.
The standup stops being a status meeting
Most standups are status reports performed for a manager, dressed up as collaboration. A standup-update-agent kills that ritual. Underneath, it is an LLM with read access to your repo, ticket system, and chat. It does what a diligent scrum master would do at 6am, except it does not get tired and it does not get political. It reads commits, PR activity, ticket transitions, and Slack or Teams threads. By 9am, it posts a one-paragraph update for each team member. Everyone walks into standup already knowing who did what and where each ticket sits.
So what is the standup for? The actual conversation: what is blocked, what is unclear, what we should change today. Fifteen minutes that used to be superficial become fifteen minutes of real work.
The reports stop being theatre
Every leadership team meeting needs an update. Every steering committee needs a deck. Every monthly review needs a status pack. So delivery managers, product managers, and GMs lose days polishing slides that most people skim.
A project-or-product-updates agent ends that polishing economy. Same shape as the standup agent, scaled up. It pulls live state from Jira, GitHub, product analytics, customer escalations, and the rest of the stack. It writes plain-English updates daily and on demand. The deck nobody read becomes the email everyone reads.
Leadership rituals change too. Instead of "give me an update", the question becomes "I read the update, here is my question". Conversations start at "why is this metric off", not "walk me through the deck". Nobody misses the deck-making industry.
Hard truths surface every morning, not every quarter
Most teams discover what is broken at the retro, weeks after it started breaking. By then the damage is baked in.
A pragmatic-brutal-facts agent does not wait. It watches the metric stream against the team's own baseline and surfaces what is drifting. Velocity drift, bug reopen rates, PR cycle time, sprint scope churn, customer escalations, promise-vs-shipped gaps, scope creep. It writes a daily one-pager in plain language: here is what is not working and here is when it started.
The team cannot avoid the conversation. That is uncomfortable at first. That is also the point. You get small, daily, honest conversations before the big quarterly explosion.
The numbers stop being faked
Most teams polishing metrics are not evil. They are rational. Jobs, bonuses, reputation, and funding are tied to the numbers. So the metric gets a haircut before anyone outside the team sees it.
A transparency-scoreboard agent removes the polishing layer. It is not even an LLM. It is a dashboard backed by a single source of truth, with no admin override and no path to manually adjust the numbers. Engineering, product, leadership, and the newest graduate see the same view at the same time.
But there is a deeper question the agent forces on the team. What are we actually measuring? Not velocity points. Not story-point burn. Not how many tickets moved to Done. Those are activity metrics and activity is not the same as outcome. The numbers that matter are the ones the customer feels. Cycle time from idea to in-their-hands. The gap between what we promised and what we shipped. How often we changed direction because we learned something. The wrong metric, surfaced honestly, is still the wrong metric. Pick the right ones.
Then the conversation flips from "how do we look this quarter" to "what do we do about this". Energy shifts from reporting choreography to actual improvement.
These four are not the finished list
These four are examples. The same pattern works for plenty of other team-level questions and you should pick the agents that match what your team actually needs answered. A few more that make sense:
- A health check agent watching retro themes, chat tone, and 1-on-1 patterns. Produces a weekly read on whether the team is energised or burning out.
- A DORA metrics agent pulling deployment frequency, lead time for changes, change failure rate, and MTTR from your CI/CD and incident stack. Tracks the four numbers Google's research says actually predict performance.
- A team performance agent synthesising throughput, predictability, and commitment-vs-delivery trends. The team sees its own pattern without anyone having to build a slide.
- A retrospective synthesis agent reading every retro across a quarter, surfacing what keeps coming back versus what actually got fixed.
- A risk radar agent watching deadlines, dependencies, and customer signals. Flags risks before they turn into incidents.
The list is open. The principle is the same: use AI to see the team honestly, then use AI to help the team improve.
The agents do not save you from a broken foundation
This is the part nobody likes saying out loud either. The agents amplify what is already there. They do not create what is not. If the team is not aligned on what it is building, the AI cannot align them. If leadership is not committed to honesty, the transparency-scoreboard agent gets ignored. If the data going into the agents is messy, the output is messier. AI is a force multiplier. Force multipliers multiply whatever is in front of them, including the bad stuff.
And the higher up you go, the more entrenched it gets. At squad level, dysfunction is annoying. At program level, it costs months. At portfolio level, it costs years and tens of millions, because the polishing layer between the work and the executives is thicker, the feedback loops are slower, and nobody is incentivised to break it.
I have sat through a Five Dysfunctions of a Team assessment, run with squad members from across functions plus a few senior leaders in the room, that scored the team 2.5 out of 5. Honest. Painful. Useful. The same leaders, the next week, softened the tone in front of the head of department until the conversation was shaped, balanced, and useless. The pulse survey said the same thing, louder. Nothing changed. Meanwhile the executive layer above would change direction mid-quarter, for the same quarter, while the teams below were still not sure what they were working on next.
If you are a portfolio leader reading this thinking "this is squad-level stuff", it is. It is the squad-level dysfunction that became your portfolio-level reality. Same pattern, several promotion cycles later.
But it is also bigger than that. A squad can do all the work in this article and lift its high-performing score from a 7 to a 9 out of 10 and still be sitting inside a portfolio that is fundamentally broken. Is the culture above the squad actually open and honest? Does anyone know what the program lead does, what the portfolio lead does, what the head of department owns versus what they delegate? Is the prioritisation real or is it three competing roadmaps in three different decks? Are business stakeholders genuinely engaged at the portfolio level or are they just signing off on outcomes they did not shape? If the answer to any of those is no, it is not a squad problem. It is a portfolio problem dressed up as a squad problem and no amount of squad-level uplift will fix it.
The harder shift is breaking the culture and no AI, no coaching engagement, no process invention does that work until the leaders themselves accept there is a problem to break. Most do not, because the books are still green.
If the department hit its financial number, the culture bit gets swept under the rug. The assessment scores, the pulse survey, the burnout, the quiet quits, every signal that something is wrong gets reframed as "we are doing fine, look at the numbers". And it works, in the short run. The numbers are visible. The culture is not. But it is exhausting, it is unhealthy, and it does not last. The people who survive are the ones willing to join a tribe, perform loyalty, and bootlick the leaders who hold the budget. Everyone else burns out. The best ones leave first.
And there is a layer below the head of department that makes all of this harder to surface. The middle managers. The careful ones. The ones whose career safety depends on staying in the leader's good books, not on telling the truth. Their teams come to them every day with the real picture: this is broken, that is broken, this person is burning out, that initiative is dead but everyone is pretending it is not. The middle managers nod. They agree in the one-on-one. They sympathise in private. Then they walk into the leadership room and tell a different story. The polished one. The one that does not put their next promotion at risk. The teams see this happen week after week and stop telling them the truth. The leadership above never hears it. The dysfunction calcifies.
Get the foundation right first. Clarity on the work. Honesty in the leadership. Clean enough data to feed the agents something real. Then add the AI. Not the other way around.
. . .
Then the humans actually have the conversation
Here's the bit the AI vendor pitch skips. The agents aren't the point. The agents are what frees up the humans. With the legwork done, the humans get to do the things humans are actually good at.
Open to being challenged. Real continuous improvement, not the kind where you nod in the retro and change nothing.
(If your team's "continuous improvement" lives in a Confluence page that hasn't been edited since Q1, you're playing the game, not playing the work.)
People speak up. People disagree. People change their minds when the evidence is on the table.
Bias for action. Decisions get made in the standup, not after a four-week "alignment" cycle. The agents have already given everyone the same picture. There's no waiting for "more data". The data is right there, in the team channel or your inbox, every morning.
Clarity of roles. When the agents handle the routine legwork, each role's actual value-add becomes visible. The product manager connects customer signal to roadmap. The engineering manager unblocks engineers and grows people. The tech lead shapes outcomes, not Jira tickets. The scrum master coaches the team's behaviour, not its ceremony attendance. Nobody hides behind status updates because there are no more status updates to hide behind. Roles get sharper, not fuzzier.
No us vs them. Engineering vs product. Frontline vs leadership. Delivery vs business. These factions exist because each side has different information. One side thinks of the other side as a black box. When the agents serve the same truth to everyone, the factions lose their oxygen. People stop fighting the team next door and start fighting the actual problem.
Forward-thinking cadence. Thinking about Q1 doesn't happen in Q1. The team is already two beats ahead, because the daily admin work that used to swallow planning time is gone. You're shaping the next thing while you're shipping this one.
Not tribal or territorial. Good people don't need to join a tribe or a cult within the organisation to survive or to keep their jobs. Because there's real transparency, there's accountability. Everyone respects everyone, but they know everyone has to perform to win. If someone is a blocker on the real win, we help them improve. If they're the "person in power", we talk to them. If they don't agree, we talk to their manager. And still don't get fired, because we did it for the right reasons.
This is the team that wins. Wins or learns. Then we celebrate.
A week of wins or learns. Every day is one or the other, both count as progress, and the team marks the win when it lands. Generated by Author using Claude.
Two ideas fuse together. First, psychological safety stops being a workshop poster and becomes structural. When brutal facts are visible by 9am, nobody needs to be brave just to raise them. When the scoreboard is honest, honesty is not punished.
Second, the question changes. Not "did we hit the sprint goal" or "did we hit the quarter goal". It becomes "did we win or did we learn?" Both count. Both are progress.
And when we win, we celebrate. Most teams ship and move on. High-performing teams mark the moment. They ring the bell, send the message, showcase the work, tell the story, and then get back to work. They also let the team have a laugh, run a side project that has nothing to do with the roadmap, and not be serious for an hour. Fun is not a perk. It is the signal that the team has the room to do its best work.
That team. Right now.
This is not fantasy. This is not a five-years-from-now transformation program. The agents in this article are not fictional. Each is a few prompts plus a connected stack away. A standup-update-agent can be a Slack or email bot calling a few APIs. A transparency-scoreboard agent can be a dashboard that does not allow metric gaming.
Worth naming what most transformation programs avoid. The people most threatened by AI-native delivery are not the engineers. It is the layer of middle management whose job is largely performance theatre, polished updates, and managing perception upward. Much of that work disappears first. That is not a bug. That is the point. The good ones will rebuild themselves around the parts that matter, which is coaching, unblocking, and shaping outcomes. The rest will resist. Expect that, plan for it, and do not let the resistance set the pace.
If you want to start, start with the standup-update-agent. It is the smallest one. It takes a day or two to build. It changes the daily rhythm immediately. Once the team trusts that one, build the next. The whole stack should not take more than a quarter.
The harder shift is the cultural one. But here's the bit that's rarely said: the agents pull most of the cultural blockers (info asymmetry, performance theatre, fake metrics, retro archaeology) out of the way before the humans even start. They don't replace the cultural work. They make it possible. And then there won't be anyone to FIRE because they were trying to be brutally honest and talk about real stuff under the layers of make-up.
That team is buildable. Right now. Not in 18 months. Not after another transformation roadmap. Reckon it is time we built it.
Salam.
P.S. If your team is not there yet but wants to be, I am a message away.
P.P.S. With chai and love.
Tags: AI, Agile, Leadership, Delivery, Future of Work.