AI Red Teams: How Agentic AI Thinks Like an Attacker
This episode explores how agentic AI is changing cybersecurity by moving beyond simple scanning to objective-driven attack simulation, chaining weaknesses, probing APIs, and uncovering privilege escalation paths. It also examines why these systems are best used as force multipliers for human defenders, accelerating reconnaissance and risk discovery while leaving judgment, prioritization, and accountability to people.
Chapter 1
The AI red team that thinks like an attacker
Simon Carver
[warmly] Welcome to the show. The episode is called "The AI Red Team: How Agentic AI is Changing Cybersecurity Forever," and the reason that title matters is simple: we are not talking about a bot that checks a list and spits out a PDF. We're talking about systems that can read an environment, choose a path, test assumptions, and keep going -- almost like a junior attacker with a mission. If that sounds exciting and slightly unsettling... good. It should. And if you enjoy these quick takes, like, share, and subscribe. Lachlan Reed is here, and so is Jack Burns. Jack, I wanna start with the tension right at the center of this: when AI stops scanning and starts REASONING, what exactly changes?
Jack Burns
[calm] The shift is from predefined inspection to objective-driven behavior. A traditional scanner asks, in effect, "Do I recognize this version number, this signature, this known CVE?" It is useful, but narrow. An agentic AI pentester is given a goal -- map the environment, identify privilege escalation paths, test application weaknesses, simulate attacker behavior -- and then it determines the sequence of actions required to pursue that goal. That means enumeration, adaptation, trial and error, and, importantly, chaining small weaknesses together into something consequential.
Lachlan Reed
[curious] That phrase -- "chaining small weaknesses together" -- that's the bit that gets me. Because a normal scanner might say, right, outdated package here, sloppy permission there, weird exposed API over yonder. But the agent goes, "Beauty, if I join those three together, now I've got a path." That's a different beast, isn't it?
Jack Burns
Exactly. Most serious compromises are not one dramatic flaw. They are sequences. An exposed service leads to information disclosure. That disclosure reveals identity relationships. Those relationships expose excessive permissions. Excessive permissions enable lateral movement or privilege escalation. A goal-driven system can keep testing each step based on the response it gets. That is much closer to how a human operator thinks than how a static scanner behaves.
Simon Carver
[questioning tone] So let me try to explain it back, and you tell me where I mangle it. A legacy scanner is more like a supermarket barcode reader: it knows what it's seen before. An agentic pentester is more like a junior investigator walking through a building, opening doors, checking badges, reading the signs on the wall, and changing course when one corridor is locked. Is that close?
Jack Burns
[slight chuckle] Close enough to be useful. The important addition is persistence toward an objective. It does not merely observe. It pursues. If one avenue fails, it can try another. If documentation is available, it can read it. If there is an API surface, it can inspect it. If code repositories are exposed or accessible, it can analyze them for weakness patterns. That adaptive loop is the real departure.
Lachlan Reed
[excited] And that's where the old "it's just ChatGPT doing a scan" idea falls over, right? Because this isn't asking for a cheeky summary. This is the AI saying, "I'm gonna enumerate the environment, poke the APIs, inspect the repo, test assumptions, maybe write proof-of-concept logic..." Mate, even a kangaroo could trip over that if they thought it was just a fancy checklist.
Simon Carver
[laughs] The "poke the APIs" part is sticking with me. Because APIs are where so much modern business actually lives now -- customer data, auth, internal services, all of it. Jack, when these systems inspect APIs and repositories and identity relationships together, is that what makes them feel less like a tool and more like a teammate?
Jack Burns
[reflective] A teammate, perhaps, but a very particular kind. It is not a strategist. It does not understand consequences in the human sense. What it does provide is relentless throughput. It can perform reconnaissance at machine speed, correlate patterns across infrastructure, and surface attack paths that a human team might need days or weeks to uncover manually. That is why I would call it augmentation rather than replacement. It extends reach. It does not own judgment.
Lachlan Reed
[skeptical] Alright, but here's the big question then. If AI can mimic an attacker -- enumerate, adapt, chain vulnerabilities, the whole box of bolts -- does that genuinely make defenders stronger? Or are we just getting a faster, more brutal mirror showing us how exposed we already were?
Jack Burns
Both. It makes defenders stronger by compressing discovery time. And it makes exposure harder to ignore because it reveals, quickly and often uncomfortably, how fragile some environments already are. That discomfort is useful. In security, clarity is preferable to false confidence.
Simon Carver
[softly] "Clarity is preferable to false confidence." That's the line I'm keeping. Because the surprise here isn't that AI makes cyber more dramatic. It's that it may force organizations to see what was already true -- they were defending sprawling systems with too little visibility and too few people.
Chapter 2
Why this changes work, risk, and accountability
Simon Carver
Jack, stay with that visibility point. Walk us through a real operating picture. What does agentic AI accelerate first: reconnaissance, privilege escalation discovery, exploit simulation... all three?
Jack Burns
[matter-of-fact] All three, and at scale. Imagine a large enterprise with cloud infrastructure, SaaS platforms, APIs, legacy systems, and millions of identity relationships -- in one example from the source material, 7 million entitlements and tens of millions of identity records. A human team can map that, but slowly. An agentic system can ingest infrastructure data, enumerate exposed services, analyze identity relationships, detect likely privilege escalation paths, simulate exploit chains, and produce candidate findings in hours rather than weeks.
Lachlan Reed
[questioning tone] Seven MILLION entitlements. That's not "messy cupboard" territory -- that's the whole shed collapsed on the lawn. So the AI can sift those 7 million relationships faster than any human crew, but the human still has to decide which ones are actually dangerous to the business, yeah?
Jack Burns
Precisely. The machine can say, "Here is a path." The human must ask, "What does that path mean in this organization, for these customers, under these obligations?" Risk is contextual. A route to a low-value internal system is not the same as a route to regulated financial data or critical infrastructure controls. Prioritization remains deeply human.
Simon Carver
[warmly] And that lands right on the Human Workforce theme. The value isn't "remove the person." The value is "remove the friction." Let the AI do the repetitive enumeration, the pattern correlation, the first-pass exploit simulation. Let the human do judgment, ethics, prioritization, risk acceptance, and the awkward but necessary conversation that starts with, "Yes, this is technically fixable, but here's what it will cost the business."
Lachlan Reed
[responds quickly] Yeah -- that's the part people miss when they talk replacement. Cyber teams are already drowning in alert fatigue, patching backlogs, cloud sprawl, identity sprawl, exposed APIs, phishing... all of it. Giving them an accelerator isn't kicking them out of the ute. It's finally giving them a better engine.
Jack Burns
[calm] And there is a labor reality beneath that. There are not enough cybersecurity professionals globally to inspect everything manually. As systems become more complex, the premium on human expertise rises, not falls, because the human is now governing a much larger field of machine-generated insight. The role shifts upward toward validation, prioritization, and strategy.
Simon Carver
[curious] Upward is interesting. So the job doesn't vanish; it becomes more editorial, more ethical, maybe even more managerial. You're not just finding needles by hand anymore. You're deciding which pile of needles matters, and whether the magnet that found them can be trusted.
Jack Burns
That is well put. And trust is where the risk enters. These systems can hallucinate findings. They can make false exploit assumptions. If unconstrained, they can conduct unbounded testing, trigger outages, or touch sensitive systems in ways that become disruptive. An autonomous security tool can, ironically, create operational damage if it is not properly contained.
Lachlan Reed
[skeptical] That's the sting in the tail, isn't it? You build the world's smartest digital bloodhound and then forget to put a lead on it. Next thing it's sprinting through production, knocking over chairs, and everyone's asking why the "defensive" tool just caused an outage.
Jack Burns
[dryly] An inelegant image, but accurate. Which is why controls are not optional. You need sandboxing, logging, approval workflows, human validation, and explicit operational boundaries. If the system is allowed to act, you must know where it acted, why it acted, and what guardrails constrained it.
Simon Carver
[serious] Logging is the one I think listeners outside security may underestimate. Because if an AI agent tests ten paths, writes payload logic, and touches sensitive systems, you need a record precise enough to reconstruct intent and impact afterward. Otherwise accountability gets fuzzy very fast.
Jack Burns
Correct. And accountability must not become fuzzy. Cybersecurity is ultimately about protecting people -- customers, employees, citizens. You cannot outsource ethical accountability to a language model. The human organization remains responsible for what is authorized, what is tested, what is remediated, and what risk is accepted.
Lachlan Reed
[reflective] So maybe that's the real twist. The more autonomous the tool becomes, the MORE important the human becomes. Not because the human is doing every tiny task, but because someone's gotta set the boundaries, check the findings, and own the consequences. That's not less responsibility. That's more.
Simon Carver
[warmly] Yeah. AI expands visibility; humans provide wisdom. AI accelerates analysis; humans decide what should happen next. And if we get that balance right, this isn't a story about machines replacing defenders. It's a story about defenders finally getting help worthy of the threat. If you enjoyed this one, like, share, and subscribe -- and Jack, Lachlan, thanks as always.
Jack Burns
A pleasure.
Lachlan Reed
[lightly] Cheers, folks. Stay safe out there -- and maybe don't let the bloodhound into production without a fence.
