When the attacker is no longer human · Nexurion Field Notes Vol. I

Seven additions to your risk register before the next audit.

Each row maps a specific 2025–2026 capability to a control family that is now stale, the GRC framework where the staleness will be cited first, and the artifact a thoughtful auditor will start asking for. None of the artifacts are exotic. All seven are deliverable in a quarter if the trigger is named today.

§ 07 · Retractions

Four positions we are willing to retract.

The systems and benchmarks in §02 and §03 are public and citable. The thesis: that agentic pentesting is a 2026 audit-finding lever: is ours. If the next twelve months show otherwise, we will say so in print, in the next volume's masthead.

If XBOW's 28-min-vs-40-hr result fails to replicate on a second independent benchmark this year, the §04 row #01 ("annual is stale") softens to "annual is not yet stale."
If Anthropic releases Mythos broadly without restriction, the §05 Q-03 dual-use posture argument needs to be re-grounded against a different precedent: likely OpenAI's o-series cyber-offense evals or a future EU AI Office determination.
If the AICPA explicitly clarifies that "annual pentest" remains the SOC 2 CC7.1 expectation regardless of agent capability, §04 row #01 inverts to a nice-to-have rather than a finding.
If a published court decision rejects Kovel-style structure for AI-agent pentest engagements in the First Circuit, §05 Q-04's privilege guidance is stale on issuance and needs to be rewritten before relying on it.

The 2025 data: XBOW at #1, ARTEMIS at $18/hr, Mythos restricted: is unambiguous on capability. The GRC framing is the part that depends on how auditors and regulators move next. We will revise here if 2026 inverts it.

By Jack Giordano, founder: Marine · Firefighter · Security & Compliance.

Vol. I · Published Apr 2026 · Rebuilt May 2026 · No marketing review · Named author

When the attacker is no longer human.

A pentest used to be a person.
Now it is a swarm.

Three years from "clunky proof of concept" to "too capable to release."

PentestGPT: the proof of concept.

Big Sleep finds a real zero-day in SQLite.

XBOW takes #1 on HackerOne.

DARPA AIxCC finals: Team Atlanta wins $4M.

The traditional pentest contract stops being economically rational.

ARTEMIS beats 9 of 10 human pentesters at $18/hr.

Mythos Preview: too capable to release.

$237M total raised · $1B+ valuation.

Nine systems your auditor will eventually name.

Seven additions to your risk register before the next audit.

Four questions in-house counsel should be asked this quarter.

What does "prior written authorization" mean when the pentester is autonomous?

Who carries the downtime when an agent's exploit is too successful?

Is the same model your red team uses also your AI helpdesk?

Are agent logs privileged work product, or routine ESI?

The question your auditor will start asking in 2026.

Four positions we are willing to retract.

Run the seven additions against your register?