The Case for Automatic Verification

Mosegas 2 hours ago

0 0 5 minutes read

By Sila Ozeren Hacioglu, Security Research Engineer at Picus Security.

In April 2026, Anthropic released its new frontier model, codename Mythos, to twelve partners under a gated preview. Not a common find; the company apparently held it as it was deemed too dangerous to release.

In its first 14 days inside that sandbox, it wrote 181 Firefox functionality. The previous modern model has two. Uh oh.

It appeared thousands of zero days for every major OS and browser, including the 27-year-old bug in OpenBSD, an operating system whose entire reputation is built on having all these bugs.

It’s over 99% of Mythos discoveries are still undocumented in production today.

That is not a prediction. That happened.

Now it matches what is already in the wild.

Let’s support a little. In February, AWS Threat Intelligence published an autopsy on A FortiGate campaign operated by a single operator. One person, low skill, no hands on keyboard.

The AI did the work, and it hit 2,516 devices in 106 countries compatibleit takes a few minutes for each goal. Zero days were not required. Known CVEs and bugs were sufficient; the AI simply works faster than anyone can respond.

**Figure 1. AWS Threat Intelligence FortiGate campaign reaches 2,516 devices in 106 countries**

Two data points, one message: the case now runs at machine speed. And the question every defender should be asking is, not “are we compatible?” or “joined?” More granular, and more pressing:

“What exactly is going through my controls today, and how far?“

If the honest response includes the quarterly pentest report and screenshots of the dashboard, consider all of this a necessary piece of learning.

How Soon Can Attackers Exploit CVE Published in 2026?

Ten years ago, the average time from publication of a CVE to an active exploit from the wild was measured in months, long enough for an actual patch cycle. By 2024that window had shrunk to approx 56 days. By 2025it was down 23 days.

Recent CVE-to-exploit pairings from CISA KEV, VulnCheck KEV, and the database now show an average delta of approx. 10 hours.

**Figure 2. Average CVE-to-exploit window: 2.3 years (2018) vs. ~ 10 hours (2026).**

Reverting published fixes to active exploits is no longer a specialized art; now the message.

This means that the comfortable assumptions of risk management, which CVSS scores prioritize, that “exploitation” is a useful filter, with time between exposure and weaponization, have all been quietly broken.

A safe working assumption now is: every vulnerability has to be exploited, or will, before you complete your next change management meeting.

Unfortunately, protective autoimmunity has not yet been established.

And the green side of AI without verification is just guessing at the speed of the machine, and that’s an expensive hunch to bring to production.

More than 99% of the findings of Mythos remain unpublished. Glasswing’s public report came in July.

This guide from Picus Labs includes 12 operational recommendations that defense teams need to bridge the gap between AI-speed offense and human-speed defense, including five steps for the first week.

Download Now

The Real Bottleneck Isn’t a Tool – It’s a Spaghetti Handoff

Let’s start with the attacker first.

With the second zero, the AI script starts. For the second five, CVE is exploited. MFA skipped twenty. The web shell is down to thirty. The information was discarded at forty-five. At seventy-three seconds, the consensus is over.

No one is there, no doubts, no team meetings, no coffee breaks.

Now think about the defender.

The SIEM alert fires for one minute, after the attacker is done. A Tier 1 analyst takes about five minutes. Someone triggers the SOAR playbook, by hand, in the fifteenth minute. A Jira ticket is filed within an hour. After four hours, it lands on the IT ops line.

The episode airs the next day, twenty-four hours after the breach took seventy-three seconds to complete.

**Figure 3. Literacy gap: AI risk (73s) vs. Amendment (24h) due to conflict of the opposite party.**

Notice where the time goes. It is not within any single tool. EDR is fast. SIEM is fast. The vulnerability scanner is fast. Time is dying in the middle tools: Slack messages, hash pasted copy, PDF report emailed for review, ticket pending approval, red team text manually rebuilt for blue team.

This is a bag of spaghettiand it’s as dirty as it sounds.

You can buy a fast scanner, connect smart EDR, even tie LLM to your SIEM, and none of them will significantly speed up your responsebecause there is no gap between any of your tools. It lives between groups and between programs. Speeding up one node in a graph does not speed up the graph.

This is a big part of why this discussion is out of the CISO’s office.

Six months ago, AI-driven cyber security was a technical problem to delegate. Today, boards take it for granted and directly control it. Budgets are available is openbut not ‘the same.’ They are there funding credible, evidence-based programs.

What are the Three Pillars of Cyber Resilience In the Era of AI-powered Attacks

The fundamentals that make organizations resilient before the Mythos still apply. There are three.:

Pillar 1: Identify. You can’t protect what you can’t see. Even with the appearance of broad exposure across the network, storage, cloud, and identity, and management of the aggressive attack environment, blind spots (orphaned remote access, missing classification, MFA gaps) are where machine speed attackers live.

Pillar 2: Protect. Active network and endpoint controls, fine-tuned. Complementary acquisitions focus on access to authentication, lateral movement and privilege escalation rather than general merchant rules.

Pillar 3: Validate. This is a very trivial program, and it is the one that answers the question we started with. Verification has two parts, and yes, you need both.

Protective authentication – Breach and Attack Simulation (BAS). Are my prevention and detection controls catching up to what’s hitting me right now? What assets are my controls failing to protect? What risk is left after my stack is implemented?

Assault confirmation – Autonomous Pentesting. Can an attacker really break it? What exposures come together to form the true jewels of our crown? What can be really exploited in our environment, not just a risk in theory?

**Figure 4. BAS and Automated Penetration Testing Together**

Only run BAS, and you’ll know if your controls work independently but not if an attacker can bypass them. Just run automated pentesting, and you’ll find attack methods but you won’t know which controls are failing silently on assets the pentest hasn’t touched. Run them as one continuous loop, where each informs the otherand in the end you will have an answer to “what is happening, and how far” based on evidence rather than theory.

But proof alone is not enough. When case runs at machine speed, the loop itself must run at machine speed.

How Picus Approaches Auto-Verification in a Post-Fictional World

A continuous loop is the correct answer. But “going on” still means that a person is going through it. In the post-Mythos world, the essential gap does not exist between seeing and seeing; it is between identification and testifyingfast enough that the AI driven enemy doesn’t find you first.

This is where validation comes in from continuous to independent: agents read an alert, run a test, run a simulation, push a fix, and write a report, while the SOC catches some much-needed sleep.

AV conference

We’ll be rolling out exactly what that looks like (architecture, agent workflow, operational reality of running it inside a real business) Automated Verification Conference on May 12 & 14managed by Frost & Sullivan along with doctors from Kraft Heinz and Glow Financial Services, and PicusCTO, Volkan Erturk.

>> See it in action at the conference.

Powered and written by Picus Security.

Mosegas 2 hours ago

0 0 5 minutes read