
Credit: Unsplash/CC0 Public Domain
Last weekend, the US AI research institute Anthropic released a report on the discovery of “the first reported AI-orchestrated cyber espionage operation.”
The company said a Chinese government-backed hacker group used Anthropic’s proprietary Claude AI tools to automate key parts of its efforts to steal sensitive information from about 30 organizations.
This report attracted a lot of attention. Some, including respected experts, are warning of the future of AI-powered cyberattacks and urging cyber defenders to invest now before the coming onslaught.
At the same time, many in the cybersecurity industry are overwhelmed by Anthropic’s claims, saying it is unclear what role AI actually played in the attack.
What the humans said happened
Critics say the report lacks detail, meaning some guesswork will be required to piece together what happened. With that in mind, hackers appear to have built a framework to execute cyber intrusion campaigns almost automatically.
This tedious work was performed by Anthropic’s Claude Code AI coding agent. Although Claude Code is designed to automate computer programming tasks, it can also be used to automate other computer activities.
Claude Code has built-in safety guardrails to keep you out of harm’s way. For example, I just asked someone to write me a program that can be used to perform hacking activities. It was flatly rejected.
However, as we’ve known since the early days of ChatGPT, one way to get around the AI system’s guardrails is to trick it into roleplaying.
Anthropic reports that this is the work of hackers. They tricked Claude Code into believing that they were helping authorized hackers test the quality of the system’s defenses.
Missing details
The information released by Anthropic lacks the detailed information that the best cyber incident investigation reports tend to include.
Chief among these are the so-called indicators of compromise (or IoCs). When investigators publish reports about cyber intrusions, they typically include hard evidence that other cyber defenders can use to look for signs of the same attack.
Each attack campaign may use specific attack tools or be executed from specific computers under the attacker’s control. Each of these indicators forms part of the cyber intrusion signature.
Other people who have been attacked by the same attack computer using the same tools can assume that they too are victims of this same campaign.
For example, the U.S. Government Cybersecurity and Infrastructure Security Agency recently partnered with government cyber agencies around the world to release information about ongoing Chinese state-sponsored cyber espionage operations, including detailed indicators of compromise.
Unfortunately, Anthropic’s report does not include such metrics. As a result, defenders cannot determine whether they too may have been victims of this AI-powered hacking campaign.
Unsurprisingly, success is limited
Another reason why many people are underwhelmed by Anthropic’s claims is that on the surface, they aren’t particularly surprising given the lack of concrete details.
Claude code is widely used by many programmers because it helps improve productivity.
Although not exactly the same as programming tasks, many of the common tasks performed during a cyber intrusion are similar enough to programming tasks that Claude Code should be able to perform them as well.
A final reason to be alarmed by Anthropic’s claims is that they suggest that the attackers may have been able to force Claude Code to perform these tasks more reliably than usual.
Generative AI can accomplish amazing feats. However, ensuring that systems like ChatGPT and Claude Code can do this reliably remains a major challenge.
In the memorable words of one critic, these tools too often meet difficult demands by “kissing ass, sabotaging, and souring.” In plain language, AI tools tend to become sycophants, repeatedly refusing to perform difficult tasks, or hallucinating.
In fact, Anthropic’s report notes that Claude Code frequently lied to attackers and pretended to have completed tasks even when they were not. This is a classic case of AI hallucination.
Perhaps this explains the low success rate of this attack. According to Anthropic’s own report, approximately 30 organizations were targeted, while the hackers were only successful in a small number of them.
What does this mean for the future of cybersecurity and AI?
Whatever the details of this particular campaign, AI-powered cyberattacks are here to stay.
Even if you argue that current AI-enabled hacking is inadequate, it would be foolish for cyber defenders to assume that the current state of affairs will continue.
After all, Anthropic’s report is a timely reminder for organizations to invest in cybersecurity. Those who don’t could face a future where their secrets are stolen or their operations disrupted by autonomous AI agents.
Presented by The Conversation
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Quote: AI Institute says Chinese-backed bots are carrying out cyber-espionage attacks. Ask the Experts (November 17, 2025) Retrieved November 17, 2025 from https://techxplore.com/news/2025-11-ai-lab-chinese-bots-cyber.html
This document is subject to copyright. No part may be reproduced without written permission, except in fair dealing for personal study or research purposes. Content is provided for informational purposes only.
