
Credit: JUSTIN JOSEPH from Pexels
Almost half of Australians say they have used artificial intelligence (AI) tools recently, making it increasingly important to know when and how they are being used.
Consulting firm Deloitte recently issued a partial refund to the Australian government due to AI-generated errors in a report it released.
A lawyer also recently faced disciplinary action after false AI-generated citations were discovered in official court documents. And many universities are concerned about how students will use AI.
Among these examples, a variety of “AI detection” tools have emerged that seek to address people’s need to identify accurate, reliable, and verified content.
But how do these tools actually work, and are they effective at detecting AI-generated materials?
How do AI detectors work?
Several approaches exist, and their effectiveness depends on what type of content is involved.
Text detectors often try to infer AI involvement by looking for “signature” patterns in sentence structure, writing style, and the predictability of certain words and phrases used. For example, the use of “delve” and “showcase” has skyrocketed since AI writing tools have become more available.
However, the differences between AI and human patterns are becoming smaller and smaller. This means that signature-based tools can be very unreliable.
Image detection capabilities may work by analyzing embedded metadata that some AI tools add to image files.
For example, the Content Credential Inspection Tool allows you to view how users have edited content that was created and edited with compatible software. Similar to text, images can also be compared to verified datasets of AI-generated content (such as deepfakes).
Finally, some AI developers have begun adding watermarks to the output of their AI systems. These are hidden patterns in all kinds of content that are imperceptible to humans but can be detected by AI developers. However, large developers have not yet made their detection tools publicly available.
Each of these methods has drawbacks and limitations.
How effective are AI detectors?
The effectiveness of an AI detector depends on several factors. These include what tools were used to create the content and whether the content was edited or modified after it was generated.
The tool’s training data can also affect the results.
For example, the primary datasets used to detect AI-generated images do not include enough full-body photos of people or images of people from certain cultures. This means that detection success is already limited in many ways.
Watermark-based detection is very good at detecting content created by AI tools from the same company. For example, if you’re using one of Google’s AI models, such as Imagen, Google’s SynthID watermarking tool claims to be able to identify the resulting output.
However, the SynthID is not yet publicly available. It also won’t work if, for example, you used ChatGPT, which is not made by Google, to generate your content. Interoperability between AI developers is a big issue.
AI detectors can also be fooled when the output is edited. For example, if you use a voice cloning app and add noise or reduce quality (by reducing the size), voice AI detectors may be triggered. The same goes for AI image detectors.
Explainability is also a big issue. Many AI detectors provide users with a “confidence estimate” of whether something was generated by an AI. But they usually don’t explain their reasoning or why they think something was generated by an AI.
It’s important to recognize that AI detection, especially when it comes to automated detection, is still in its infancy.
A good example of this can be seen in recent attempts to detect deepfakes. The winners of Meta’s Deepfake Detection Challenge identified four out of five deepfakes. However, the model was trained on the same data it was tested on. It’s like seeing the answers before taking the quiz.
The model’s success rate decreased when tested against new content. Only three out of five deepfakes in the new dataset were correctly identified.
All of this means that AI detectors can and do get things wrong. These can lead to false positives (claiming that something was generated by an AI when it wasn’t) and false negatives (claiming that something was generated by a human when it wasn’t).
These mistakes can be devastating for the users involved. For example, a student who wrote an essay himself but had it rejected as written by an AI, or someone who mistakenly believed an email written by an AI was from a real human.
As new technology is developed or improved, the arms race continues as detectors struggle to keep up.
Where do we go from here?
Relying on a single tool is problematic and comes with risks. It’s generally safer and better to use a variety of methods to assess the trustworthiness of content.
You can do this by cross-referencing your sources and double-checking the facts of what is written. Or, for visual content, you can compare the suspect image to other images allegedly taken at the same time or location. You can also ask for additional evidence or clarification if something looks or sounds suspicious.
Ultimately, however, trust with individuals and institutions will remain one of the most important factors when detection tools are inadequate or other options are not available.
Presented by The Conversation
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Quote: How do “AI detection” tools actually work? And are they effective? (November 16, 2025) Retrieved November 16, 2025 from https://techxplore.com/news/2025-11-ai-tools-Effective.html
This document is subject to copyright. No part may be reproduced without written permission, except in fair dealing for personal study or research purposes. Content is provided for informational purposes only.
