Spotting Deepfakes in an Election Year: How AI Detection Tools Work — and Sometimes Fail

1 week ago 32

With the progress of generative AI technologies, synthetic media is getting more realistic. Some of these outputs can still be recognized as AI-altered or AI-generated, but the quality we see today represents the lowest level of verisimilitude we can expect from these technologies moving forward.

AI detection tools provide results that require informed interpretation, and this can easily mislead users.

We work for WITNESS, an organization that is addressing how transparency in AI production can help mitigate the increasing confusion and lack of trust in the information environment. However, disclosure techniques such as visible and invisible watermarking, digital fingerprinting, labelling, and embedded metadata still need more refinement to address at least issues with their resilience, interoperability, and adoption. WITNESS has also done extensive work about the socio-technical aspects of provenance and authenticity approaches that can help people identify real content. See, for instance, this report or these articles here and here.

A complementary approach to marking what is real is to spot what is fake. Simple visual cues, such as looking for anomalous hand features or unnatural blinking patterns in deepfake videos, are quickly outdated by ever-evolving techniques. This has led to a growing demand for AI detection tools that can determine whether a piece of audio and visual content has been generated or edited using AI without relying on external corroboration or context.

While not a perfect fit as a term, in this article we use “real” to refer to content that has not been generated or edited by AI. Yet it is crucial to note that the distinction between real and synthetic is increasingly blurring. This is due in part to the fact that many modern cameras already integrate AI functionalities to direct light and frame objects. For instance, iPhone features such as Portrait Mode, Smart HDR, Deep Fusion, and Night mode use AI to enhance photo quality. Android incorporates similar features and further options that allow for in-camera AI-editing.

This piece aims to offer preliminary insights on how to evaluate and understand the outcomes provided by publicly accessible detectors that are available for free or at low cost. Based on testing conducted in February 2024 of a sample of online detectors that included Optic, Hive Moderation, V7, Invid, Deepware Scanner, Illuminarty, DeepID and open-source AI image detector, we discuss where these tools may currently experience limitations, as well as what factors need to be considered when deciding to use them. It is important to acknowledge that, akin to generative technologies, the development of detection models is ongoing. Therefore, a tool’s performance may vary over time: it might struggle to accurately identify certain manipulations at one point but excel in detecting others as it evolves. This dynamic nature underscores the challenges in synthetic media detection tools and the necessity of staying informed about their latest advancements.

How Understandable Are the Results?

AI detection tools provide results that require informed interpretation, and this can easily mislead users. Computational detection tools could be a great starting point as part of a verification process, along with other open source techniques, often referred to as OSINT methods. This may include reverse image search, geolocation, or shadow analysis, among many others.

Unfortunately, most online detection tools do not provide sufficient information about their development, making it difficult to evaluate and trust the detector results and their significance.

However, it is essential to note that detection tools should not be considered a one-stop solution and must be used with caution. We have seen how the use of publicly available software has led to confusion, especially when used without the right expertise to help interpret the results. Moreover, even when an AI-detection tool does not identify any signs of AI, this does not necessarily mean the content is not synthetic. And even when a piece of media is not synthetic, what is on the frame is always a curation of reality, or the content may have been staged.

Most of the results provided by AI detection tools give either a confidence interval or probabilistic determination (e.g. 85% human), whereas others only give a binary “yes/no” result. It can be challenging to interpret these results without knowing more about the detection model, such as what it was trained to detect, the dataset used for training, and when it was last updated. Unfortunately, most online detection tools do not provide sufficient information about their development, making it difficult to evaluate and trust the detector results and their significance.

How Accurate Are These Tools?

Classifiers developed by companies determine whether a particular piece of content was produced using their tool. …This is important because a negative result just denotes that the specific tool was not employed, but the content may have been generated or edited by another AI tool.

AI detection tools are computational and data-driven processes. These tools are trained on using specific datasets, including pairs of verified and synthetic content, to categorize media with varying degrees of certainty as either real or AI-generated. The accuracy of a tool depends on the quality, quantity, and type of training data used, as well as the algorithmic functions that it was designed for. For instance, a detection model may be able to spot AI-generated images, but may not be able to identify that a video is a deepfake created from swapping people’s faces.

Similarly, a model trained on a dataset of public figures and politicians may be able to identify a deepfake of Ukraine President Volodymyr Zelensky, but may falter with less public figures like a journalist who lacks a substantial online footprint.

Additionally, detection accuracy may diminish in scenarios involving audio content marred by background noise or overlapping conversations, particularly if the tool was originally trained on clear, unobstructed audio samples.