The Landscape of CSAM Detection: Challenges and Innovations

Recent reports and draft legislation have highlighted the continued, and perhaps increasing, salience of the problem of child sexual abuse material (CSAM) online, including on mainstream platforms. And so it is worth reviewing the landscape of CSAM detection technology, obstacles to further advances, and look to the future of these tools, along with their legal and policy context.

(In some regions, CSAE—child sexual abuse and exploitation—may be a more familiar acronym than CSAM, though this includes more complex issues like non-explicit content and grooming, which are beyond the scope of this article. Network exposure is another major issue in CSAM detection, but also outside of what we will cover here.)

‍

Hash matching

The baseline technique for detecting CSAM is hash matching: comparing a string of characters (a hash) algorithmically extracted from a media file with the hashes taken from a database of confirmed CSAM to automatically identify violative content.

There are several common hashing algorithms that will provide a 1:1 hash for an image, meaning that any change to the image will also change the hash. For an adversarial context like CSAM, where users are deliberately violating policy and attempting to evade enforcement, this is of very limited utility, as it is easy to tweak images and videos by, for example, minor cropping, and thus change the hash. For this reason, CSAM hash matching uses fuzzy or perceptual hashing, where the hashing algorithm is designed to allow a certain degree of tolerance to changes to the media that are likely to preserve the original image as it is perceived by human vision. (This recent article by the technology’s pioneer gives a fuller explanation of the perceptual hashing landscape.)

‍CSAM collections and hash databases

As mentioned, hash matching requires the use of a database of known CSAM. This presents some complexity as, in many jurisdictions, it is illegal to knowingly possess the CSAM itself. Some organizations are authorized to maintain these collections, such as the National Center for Missing & Exploited Children (NCMEC) in the US and the Internet Watch Foundation in the UK and Canadian Centre for Child Protection in Canada. Each of these entities has their own processes and policies for including and categorizing images in their data set, and then they provide a database of hashes to trusted partners.

By far the most prominent among the CSAM image hash matching systems is PhotoDNA. Developed by Microsoft for use on Bing and SkyDrive but donated to NCMEC, this has become the standard CSAM image detection tool for platforms above a certain size. Some larger platforms have arranged to implement the technology internally, but Microsoft also offers a free cloud implementation. More recently, PhotoDNA for video was introduced, and Google also offers a free CSAM matching API called CSAI Match, based on their own work for YouTube. In a parallel move, Meta (then Facebook) open-sourced their perceptual hashing algorithms for images and videos, PDQ and TMK+PDQF, so that other platforms could implement them internally—though these are not plug-and-play integrations into databases as with PhotoDNA and CSAI Match.

Most recently, Cloudflare, the market-dominating network infrastructure provider, introduced a tool (descriptively named CSAM Scanning Tool) for its clients that draws on NCMEC’s database to identify and support the removal and reporting of known CSAM. This last service opens up the technology to many smaller platforms and websites that may not be able to be approved for or be able to properly implement PhotoDNA.

The non-profit Thorn also maintains a database of hashes of images that are reported by platforms that use their proprietary CSAM detection / actioning / reporting solution, Safer, and then provides that database to other customers as well as to law enforcement and NGOs.

Shortcomings of the hash matching system

Despite substantial successes in keeping known CSAM off major platforms without undue engineering, computational, or moderator burdens, there remain problems, some minor, some more significant with this regime:

Accessibility: Very small platforms may not find it easy to meet the criteria for the PhotoDNA service or to afford Thorn’s Safer, and may not be direct users of Cloudflare’s services either. (The non-profit Prostasia Foundation offers a free plugin for Rocketchat admins to run uploaded images through the PhotoDNA cloud service.)
Accuracy: Though rare, there are false positives inherent in hashing algorithms known as hash collisions. Some of these are random natural collisions. An example of a natural collision in the TMK+PDQF hashing algorithm has a video of three people by a river matching a video of unboxing a sandwich from a shoebox with a high confidence level. There are also forced collisions, where a non-CSAM file is developed to produce a hash that matches the hash of content on a CSAM registry. In theory, a bad actor could trick a victim into downloading or sharing an innocuous-appearing file that is matched with CSAM, thus causing them platform bans and even police attention. This is among the reasons that the hash matching is performed via API and / or only for trusted partners.
Privacy: Some information about the files can be gleaned from perceptual hashing, prompting privacy concerns.
Employee health: Workers at organizations like NCMEC must shoulder the psychological burden of viewing and categorizing CSAM images. Similarly, content moderators for platforms must perform a similar task for CSAM not on the hash registry that has been flagged by users or proactive means.
Policy autonomy: As mentioned, each database has its own policies, procedures and categories for identifying and categorizing CSAM. Platforms using hash-matching services can have limited options to define what should be treated as clearly violating content with respect to on-platform enforcement and reporting to law enforcement.
Comprehensiveness: Conversely, the existence of different hashes and databases means that known CSAM may not be detected on a platform because it is only contained in a database or hashed using an algorithm that the platform is not using. Fortunately, collaboration is common in this field, and the Tech Coalition (a large group of internet companies cooperating on shield safety efforts) has been working on improving interoperability between CSAM video hashing technologies.
New CSAM: Most importantly, hash matching only identifies known CSAM that has been reported, classified, hashed, and included in a database, and not previously unreported “new” CSAM, the detection of which is likely the most critical for preventing ongoing abuse of children.

Detecting previously unknown CSAM

New CSAM is a growing issue. Apart from standard child abuse, there has been increased awareness of online sextortion schemes, and the Stanford Internet Observatory released reports in recent months describing trends of self-generated and AI-generated CSAM.

A portion of new CSAM is identified through user reports or by identifying a cache or network dedicated to CSAM. Other online new CSAM can be identified through automated systems, though these are not nearly as widespread as hash-matching systems. Platforms have access to just one readily available free classifier for static images that dates back to 2018 (Google’s Content Safety API) and gives a priority score based on the likelihood that the image contains CSAM. Users of the API are encouraged to also conduct their own review, implying a reasonable likelihood of false positives.

Considering the recent advances in image and video classification, and the boom in vendors (including Unitary) providing sophisticated classification services, as well as the universal agreement of the intensely problematic nature of the content, it might be surprising that there are so few publicly available options for identifying new CSAM on platforms. (It is important to note that online platforms are not the only parties who scan for CSAM: law enforcement bodies and NGOs also make use of similar or identical hash-matching systems to identify CSAM on the open web, the dark web, peer-to-peer networks, and on private systems. There are a number of classifier options available for law enforcement applications.)

Legal restrictions regarding possession of CSAM is perhaps the biggest obstacle to creating new classifiers. A data set of labeled images or videos is required to train a machine learning model to recognise such videos, and to test it for accuracy. Data labelers would then review images in successive iterations of the model to assess and improve its performance. Special arrangements would have to be made to allow the developers access to the imagery, and labelers would have to be exposed to traumatic material.

Researchers have proposed approaches like combining existing classifiers for pornography and age estimation, breaking up the task into other discrete categories for which training data is accessible, or training a classifier on CSAM metadata. However, these do not appear to have resulted in a wealth of accessible and accurate tools for online platforms.

Despite these options, some of the hesitancy in creating CSAM classifiers for trust & safety applications may also lie in the mismatch between high error rates of classifiers trained in less-than-ideal processes (in comparison to hash matching) and the severity of the consequences for a user who is identified as uploading CSAM, as a few publicized cases of potential misidentification illustrate. Companies may fear negative publicity if their tools were implicated in serious problems for innocent users.

CSAM detection in the future

At some point, the legal and other obstacles to creating more AI classifiers for CSAM are bound to be surpassed, and so we should expect to see more of these services in the short-to-medium term. Live video (chat or broadcast) is a known problem area for CSAM and associated abuse. This format adds several layers of complexity, but classifiers that can detect live content will also be in demand. Two intertwined areas will also steer developments in CSAM detection in the coming months and years:

Legislative demands for scanning: The UK’s Online Safety Bill, which is in its final stages of approval, requires platforms to take responsibility for ensuring that CSAM is promptly removed from online platforms. It also gives Ofcom, the British regulator charged with enforcing the new regime, the power to insist that platforms adopt approved technologies to remove this content if they are assessed to be falling short. Hash matching is a clear candidate, but use of classifiers may also be imposed. The draft EU Child Sexual Abuse Regulation would give regulators similar powers, but it remains to be seen what the final text will look like and how this will square with the DSA’s prohibition on general monitoring obligations. In the US, the EARN-IT Act would create a carve out to internet intermediaries’ liability shield when it comes to CSAM. This would necessitate extreme efforts by platforms to remove CSAM if they are to avoid risk from lawsuits.

Scanning encrypted communications: End-to-end encrypted messaging and video calls are not currently scanned for CSAM. However, technologies are emerging that could do this, lawmakers have expressed interest in requiring it, and, in the UK, Ofcom may have the power to impose it under the Online Safety Bill requirements. (Privacy activists and messaging platform executives have opposed these efforts wholeheartedly, fearing the addition of other content areas to the list of content that must be detected in encrypted communications.) The most prominent of these technologies is client-side scanning, where the hash matching takes place on the user’s device, insulated from the service provider, unless a positive match is found. Apple got close to rolling out their implementation of client-side scanning for iCloud (which is encrypted, but not end-to-end), before walking it back after complaints from civil liberties organizations. Two other technologies are in a more theoretical phase: homomorphic encryption, where a cryptographic transformation of the hash of a user’s image is compared to a CSAM hash database on a server without the hash ever being decrypted, and secure enclaves are a third solution (though technically not preserving end-to-end encryption), where special hardware on a server is employed to conduct the hash matching, beyond the reach of the service provider. More information about these technologies can be found in section 6 of Hany Farid’s paper cited earlier, and in an EU impact assessment report (pp.291-309).

CSAM detection is perhaps the area of greatest consensus in content moderation, and therefore it has been the locus of unprecedented collaboration between tech companies, academic researchers, NGOs and government agencies, leading to a robust hash matching ecosystem. However, the difficulties involved in creating AI classifiers and conflicts regarding the scanning of private communications have led to a less straightforward path for the field in recent years and it seems likely that this trend may continue for some time.

‍

The Present and Future of Detecting Child Sexual Abuse Material on Social Media

Hash matching

‍CSAM collections and hash databases

Shortcomings of the hash matching system

Detecting previously unknown CSAM

CSAM detection in the future

Download the white paper

Book a consultation

More blog articles