Binary Hashing

Binary Hashing
26 July, 2022 NetClean

SUMMER READING:
Technologies to stop child sexual abuse material

Binary Hashing

In this short series of articles, we look at some of the technologies that are used to stop child sexual abuse material (CSAM) today. Some are used in NetClean’s products, and some are used by law enforcement and NGOs to find and remove material online.

Here we look at the first of two hashing technologies, binary hashing. Read on to see our articles on robust hashing; AI and keyword matching in a series of short articles that we will post over the summer. 

Binary hashing is used to fingerprint and discover child sexual abuse material on content level. It identifies actual images and videos depicting this abuse. Hashing technology is secure, fast and reliable and is used in various ways. e.g. in detection tools, digital investigation tools and crawlers.

If and when new child sexual abuse material is found, the image or video will be classified and given a hash value, a unique digital signature. These signatures can be added to databases that are used in software to match child sexual abuse material. One example of this software is NetClean ProActive, which is an efficient CSAM tool.

This work is done by law enforcement agencies and select NGOs that work to combat child sexual abuse. When unknown material is found, a hash value is calculated and added to a database. Specialized software can then run matches and look for exact copies of the material on for example social media sites, and in IT environments that are protected by detection software.

How does it work? A binary hash is created by a mathematical algorithm that transforms the data of a file, whatever size it may be, into much shorter fixed-length data, a hash value. This acts as the file’s signature, allowing software to find and identify it.

The conversion is random, however, the algorithm always transforms the same input data into the same output data. The output data cannot be reversed or traced back to the original input data. The secure feature means that an image cannot be recreated from a hash value.

This is an efficient and reliable technology. As binary hashes are non-reversible and only detect classified material and identical files, it is extremely unlikely that the wrong material will be flagged. This is why law enforcement agencies use this technology to find material in investigations and for evidence authentication.

It requires less data power than robust hashing  (see next week’s article), and that is why NGOs and social media companies incorporate this technology into their web crawlers in their active search for known material online.

Binary hashing is a powerful tool, however, with the slightest alteration of an image, the hash value will change and a crawler or technology that relies solely on binary hashing will not be able to find or recognize the image or video. Robust hashing, which we will look at next week, offers a solution to this problem.

Others also read

Aug09
Artificial Intelligence
Artificial Intelligence

Artificial Intelligence, AI, is increasingly being used in child sexual abuse investigations, by helping to recognise, categorise and triage material. In this article we look closer at the technology and how it works.

Aug02
Robust Hashing
Robust Hashing

Discover robust hashing and how it is used to fingerprint and detect online child sexual abuse material on a content level.

Jul26
Binary Hashing
Binary Hashing

Learn more about binary hashing and how it is used to fingerprint and discover child sexual abuse material on content level.

Jul19
Filter Technologies
Filter Technologies

Child sexual abuse material is illegal and more prevalent than one would like to think. It can be caught by firewalls and in this article, we look at the different filter technologies that are available.