Artificial Intelligence – the future of fighting child sexual abuse material

Artificial Intelligence – the future of fighting child sexual abuse material
15 April, 2019 NetClean

Artificial Intelligence – the future of fighting child sexual abuse material?

Artificial Intelligence, AI, has started to play a prominent part in child sexual abuse investigations, by helping to recognise, categorise and triage material. This technology is currently mostly used by law enforcement, however industry is now developing AI to recognise online child sexual abuse material.

In this series of blogs, we discuss technologies that are used to stop child sexual abuse. Unlike the tools we have previously covered, e.g. web crawlers and hashing technology, AI has the potential to identify new and previously unclassified child sexual abuse material. However, as always, it is not a fix-it-all solution.

The sheer volume of material that is found after arrests is a major stumbling block to processing and identifying offenders and children. Data in the NetClean Report 2017 shows that a ‘normal’ case can contain anywhere between 50,000 to 5 million images, many of them images that do not contain child sexual abuse. Analysis and classification of this number of images is a major challenge for the human brain. Help to sort through and triage large quantities of data is vital.

What is Artificial Intelligence?

Artificial Intelligence, AI, is the philosophical and practical idea that machines can carry out tasks in a way that humans consider ‘smart.’

Originally AI focused on making computers perform challenging sequential tasks, like calculating increasingly complex calculations. However, as technology and our understanding of the human brain changed, work on AI was directed towards mimicking human decision-making processes and carrying out tasks in more ‘human’ ways. It now relies on the adage that intelligence is learning without instruction or programming.

The current cutting edge of AI is Machine Learning, which is born from the idea that machines should be able to train on data and learn for themselves.

The cutting edge of the cutting edge

The cutting edge of machine learning is Artificial Neural Networks (ANN), complex statistic processors, which originally were designed to process information the same way that a human brain does.

Rather than being modelled on the historical system of sequential processing and execution of explicit instructions, ANN are based on efforts to model information processing in the human brain. These rely on parallel processing and implicit instructions based on recognition of patterns from external (sensory) sources.

Typically artificial neurons are aggregated into different layers, which may perform different types of transformations on their inputs. The output of one layer is the input to the next one. The more layers, the deeper the network is said to be, and this is why Machine Learning with ANN is sometimes called Deep Learning. Deeper networks enable more complex calculations and tasks to be performed. 

Importantly, ANN are adaptive systems that change based on external or internal information that flows through the network. I.e. the system is learning and the networks infer functions from learning. This is important in applications where the complexity of the data or task makes the design of function by hand difficult.

Data the crucial component for training and learning

ANN learn and adapt through assessing data, and in order to draw the right conclusions they must train on high volumes of quality data. For example, if a system is to learn what a dinosaur is, it must train on large volumes of images and material labelled ‘Dinosaur’ and ‘Not Dinosaur.’ From this it will start to draw conclusions about what a dinosaur is (and is not). It will register colours, shapes, etc to draw conclusions, and it will come to understand that despite being as tall as some dinosaurs, a giraffe is not a dinosaur. The colouring and head shape are wrong.

An AI application is only as good as the data on which it has trained. If the data is flawed the system will draw the wrong conclusions and become inefficient, or unhelpful really. This is why, for optimum results (output) it is crucial to have high quality data and to structure the training in such a way that the system draws the right conclusions.  

With the focus on quality data and training, current work has shifted towards performing specific tasks rather than investing in mimicking the human brain.

Every day applications of AI

Tools and applications of AI are everywhere today. Some of these are targeted advertisements on websites, travel-planning, suggestions for articles and items to purchase, suggestions for whom you might know on LinkedIn, Twitter etc etc.

Other commonplace things that are results of machine learning and ANN are virtual home assistants such as Alexa, Siri and OK Google. These operate by converting voice recognition commands to text which searches the cloud for the information, music etc that has been requested. These assistants can also turn commands into action, like turning down the lights and answering the door. Another cutting-edge example of cloud machine learning is Google applications like Image search, Translate and Voice search.

AI and child sexual abuse material

Law enforcement works with specialised AI classifiers that can find previously unclassified material, and assist with categorising and triaging images. This frees up time for investigators to focus on Victim ID, and enables police forces to bring prosecutions forward faster. Examples of some of the classifiers that are available today to law enforcement and NGO’s that work with reviewing child sexual abuse material are Amazon Rekognition, Google’s AI Tool, Griffeye Brain and Cease AI. There are also a number of law enforcement driven initiatives which are aimed towards the use of or development of AI classifiers, by for example the Metropolitan Police in the UK.

The scope of what AI can assist with in child sexual abuse investigations is huge. In the near future the hope is that AI will be able to analyse wider portions of images. By looking at objects in pictures it will be able to do calculations on where the image is taken. It will be able to quickly run face recognition scans and in video listen to voices and group together material where the same voice occurs.

The use of AI technology to detect or flag suspected child sexual abuse material outside of law enforcement is only starting to be developed. Google’s AI tool can for example be used to detect online child sexual abuse material in networks, services and on platforms. (It is available through a Content Safety API to NGOs and industry partners on request). The Google product uses deep neural networks to learn what child sexual abuse material looks like. It is still in its early development, but is showing great potential.

Limitations of AI

As mentioned above, AI is only as good as the data that it has been provided with. If images are categorised incorrectly, the AI will draw the wrong conclusions from the data. This means that the technology still relies heavily on human verification to ensure that the AI classification is right.

Although AI is learning and correcting mistakes, it is not a hundred percent reliable, and it will make mistakes. Some of those mistakes might seem strange to a human mind as the incorrect classification will have been easy to spot (calling a giraffe a dinosaur). It is usually very difficult, if not impossible, to backtrack why the AI has made a particular mistake. The pathways are so complicated that it is near on impossible to untangle what is going on.

An example of another challenge is perfecting things like face recognition. The algorithm reads mature faces better than young ones, which in the scope of child sexual offence and Victim ID needs to change.

Still, the scope for law enforcement and industry to detect material early and stop child suffering is tremendous and AI technology makes the whole process of detecting, analysing and rescuing children faster. In fact, development of AI is so quick that it is not unreasonable to believe that we will be able to return to this blog soon to update some of the content on how law enforcement and industry use AI to fight the child sexual abuse.

About the Technical Model National Response

Inspired by the WeProtect Global Alliance Model we have set out to develop an initiative that looks at technology. We call it the Technical Model National Response.  It is an overview of the existing technologies that need to be applied by different sectors and businesses to effectively fight the spread of child sexual abuse material.

Learn about the other

  • Aug202018

    Hashing Technologies
    Read now

  • Aug192018

    Read now

  • Aug182018

    Artificial Intelligence
    Read now

  • Aug162018
    Blocking - Technical Model National Response

    Blocking Technologies
    Read now

  • Aug162018

    Web Crawlers
    Read now

  • Aug152018

    Filter Technologies
    Read now

  • Aug142018

    Keyword Matching
    Read now

  • Aug142018

    Law Enforcement Collaboration platform – Coming soon

  • Aug132018

    Notice and Takedown
    Coming soon