SFI ADAPT Centre

Nominated Award: Best Application of AI to achieve Social Good
Website of Company: www.adaptcentre.ie
ADAPT is the world-leading SFI research centre for AI Driven Digital Content Technology hosted by Trinity College Dublin. ADAPT’s partner institutions include Dublin City University, University College Dublin, Technological University Dublin (hosting this nominated project), Maynooth University, Munster Technological University, Athlone Institute of Technology, and the National University of Ireland Galway. ADAPT’s research vision is to pioneer new forms of proactive, scalable, and integrated AI-driven Digital Content Technology that empower individuals and society to engage in digital experiences with control, inclusion, and accountability with the long term goal of a balanced digital society by 2030. ADAPT is pioneering new Human Centric AI techniques and technologies including personalisation, natural language processing, data analytics, intelligent machine translation human-computer interaction, as well as setting the standards for data governance, privacy and ethics for digital content.
Technological University of Dublin, as a partner institute in Adapt, have undertaken the NLight project, in collaboration with two national agencies: The ISPCC (Ireland’s national child protection agency) and Hotline.ie (Irish national hotline agency for reporting of illegal content). The project is funded by the Tech Coalition initiative of the End Violence Against Children Global Partnership (Unicef).
Reason for Nomination
Societal Issue
The prevalence of child sexual abuse material (CSAM) online is a global problem that has grown exponentially with the digitisation of content and ease of internet distribution. In Ireland, a quarter of reports in 2020 sent by the general public to Hotline.ie, the Irish centre for combatting illegal content online, related to cases involving up to thousands of child sexual abuse images and videos. There was a 142 % increase in the amount of child sexual abuse material which appeared to have been “self-generated”. Facebook alone reported that it removed 11.6 million of CSAM content in a three month period in 2019 (Facebook community enforcement report 2019).
Coercion or leading of children into the production of abusive content often involves grooming children – both offline in their local environment and online via interactions on gaming, social media phone contact and other digital contact points. Once a child has been directly abused or coerced into self-generated content, the content distribution is facilitated by both social media channels and more covertly, the dark web. This problem is increasing year on year, with the Internet Watch Foundation reporting a 77% increase in child ‘self-generated’ content from 2019 to 2021 (WeProtect Global Threat Assessment 2021 report).
Two primary issues to be addressed, as pinpointed by the TechCoalition are: (1) The use of internet by predators to gain access to children for the purposes of abuse – either online or offline We need to understand exactly how children are being groomed: How are they being approached and coerced? What games, web sites, mobile apps, or offline approaches are being used for contact? What patterns can we find ? By better understanding the activities from the perspective of victims, we can develop educational strategies and policy recommendations to address this. (2) The sharing and distribution of abusive images and videos by CSAM sharers. How and where are CSAM sharers distributing and sourcing their content? What discussions are they having about child grooming techniques? How are these techniques changing over time ?
The impact on children who have been involved in online child sexual exploitation and abusive is devastating. In addition to the original abusive crime, this crime perpetuates – with content having no shelf life and potentially available online, anywhere, with no end to the victims’ trauma.
Solution
While there is no silver bullet to the problem of CSAM, a key element of reducing CSAM is to prevent the engagement of children into the production of CSAM. By understanding the practices and changing techniques of groomers, we can inform stakeholders, including educators, parents, children and law enforcement as to the latest practices being used by abusers to produce and share CSAM.
ADAPT researcher at TU Dublin, Dr Susan McKeever, along with her colleague Dr Christina Thorpe, have been awarded funding from Unicef’s Tech Coalition’s Safe Online Research fund for their project, N-Light, which aims to discover patterns of child grooming and abuser activities through the use of AI on victim and abuser forums chats.
The project is a collaboration with the ISPCC (Ireland’s national child agency) and the national hotline agency (Hotline.ie). The N-Light team are developing an AI-driven platform that identifies patterns applying patterns of child sexual abuse and grooming behaviours from child victim reports, from reports of illegal content from the public and from internet forums CSAM chats from both the deep and dark web. State of the Art machine learning techniques for text analysis – using natural language processing, language models and supervised learning models – are being created to identify patterns and occurrences, leading to greater insights into how children are being approached and abused.
The output of the work will be a web based tool, for use by the child support and hotline agency, enabling them to identify deeper hidden patterns within and across their information in relation to CSAM production. This output tool offers the opportunity to scale up: throughout the EU and beyond, child support and hotline agencies exist, opening the opportunity for the NLight tool to be used by other countries. The design of the tool includes the concept of forward engineering such that data storage and uploading can be easily expanded to other jurisdictions.
Many AI based efforts are already being applied by technology companies, in an attempt to moderate the scale of abusive content uploaded to their platforms. While these apply AI techniques, the efforts are addressing content after it has been created and uploaded. It also does not apply to the dark web. The novelty of NLight lies in its collaboration on expertise and data with victim and hotline agencies in the field – combined with the latest state of the art AI expertise in NLP and machine learning applied via Adapt researchers.
The outputs of the project will be a usable tool for generating CSAE knowledge: Identifying specifics of how children are being targeted and information shared. The impact of the project will be achieved via acting upon that knowledge via stakeholders: Government in education and law enforcement, parents, children, general public. The project partnership facilitates this: both the ISPCC and Hotline feed into child and online safety directives. In addition, NLight has an advisory panel covering the child related agencies, internet safety authorities and law enforcement. This network will be used for policy statements on child education and online safety policy. Finally, the Adapt research group hosts an extensive range of outreach and engagement activities that will be used to promote both Nlight and its CSAM findings for education of the research community and the general public.
Additional Information:
Many efforts have and continue to be made in academia and beyond on applying AI to detecting child sexual abuse and exploitation content. Gaining impact and traction from these efforts is very difficult. Data privacy makes it difficult for academia to acquire large labelled social media datasets for model creation and evaluation. In parallel, the big-tech social media companies have huge budgets to drive developments in the field of AI for content moderation and content analysis – with leading NLP language models emerging from Google, Microsoft and Facebook/Meta. These developments, while encouraging, can only be evaluated and applied at scale by the social media companies themselves, as they control their own content.
A key aspect of the NLight project is the effort made to work with real agencies in the field of CSAM – the ISPCC and Hotline. This is not an AI research for research sake – it is an example of applying state of the art AI in the domain of CSAM, where it is most needed: with agencies who would otherwise have limited mechanisms to access the level of AI expertise needed to analyse their own data at scale – and thus discover patterns of abuser and victim behaviours so that they can influence the stakeholders who matter – Government departments, law enforcement media and the general public.
The collaborations in this case, and in general, require a large investment of time and expertise. But they help to bridge the gap between academics and practitioners – this is always a challenge, but for applied AI, it is an essential one to address.
The NLIght project commenced in January 2022 and is due to complete in July 2023.