Sasha Haco

November 23, 2022

|

Category:

AI

Sixteen images taken from the internet. They are meant to test a person's ability to recognise harmful content out of context.

‍

Tuesday 9th February is Safer Internet Day. Marking this day in 2021 feels more important than ever. Recent events in America have demonstrated how the abuse of the internet can lead to real-life lasting damage. In this case, large-scale misinformation resulted in mass unrest, causing riots and even deaths.

Prior to this, 2020 revealed the power of the internet to allow us to transition to living our lives from the safety of our own homes: the unprecedented effects of the pandemic has meant people are spending more time online than ever before — for entertainment, work, socialising, learning, and everything else that is currently not possible in-person.

The power of the online world has never been more apparent, for both the good and the bad.

With the enormous benefits that the internet has brought us, particularly in this time of crisis, there are also serious dangers that lie within the internet’s depths. Some of these are obviously identifiable, and others are more disguised, ambiguous and difficult to recognise, even for humans.

When we talk about online harm, it’s not necessarily obvious what we mean. In fact, governments, social media platforms, regulators and startups alike have dedicated enormous effort to defining what is meant by harmful content.

Two instances where Pepe the Frog is being used in completely different contexts. — The cartoon of Pepe the Frog has been used in a variety of contexts, with hugely different meanings and intents. On the left, Pepe is being used as a resistance symbol by protestors in Hong Kong. On the right, Pepe's use for has led to the cartoon becoming officially classified as a hate symbol.

There are certain types of content that are obviously harmful and should be removed from the internet — for example terrorist propaganda, child abuse material, or other egregious and illegal content that might be shocking or disturbing to consume. This is the type of content that reaches us in the news, with reports of Facebook moderators becoming traumatised by the destructive impact of their work. As such, this kind of overtly dangerous footage is what comes to mind when we talk about the need for online content moderation. But what about the other types of content that might be legal but still just as harmful? The significance of some material might be ambiguous or only unsafe in certain contexts, depending on the platform, intended viewer or geographical location and culture. It might not be as obviously damaging, but we must still be wary of its consequences. This might include new types of content that we haven’t seen before, such as the cartoon Pepe the Frog repurposed as a hate symbol, or content that may only be unsafe for certain users. We must seriously consider the impact of pornographic material, content encouraging the use of drugs, and media advertising alcohol or weapons to underage audiences. What is considered acceptable on one platform might be extremely inappropriate on another.

Harm is not a binary label: it is determined by a wide range of contextual factors.

In the advertising world, a new industry body called the Global Alliance for Responsible Media (GARM), has recently developed a brand suitability framework in which to assess and label content that is ‘safe’ for adverts to be placed. This new GARM taxonomy consists of 11 categories and 4 risk levels, designed to allow advertisers to have more fine-grained decision-making capabilities about the environments that serve their ads.

An extract from the new GARM framework for categorising harmful content. Image via GARM.

For example, one category is ‘Terrorism’, and the risk levels range from illegal and disturbing content to that which is associated with terrorism in some way, such as through a news feature on the subject. But while these 11 categories are a useful tool to think about categories of content, many grey areas, holes and ambiguities remain.

The ambiguous nature of harm can make moderation a difficult task.

This problem is further complicated by the fact the perpetrators of toxic content often deliberately try to evade detection, disguising their posts as something different and benign. For example, if someone wanted to sell drugs on Facebook’s marketplace, they wouldn’t caption the post “drugs for sale”, or advertise a photograph of the substance.

MailOnline: Capsicum, broccoli and light salad dressing: The strange code names 'backpacker drug dealers used to sell MDMA, marijuana and LSD on Vegetables Australia Facebook group.

Instead, they might create a post offering the sale of “light salad dressing” (LSD) or “potent broccoli” (marijuana). These deliberate attempts to avoid being caught make moderation a much greater challenge than simply the removal of content that is upsetting to watch.

It is precisely because of the subtle and complex nature of online content that companies must take moderation so seriously. Platforms should carefully define what they consider appropriate, and moderators, both human and automated, must undergo rigorous training in order to sufficiently detect dangerous material. At Unitary, we develop multimodal algorithms to identify a range of harmful content, baking in an awareness of context and feeding our models with any extra signals that might be available to us in order to refine our system. Since content moderation remains an ongoing and constantly evolving task, we will continue to learn and adapt to keep up with the unpredictable demands of this problem.

As a challenge on this year’s Safer Internet Day, we encourage you to consider which of the images posted above you think have the potential to be harmful, depending on the target platform or audience. Which would you flag up on social media, or want to warn users about?

Descriptions and sources of images:

Dabs, portions of cannabis concentrates that you smoke
Lean (aka ‘purple drank’) is a recreational drug, made by combining prescription-grade cough syrup with a soft drink and hard candy.
An incense match (e.g. here)
Artificial marijuana plant
Marijuana
Essential oil inhaler
Flag of Tennessee
Dietary supplement
Essential oils
British Union of Fascists
Hemp smokes
Vape juice (in pink lemonade flavour)
Pepe the Frog, used here in a non-racist context
Sparkling water — with brand name Liquid death
Vape liquid
Clown Pepe (aka ‘Honkler’), now a racist cartoon

Download the white paper

A practical guide to implementing a hybrid AI-human model for maximum impact and minimum risk.

Download now

Book a consultation

Find out more about Virtual Agents and what they could do for you

Book a consultation

Can you identify harmful content when you see it? Take the test!

Download the white paper

Book a consultation

More blog articles