In 2018, according to a survey conducted on behalf of the Anti-defamation League, 53% of Americans experienced some form of online hate speech. For 37% of people, the content was “severe” and included sustained harassment, physical threats, stalking or sexual harassment, and more often than not it included a combination of these.[1] Sadly, 15% of targeted individuals felt so much in danger that they had to take active measures such as moving their place of residence.[1]


This is such grim news. Considering we spend a considerable amount of time on social media every day (many studies report >2 hours[2]); it is not surprising that the quality of our digital interactions matters a lot for our emotional and physical health and wellbeing.


We are living in an age where everything is more connected than ever before. We read, write, laugh, cry, announce, denounce, fall in love, break up, and most definitely get back at, and even sleep with our phones nearby. And there is a good reason for that; digital technology has improved our lives in many ways. Social media is one of them. Being connected while defying borders, distances, and time-zones, feeling belonged and mattered, having the ability to voice and broadcast our opinions are key benefits of social media. Nevertheless, like many other good things in life, social media also comes at an expense.  Being targeted for or being exposed to hate speech is a big part of that hefty cost.



What is online hate speech?

Hate speech is a type of speech performed with the purpose of attacking an individual or a group of people based on their perceived attributes. Online hate speech is the kind that occurs on digital platforms. One may be exposed to online hate speech by witnessing hateful comments directed to others, or by experiencing such content directed towards persons or groups they associate with or by experiencing such content directly targeted to themselves.


One crucial thing to emphasize is targets are not chosen randomly. It is often a purposeful act. While the accuracy of the aim depends on the skillsets of the perpetrator; the intent is clear: to diminish, demonize, trigger, harm and hurt the person or persons that belong(s) to a particular group, which is represented under the “protected categories” in the law (race, ethnicity, religion, sex, gender identity, national origin, sexual orientation, disability status). Sometimes, the content blatantly expresses that targeting. For example, for one-third of the targets outlined in the above-mentioned study, the comments were “explicitly” targeted to their ethnicity, race, gender identity, sexual orientation, religion or disability.[1] However, sometimes, the content is skillfully hidden in between the lines for outsiders while causing its full impact on targets.



Why is it so hard to do something about hate speech?

Part 1: Detection


Online hate speech is common and severe. Accordingly, there are increasing levels of recognition of the problem and effort to remove or counteract it in the industry. There is also a growing body of academic research investigating the formation, perpetration, and dissemination of hate speech. Yet, reliable detection remains one of the biggest challenges. This is an extremely hard problem because it is becoming clear that most of the content is being systematically and deliberately prepared to pass through often “keyword based” supervised machine learning algorithms. These detection models are often not susceptible to catch a small twist in the phrase (such as removing spaces, or replacing a letter with another one); hence they fail to detect carefully orchestrated content.


Even if there were not these entities, the detection of non-profane hate speech is extremely challenging, to begin with. It is often context, receiver and sender dependent. Such multi-level context dependent nuances of language are hard to detect for even a Skynet on steroids. Writing “lol, he alright” might be a normal practice of a dialect, a grammar mistake or it can be a purposefully, degrading and hateful comment if it’s written to invocate another person’s dialect, especially with the intention of hurting that person and the group they are affiliated with.  


One of the greatest strides in this area has been made by UC Berkeley. Dr. Claudia von Vacano and the D-Lab Online Hate Index research team have been working on a multi-dimensional approach. This strategy includes the following


1) establishing a theoretically-grounded definition of hate speech inclusive of research, policies, and practice

2) developing and applying a multi-component labeling instrument using Item Response Theory (IRT)

3) creating a new crowdsourcing workflow to label comments at scale

4) curating an open, reliably labeled hate speech corpus that includes multiple platforms

5) growing existing data and tool repositories within principles of replicable and reproducible research, enabling greater transparency and collaboration

6) creating new knowledge through ethical online experimentation (and citizen science)

7) refining AI models


Ultimately, they seek to understand the causal mechanisms for intervention and evaluation. All of these innovations will culminate in a new open-source platform with tools that will make these resources available along with policy recommendations, supporting the Anti-defamation League and other advocacy organizations who are educating and growing a larger community invested in countering hate speech. Visit the Online Hate Index (OHI) or even better attend or view the BIDS lecture on Wednesday, May 8th at 3:00 pm to learn more about this research.


Part 2: Then what?


The detection, unfortunately, is the tip of the iceberg of the challenge. The real hardship starts with strategizing what happens after we detect. Should each platform have online-filters to detect and censor it on-spot? A New York Times opinion piece recently suggested that putting a levy on targeted ad revenue might give Facebook and Google a real incentive to change their dangerous business models as they relate to hate speech. Should the detection just bring it to the user’s attention without censoring it, in case the user was about to, albeit unknowingly, retweet that hate? If you look at the experts dedicating their lives to this cause you realize that censorship or filtering is the last thing they want. In fact, not by random luck (which would make a great story for later), many scientists at the forefronts of social media research are prominent believers of free speech. They carry a heavy concern for and are doing all they can to ensure the methodologies they are developing don’t promote censorship of speech, particularly political speech. Coming from Turkey, a still heavily externally and internally censored country, I fully agree with them and applaud their efforts.


The remaining viable alternative is to use that knowledge and technology to promote counterspeech. But how? Although there are a couple of working models at play, that is still an evolving conversation that requires deeper levels of discussion in order to solidify strategies.


The bottom line is that reliably detecting hate speech, understanding its impacts, and defining strategies to defeat it without resorting to censorship rules and regulations is not only a very challenging technical problem but it is also a sociological and political issue that needs to be solved at a societal level. Many things are unclear and yet to be developed, discussed and agreed upon - with one exception: as the impact involves all of us, so should the solution. Stopping online hate starts with every one of us, whether it is through supporting and participating in the research happening at D-Lab, joining the dialogue, attending to the workshops and lectures, reaching out, donating to OHI research, engaging with friends and family members, or standing up for those targeted…  At the very least, it involves using our own digital power (those likes, loves, retweets, etc.) to disseminate love and compassion instead of hate.   



[1] Anti-Defamation League. Online Hate and Harassment: The American Experience. Accessed on April 13 2019 through


[2] Statista: The Statistics Portal. Daily time spent on social networking by internet users worldwide from 2012 to 2017 (in minutes). Accessed on April 15 2019 through


Simal Ozen Irmak

Simal is a neuroscientist and researcher passionate to translate data and scientific research into meaningful information to improve health and wellbeing. She has a BA degree in Business Administration and a PHD degree in Neuroscience. She spent most of her career in academic research, investigating how certain brain regions communicate with each other during sleep and sleep like brain states.