Fritz_X_DargesBlue42… Who Are You?
What does it mean to be a statistic? I have always wondered who the faces are behind the dots on a plot. What are their interests? What do they care about? Do they even know they are contributing to the construction of knowledge?
This past summer, I participated in the Data Science for Social Justice Workshop, where I had the opportunity to learn Python as a method for data science through an ethical lens. My data corpus consisted of posts and comments from Reddit, a social media platform and online community where people can post content and engage in discussions on a variety of topics. What is less widely known is that Reddit’s posts and comments are downloadable, offering a rich data set to explore the discursive metrics manifest in each user’s interaction. I instantly had over 100,000 submissions at my fingertips. Upon realizing I had access to so much information, my computer incessantly had to churn and sometimes froze every other minute. There was such an abundant amount of data to analyze—a researcher’s gold mine. And in typical researcher fashion, I instantly wanted to delve into the endless potential conclusions I could discover.
As someone who once approached data with trepidation, either in fear of not knowing how to engage or ethical concerns, this workshop challenged me to grow in all the best possible ways. Researchers gain valuable insights from their findings, but it was the journey itself that provided the most profound lessons. Every data point represents a person, their voice, and a lived experience; thus, it is our responsibility as data scientists to treat our work with empathy and care. This point can easily get lost in the inundation of data. For my fellow data scientists looking to humanize the data collection experience, I offer three guiding rules to ensure that we honor the people behind the data. Through these rules, we can better appreciate the stories we are privileged to analyze.
Explorers, Discoveries, and Gold Mines
How easy is it to view information as if knowledge is meant to conquer? Every researcher’s demise is that our training has been centered on extracting information and consequently dehumanizing the very people we aim to support. The people behind the data are objectified into a data point ripe with findings. Mohamed et al. (2020) proclaim that “algorithmic exploitation…take[s] advantage of (often already marginalized) people by unfair or unethical means, for [an] asymmetrical benefit” (p. 667, Mohamed et al.’s parenthesis). As researchers working with and within extractive institutions, we struggle to divorce ourselves from exploitive methodologies because the tenor of our colonial histories runs deep in our scientific systems. Information acquisition is naturally embedded in colonial logics that we struggle to unsee and unlearn. Thus, to no surprise, the faces and stories represented in the data fade into qualitative and quantitative oblivion as these institutions profit in the name of “progressing” society.
I wanted to approach this project differently.
You see, this project, like many of my research endeavors, is personal. I decided to survey the conspiracy subreddit because I wanted to understand the mechanisms driving conspiracy theories. Having grown up in the South, conspiracy theories were all around me. I once remember my older brother refusing to vote for Obama during his reelection out of fear that the Affordable Healthcare Act would force everyone to plant microchips into their brains to spy on their minds. “You’re crazy,” I would retort in an effort to invalidate his concerns. However, if I had taken a moment to reflect, our government has indeed committed some atrocious acts (i.e., the horrific Tuskegee Syphilis Study and the unethical DNA robbery of Henrietta Lacks for cancer research). Nevertheless, every perspective holds value and deserves consideration for possessing explanatory power.
Before inputting the code to analyze the data, I wanted to practice humanizing the data. In many instances, when researchers access open data sources, they download the data file, upload it onto a program or software to process, and interpret the information. In doing so, they sometimes neglect a very important step: getting to know the people behind the data.
In the process of aiming to humanize the faces that comprise a data point, I have established three go-to rules.
Rule #1: A Researcher Must Always Find the Story Behind the Data
I opened the data file and began to peruse through the submission posts. Admittedly, I was overwhelmed (the volleyball of doom was also not encouraging). I scrolled through the many submissions and read them aloud.
The first post I read: “Grimes is a witch and that's why Elon's stock value went down.” I chuckle as I say this aloud. “Creative,” I think to myself. But this user only posted once in this data set. I continued to look for someone who posted multiple times, which is when I found Fritz_X_DargesBlue42 (username modified for anonymity). Out of the 100,000+ posts, Fritz_X_DargesBlue42 had posted 87 times. Out of the top 50,892 posts, Fritz_X_DargesBlue42 had 25 of the top posts—making Fritz_X_DargesBlue42 one of this subreddit’s most frequent and provocative contributors. They then became the user I decided to better understand who they were as a person.
On Reddit, there are some important numbers to keep in mind. Scores consist of upvotes and downvotes where users can decide to support or reject a user’s post or comment.
Score = Upvotes – Downvotes
Why does this matter?
The score is a measure of how the Reddit community approves a post. Of the top conspiracy subreddit submissions, the lowest score is 173 and the highest is 76,121. Figure 1 displays the scores and number of comments for Fritz_X_DargesBlue42’s top 25 posts. The blue bars enable us to see his score, and the red enables us to see the comments the corresponding post incited.
Figure 1. Dual-axis chart representing Fritz_X_DargesBlue42’s Reddit Upscore compared to the number of comments each post invited.
Figure 2. Word cloud representing the most frequent words in Fritz_X_DargesBlue42’s posts.
Figure 2 displays the keywords in Fritz_X_DargesBlue42’s posts. Words that were the most salient in their posts were “Sub,” “white,” “Israel,” “Black,” “media,” and “diversity.” The intersection of these words suggests an overlap of topics ranging from race to geopolitics and social media. Although Figure 2 presents a glimpse of Fritz_X_DargesBlue42, my interpretations remain speculative. I decided to look further into the content of their posts.
When looking at Post 2 and Post 3, I can see that the scores were a high 12,286 and 10,668 respectively. I sought to further explore both posts, but found that Reddit deleted Post 3 titled “Wanna guess why Reddit has been purging subs and banhammering like crazy?” I guess we will never know…
Figure 3. Post 2 from Fritz_X_DargesBlue42.
Post 2 (Figure 3) however, Fritz_X_DargesBlue42 posted during the onset of the COVID-19 Pandemic to reveal an unconfirmed narrative about China obstructing a whistleblower’s preventative efforts to warn the public about the virus. In further research, what made this post unfounded is the claim of potentially preventing 95% of virus cases. However, a recent New York Times article (2024) tells the story of Professor Zhang Yongzhen, who opposed the Chinese government to be the first to expose the COVID-19 genome to the global virology database. This article, to an extent, corroborates what Fritz_X_DargesBlue42 proclaimed in Post 2. Despite Post 2 possessing some accuracy, I want to acknowledge that such posts, within a conspiracy subreddit, have the potential to promote a dangerous narrative about a whole group of people. In this case, Post 2 has the potential to incite a false, generalizable narrative about people of Chinese descent and potentially toward all people of Asian, Asian American, or Pacific Islander ancestry (AAPI). I am unsure if Fritz_X_DargesBlue42’s motive was to disclose pertinent information at the pandemic’s onset or to perpetuate a harmful narrative that portrayed the AAPI community in a negative light. Nonetheless, my quest to understand this person continued.
Another post that drew my attention was Post 12, which had a low score but had the most comments. Fritz_X_DargesBlue42 titled Post 12: “Transgender book 'Beyond Magenta' contains graphic descriptions of a 6-year-old performing oral sex on multiple men and this book is in the youth section in many libraries.” As a queer person, I was distraught to see that Fritz_X_DargesBlue42 used his voice to depict the queer community in such a perverse image. Amidst the wave of anti-trans rhetoric occurring across the nation, Post 12 is reflective of the dangerous discourse that so easily vilifies the trans community and, more specifically, endangers trans children.
Barricading the Ripple Effect of Dehumanization
Many forms of dehumanization come into play when engaging with Post 12. The first is Fritz_X_DargesBlue42’s blatant attempt to hypersexualize trans children. This post bolsters unfounded claims about gender-affirming care and further stigmatizes an already marginalized group. Post 12 immediately detracts attention from the discrimination trans children encounter while upholding baseless assertions about gender-affirming care. To be clear, many cisgender children receive gender-affirming care, not just trans children (see Schall & Moses, 2023). Nevertheless, Fritz_X_DargesBlue42’s Post 12 is the beginning of what I deem the ripple effect of dehumanization. This phenomenon refers to an initial act of dehumanization that spreads outward to create social harm. Post 12 serves as a ripple effect that perpetuates false narratives. As netizens, we must recognize that what we put into the internet has a ripple effect.
Through their momentum, ripple effects become waves. Another form of dehumanization occurred in Post 12’s 188 comments which resulted in 3,222 upvotes. The posts and upvotes are visible metrics but fail to account for the silent observers. These are the users who observe to avoid leaving their digital footprint on such a polemic parley. Although we cannot determine if silent observers approve or disapprove of Post 12, the mere dissemination of falsehood is concerning because we should not underestimate misinformation’s influence. That established, Fritz_X_DargesBlue42’s Post 12 caused a dehumanizing ripple effect that invited a wave of similarly vitriolic language that confirmed anti-trans biases. We cannot separate this wave from the wave of anti-trans movements we see affecting the nation.
And like all waves, they must inevitably crash. At this moment, the researcher risks becoming another form of dehumanization, for my immediate reaction to Fritz_X_DargesBlue42’s words, along with their fellow subreddit conspirators, was a visceral ire. I found myself instinctively dismissing Fritz_X_DargesBlue42 and the other subredditors with the logic that if they were dehumanizing others then they deserve my dehumanization of them. Yet, this logic is unproductive and deeply flawed. When caving into a cycle of dehumanization, there is no hope for the social collective. When researchers spend enough time with the data, the data begins to have an effect on us.
Rule #2: A Researcher Must Protect Themselves
Fritz_X_DargesBlue42, you have deeply saddened me.
In reviewing the other posts, their rhetoric was filled with negative portrayals of marginalized communities.
Fritz_X_DargesBlue42, I looked into your username. Fritz Darges was a military officer in Hitler’s close circle (Nash, 2021), who eventually was interned by the U.S. military until his release in 1948 (Mitcham, 2001). Until his dying days, Darges held Hitler in high regard (Hall, 2009). When I investigated further into Fritz_X_DargesBlue42, I found Reddit suspended their account.
Protecting oneself entails not falling into the trap of further dehumanization. Rule #2 is easier said than done. Researchers must avoid the binary of either being fully immersed in their data or maintaining detachment, for both extremes compromise integrity and the researcher’s well-being. There exists a tertiary—a middle path that humanizes the researcher and the research. Doing so consists of the researcher preserving proximity as a humanizing approach while establishing boundaries that promote the researcher’s well-being. Rule #2 entails this delicate balance of embracing the complexity of the human experience. Such complexity prioritizes the researcher’s capacity to engage meaningfully and responsibly. In establishing this tertiary positionality, the researcher fosters a deeper sense of humanity in their work.
Rule #3: A Researcher Must Still Humanize Participants
I found myself falling into the trap of homogenizing everyone on the conspiracy subreddit. I told myself I should not be surprised by Fritz_X_DargesBlue42’s posts.
How do I humanize someone who aims to dehumanize others?
How the tables have turned… The researcher intending to safeguard humanity in a data point now realizes this data point has upheld dehumanizing narratives. Humanizing people like Fritz_X_DargesBlue42 does not necessarily entail justifying their actions. To humanize users like Fritz_X_DargesBlue42, the researcher must understand the broader context that has led to their behavior. Researchers must make a concerted effort to find the people they hope to understand amongst the data.
When I zoom out and proceed with my project, I see you, Fritz_X_DargesBlue42. I see you in the data points, and I know a part of your story. I understand how a part of your worldview distrusts the media due to its connotation with disinformation.
Perhaps, you feel that the world is against you.
Although you may not agree with my existence, I see you. And in doing so, I reclaim the humanity within myself that your posts have aimed to erase and deny.
With these rules grounding my research approach, I continue my analysis, guided by this ethos that shapes every aspect of my inquiry.
References
-
Bradsher, K. (2024, May 1). Chinese Scientist Who Shared Covid Sequence Protests Lab Closure. The New York Timeshttps://www.nytimes.com/2024/05/01/world/asia/chinascientist-covid-lab.html
-
Hall, A. (2009, October 30. Memoirs of Hitler aide could finally end Holocaust claims. The Telegraph. https://www.telegraph.co.uk/news/6461171/Memoirs-of-Hitler-aide-couldfinally-end-Holocaust-claims.html
-
Mitcham, S. W. (2001). Crumbling empire: the German defeat in the East, 1944.
-
Mohamed, S., Png, M. T., & Isaac, W. (2020). Decolonial AI: Decolonial theory as sociotechnical foresight in artificial intelligence. Philosophy & Technology, 33, 659-684.
-
Nash, D. E. (2021). From the Realm of a Dying Sun: Volume III: IV. SS-Panzerkorps from Budapest to Vienna, February–May 1945 (Vol. 3). Casemate.
-
Schall, T. E., & Moses, J. D. (2023). Gender‐Affirming Care for Cisgender People. Hastings Center Report, 53(3), 15-24.