Unlock the Joy and Power of Reading in Language Learning

August 21, 2023

Unlock the Joy and Power of Reading in Language Learning

This summer, I participated in the Data Science for Social Justice (DSSJ) Workshop. As part of the workshop, I analyzed the Language Learning subreddit to uncover the emotions associated with different methods of improving language proficiency among language learners. I took a particular interest in this subreddit because it resonated with my own journey in learning English, in which reading for pleasure transformed my English speaking and writing skills. My English-learning experience inspired my passion in promoting the joy and power of reading to all language learners. In this blog post, I will share the findings from my analyses and distill complex language learning theories into actionable tips any learner can follow to find joy and empowerment in their language learning journey.

How Reading for Pleasure Changed My Life

Back in college, I decided to apply to graduate programs in the United Kingdom, which required achieving high scores on IELTS, a standardized English proficiency test. However, despite studying English intensively for 15 years, I could not speak or write spontaneously in English. I had to translate my thoughts from Mandarin into English, word by word, laboriously piecing the words together. To improve my English, I forced myself to practice speaking with friends and native speakers, but made very little improvement. Speaking English was almost physically painful!

Upon a professor's recommendation, I started reading in English for fun. I found the process so enjoyable that I devoured over 20 English books and hundreds of articles in two months. I stopped needing to force myself to practice English: I was simply riding a continuous flow of joy, enjoying every moment. 

This immersion led to a breakthrough. A voice in my mind started talking in English without my conscious effort. I could finally speak and write spontaneously in English! My IELTS scores improved significantly, especially in speaking and writing. Fortunately, I was admitted into my dream program–the master's program in Applied Linguistics and Second Language Acquisition at the University of Oxford–which eventually led me to UC Berkeley for my PhD studies.

I would not have fulfilled my dream if I had not discovered the joy and power of reading. I want all language learners to experience the same joy and empowerment that I experienced. That's why I see it as my life mission to share this joy with every language learner so that they can fulfill their dreams, too.

Emotions Associated with Different Language Learning Methods

Besides my personal experience, there is a large body of empirical research and theories supporting the substantial role of reading, or input in general, in language acquisition (e.g., Day & Bamford, 1998; Krashen, 2004; Landauer & Dumais, 1997). This motivated me during the DSSJ workshop to examine whether other language learners shared similar experiences of struggling with and suffering from forced output (i.e., speaking and writing) practice but benefiting from joyful input (i.e., reading and listening). I asked, What attitudes and sentiments does the Language Learning subreddit community have about input and output practice?

The Language Learning subreddit is an online community for people interested in learning other languages where they can ask questions and share tips related to language learning. Thus, its contents serve as a suitable data set for answering my research questions. My hypothesis was that Redditors in the Language Learning subreddit may have negative sentiments towards output practices and neutral to positive sentiments towards input. To test this hypothesis, I identified 293 posts containing certain keywords corresponding to input, including variations of "input"/"read"/"listen"/"watch" and 195 posts whose keywords correspond to output, including variations of "output"/"practice"/"speak" . I used Term-Frequency-Inverse-Document-Frequency (TF-IDF), a metric that reflects the importance of words in a collection of texts, to quantify the occurrence of keywords in posts.

Importantly, in the "input" keywords, I also included the phrases "comprehensible input", "comprehensive input" and "input hypothesis". These words are related to Dr. Stephen Krashen's Comprehensible Input Hypothesis (Krashen, 1989), which argues that we acquire language by understanding what we read and listen to in that language. Krashen makes a clear distinction between language “learning” and language “acquisition”. According to Krashen, “learning” a second language is through rote memory and practice of vocabulary and grammar rules. The end result is explicit knowledge or metalinguistic knowledge that cannot be used spontaneously. “Acquiring” a second language is similar to how we acquire our first language, which is through input and comprehension.The end result is implicit knowledge that we may not be able to explain but can use spontaneously

Input Is More Associated with Positive Sentiments

I calculated the "sentiment" scores of these posts using a machine learning technique called VADER (Valence Aware Dictionary and sEntiment Reasoner). I will not go into the technical details of this technique, but the general idea is that VADER has a dictionary of the sentiment or emotion associated with each word (e.g., "sad" expresses negative sentiment whereas "excited" expresses positive sentiment). This technique can quantify the "sentiment" or emotion expressed through a piece of text on a scale from -1 to 1, where -1 means extremely negative emotions and 1 means extremely positive emotions. 

Figure 1 shows the distribution of the sentiment scores comparing the "input" posts (represented by the blue bars) and the "output" posts (represented by the orange bars). In the range from 0.4 to 1, the blue bars are consistently higher than the orange bars, whereas in the range from -1 to 0.4, the orange bars are almost always higher than the blue bars. In other words, the "output" posts have a higher proportion of negative to neutral sentiment whereas the "input" posts have a higher proportion of positive sentiment. Thus, Redditors tend to express positive emotions when talking about “input” but negative emotions when talking about “output” or “practice”.

 Distribution of Sentiments of “Input” Posts Vs. “Output” / “Practice” Posts

Figure 1. Distribution of Sentiments of “Input” Posts Vs. “Output” / “Practice” Posts

Output Practice is Associated with Anxiety and Fear

To further identify what specific emotions are more associated with the “input” words and “output” / “practice” words, I fit a Word2vec model on all the posts in the Language Learning subreddit from June 2005 to December 2022. Word2vec is a natural language processing technique that derives numeric representations of words in a corpus from their co-occurrence patterns. The underlying psychological theory of this technique is called Distributional Semantic Theory, which argues that words that occur in similar contexts tend to have similar meanings. For instance, "physician" and "doctor" are similar in meaning because both frequently occur with words like “patient” and “medicine”, even though “physician” and “doctor” seldom occur with each other. Psychology studies show that this theory is psychologically plausible and models based on this theory can simulate human children’s rates of vocabulary learning through reading (Günther et al., 2019; Landauer & Dumais, 1997). 

We can use this model to discover language bias. For instance, we can check whether people are more likely to associate “boss” with male and “secretary” with female. In the Language Learning subreddit, I found that "output" words are "biased" towards a distinct set of words that represent negative emotions. Figure 2 shows a graph of these words represented in the semantic space constructed by the Word2vec model. In this semantic space, words that are closer together have more similar meanings. These words range from relatively mild ones like “embarrassed”, “insecure”, “shy”, “awkward”, “nervous” to more intense ones like “anxious”, “terrified”, and “scared”.

 Negative Emotions “Biased” Towards “Output” / “Practice” Words

Figure 2. Negative Emotions “Biased” Towards “Output” / “Practice” Words

Here is an example post to illustrate how some Reddit users are describing their negative emotions towards output/practice:

“...every time i speak in korean i sound like an idiot, make 1000+ mistakes, am incredibly slow and struggle a lot to put sentences together, its been like this for years and ive never gotten better, im too scared to keep talking now because i feel so stupid and ashamed for it. what’s made this worse is i was watching a movie i was excited for that was korean and realised that even hearing the language now is making me feel horrible and guilty because of her and its reminding me of how poor and pathetic i am at the language and how im not trying or studying hard enough even though i do.”

This post is very similar to my own experience trying to improve my output skills by forcing myself to practice them when I had not had sufficient input and implicit knowledge to speak and write spontaneously. 

Input Is Associated with Joy and Sustainability

I didn't find any specific words of positive emotions that are “biased” towards either the "output" or the "input" words. However, I found that the following words are more associated with the "input" words, including "theory", "krashen", "stephen krashen". I then identified some "input" posts with high "sentiment" scores, and found that many of them directly cited Krashen or alluded to Krashen’s theories. As an example, the following are quotes from a popular post where someone shares their successful experience learning Japanese through a few thousand hours of joyful input:

“After around 21 months I was done with "learning Japanese" … I enjoyed what I read and watched, but I did not enjoy just progressing for the sake of learning Japanese… I dropped any form of vocabulary/grammar study as well as tracking my journey in detail… At that time, I had learned enough Japanese to just be able to watch/read what I want, understand and enjoy it. … I changed my whole view point from being motivated my goals to just doing what I really, honestly, genuinely and truly enjoy, no pressure and no goals. It almost felt like I was free… I think that if I'd have continued this goal-driven way I would have eventually quit, and I'm really glad I didn't. ... Language learning is all about time. We're talking about hundreds and thousands of hours to really get good. This time must be spent in an enjoyable way. If you're doing something for thousands of hours and you're having no fun, you're just torturing yourself.”

Notice that this Redditor alluded to Krashen’s distinction between language “learning” and “acquisition”. They also emphasize that input can be a joyful way of sustainable language learning, which resonates with my own experience and research. 

From this analysis of the posts from the Language Learning subreddit, we can get a general picture of how many learners have negative emotions towards output practices, ranging from mild anxiety to terror, whereas the value and joy of input has been acknowledged by some successful language learners.

Actionable Steps for Joyful and Effective Language Learning

To end this post, I want to provide a brief summary of the research and theory on the role of input in second language acquisition as well as actionable steps of how to improve language learning experience through joyful input. 

Stephen Krashen’s Comprehensible Input Hypothesis, despite its popularity and positive influence on many language learners, has been controversial in the field of second language acquisition. The pushback is usually focused on the scientific rigor of how the theory is phrased. For instance, the definition of “comprehensible” is unclear and makes the theory unfalsifiable (in other words, there are no ways to prove that this theory is wrong). Additionally, Krashen’s theory does not explain why input can lead to language acquisition. Other theories such as Distributional Semantic Theory, the same theory underlying the word2vec model mentioned above, provides a plausible explanation of why reading leads to growth in vocabulary, a key aspect of language learning. For those who are interested, I have introduced this theory to the field of second language acquisition in my recent publication in Applied Linguistics (Wang-Kildegaard & Ji, 2023). 

However, these debates above may not be of interest to learners who simply want to improve their language skills and language learning experience. The bottom line is that researchers in the field generally support the indispensable role of input. More importantly, the reality is that most language learners have not been exposed to sufficient amounts of input, and many are suffering from strong negative emotions related to forced output practice. Reading a lot and listening to content that you enjoy can be a major boost to language learning. This is your key to finding joy in learning a new language, just as it worked for me and many others. Here's how you can unlock the joy and power of input in language learning:

  1. Start with the Basics: If you're just beginning, consider taking an introductory language course or using vocabulary memorization apps to learn the most frequent words in the language you're learning. These words form the core of the language and are essential to get started.

  2. Immerse Yourself in Materials that Interest You: Look for content about topics that you are genuinely interested in (and better yet, have knowledge of). Whether it's books, movies, or podcasts, surround yourself with the language by reading and listening to it as much as possible. Find joy in the process, just like you would with any hobby.

  3. Consider Graded Readers: Many languages have graded readers that are designed for language learners at different levels and are fun to read, before you’re ready for materials that are written for native speakers.

  4. Enlist the Help of AI: If a piece of material that you are interested in reading is too difficult at the moment, consider using generative AI like ChatGPT to simplify it for you using vocabulary that you know.

  5. Don’t Force Output: If speaking or writing in the language feels stressful, focus on input through reading and listening. Speaking and writing will come more naturally with time as you receive a sufficient amount of input.

  6. Be Patient and Enjoy the Process: Language learning is a marathon, not a sprint. Enjoy the journey and don't rush yourself. Finding joy in the process can lead to better results and a more fulfilling experience.

I believe that understanding and spreading awareness about the power of reading could democratize language learning, which has implications on social justice. Many learners rely on costly test prep courses or tutoring, which can perpetuate socioeconomic disparities. Worse yet, many are suffering from strong negative emotions caused by these more “intentional” ways of learning language. On the other hand, extensive reading offers an enjoyable and cost-effective, and equitable method for language acquisition, potentially leveling the playing field for learners from diverse backgrounds.


Day, R. R., & Bamford, J. (1998). Extensive reading in the second language classroom. Cambridge University Press.

Günther, F., Rinaldi, L., & Marelli, M. (2019). Vector-space models of semantic representation from a cognitive perspective: A discussion of common misconceptions. Perspectives on Psychological Science, 14(6), 1006-1033.

Krashen, S. D. (1989). We acquire vocabulary and spelling by reading: Additional evidence for the input hypothesis. The Modern Language Journal, 73(4), 440-464.

Krashen, S. D. (2004). The power of reading: Insights from the research: Insights from the research. ABC-CLIO.

Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211.

Wang-Kildegaard, B.-W., Ji, F. (2023). Context synthesis accelerates vocabulary learning through reading: The implication of distributional semantic theory on second language vocabulary research. Applied Linguistics. DOI: 10.1093/applin/amad014