Natural Language Processing (NLP)

Tactics for Text Mining non-Roman Scripts

April 15, 2024
by Hilary Faxon, Ph.D. & Win Moe. Non-Roman scripts pose particular challenges for text mining. Here, we reflect on a project that used text mining alongside qualitative coding to understand the politicization of online content following Myanmar’s 2021 military coup.

Addison Pickrell

IUSE Undergraduate Advisory Board
Mathematics
Sociology

Addison is an aspiring mathematician and social scientist (Class of '27). He loves collecting books he'll never read, is an open-source and open-access advocate, and an aspiring community organizer and systems disrupter. Ask me about community-based participatory action research (CBPAR), critical pedagogy, applied mathematics, and social science.

D-Lab & Graduate Division create inclusive data science summer program

August 9, 2023
by Vanessa Navarro Rodriguez. UC Berkeley's Social Sciences D-Lab and Graduate Division created the Data Science for Social Justice Program to address underrepresentation in data science. The program teaches diverse students critical data analysis and its applications in addressing societal injustices. The 8-week free summer course for admitted University of California students focuses on Python programming, Natural Language Processing, and value-informed data practices. It aims to empower students from underrepresented backgrounds and to bridge STEM with social justice. This blog post elaborates on the program's creation and features one of the DSSJ students, Robin López, and his reasons for participating.

Unlock the Joy and Power of Reading in Language Learning

August 21, 2023
by Bowen Wang-Kildegaard. I share my story of how reading for pleasure transformed my English speaking and writing skills. This experience inspired my passion to promote the joy and power of reading to all language learners. Using natural language processing techniques, I dive into the Language Learning subreddit, revealing a trend: Learners are often highly anxious about output practices, but are generally positive about input methods like reading and listening. I then distill complex language learning theories into actionable language learning tips, emphasizing the value of extensive reading for pleasure, pointing to potential methods like using ChatGPT for customization of reading materials, and advocating for joy in the learning journey.

My Summer Exploring Data Science for Social Justice: Learnings, Tensions & Recommendations

September 5, 2023
by Genevieve Smith. This summer I joined the D-Lab hosted Data Science for Social Justice workshop at UC Berkeley diving into Python – including TF-IDF, sentiment analysis, word embeddings, and more – with a lens towards leveraging data science for social justice. My team explored a Reddit channel on abortion and used computational analysis to answer key questions related to abortion access from before versus after Roe vs. Wade was overturned. Computational social science is incredibly powerful, but I continue to grapple with tensions particularly as it relates to employing machine learning and large language in international research, and end with key recommendations for CSS practitioners.

Peter Amerkhanian

Graduate Student Researcher (GSR), Instructor
Goldman School of Public Policy (GSPP)

I’m a D-Lab GSR and a graduate student in The Goldman School’s Master of Public Policy/The I School’s Graduate Certificate in Applied Data Science. I have 5 years of experience working on data problems in government and nonprofits. I’m interested in social policy, program evaluation, and computational methods. Python is my principal language, but I’ve developed experience using and teaching a variety of other tools, including R, Excel, Tableau, and JavaScript. I deeply enjoy teaching data science methods and am excited to be a part of the D-Lab.

Abhishek Roy

IUSE Undergraduate Advisory Board
Economics
Data Science

I'm Abhishek Roy and I'm double majoring in Economics and Data Science. I've been a part of D-Lab's IUSE project since Spring 2020 and have truly found an organization that is not only passionate about Data Science but also strives to expand its reach equitably to all communities. I am involved in Research and Project Management roles in various departments and labs at Berkeley and I'm an Editor at the Berkeley Economic Review. I love diving into anything at the intersection of Data Science, Economics, Business, and Computational Social Science. Whenever I'm free, I love writing...

Working with State-of-the-Art NLP Models: A Friendly Introduction to Hugging Face

December 13, 2021

We often read about the many new advancements being made in the field of Natural Language Processing (NLP). Each month, leading organizations release new models that seem like magic to us, such as models that can write it’s own code based on user prompts [1] or are able to help answer our queries when we use Google Search [2]. Large AI research groups like OpenAI and Google spend many years and pour millions of...

Ilya Akdemir

Data Science Fellow
School of Law

Ilya is a JSD candidate at UC Berkeley School of Law. His research focuses on natural language processing and machine learning applications that are motivated by both theoretical and practical questions in the legal domain.

Brooks Jessup, Ph.D.

Data Science Fellow
History

Brooks received his Ph.D. in History from UC Berkeley and was trained in Data Science at General Assembly. His work applies digital tools and methods to the study of modern cities and urban issues. At D-Lab, he teaches and consults on data analytics, machine learning, geospatial analysis, and natural language processing with Python and SQL.