Text Analysis

Spencer Le

Data Peer Consultant, UTech
Computer Science
Data Science

I am a senior majoring in Computer Science and minoring in Data Science. I love crunching down big data and analyzing it in order to help solve real-life issues. In my free time, I like jamming out to music, drawing, studying history, and posting on my foodstagram. If you have any questions regarding Computer Science or Data Science, please stop by!

Twitter data extraction with Selenium

March 1, 2022

Introduction

With online communities and social networks serving as important sites for computational social science research, Twitter has quickly become a popular data source for researchers (Frey et al. (2020), Kusen et al. (2017), Rao et al. (2010) and Ru et al. (2021)). This blog post will demonstrate one way to extract twitter data without using the Twitter API. This is especially useful for researchers who are new to exploring the use of Twitter data in their research, looking to develop a baseline corpus for a research question they are newly...

PoliPy: A Python Library for Scraping and Analyzing Privacy Policies

February 8, 2022

In light of recent scandals involving the misuse and improper handling of personal data by large corporations, advocacy groups and regulators alike have given increased attention to the issue of consumer privacy [e.g., 1, 2, 3, 4, 5]. National and local governments have been enacting privacy legislation that requires companies to minimize the amount of data they collect, deters the collection of sensitive data, limits the purposes for which the data are used, and critically, gives users more transparency into data collection and use.

As part...

Python Text Analysis Fundamentals: Parts 1-3

February 15, 2022, 2:00pm
This three-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications.

Jennifer Kaplan

Consultant
French

Jennifer is a first-year graduate student in the Romance Languages & Literatures program here at Berkeley. She has experience conducting ethnographic fieldwork and is passionate about qualitative research methods.

Python Text Analysis Fundamentals: Parts 1-3

November 8, 2021, 12:00pm
This three-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications.

Text Analysis for Public Health

October 5, 2021
October 5th, 2021 - another day in the global pandemic. Average Joes are busy tweeting about it, politicians give interviews on the latest plans, and newspapers publish article after article on vaccination levels, case counts, and the booster shot. That’s a ton of information. So much in fact, that it would be pretty nice to have some computer assisted help to sort through it. Enter stage right: text analysis. Just what is it, and in the midst of COVID-19, how can it be used to advance public health? Text analysis is a family of analytic techniques used to identify patterns and meaning from unstructured text, that is, text that a computer can’t readily understand. Aka, most qualitative data. And there is a lot of that sort of data floating around. We’re talking tweets, Reddit posts, and emails, but also electronic health records (EHRs), books, and even academic research. You’ll probably agree that in that list alone, there’s a lot of valuable data!

Python Text Analysis Fundamentals: Parts 1-3

September 21, 2021, 10:00am
This three-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications.

Ilya Akdemir

Data Science Fellow
School of Law

Ilya is a JSD candidate at UC Berkeley School of Law. His research focuses on natural language processing and machine learning applications that are motivated by both theoretical and practical questions in the legal domain.

Brooks Jessup, Ph.D.

Data Science Fellow
History

Brooks received his Ph.D. in History from UC Berkeley and was trained in Data Science at General Assembly. His work applies digital tools and methods to the study of modern cities and urban issues. At D-Lab, he teaches and consults on data analytics, machine learning, geospatial analysis, and natural language processing with Python and SQL.