Software Output Interpretation

Theo Snow

Availability: By appointment only

Consulting Areas: Python, R, SQL, SAS, Databases & SQL, Data Manipulation and Cleaning, Data Science, Data Visualization, Geospatial Data, Maps & Spatial Analysis, Machine Learning, Mixed Methods, Qualitative methods, Surveys, Sampling & Interviews, Regression Analysis, Means Tests, Software Output Interpretation, Other, Excel, Git or Github, RStudio, RStudio Cloud, SAS, Tableau

Anusha Bishop

Availability: By appointment only

Consulting Areas: Python, R, Cloud & HPC Computing, Data Sources, Data Visualization, Geospatial Data, Maps & Analysis, Machine Learning, Research Design, Cluster analysis, Experimental design, Hierarchical Models, High dimensional statistics, Means Tests, Nonparametric methods, Regression Analysis, Software Output Interpretation, Spatial statistics, Bash or Command Line, Excel, Git or Github, RStudio

Sohail Khan

Data Science Fellow 2024-2025
School of Information

Hey everyone, I’m Sohail - a 1st years Master’s student studying Data Science at the I-School. I am interested in the intersection between Computer Science, Data Science, and Cognitive Psychology and using these tools to understand, discover, and drive the development of assistive technologies.

I have experience building with brain computer Interfaces, developing distributed data processing applications, and am currently working on a large scale archival project aimed at preserving the history and memory of resistance movements through an embedding based...

Yue Lin

Data Science Fellow 2024-2025
Political Science

Yue is a Ph.D. student in Political Science at the University of California, Berkeley, with a Designated Emphasis on Political Economy. Using mixed methods, she studies foreign lobbying, geopolitical risk, and economic security to understand when, how, and why multinational corporations become the targets and weapons of state power rivalry.

Alex Stephenson

Senior Data Science Fellow
Political Science

I am a Ph.D. Student in the Travers Department of Political Science. My primary research interests are military organizations, policing, the determinants of political violence, and causal inference. I am also interested in creating tools to make software easier to use for non-technical political scientists.

Amanda Glazer

Instructor
Statistics

Amanda is a PhD candidate in the statistics department at Berkeley. Her research focuses on causal inference with applications in education, political science and sports. Previously she earned her Bachelor’s degree in mathematics and statistics, with a secondary in computer science, from Harvard.

Chirag Manghani

Consultant
School of Information

Chirag is a 2nd year graduate at the I-School. Proficient in Python, Java, R, and SQL, he navigates software application development, machine learning and data science. His keen interest lies in data analysis and statistical methods, driving him to bridge theory and practice seamlessly. Chirag's dedication to excellence, adaptable mindset, and innate curiosity define him as a dynamic problem solver in the ever-evolving tech landscape.

Deibi Sibrian

Data Science for Social Justice Fellow 2024
Deibi is a Ph.D. student in the Department of Environmental Science, Policy, and Management, centering critical interdisciplinary ecology and multispecies justice. Deibi coined the term "Cryptonocene," an interdisciplinary framework, to study the socio-environmental health impacts of cryptocurrencies and related technologies, such as AI. With over two years of experience as a graduate instructor, Deibi now is a Graduate Student Researcher, NSF Digital Transformation Fellow, and Mentored Research Fellow. Before joining Berkeley, Deibi was the project manager for an interdisciplinary team...

Propensity Score Matching for Causal Inference: Creating Data Visualizations to Assess Covariate Balance in R

June 10, 2024
by Sharon Green. Although some people consider randomized experiments the gold standard, in many cases, it would be highly unethical to assign individuals to harmful exposures to measure their effects. Modern causal inference techniques help scientists to estimate treatment effects using observational data. In particular, propensity score matching helps scientists estimate causal effects using observational data by matching individuals so that the “treatment” and “control” groups are balanced on measured covariates. After implementing propensity score matching, data visualizations make it easier to assess the quality of the matches before estimating effects. This blog post is a tutorial for implementing propensity score matching and creating data visualizations to assess covariate balance–that is, visually assessing whether the matched individuals are balanced with respect to measured covariates.

Tactics for Text Mining non-Roman Scripts

April 15, 2024
by Hilary Faxon, Ph.D. & Win Moe. Non-Roman scripts pose particular challenges for text mining. Here, we reflect on a project that used text mining alongside qualitative coding to understand the politicization of online content following Myanmar’s 2021 military coup.