Data Manipulation and Cleaning

Leah Lee

Senior Data Science Fellow 2024-2025, Data Science Fellow 2023-2024

Integrative Biology

I am a PhD candidate in the department of Integrative Biology. My research interest is at the intersection of biomechanics, entomology, and physiology. Currently I am studying how beetles use their shield-like forewings called elytra for flight, thermoregulation, and protection. Prior to UC Berkeley, I worked as a research assistant at Korea Institute of Ocean Science and Technology (KIOST), studying algae phylogenetics. I received my B.A. in Biology and Mathematics from Swarthmore College.

Read more about Leah Lee

Alex Ramiller

Senior Data Science Fellow 2024-2025, Data Science Fellow 2023-2024

City and Regional Planning

I am a PhD Candidate in City and Regional Planning. My research focuses on the use of large administrative datasets to study residential mobility, neighborhood change, and housing access. I received a Master in Geography from the University of Washington and a Bachelor's in Economics and Geography from Macalester College. I have also consulted on analytical projects for several organizations including the San Francisco Federal Reserve Bank, PolicyLink, and the City of Seattle.

Read more about Alex Ramiller

Farnam Mohebi

Data Science Fellow 2023-2024, Data Science for Social Justice Senior Fellow 2024

Haas School of Business

I am a PhD student at the Haas School of Business, University of California, Berkeley, and a researcher in the Department of Radiation Oncology at the University of California, San Francisco, having previously earned my MD and MPH degrees. My research focuses on the intersection of professionals and emerging technologies, drawing from the fields of medical sociology, organizational theory, and science and technology studies. I am particularly fascinated by the evolving relationship between physicians and artificial intelligence, the phenomenon of physician influencers, and the social...

Read more about Farnam Mohebi

Valeria Ramírez Castañeda

Data Science for Social Justice Fellow (2024-2025)

Integrative Biology

Valeria Ramírez Castañeda is a Colombian biologist currently pursuing a PhD in the Department of Integrative Biology at the University of California, Berkeley. I completed my undergraduate degree in Biology at the National University of Colombia and earned a master's degree in Ecology and Evolution, as well as another in Science Communication. During her PhD, she is studying the interactions between snakes and frogs and how this influences the evolution of toxin resistance in snakes. She is also collaborating and leading projects regarding the consequences of English in science and the...

Read more about Valeria Ramírez Castañeda

R Machine Learning with tidymodels: Parts 1-2

October 14, 2024, 1:00pm

Machine learning often evokes images of Skynet, self-driving cars, and computerized homes. However, these ideas are less science fiction as they are tangible phenomena that are predicated on description, classification, prediction, and pattern recognition in data. During this two part workshop, we will discuss basic features of supervised machine learning algorithms including k-nearest neighbor, linear regression, decision tree, random forest, boosting, and ensembling using the tidymodels framework. To social scientists, such methods might be critical for investigating evolutionary relationships, global health patterns, voter turnout in local elections, or individual psychological diagnoses.

Read more about R Machine Learning with tidymodels: Parts 1-2

Python Data Wrangling and Manipulation with Pandas

October 10, 2024, 2:00pm

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Read more about Python Data Wrangling and Manipulation with Pandas

Python Data Wrangling and Manipulation with Pandas

September 27, 2024, 9:00am

Read more about Python Data Wrangling and Manipulation with Pandas

R Data Wrangling and Manipulation: Parts 1-2

October 1, 2024, 1:00pm

It is said that 80% of data analysis is spent on the process of cleaning and preparing the data for exploration, visualization, and analysis. This R workshop will introduce the dplyr and tidyr packages to make data wrangling and manipulation easier. Participants will learn how to use these packages to subset and reshape data sets, do calculations across groups of data, clean data, and other useful tasks.

Read more about R Data Wrangling and Manipulation: Parts 1-2

Stephanie Andrews

Availability: By appointment only

Consulting Areas: Python, SQL, HTML / CSS, Javascript, APIs, Databases & SQL, Data Manipulation and Cleaning, Data Science, Data Sources, Data Visualization, Digital Humanities, Machine Learning, Natural Language Processing, Software Tools, Text Analysis, Web Scraping, Bash or Command Line, Excel, Git or Github, Tableau

Read more about Stephanie Andrews

Theo Snow

Availability: By appointment only

Consulting Areas: Python, R, SQL, SAS, Databases & SQL, Data Manipulation and Cleaning, Data Science, Data Visualization, Geospatial Data, Maps & Spatial Analysis, Machine Learning, Mixed Methods, Qualitative methods, Surveys, Sampling & Interviews, Regression Analysis, Means Tests, Software Output Interpretation, Other, Excel, Git or Github, RStudio, RStudio Cloud, SAS, Tableau

Read more about Theo Snow

« first View: Taxonomy term
‹ previous View: Taxonomy term
1 of 14 View: Taxonomy term
2 of 14 View: Taxonomy term
3 of 14 View: Taxonomy term
4 of 14 View: Taxonomy term (Current page)
5 of 14 View: Taxonomy term
6 of 14 View: Taxonomy term
7 of 14 View: Taxonomy term
8 of 14 View: Taxonomy term
9 of 14 View: Taxonomy term
…
next › View: Taxonomy term
last » View: Taxonomy term