Data Manipulation and Cleaning

R Data Wrangling and Manipulation: Parts 1-2

March 19, 2024, 9:00am

It is said that 80% of data analysis is spent on the process of cleaning and preparing the data for exploration, visualization, and analysis. This R workshop will introduce the dplyr and tidyr packages to make data wrangling and manipulation easier. Participants will learn how to use these packages to subset and reshape data sets, do calculations across groups of data, clean data, and other useful tasks.

Read more about R Data Wrangling and Manipulation: Parts 1-2

Python Data Wrangling and Manipulation with Pandas

May 4, 2023, 1:00pm

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Read more about Python Data Wrangling and Manipulation with Pandas

R Data Wrangling and Manipulation

November 16, 2021, 12:00pm

Read more about R Data Wrangling and Manipulation

Grace Hu

Data Science for Social Justice Fellow 2024

Bioengineering

Grace is a 3rd year Bioengineering PhD candidate in the joint UC Berkeley-UCSF Graduate Program. Her research lies at the nexus of computational design and 3D-bioprinting to advance tissue engineering for regenerative medicine. She previously studied Materials Science and Engineering (B.S.) and Computer Science (M.S.) at Stanford University, where she investigated printable batteries to power an ultra-affordable scanning electron microscope and explored computer science education research by developing AI models to augment teaching ability.

In her free time she...

Read more about Grace Hu

Hugh Kadhem

Mathematics

Hugh Kadhem is a Ph.D. student in Applied Mathematics, with broad research interests in computational quantum physics and high-performance scientific computing.

Read more about Hugh Kadhem

Introduction to Propensity Score Matching with MatchIt

April 1, 2024

Alex Ramiller

by Alex Ramiller. When working with observational (i.e. non-experimental) data, it is often challenging to establish the existence of causal relationships between interventions and outcomes. Propensity Score Matching (PSM) provides a powerful tool for causal inference with observational data, enabling the creation of comparable groups that allow us to directly measure the impact of an intervention. This blog post introduces MatchIt – a software package that provides all of the necessary tools for conducting Propensity Score Matching in R – and provides step-by-step instructions on how to conduct and evaluate matches.

Read more about Introduction to Propensity Score Matching with MatchIt

What Are Vowels Made Of? Graphing a Classic Dataset with R

February 13, 2024

Anna Björklund

by Anna Björklund. Vowels are all around us. Mainstream US English has around twelve unique vowels. How can our brains tell these sounds apart? This blog post will help you answer this question by plotting vowel data from a classic American English dataset by Peterson and Barney (1952).

Read more about What Are Vowels Made Of? Graphing a Classic Dataset with R

How can we use big data from iNaturalist to address important questions in Entomology?

February 26, 2024

Leah Lee

by Leah Lee. Large-scale geographic data over time on insect diversity can be used to answer important questions in Entomology. Open-source, open-access citizen science platforms like iNaturalist generate huge amounts of data on species diversity and distribution at accelerating rates. However, unstructured citizen science data contain inherent biases and need to be used with care. One of the efforts to validate big data from iNaturalist is to cross-check with systematically collected data, such as museum specimens.

Read more about How can we use big data from iNaturalist to address important questions in Entomology?

Creating the Ultimate Sweet

January 30, 2024

Emma Turtelboom

by Emma Turtelboom. What is the best Halloween candy? In this blog post, we will identify attributes of popular sweets and create a model to understand how these attributes influence the popularity of the sweet. We’ll discuss alternative model approaches and potential drawbacks, as well as caveats to interpreting the predictions of our model.

Read more about Creating the Ultimate Sweet

Addison Pickrell

IUSE Undergraduate Advisory Board

Mathematics

Sociology

Addison is an aspiring mathematician and social scientist (Class of '27). He loves collecting books he'll never read, is an open-source and open-access advocate, and an aspiring community organizer and systems disrupter. Ask me about community-based participatory action research (CBPAR), critical pedagogy, applied mathematics, and social science.

Read more about Addison Pickrell

« first View: Taxonomy term
‹ previous View: Taxonomy term
…
5 of 13 View: Taxonomy term
6 of 13 View: Taxonomy term
7 of 13 View: Taxonomy term
8 of 13 View: Taxonomy term
9 of 13 View: Taxonomy term
10 of 13 View: Taxonomy term
11 of 13 View: Taxonomy term
12 of 13 View: Taxonomy term (Current page)
13 of 13 View: Taxonomy term
next › View: Taxonomy term
last » View: Taxonomy term