Quantitative Analysis

Python Web Scraping

June 26, 2024, 10:00am

In this workshop, we cover how to scrape data from the web using Python. Web scraping involves downloading a webpage's source code and sifting through the material to extract desired data.

Read more about Python Web Scraping

In this workshop, we cover how to extract data from the web using Python. We focus on two approaches to extracting data from the web: leveraging application programming interfaces (APIs) and web scraping.

Read more about Python Web Scraping

Python Data Wrangling and Manipulation with Pandas

May 4, 2023, 1:00pm

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Read more about Python Data Wrangling and Manipulation with Pandas

Python Web APIs

February 8, 2024, 10:00am

In this workshop, we cover how to extract data from the web with APIs using Python. APIs are often official services offered by companies and other entities, which allow you to directly query their servers in order to retrieve their data. Platforms like The New York Times, Twitter and Reddit offer APIs to retrieve data.

Read more about Python Web APIs

Python Data Wrangling and Manipulation with Pandas

September 20, 2021, 1:00pm

Read more about Python Data Wrangling and Manipulation with Pandas

Python Web Scraping & APIs

November 2, 2022, 3:00pm

Read more about Python Web Scraping & APIs

R Machine Learning with tidymodels: Parts 1-2

October 9, 2023, 2:00pm

Machine learning often evokes images of Skynet, self-driving cars, and computerized homes. However, these ideas are less science fiction as they are tangible phenomena that are predicated on description, classification, prediction, and pattern recognition in data. During this two part workshop, we will discuss basic features of supervised machine learning algorithms including k-nearest neighbor, linear regression, decision tree, random forest, boosting, and ensembling using the tidymodels framework. To social scientists, such methods might be critical for investigating evolutionary relationships, global health patterns, voter turnout in local elections, or individual psychological diagnoses.

Read more about R Machine Learning with tidymodels: Parts 1-2

Skyler Yumeng Chen

Data Science for Social Justice Fellow 2024

Haas School of Business

Skyler is a Ph.D. student in Behavioral Marketing at the Haas School of Business. Her research centers on consumer behavior and judgment and decision-making, with a keen interest in both experimental methods and data science techniques. She holds a B.A. in Economics and a B.S. in Data Science from New York University Shanghai.

Read more about Skyler Yumeng Chen

Propensity Score Matching for Causal Inference: Creating Data Visualizations to Assess Covariate Balance in R

June 10, 2024

Sharon Green

by Sharon Green. Although some people consider randomized experiments the gold standard, in many cases, it would be highly unethical to assign individuals to harmful exposures to measure their effects. Modern causal inference techniques help scientists to estimate treatment effects using observational data. In particular, propensity score matching helps scientists estimate causal effects using observational data by matching individuals so that the “treatment” and “control” groups are balanced on measured covariates. After implementing propensity score matching, data visualizations make it easier to assess the quality of the matches before estimating effects. This blog post is a tutorial for implementing propensity score matching and creating data visualizations to assess covariate balance–that is, visually assessing whether the matched individuals are balanced with respect to measured covariates.

Read more about Propensity Score Matching for Causal Inference: Creating Data Visualizations to Assess Covariate Balance in R

Enhancing Research Transparency Inspired by Grounded Theory

April 30, 2024

Farnam Mohebi

by Farnam Mohebi. Grounded theory, a powerful tool for qualitative analysis, can enhance data science research by improving transparency and impact. Researchers can create a vivid record of their process by meticulously documenting the entire research journey, including the decisions they make and the corresponding rationale behind them, from initial data exploration to developing and refining theories. Embracing grounded theory principles, such as iterative coding and constant comparison, can help data scientists build robust, data-driven theories while ensuring transparency throughout the research process. This approach makes research more replicable and understandable and invites others to engage with the work, fostering collaboration and constructive critique, ultimately elevating the value and reach of their findings.

Read more about Enhancing Research Transparency Inspired by Grounded Theory

« first View: Taxonomy term
‹ previous View: Taxonomy term
1 of 8 View: Taxonomy term
2 of 8 View: Taxonomy term
3 of 8 View: Taxonomy term
4 of 8 View: Taxonomy term
5 of 8 View: Taxonomy term
6 of 8 View: Taxonomy term
7 of 8 View: Taxonomy term (Current page)
8 of 8 View: Taxonomy term
next › View: Taxonomy term
last » View: Taxonomy term