Statistics

Causal Effect Estimation in Observational Field Studies of Thermal Comfort

April 1, 2025
by Ruiji Sun. We introduce and apply regression discontinuity to thermal comfort field studies, which are typically observational. The method utilizes policy thresholds in China, where the winter district heating policy is based on cities' geographical locations relative to the Huai River. Using the regression discontinuity method, we quantify the causal effects of the experiment treatment (district heating) on the physical indoor environments and subjective responses of building occupants. In contrast, using conventional correlational analysis, we demonstrate that the correlation between indoor operative temperature and thermal sensation votes does not accurately reflect the causal relationship between the two. This highlights the importance of causal inference methods in thermal comfort field studies and other observational studies in building science where the regression discontinuity method might apply.

Looking Ahead: How Adolescents’ Consideration of Future Consequences Shapes Their Developmental Outcomes

March 25, 2025
by Elaine Luo. Adolescents constantly balance immediate impulses with long-term goals. Our research explored how adolescents differ in their tendency to think about immediate versus future consequences, and how these differences relate to academic performance, stress, and perceived life chances. Using Latent Profile Analysis, we identified three distinct groups: Indifferent (low consideration overall), Future-Focused (prioritizing future outcomes), and Dual-Focused (high consideration of both immediate and future outcomes). Results indicated the Dual-Focused adolescents had higher academic achievement, whereas the Future-Focused group perceived the most positive life prospects. A discussion on practical implications and future research direction for supporting balanced decision-making among adolescents is also provided.

Why Data Disaggregation Matters: Exploring the Diversity of Asian American Economic Outcomes Using Public Use Microdata Sample (PUMS) Data

February 11, 2025
by Taesoo Song. Asian Americans are often overlooked in discussions of racial inequality due to their high average socioeconomic attainment. Many academic and policy researchers treat Asians as a single racial category in their analysis. However, this broad categorization can mask significant within-group disparities, leaving many disadvantaged individuals without access to vital resources and policy support. Song emphasizes the importance of data disaggregation in revealing Asian American inequalities, particularly in areas like income and homeownership, and demonstrates how breaking down these categories can lead to more targeted and effective policy solutions.

Which Coin Should I Flip? The Multi-Arm Bandit

February 4, 2025
by Bruno Smaniotto. Consider the following game: You are given the option to choose between two coins to flip. These coins are possibly biased, so the probability of getting Heads for each coin might differ from 50/50. Each time that you flip Heads, you win one dollar. There are a total of 10 rounds. Which coin should you flip at each round? In this blog post, we will analyze this problem through the lens of a famous decision-making algorithm called the Multi-Arm Bandit, exploring how to structure the problem mathematically and how it can be solved for particular examples.

Field Experiments in Corporations

January 28, 2025
by Yue Lin. How do social science researchers conduct field experiments with private actors? Yue Lin provides a brief overview of the recent developments in political economy and management strategy, with a focus on filing field experiments within private corporations. Unlike conventional targets like individuals and government agencies, private companies are an emergent sweet spot for scholars to test for important theories, such as sustainability, censorship, and market behavior. After comparing the strengths and weaknesses of this powerful yet nascent method, Lin brainstorms some practical solutions to improve the success rate of field experimental studies. She aims to introduce a new methodological tool in a nascent research field and shed some light on improving experimental quality while adhering to ethical standards.

Finley Golightly

IT Support & Helpdesk Supervisor
Applied Mathematics

Finley joined D-Lab as full-time staff launching their career in Data Science after graduating with a Bachelor's degree in Applied Math from UC Berkeley.

They have been with D-Lab since Fall 2020, formerly as part of the UTech Management team before joining as full-time staff in Fall 2023. They love the learning environment of D-Lab and their favorite part of the job is their co-workers! In their free time, they enjoy reading, boxing, listening to music, and playing Dungeons & Dragons. Feel free to stop by the front desk to ask them any questions or...

What are Time Series Made of?

December 10, 2024
by Bruno Smaniotto. Trend-cycle decompositions are statistical tools that help us understand the different components of Time Series – Trend, Cycle, Seasonal, and Error. In this blog post, we will provide an introduction to these methods, focusing on the intuition behind the definition of the different components, providing real-life examples and discussing applications.

Language Models in Mental Health Conversations – How Empathetic Are They Really?

December 3, 2024
by Sohail Khan. Language models are becoming integral to daily life as trusted sources of advice. While their utility has expanded from simple tasks like text summarization to more complex interactions, the empathetic quality of their responses is crucial. This article explores methods to assess the emotional appropriateness of these models, using metrics such as BLEU, ROUGE, and Sentence Transformers. By analyzing models like LLaMA in mental health dialogues, we learn that while they suffer through traditional word-based metrics, LLaMA's performance in capturing empathy through semantic similarity is promising. In addition, we must advocate for continuous monitoring to ensure these models support their users' mental well-being effectively.

A Recipe for Reliable Discoveries: Ensuring Stability Throughout Your Data Work

November 19, 2024
by Jaewon Saw. Imagine perfecting a favorite recipe, then sharing it with others, only to find their results differ because of small changes in tools or ingredients. How do you ensure the dish still reflects your original vision? This challenge captures the principle of stability in data science: achieving acceptable consistency in outcomes relative to reasonable perturbations of conditions and methods. In this blog post, I reflect on my research journey and share why grounding data work in stability is essential for reproducibility, adaptability, and trust in the final results.

Exploring Rental Affordability in the San Francisco Bay Area Neighborhoods with R

November 5, 2024
by Taesoo Song. Many American cities continue to face severe rental burdens. However, we rarely examine rental affordability through the lens of quantitative data. In this blog post, I demonstrate how to download and visualize rental affordability data for the San Francisco Bay Area using R packages like `tidycensus` and `sf`. This exercise shows that mapping census data can be a straightforward and powerful way to understand the spatial patterns of housing dynamics and can offer valuable insights for research, policy, and advocacy.