Today something comes out of the shadows and into the light.

The Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation are announcing a $37M investment to build “data science environments” for academic research. Through a year-long selection process, three universities have been chosen as partners. Berkeley is one of them.

It’s been an incredible ride, and it’ll be exciting to be part of what is about to unfold.

To say one thing up front: “Data science” is an indefinite term. People argue about whether it’s a set of intersecting competencies (shorthand: Venn diagram), an assemblage of communities and practices, a discipline-in-formation, or platform for statistically informed, computationally intensive research. Or whether it’s a “thing” at all. However you think about it, the ways we do social science are being opened up to change, just like the ways we do other domain-area research. I’ll leave others to reflect on the invocation of scientific revolution. What I find enticing is the invitation to institutional change.

Pointedly, Moore and Sloan are investing in the institutional and cultural environment for data science. They care about data science as a means to an end, and they care about making it stick. They care especially about creating a sustainable space for it inside universities, with all their existing structures and paths.Here are the three goals of Moore-Sloan (from today’s press release):

  • Develop meaningful and sustained interactions and collaborations between researchers with backgrounds in specific subjects (such as astrophysics, genetics, economics) and in the methodology fields (such as computer science, statistics and applied mathematics), with the specific aim of recognizing what it takes to move each of the sciences forward.

  • Establish career paths that are long-term and sustainable, using alternative metrics and reward structures to retain a new generation of scientists whose research focuses on the multi-disciplinary analysis of massive, noisy, and complex scientific data and the development of the tools and techniques that enable this analysis.

  • Build on current academic and industrial efforts to work towards an ecosystem of analytical tools and research practices that is sustainable, reusable, extensible, learnable, easy to translate across research areas and enables researchers to spend more time focusing on their science.

These are ambitious goals. They’re also really smart ones. Any social scientist will recognize they’re way harder to achieve than just happily saying, “Do good data science.” Institutional and cultural change is at the core of it. If universities want to play this game, they are going to have to try out new ways of doing things and tackle “academia’s disconnect.”

Berkeley is one of the places this experiment will happen, along with the University of Washington and NYU. In December there will be a public launch of the Berkeley Institute for Data Science (BIDS - for now, here). BIDS is still being defined, but it will sit at the center of a landscape stretching across campus institutes and domain areas. D-Lab will be one of the “data science” touch-downs in the social sciences. D-Lab supports a broad and integrated spectrum of methods and approaches – it starts from whatever ways Berkeley researchers choose to define “data intensive social science.” Part of D-Lab’s portfolio is assisting faculty, grad students, postdocs, and staff who want to play in the “data science” space, and we’ll be developing new on-ramps, working groups, and tools.

We’ll have more to say about this over the next months, and there are many questions to answer about how BIDS will work. For now, two institutional observations:

1) It’s energizing how the social and behavioral sciences have been welcomed - in our Moore-Sloan discussions and specifically here. Berkeley stands for research excellence across every domain. The strength of Berkeley social science, D-Lab’s institutional footing, and our advocacy for organizational change have shaped how BIDS conceives its domain. Along with opportunities for social scientists doing data science, BIDS includes data science ethnography and social scientific evaluation. This is an effort to be reflexive about data science, its culture, and its institutions.

2) Going through the Moore/Sloan process has reminded me of what Berkeley can do, and what it can do is amazing. It’s inspiring to connect with the IPython team and the AMPLab, with researchers in computational biology and cosmology and astrophysics and statistics and the ISchool and so many other domains. BIDS embodies the Berkeley ethos of experimentation, open-source engagement, and attentiveness to innovative thinking across the data science ecosystem. We’re building the new Berkeley here. There’s no place like it on earth.


For more about social science and BIDS,

For more information about the announcement: