Data science at UC Berkeley: 2,000 undergraduates, 50 majors, no command line
Engaging critically with data is now a required skill for students in all areas, but many traditional data science programs arent easily accessible to those without prior computing experience. Gunjan Baid and Vinitra Swamy explore UC Berkeley's Data Science program2,000 students across 50 majorsexplaining how its pedagogy was designed to make data science accessible to everyone.
Talk Title | Data science at UC Berkeley: 2,000 undergraduates, 50 majors, no command line |
Speakers | Gunjan Baid (UC Berkeley), Vinitra Swamy (UC Berkeley) |
Conference | JupyterCon in New York 2017 |
Conf Tag | |
Location | New York, New York |
Date | August 23-25, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Engaging critically with data is now a required skill for students in all areas, but many traditional data science programs aren’t easily accessible to those without prior computing experience. Gunjan Baid and Vinitra Swamy explore UC Berkeley’s Data Science program, which has no math, computing, or statistics prerequisites and is designed to be accessible to students of all backgrounds. At the introductory level, the program consists of a fundamentals course that introduces students to concepts of computer programming and statistics, and there is a diverse set of connector courses that allow students to apply data science to their area of interest, such as geography, immunotherapy, or cognitive science. Using Jupyter notebooks, students are able to get hands-on experience working with data without the burden of setting up and maintaining a development environment. The program has developed a tool that allows students to obtain notebooks and datasets for an assignment with one click, and autograding, user authentication, and submission are all done through Jupyter notebooks, enabling instructors to focus on real-world issues, such as racial profiling and California water usage, instead of the technical details surrounding the computing infrastructure. The effectiveness of this approach is shown by the numbers: over 2,000 students across 50 majors have taken the fundamentals course and the connector courses in the past four semesters. Gunjan and Vinitra explain the program in more detail and expand upon the pedagogical challenges faced in scaling Jupyter notebooks for use in large courses. They conclude by discussing how the program’s vision can be applied more generally for teaching data science using Jupyter at other universities and institutions.