Story

Student-Led Course Sparks Interest in Data Science

Walk into Gates Hall G01 on Wednesdays at 5 p.m. and you’ll see a crowd of students gathered in their seats. A lecturer at the front of the room runs through carefully prepared slides, describing machine learning algorithms and data processing techniques. Students take notes and ask questions, which are answered in detail on the chalkboard. This scene looks and sounds like a typical Cornell lecture, aside from one detail: all of the material is prepared and taught by students.

The Cornell Data Science Training Program is a one-credit unofficial course offered through the Cornell Data Science project team. The twelve-week course focuses on manipulating data, visualizing trends, and implementing machine learning algorithms. Students are not expected to have any programming experience and are taught the R programming language to supplement the concepts they encounter. They then use their new skills to complete four assignments and two real-world data science projects.

The course materials are the work of Dae Won Kim ‘17, Amit Mizrahi ‘19, Chase Thomas ‘19, Kenta Takatsu ‘19, and Jared Lim ‘20. The five noticed a demand in the undergraduate community for a practical, hands-on introduction to data science. The group hopes to equip students with the tools and knowledge needed to take on their own projects.

“The Data Science Training Program can be more sensitive to trending industry standards because we’re not bound by the same constraints as College courses and we are positioned to provide a more industry-sensitive introduction to data science.”

The course is aimed at freshmen and sophomores, representing a variety of majors and interests. “Everything today from marketing, to healthcare, to finance, to agriculture uses data science,” said Thomas. “There’s a lot of buzz around data science right now; we saw the course as an opportunity to provide an introduction accessible to people from all backgrounds.”

Over winter break the group collaborated to create lecture slides, notes, assignments, and projects from scratch. The team members drew on past experiences such as working in data science internships and taking online courses from peer institutions like MIT and Berkeley.

“With the widespread growth of big data tools, there has been no better time to learn data science,” Mizrahi said. “It’s exciting to be able to help teach a field that hardly existed fifteen or twenty years ago.”

The course will continue through May and will be offered again next semester. Cornell Data Science has been selected as the co-winner of the 2017 College of Engineering Alumni Association Albert R. George Student Team Award.