Data Science
Overview
Data Science, the art and science of extracting information from datasets, is an interdisciplinary field that offers many exciting and challenging opportunities. Formed from the amalgamation of Computer Science, Statistics, and Mathematics, Data Science aims to solve real-world problems by revealing information hidden in data. As today’s businesses and IT systems continue to produce massive and ever-increasing amounts of digital data, the need for data scientists is greater than ever. Whether you are interested in analyzing consumer transactions, tweets, call data records, text corpuses, or sounds in nature, or creating stunning data visualizations, you will find that the concepts, techniques, and tools covered in our Data Science program will be extremely useful in a wide range of industrial domains and disciplines. These skills will also form a strong foundation for advanced graduate studies
The area of concentration starts with courses that form the foundational knowledge and skills in mathematics, statistics, computer programming and databases, followed by more advanced courses that depend on the discipline-specific track. We offer three such tracks: Computational, Statistical and Mathematical Data Science, and a student will get to decide their advanced area of focus among those three options. Students will then have to carry out a data science thesis under the supervision of two faculty from their chosen track (Computational, Statistical or Mathematical), and at least one faculty from another Data Science track.
We also offer a Secondary Field in Data Science option, which is primarily aimed at students who are interested in obtaining skills and hands-on experience of working with data in their respective fields of application. [NOTE: Due to large curricular overlaps, students already pursuing an AOC in either of Computer Science, Statistics or Applied Mathematics - the three core disciplines that primarily constitute Data Science - are encouraged to do a Secondary field in another core discipline instead (e.g., Computer Science AOC with Secondary field in Statistics).]
Faculty in Data Science
Andrey Skripnikov, Associate Professor of Applied Statistics and Data Science [AOC coordinator, main point of contact]
David Gillman, Associate Professor of Computer Science
Patrick McDonald, Professor of Mathematics
Melissa Crow, Instructor of Statistics
Tania Roy, Associate Professor of Human Centered Computing
Necmettin Yildirim, Professor of Mathematics/Soo Bong Chae Chair of Applied Mathematics
Fahmida Hamid, Assistant Professor of Computer Science
Vlad Serban, Assistant Professor of Mathematics
Rohan Loveland, Assistant Professor of Computer Science and Data Science
Mans Hulden, Assistant Professor of Computer Science
Toby Wade, Assistant Professor of Statistics and Data Science
Christopher Kottke, Associate Professor of Mathematics (On Leave)
Gil Salu, Visiting Assistant Professor of Computer Science
Bernhard Klingenberg, Professor of Statistics/ Director of Data Science Masters Program
Requirements for the AOC in Data Science
A minimum of twelve and a half (12.5) academic units, with eight and half (8.5) units of core requirements and four (4) units of discipline-based track requirements. Student has to pick one discipline-based track (Computational, Statistical or Mathematical).
| Code | Title |
|---|---|
| Core requirements (8.5 units) | |
| Computer Science (3 units): | |
| COP 2047 | Introduction to Programming in Python (Introduction to Programming in Python) |
| DATA 3130 | Databases for Data Science |
| CSCI 3370 | Machine Learning |
| or CSCI 4200 | Artificial Intelligence |
| or CSCI 4210 | Artificial Intelligence and Data Mining |
| Statistics (3 units): | |
| STA 2023 | Introduction to Applied Statistics (Dealing with Data I) 1 |
| STA 3024 | Dealing with Data II |
| STA 3100 | R for Data Science |
| or DATA 3110 | Data Munging and Exploratory Data Analysis |
| Mathematics (2.5 units): | |
| MAC 2311 | Calculus I 1 |
| STA 2442 | Probability I |
| MAS 3105 | Advanced Linear Algebra |
| Computational Data Science Track (4 units): | |
Required courses for the track (1 unit): | |
| Object Oriented Programming | |
Electives for the track (select 3 units): | |
| Distributed Computing | |
| Algorithms | |
| Full Stack Application Development | |
| Data Structures (Data Structures) | |
| Object Oriented Design (Object-Oriented Design) | |
| Natural Language Processing | |
| Software Engineering | |
| Reinforcement Learning | |
Faculty-approved internship or Research Experience for Undergraduates (REU) - 0.5 unit | |
| Statistical Data Science Track (4 units) | |
Electives for the track (select 4 units): | |
| Statistical Learning | |
| Data Visualization and Communication | |
| Applied Linear Models | |
| Financial Markets Modeling using Machine Learning | |
| Applied Time Series Analysis | |
| Probability II | |
Faculty-approved internship or Research Experience for Undergraduates (REU) - 0.5 unit | |
| Mathematical Data Science Track (4 units) | |
Required courses for the track (0.5 unit): | |
| Probability II | |
Electives for the track (select at least 3.5 units): | |
| Calculus II | |
MATH 2250 | |
| Introduction to Number Theory | |
MATH 3220 | |
| Ordinary Differential Equations | |
| Mathematical Modeling | |
| Mathematics Seminar | |
| Introduction to Numerical Methods | |
| Partial Differential Equations | |
Faculty-approved internship or Research Experience for Undergraduates (REU) - 0.5 unit | |
- 1
AP or IB credit may be counted towards that requirement. Please reach out to the program coordinator to confirm.
NOTES:
- If a student is pursuing an AOC in one of the following three disciplines - Computer Science, Statistics, Applied Mathematics - and is also interested in doing a double AOC with Data Science, they will have to do a track that is different from their AOC discipline. E.g., a student pursuing Computer Science AOC can only do a double AOC with Statistical or Mathematical Data Science tracks.
- If a student pursuing Data Science AOC is also interested in doing a Secondary Field in one of Computer Science, Statistics or Applied Mathematics, they can only do a Secondary Field in an area that's different from their Data Science track. E.g., a student pursuing Computational Data Science track can only do a Secondary Field in Statistics or in Applied Mathematics.
Requirements for a Secondary Field in Data Science
A minimum of five and a half (5.5) academic units.
| Code | Title |
|---|---|
| Computer Science (2 units): | |
| COP 2047 | Introduction to Programming in Python (Introduction to Programming in Python) |
| CSCI 3370 | Machine Learning |
| or CSCI 4200 | Artificial Intelligence |
| or CSCI 4210 | Artificial Intelligence and Data Mining |
| Statistics (2 units): | |
| STA 2023 | Introduction to Applied Statistics (Dealing with Data I) 1 |
| STA 3100 | R for Data Science |
| or DATA 3110 | Data Munging and Exploratory Data Analysis |
| Mathematics (1.5 units): | |
| STA 2442 | Probability I |
| MAS 3105 | Advanced Linear Algebra |
- 1
AP or IB credit may be counted towards that requirement. Please reach out to the program coordinator to confirm.
NOTE: Due to heavy curricular overlap, that Secondary Field degree is not available to students already pursuing an AOC in one of the three disciplines - Computer Science, Statistics, Applied Mathematics. Instead, to enhance their Data Science experience and credentials, those students are encouraged to pursue a Secondary Field in another discipline out of those three (e.g., a Computer Science AOC student is encouraged to pursue a Secondary Field in either Statistics or Applied Mathematics).
Applied Data Science Masters program and the 3+2 pathway
If interested in the Applied Data Science Masters program, or the 3+2 pathway, please contact the Director of the Data Science Masters Program Dr. Bernhard Klingenberg at your earliest convenience.
Sample Pathways
Sample Four-Year Pathway
| First Year | |||||
|---|---|---|---|---|---|
| Fall Term | Spring Term | ||||
| Dealing with Data 1 | Dealing with Data 2 | ||||
| Intro. to Programming in Python | Linear Algebra | ||||
| Calculus 1 | |||||
| Second Year | |||||
| Fall Term | Spring Term | ||||
| Databases for Data Science | Machine Learning / Artificial Intelligence | ||||
| Probability 1 (0.5 unit) | Track-based course #1 | ||||
| R for Data Science | |||||
| Third Year | |||||
| Fall Term | Spring Term | ||||
| Track-based course #2 | Track-based course #4 | ||||
| Track-based course #3 | |||||
| Fourth Year | |||||
| Fall Term | ISP | Spring Term | |||
| Thesis | Thesis | Thesis | |||
Sample Two-Year Pathway
This pathway assumes a student has completed an introductory programming course in Python, an introductory statistics course, Calculus 1, and either Linear Algebra or an intermediate statistics course.
| First Year | |||||
|---|---|---|---|---|---|
| Fall Term | Spring Term | ||||
| Databases for Data Science | Machine Learning / Artificial Intelligence | ||||
| Probability 1 | Either Dealing with Data 2 OR Linear Algebra | ||||
| R for Data Science | Track-based course #1 | ||||
| Second Year | |||||
| Fall Term | ISP | Spring Term | |||
| Track-based course #2 | Thesis | Track-based course #4 | |||
| Track-based course #3 | Thesis | ||||
| Thesis | |||||
Requirements for 3+2 Pathway for Combined Undergraduate + Graduate Degrees (BA and MS in Data Science)
If interested in the 3+2 pathway, or the Applied Data Science Masters program in general, please contact the Director of the Data Science Masters Program Dr. Bernhard Klingenberg at your earliest convenience.
Data Science Facilities
New College has a number of servers that support students and faculty in the computer science and data science programs. These include 5 HP physical servers with NVIDIA graphics processing units (Tesla, Titan X, and 1080 Ti); 1 SuperMicro physical server with 4 NVIDIA graphics processing units (Quadro RTX 6000); 1 SuperMicro physical server with 4 NVIDIA graphics processing units (RTX A5000 and 1080 Ti); and 12 virtual servers used in a variety of computer science, data science, and statistics courses.