Tutorials

Find tutorials from past quarters at github.com/datascienceucsc/workshops

Software

Slack

We primarly communicate through Slack at datascienceucsc.slack.com. Join here.

Git

We collaborate though Github at datascienceucsc. This hosts most of our past work, including competitions and workshops.

To get access to our Github organization, message an officer on Slack or on Instagram @datascienceuscs with your Github username and email.

In addition, we recommend installing git locally.

Development Environments

Workings on projects will require access to a development environment for Python or R.

Python

We recommend one of the following three options to set up a development environment for Python and Jupyter Lab.

1. Google Colab

Google Colab serves Jupyter notebooks in the browser without a local installation.

This is a great way to get started if you are not comfortable with the command line or Python package management, but we recommend that you enventually learn how to set up a local environment.

2. Conda

conda is package and environment management, which we recommend for managing and creating local Python installations.

To help you get started, we provide pre-defined environments.

3. Docker (advanced)

Coming soon

Recommended Readings

We recommend the following books as references for data science basics. All are available legally for free online.

Programming:

Theory:

  • Introduction to Statistical Learning, Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Theory fundamentals (regression, classification, validation, clustering) using high-school level math
  • Data Science Design Manual (Requires UCSC network/VPN), Steven Skiena. An overview of fundamental techniques and problems in data science

Useful documents