Tutorials
Find tutorials from past quarters at github.com/datascienceucsc/workshops
Software
Slack
We primarly communicate through Slack at datascienceucsc.slack.com. Join here.
Git
We collaborate though Github at datascienceucsc. This hosts most of our past work, including competitions and workshops.
To get access to our Github organization, message an officer on Slack or on Instagram @datascienceuscs with your Github username and email.
In addition, we recommend installing git locally.
Development Environments
Workings on projects will require access to a development environment for Python or R.
Python
We recommend one of the following three options to set up a development environment for Python and Jupyter Lab.
1. Google Colab
Google Colab serves Jupyter notebooks in the browser without a local installation.
This is a great way to get started if you are not comfortable with the command line or Python package management, but we recommend that you enventually learn how to set up a local environment.
2. Conda
conda is package and environment management, which we recommend for managing and creating local Python installations.
To help you get started, we provide pre-defined environments.
3. Docker (advanced)
Coming soon
Recommended Readings
We recommend the following books as references for data science basics. All are available legally for free online.
Programming:
- A Whirlwind Tour of Python, Python Data Science Handbook, Jake VanderPlas. Teaches the main Python data science libraries (Pandas, Matplotlib, Scikit-Learn)
- R for Data Science, Hadley Wickham. Teaches modern R syntax and the Tidyverse family of data science libraries
Theory:
- Introduction to Statistical Learning, Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Theory fundamentals (regression, classification, validation, clustering) using high-school level math
- Data Science Design Manual (Requires UCSC network/VPN), Steven Skiena. An overview of fundamental techniques and problems in data science