This is the repository for D-Lab’s Introduction to Machine Learning in R workshop.
- Background on machine learning
- Classification vs regression
- Performance metrics
- Data preprocessing
- Missing data
- Train/test splits
- Algorithm walkthroughs
- K-nearest neighbors
- Decision trees
- Random forests
- Gradient boosted machines
- SuperLearner ensembling
We assume that participants have familiarity with:
- basic R syntax
- statistical concepts such as mean and standard deviation
Please bring a laptop with the following:
- R version 3.5 or greater
- RStudio integrated development environment (IDE) is highly recommended but not required.
Browse resources listed on the D-Lab Machine Learning Working Group repository. Scroll down to see code examples in R and Python, books, courses at UC Berkeley, online classes, and other resources and groups to help you along your machine learning journey!