Chapter 2: Stack Overflow Developer Survey

Stack Overflow is the world's largest online community for developers, and you have probably used it to find an answer to a programming question. The second chapter of this course uses data from the annual Stack Overflow Developer Survey to practice predictive modeling and find which developers are more likely to work remotely.

1Essential copying and pasting from Stack Overflow

2Choose an appropriate model

3Explore the Stack Overflow survey

4Training and testing data

5Dealing with imbalanced data

6Preprocess with a recipe

7Downsampling

8Understand downsampling

9Downsampling in your workflow

10Predicting remote status

11Train models

12Confusion matrix

13Classification model metrics

About this course

This is a free, open source course on supervised machine learning in R. In this course, you'll work through four case studies and practice skills from exploratory data analysis through model evaluation. Ines Montani designed the web framework that runs this course, and Florencia D'Andrea helped build the site.

Contributions and comments on how to improve this course are welcome! Please file an issue or submit a pull request if you find something that could be fixed or improved.

Creative Commons License

About me

My name is Julia Silge and I'm a data scientist and software engineer at RStudio where I build modeling tools. I am both an international keynote speaker and a real-world practitioner focused on data analysis and machine learning practice. I love making beautiful charts and communicating about technical topics with diverse audiences.