Skip to content

Exercise

First Steps in Data Science

1. Today we will start with walking through a Jupyter Notebook that introduces a bit of Python and starting to work with data.

Binder

Completed Notebook

2. Next, we will explore data related to college mobility. We will first describe the distributions of access, success rates, and mobility rates across institutions. We use the same definitions of these terms used in the paper and described in lecture:

  • access: the percentage of students enrolled that are ‘low income’–those whose parents' income is in the bottom quintile (bottom 20%) of the parental income distribution. Note: values range from 0 to 100.

  • success: the percentage of low income students with post-graduation incomes in the top quintile (top 20%) of the student income distribution, measured at age 32-34.

  • mobility: the percentage of students enrolled that are both ‘low income’ and later have earnings in the top quintile (top 20%) of the student income distribution.

Recall that mobility = access x success. Hence, institutions with high mobility will tend to have more low income students and high 'success' rates with those students.

Binder

Completed Notebook

Data Files

Data files used in this analysis: