{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "CiaKDpAltTxe"
},
"source": [
"# Visualization in Python \n",
"\n",
"#### Computing Showcase \n",
"\n",
"**April 6, 2022**\n",
"\n",
"\n",
"*Mobility data and example adapted from UGBA 88 course materials*\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "K9U62tY3iR2p"
},
"source": [
"\n",
"## Workshop Goals\n",
"\n",
"In this workshop, we will explore visualization of the data looking at College Mobility. We will focus on public universities and community colleges in Michigan. An important justification for public spending on higher education is that colleges and universities may be seen as the 'engines of social mobility'. \n",
"\n",
"We will do three things. First, we will investigate how access, success, and upward mobility rates vary across institutions. Second, we will explore how access has changed over time, as Michigan’s spending on public higher education has declined or stagnated. Third, we will write a function that generates a Report Card for a provided institution.\n",
"\n",
"The exercises are intended to illustrate how visualizations can provide valuable insights and motivate new questions.\n",
"\n",
"\n",
"## Economic Mobility at Universities\n",
"\n",
"\n",
"In 2017, a team of researchers used anonymized data from the federal government to publish statistics for each college in the U.S. on the distribution of students’ earnings in their thirties and their parents’ incomes. They showed that students from low-income families have excellent long-term outcomes after attending selective schools, but that there are very few low-income students at these schools.\n",
"\n",
"This work was highlighted in several news sites: \n",
"\n",
"* [NYTimes](https://www.nytimes.com/interactive/2017/01/18/upshot/some-colleges-have-more-students-from-the-top-1-percent-than-the-bottom-60.html) including interactive visualizations\n",
"* [Vox](https://www.vox.com/policy-and-politics/2017/2/28/14359140/chetty-friedman-college-mobility)\n",
"\n",
"Full details on the data used here as well as many related data sets can be found at [opportunityinsights.org](https://opportunityinsights.org/education/)\n",
"\n",
"\n",
"\n",
"\n",
"### Table of Contents\n",
"1 - [Comparing Outcomes Across Institutions](#compare)
\n",
"2 - [How Does Access Vary Over Time?](#access)
\n",
"3 - [Creating a College Report Card](#card)
\n",
"\n",
"\n",
"**Dependencies:**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9n25VHRQtS14"
},
"outputs": [],
"source": [
"# import required libraries\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"# data Visualization libraries\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sb\n",
"%matplotlib inline\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Zri0aZmvjPg3"
},
"source": [
"## 1. Comparing Outcomes Mobility Across Institutions \n",
"The first dataset we'll use has one row of data for each college and university in the US.\n",
"\n",
"(Though we discuss the columns we'll use in this lab, look [here](http://www.equality-of-opportunity.org/data/college/Codebook%20MRC%20Table%202.pdf) for more documentation on the remaining contents of these data.)\n",
"\n",
"\n",
"First, let's load the data and the specific columns we'll use in this lab."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 400
},
"id": "lZOdPgmYjQJg",
"outputId": "06e41f2e-a370-48c3-8066-1631e5d1434b"
},
"outputs": [],
"source": [
"mobility = pd.read_csv(\"data/mrc_q1.csv\")\n",
"\n",
"print(\"Data Dimensions:\", mobility.shape[1] , \"X\" , mobility.shape[0])\n",
"mobility.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iLK_4eY3jSFP"
},
"source": [
"In this exercise, we will focus on Michigan public institutions. Let’s filter the data to reflect this."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mi_pub_mobility = mobility[(mobility['type']=='Public') & (mobility['state']=='MI')]\n",
"\n",
"\n",
"print(\"Data Dimensions:\", mi_pub_mobility.shape[1] , \"X\" , mi_pub_mobility.shape[0])\n",
"mi_pub_mobility.head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are left with 40 institutions. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploring the Data \n",
"\n",
"We will first describe the distributions of _access, success rates, and mobility rates_ across institutions. We use the same definitions of these terms used in the paper and described in lecture:\n",
"\n",
"- **`access`:** the percentage of students enrolled that are ‘low income’–those whose parents' income is in the bottom quintile (bottom 20%) of the parental income distribution. Note: values range from 0 to 100.\n",
"\n",
"- **`success`:** the percentage of low income students with post-graduation incomes in the top quintile (top 20%) of the student income distribution, measured at age 32-34.\n",
"\n",
"- **`mobility`:** the percentage of students enrolled that are both ‘low income’ and later have earnings in the top quintile (top 20%) of the student income distribution.\n",
"\n",
"Recall that `mobility` $=$ `access` $\\times$ `success`. Hence, institutions with high mobility will tend to have more low income students and high 'success' rates with those students."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Success Rates\n",
"\n",
"