Skip to content

Instantly share code, notes, and snippets.

@mehak-sachdeva
Last active May 17, 2017 17:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mehak-sachdeva/e3db643e2f9b0af4346a4c46a70ca773 to your computer and use it in GitHub Desktop.
Save mehak-sachdeva/e3db643e2f9b0af4346a4c46a70ca773 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# JC Penny Store Closings\n",
"\n",
"\n",
"## Workflow\n",
"\n",
"Investigate JC Penny store closings$^1$ by:\n",
"\n",
"* Tagging locations as Urban vs Rural (using population density from the Data Observatory)\n",
"* Draw 10 minutes walk or drive isochrones based on whether the location is urban or not\n",
"* Visualize data with cartoframes\n",
"* Augment isochrones with Data Observatory measures\n",
"* Visualize data in Builder and add widgets for specific measures and store properties\n",
"\n",
"Final dashboard: https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed\n",
"\n",
"1. closing status is real, but the actual close date is chosen randomly from the last five years"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installing dependencies\n",
"\n",
"Install [cartoframes](https://github.com/cartodb/cartoframes) (which is currently in beta). I recommend installing in a virtual environment to keep things clean and sandboxed.\n",
"\n",
"## Getting the data\n",
"\n",
"Download the JC Penny store location data from here:\n",
"* <http://mehak-carto.carto.com/api/v2/sql?q=select%20*%20from%20jc_penny_stores&format=csv&filename=jc_penny_stores>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Workflow for obtaining data\n",
"\n",
"Pull JC Penny locations from my CARTO account into cartoframes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import cartoframes\n",
"import json\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")\n",
"\n",
"USERNAME = '' # <-- Put your carto username here\n",
"APIKEY = '' # <-- Put your carto api key here\n",
"\n",
"# use cartoframes.credentials.set_creds() to save credentials for future use\n",
"cc = cartoframes.CartoContext(api_key=APIKEY,\n",
" base_url='https://{}.carto.com/'.format(USERNAME))\n",
"table_name = 'jc_penny_stores'\n",
"\n",
"# load JC Penny locations into DataFrame\n",
"df = cc.read(table_name)\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## JC Penny Store Closings\n",
"\n",
"* Purple = stores closing\n",
"* Orange = stores staying open"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from cartoframes import Layer\n",
"from cartoframes.styling import vivid\n",
"\n",
"cc.map(layers=Layer(table_name,\n",
" color={'column': 'status', 'scheme': vivid(10, 'category')}),\n",
" interactive=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Augment with DO to get 'urban-ness' metric (population density)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# get population, other measures within 5 minute walk time\n",
"# More info about this Data Observatory measure here:\n",
"# https://cartodb.github.io/bigmetadata/united_states/age_gender.html#total-population\n",
"df = cc.data_augment(table_name, [{'numer_id': 'us.census.acs.B01003001',\n",
" 'normalization': 'area',\n",
" 'numer_timespan': '2011 - 2015'}])\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get a sense of the range of data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"df.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create isochrones based on travel inferences\n",
"\n",
"Create a derivative table with geometries as isochrones of walk/drive times from store locations. If pop density is above 5000 people / sq. km., assume it's a walkable area. Otherwise, assume cars are the primary mode of transit.\n",
"\n",
"**Note:** This functionality is a planned cartoframes method."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%%time\n",
"df = cc.query('''\n",
" SELECT \n",
" CASE WHEN total_pop_area_2011_2015 > 5000\n",
" THEN (cdb_isochrone(the_geom, 'walk', Array[600])).the_geom\n",
" ELSE (cdb_isochrone(the_geom, 'car', Array[600])).the_geom\n",
" END as the_geom,\n",
" {keep_columns}\n",
" FROM\n",
" {table_name}\n",
" '''.format(table_name=table_name,\n",
" keep_columns=', '.join(set(df.columns) - {'the_geom', 'the_geom_webmercator'})))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"iso_table_name = (table_name + '_isochrones')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is an issue in the repo already to introduce batch_api queries to avoid timeout:\n",
"https://github.com/CartoDB/cartoframes/issues/85\n",
"\n",
"There are bonus points to find bugs and open issues!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"cc.write(df, iso_table_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If this fails because of a lack of credits (i.e., reaching quota), then replace the `(cdb_isochrone(the_geom, 'walk', Array[600])).the_geom` pieces with `ST_Buffer(the_geom::geography, 800)::geometry` for an approximate 10 minute walk ('crow flies' distance), and `ST_Buffer(the_geom::geography, 12000)::geometry` for an approximate 10 minute drive (assuming 45 mph on average for 10 minutes)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from cartoframes import BaseMap\n",
"cc.map(layers=[BaseMap('light'),\n",
" Layer(iso_table_name),\n",
" Layer(table_name)],\n",
" zoom=12, lng=-73.9668, lat=40.7306,\n",
" interactive=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# show choropleth of isochrones by pop density\n",
"from cartoframes.styling import vivid\n",
"cc.map(layers=[Layer(iso_table_name,\n",
" color='total_pop_area_2011_2015'),\n",
" Layer(table_name, size=6, color={'column': 'status', 'scheme': vivid(2)})],\n",
" zoom=8, lng=-74.7729, lat=39.9771,\n",
" interactive=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Data Observatory measures: median income, male age 30-34 (both ACS)\n",
"# Male age 30-34: https://cartodb.github.io/bigmetadata/united_states/age_gender.html#male-age-30-to-34\n",
"# Median Income: https://cartodb.github.io/bigmetadata/united_states/income.html#median-household-income-in-the-past-12-months\n",
"\n",
"# Note: this may take a minute or two because all the measures are being calculated based on the custom geographies\n",
"# that are passed in using spatially interpolated calculations (area-weighted measures)\n",
"\n",
"data_obs_measures = [{'numer_id': 'us.census.acs.B01001012'},\n",
" {'numer_id': 'us.census.acs.B19013001'}]\n",
"df = cc.data_augment(table_name + '_isochrones', data_obs_measures)\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you might have already heard, the Data Observatory just launched to help provide CartoDB users with a universe of data. One of the reasons we built the Data Observatory is because getting the third-party data you need is oftentimes the hardest part of analyzing your own data. Data wrangling shouldn't be such a big roadblock to mapping and analyzing your world.\n",
"\n",
"https://carto.com/blog/create-location-data-easily"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize isochrones based on Data Observatory measure"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"cc.map(layers=Layer(iso_table_name,\n",
" color='median_income_prenormalized_2011_2015'),\n",
" zoom=8, lng=-74.3115, lat=40.1621,\n",
" interactive=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Builder Dashboard\n",
"\n",
"https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"HTML('<iframe width=\"100%\" height=\"520\" frameborder=\"0\" src=\"https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed\" allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment