Skip to content

Instantly share code, notes, and snippets.

@diegovalle
Last active July 27, 2023 15:45
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save diegovalle/fbfa5d6e02c50fe7153f8a22092fdbef to your computer and use it in GitHub Desktop.
Save diegovalle/fbfa5d6e02c50fe7153f8a22092fdbef to your computer and use it in GitHub Desktop.
Download the shapefiles from the 2020 Mexican census
#!/usr/bin/env bash
####################################################
# Make sure `rename` is available on your system
####################################################
# Exit on error, undefined and prevent pipeline errors,
# use '|| true' on commands that intentionally exit non-zero
set -euo pipefail
# The directory from which the script is running
readonly LOCALDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
readonly TARGET_DIR="$LOCALDIR/censo2020"
IFS=$'\n\t'
# index starts at zero and that's why there's and empty string at the beginning
declare -a states=("" "ags" "bc" "bcs" "camp" "coah" "col"
"chis" "chih" "cdmx" "dgo" "gto" "gro" "hgo" "jal"
"mex" "mich" "mor" "nay" "nl" "oax" "pue" "qro"
"qroo" "slp" "sin" "son" "tab" "tamps" "tlax" "ver"
"yuc" "zac");
# Download Censo 2020 shapefiles from https://www.inegi.org.mx/temas/mg/#Descargas
main() {
local TEMP_DIR=$TMPDIR
wget -O "$TEMP_DIR"/censo2020.zip -nc "https://www.inegi.org.mx/contenidos/productos//prod_serv/contenidos/espanol/bvinegi/productos/geografia/marcogeo/889463807469_s.zip" || true
unzip -o "$TEMP_DIR"/censo2020.zip -d "$TEMP_DIR"/.zip
for i in {1..32}
do
# The INEGI uses a leading zero for all one digit numbers
if [ "$i" -lt 10 ]
then
FILENUM="0$i"
else
FILENUM="$i"
fi
mkdir -p "$TARGET_DIR/${states[$i]}"
unzip -o "$TEMP_DIR"/.zip/$FILENUM*.zip -d "$TARGET_DIR/${states[$i]}"
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"ent.* -exec rename "s/[0-9]{2}ent/${states[$i]}_estatales/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"mun.* -exec rename "s/[0-9]{2}mun/${states[$i]}_municipales/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"ar.* -exec rename "s/[0-9]{2}ar/${states[$i]}_ageb_rural/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"l.* -exec rename "s/[0-9]{2}l/${states[$i]}_localidades_amanzanadas/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"lpr.* -exec rename "s/[0-9]{2}lpr/${states[$i]}_localidades_puntos_rurales/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"ti.* -exec rename "s/[0-9]{2}ti/${states[$i]}_territorio_insular/" {} \; || true
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"a.* -exec rename "s/[0-9]{2}a/${states[$i]}_ageb_urbanas/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"m.* -exec rename "s/[0-9]{2}m/${states[$i]}_manzanas/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"fm.* -exec rename "s/[0-9]{2}fm/${states[$i]}_frentes_de_manzana/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"e.* -exec rename "s/[0-9]{2}e/${states[$i]}_ejes_viales/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"cd.* -exec rename "s/[0-9]{2}cd/${states[$i]}_caserio_disperso/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"sia.* -exec rename "s/[0-9]{2}sia/${states[$i]}_servicios_area/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"sil.* -exec rename "s/[0-9]{2}sil/${states[$i]}_servicios_linea/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"sip.* -exec rename "s/[0-9]{2}sip/${states[$i]}_servicios_punto/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"pe.* -exec rename "s/[0-9]{2}pe/${states[$i]}_poligono_externo/" {} \;
find "$TARGET_DIR"/${states[$i]}/conjunto_de_datos/"$FILENUM"pem.* -exec rename "s/[0-9]{2}pem/${states[$i]}_poligono_externo_manzana/" {} \;
done
}
merge() {
# Merge a bunch of shapefiles
# The filename of the merged file
local FILEOUT=$1
# The names of the files to merge, you can change this to
# "*entidad.shp" or "*eje_vial.shp", etc
local FILTER=$2
local PROJECTION="+proj=longlat +ellps=WGS84 +no_defs +towgs84=0,0,0"
for i in $(find "$TARGET_DIR" -name "$FILTER")
do
if [ -f "$FILEOUT" ]
then
echo "adding state $i to $FILEOUT"
ogr2ogr -f 'ESRI Shapefile' -update -append "$FILEOUT" "$i" -nln "$(basename -s .shp $FILEOUT)" -t_srs "$PROJECTION"
else
echo "startin merge..."
echo "adding state $i to $FILEOUT"
ogr2ogr -f 'ESRI Shapefile' "$FILEOUT" "$i" -t_srs "$PROJECTION"
fi
done
}
main
merge municipios.shp "*municipales.shp"
merge estados.shp "*estatales.shp"
@cbeddow
Copy link

cbeddow commented Feb 3, 2021

Is there a problem with the certificate for the INEGI site?

CSB@DESKTOP-L73DBC4 MINGW64 /c/users/CSB/downloads
$ sh censo_2020.sh
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
--2021-02-03 09:37:26--  https://www.inegi.org.mx/contenidos/productos//prod_serv/contenidos/espanol/bvinegi/productos/geografia/marcogeo/889463807469_s.zip
Resolving www.inegi.org.mx... 200.23.8.5
Connecting to www.inegi.org.mx|200.23.8.5|:443... connected.
ERROR: cannot verify www.inegi.org.mx's certificate, issued by `/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=Thawte RSA CA 2018':
  Unable to locally verify the issuer's authority.
To connect to www.inegi.org.mx insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.
Archive:  /tmp/censo2020.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of /tmp/censo2020.zip or
        /tmp/censo2020.zip.zip, and cannot find /tmp/censo2020.zip.ZIP, period.

@diegovalle
Copy link
Author

I don't get any errors accessing the INEGI website. If you are absolutely certain you are not being man-in-middled you could add the '--no-check-certificate' parameter to wget. Could also be that old versions of wget are unable to access the Windows certificate store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment