Skip to content

Instantly share code, notes, and snippets.

@enjalot
enjalot / ndjson.md
Last active December 15, 2021 04:10
Tips for processing Quick, Draw! data with ndjson-cli

Quick, Draw! ndjson data

The Quick, Draw! dataset uses ndjson as one of the formats to store its millions of drawings.

We can use the ndjons-cli utility to quickly create interesting subsets of this dataset.

The drawings (stroke data and associated metadata) are stored as one JSON object per line. e.g.:

{
@fitnr
fitnr / presidential_17.csv
Created April 26, 2017 19:44
Résultats de le 1er tour de la présidentielle 2017, a niveau de commune / Commune-level results of the first round of the 2017 French presidential election
We can't make this file beautiful and searchable because it's too large.
commune,votes,le_pen,macron,fillon,melenchon,dupont_aig,hamon,asselineau,arthaud,poutou,cheminade,lassalle
01001,495,126,119,110,59,34,29,6,4,4,2,2
01002,176,48,37,34,33,6,13,1,2,2,0,0
01004,6452,1667,1332,1084,1412,346,344,71,40,91,5,60
01005,933,306,191,197,126,45,37,10,5,10,0,6
01006,77,18,15,14,19,4,3,0,1,2,0,1
01007,1548,458,348,233,296,80,82,11,9,17,1,13
01008,473,135,95,84,89,28,23,2,3,8,3,3
01009,209,40,55,44,39,6,8,6,0,3,0,8
01010,589,207,110,83,103,33,20,10,3,5,1,14
import asyncio
import aiohttp
import os
import random
import re
import sys
import traceback
from io import StringIO
from lxml.html import parse, make_links_absolute
from lxml.cssselect import CSSSelector
@yanofsky
yanofsky / Makefile
Last active May 4, 2019 06:45
This is workflow for downloading, processing, and mosaicing Landsat scenes, as a Makefile
# lansatutil directory
LANDSAT = ~/landsat
# scenes to target
LANDSAT_IDS = \
LC81220442016038LGN00 \
LC81220452016038LGN00 \
LC81210442014281LGN00 \
LC81210452014281LGN00
@benib
benib / orakel.js
Created May 18, 2016 15:41
NZZ Euro 2016 Orakel Algorithmus
/*
NZZ Euro 2016 Orakel
http://nzz.ch/-ld.17757
The algorithm
this works like this:
// tables with germany shape and the cities we are using in our application
var germany = ee.FeatureCollection('ft:1KDrYXBDlAx1fhcfmWRx7u_qqN2O_gwBNInjnGmnZ')
var cities = ee.FeatureCollection('ft:1w4PgU3okfzwKFEIpH32oPMlOtei6hUWa9tkXv5Rt');
// landsat properties we need to create our image collection over different years
// we use a feature collection here, because we can easily filter it
var landsats = ee.FeatureCollection([
ee.Feature(null, { collection: ee.ImageCollection('LANDSAT/LT5_L1T_TOA'), nir: 'B4', red: 'B3', from: 1984, to: 1992 }),
ee.Feature(null, { collection: ee.ImageCollection('LANDSAT/LT4_L1T_TOA'), nir: 'B4', red: 'B3', from: 1992, to: 1994 }),
ee.Feature(null, { collection: ee.ImageCollection('LANDSAT/LT5_L1T_TOA'), nir: 'B4', red: 'B3', from: 1994, to: 1999 }),
@dannguyen
dannguyen / guardian-articles-day-api.md
Last active November 23, 2023 12:28
How to use The Guardian's API to download article data for content analysis (in Python 3.x)

How to use The Guardian's API to download article data for content analysis (in Python 3.x)

The Guardian offers an API as deep and robust as the New York Times Article API when it comes to content analysis.

The Guardian's API offers more than "1.7 million pieces of content", with published items as far back as 1999. You can register as a developer here, which gets you 5,000 API hits a day and an API key that looks something like this:

zzzyyyyy-9a9z-999z-z999-9e8a83922516

The Guardian has a handy interactive explorer to interactively tweak the query parameters.

@celoyd
celoyd / hi8-anim-howto.md
Last active August 1, 2022 15:37
A way to make Himawari-8 animations

Himawari-8 animation tutorial

Here’s how to make animations like this one. It requires intermediate Unix command-line knowledge, to install some tools and to debug if they don’t work. You’ll need these utilities:

  • curl (or you can translate to wget)
  • convert and montage, part of ImageMagick
  • ffmpeg, plus whatever codecs
  • parallel, for iteration that’s nicer than shell for loops or xargs
  • run everything in zsh for leading 0s in numerical ranges to work
@dannguyen
dannguyen / faa-333-pdf-gathering.md
Last active June 19, 2021 13:18
Using wget + grep to explore inconveniently organized federal data (FAA Section 333 Exemptions)

if !database: wget + grep

The Federal Aviation Administration is posting PDFs of the Section 333 exemptions that it grants, i.e. the exemptions for operators who want to fly drones commercially before the FAA finishes its rulemaking. A journalist wanted to look for exemptions granted to operators in a given U.S. state. But the FAA doesn't appear to have an easy-to-read data file to use and doesn't otherwise list exemptions by location of operator.

However, since their exemptions page is just one giant HTML table for listing the PDFs, we can just use wget to fetch all the PDFs, run pdftotext on each file, and then [grep](https://medium.com/@rualthanzauva/grep-was-a-private-command-of-m

@ErisDS
ErisDS / examples.md
Last active May 2, 2024 08:23
Ghost Filter Query examples

Filter Queries - Example Use Cases

Here are a few example use cases, these use cases combine filter with other parameters to make useful API queries. The syntax for any of this may change between now, implementation, and release - they're meant as illustrative examples :)

Fetch 3 posts with tags which match 'photo' or 'video' and aren't the post with id 5.

api.posts.browse({filter: "tags:[photo, video] + id:-5", limit="3"});

GET /api/posts?filter=tags%3A%5Bphoto%2Cvideo%5D%2Bid%3A-5&limit=3