Skip to content

Instantly share code, notes, and snippets.

@iaindillingham
iaindillingham / dataset_definition.py
Last active October 2, 2023 16:56
Dataset definition for the ehrQL tutorial
from ehrql import case, codelist_from_csv, create_dataset, days, when
from ehrql.tables.beta.core import medications, patients
from ehrql.tables.beta.tpp import (
addresses,
clinical_events,
hospital_admissions,
practice_registrations,
)
index_date = "2023-10-01"
import glob
import pathlib
def match_paths_1(pattern):
# not a generator function
return (pathlib.Path(path) for path in glob.iglob(pattern))
def match_paths_2(pattern):
@iaindillingham
iaindillingham / join_strategies.py
Last active March 25, 2022 16:25
Join Strategies
# A common pattern when using OpenSAFELY for time series analysis is to extract one
# cohort for slow-to-extract variables that we don't expect to change over time, and
# multiple cohorts (e.g. by week or by month) for fast-to-extract variables that we
# expect to change over time. Each "fast" cohort is then joined to the "slow" cohort for
# analysis.
#
# In this gist, we compare the memory profiles of two join strategies found in the
# OpenSAFELY documentation: a map strategy and a merge strategy. We find that on a
# dataset with an order of magnitude difference between the population size and the
# sample size, the map strategy uses roughly 2.9 times more memory than the merge
@iaindillingham
iaindillingham / descriptors.py
Created February 16, 2022 14:30
Attribute Descriptors
import logging
logging.basicConfig(level=logging.INFO)
class LoggingProperty:
def __init__(self):
logging.info(f"Initializing '{self.__class__.__name__}'")
def __set_name__(self, owner, name):
@iaindillingham
iaindillingham / dataclasses_vs_namedtuples.py
Created September 10, 2021 20:19
Data Classes vs Named Tuples
import dataclasses
import random
import typing
@dataclasses.dataclass
class PointDC:
x: float
y: float
[
{
"id": 0,
"parent": null,
"name": "Name"
},
{
"id": 1,
"parent": 0,
"name": "Name"

countries_topology.json is derived from the Natural Earth 1:110m Cultural Vectors. Each feature has an id property, which is the ISO A2 country code for the country; and a properties.title property, which is the English-language name of the country.

@iaindillingham
iaindillingham / df1.csv
Last active June 23, 2019 17:06
Reporting a potential bug in Vega's joinaggregate transform
dimension metric
A 1
A 1
B 2
B 2
@iaindillingham
iaindillingham / df1.csv
Last active June 23, 2019 16:20
Reporting a potential bug in Vega's pivot transform
A 1
A 1
B 2
B 2
@iaindillingham
iaindillingham / .block
Last active May 18, 2018 20:49
Reusable chart with d3.dispatch
license: mit
height: 500
scrolling: no
border: yes