Skip to content

Instantly share code, notes, and snippets.

View Selbosh's full-sized avatar

David Selby Selbosh

View GitHub Profile
@Selbosh
Selbosh / Sets.jl
Last active December 28, 2021 20:22
Faster set operations on sequences in Julia
# Faster set operations for integer sequences
import Base.setdiff, Base.union, Base.collect
function setdiff(a::UnitRange{Int}, b::UnitRange{Int})
if intersect(a, b) == a
last(a):first(a)
elseif isempty(intersect(a, b))
a
# a0 --- b0 --- a1 --- b1
elseif first(a) <= first(b) && last(a) <= last(b)
@Selbosh
Selbosh / unrainbow.R
Last active November 6, 2019 17:25
Seen a plot that uses a rainbow colour palette? Convert it to a less contemptuous colour scheme with this script.
#############################
# Fix a rainbow-coloured plot
#############################
rainbow2viridis <- function(img) {
# Input: `img`, an H x W x 3 array of RGB values from an image file
dimnames(img) <- list(height = NULL, width = NULL,
rgb = c('r', 'g', 'b'))
data <- reshape2::melt(img)
data <- reshape2::dcast(data, width + height ~ rgb)
data <- subset(data, !(r == 1 & g == 1 & b == 1)) # omit white pixels
@Selbosh
Selbosh / robust_QS_test.R
Last active April 16, 2018 11:34
Simulating from Bradley-Terry models and comparing variance estimators
simulate <- function(means, ..., sampler = rpois) {
# Simulate from a dataset and return
# a sample that is in the same format
sample <- means
sample[] <- sampler(length(means), means, ...)
return(sample)
}
make_QS <- function(n, mean = 10) {
D <- diag(sample.int(n))
@Selbosh
Selbosh / cranly.R
Last active March 27, 2018 13:24
Playing with Ioannis Kosmidis's cranly package
# devtools::install_github("ikosmidis/cranly")
library(cranly)
package_db <- clean_CRAN_db()
(cranly_ts <- attr(package_db, "timestamp"))
package_network <- build_network(package_db)
library(igraph)
cranlig <- as.igraph(package_network, reverse = TRUE)
npkgs <- vcount(cranlig)
nlinks <- ecount(cranlig)
@Selbosh
Selbosh / unnest.R
Last active March 15, 2018 11:57
Minimal working example of un-nesting a data frame
# Create data frame with a list column
users <- data.frame(
id = 1:3,
username = c('Tom', 'Dick', 'Harry'),
following = I(list(
c('TeaStats', 'RobertJBlincoe'),
c('JeremyCorbyn', 'realDonaldTrump'),
c('Scottish_Tweets')
))
)
@Selbosh
Selbosh / .block
Last active February 28, 2018 11:52
Queued animations for scatter plots with error bars
license: gpl-3.0
height: 700
scrolling: no
border: no
@Selbosh
Selbosh / ks_test.R
Last active February 15, 2018 16:42
Kolmogorov–Smirnov tests with non-standard evaluation
library(rlang)
library(dplyr)
# ----------------------------------------------------------------
# First implementation: specify desired variable names explicitly.
# ----------------------------------------------------------------
test1 <- function(var1, var2, data, name1, name2) {
qvar1 <- enquo(var1)
qvar2 <- enquo(var2)
@Selbosh
Selbosh / custom.css
Last active July 19, 2017 14:42
R-themed styling for Xaringan (Remark.js) slideshow presentations
@import url(https://fonts.googleapis.com/css?family=Merriweather:400,400i,700|Merriweather+Sans:300,400);
@import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700);
body {
font-family: Merriweather, Georgia, serif;
color: #515151;
line-height: 1.5;
}
.remark-slide-content {
@Selbosh
Selbosh / README.md
Last active March 3, 2016 18:08
A visualisation makeover

The original graphic (reproduced below) features two concentric doughtnut charts with an accompanying legend, and the reader is challenged to match up eight seemingly arbitrary colours with the different categories.

Proportions are difficult to ascertain because there are too many groups and many of them have similar sizes. All the numbers are printed on the chart which sort of makes it into a badly-laid-out table.

Original visualisation

My makeover simplifies everything into a grouped bar chart, with direct labelling of categories and a simple legend to distinguish online from direct services. From this graph it is easy to see that the majority of Citizens Advice issues are dealt with online, whereas inferring this from the original graph requires reading off the actual numbers.

This graphic could be improved further by cleaning up