Skip to content

Instantly share code, notes, and snippets.

@brandonhamric
brandonhamric / hash_rdd.py
Last active May 9, 2016 06:03
Create an unique hash for an RDD
from pyspark.shuffle import ExternalSorter
from pyspark.rdd import _parse_memory
def hash_rdd(rdd, id_func=lambda el: repr(el), hash_function='sha256', num_partitions=200):
"""
This function returns a unique hash representing all records in the rdd. Order of items doesn't affect the hash.
params:
id_func - a function that gets a unique string identifier from an rdd element. If none is specified, use repr(element)
hash_function - is the name of the hashlib algorithm you want to use.
{"elements":[{"label":"codon","data":{"value":"gat"},"elements":[{"label":"base","data":{"value":"g"}},{"label":"base","data":{"value":"a"}},{"label":"base","data":{"value":"t"}}]},{"label":"codon","data":{"value":"cac"},"elements":[{"label":"base","data":{"value":"c"}},{"label":"base","data":{"value":"a"}},{"label":"base","data":{"value":"c"}}]},{"label":"codon","data":{"value":"agg"},"elements":[{"label":"base","data":{"value":"a"}},{"label":"base","data":{"value":"g"}},{"label":"base","data":{"value":"g"}}]},{"label":"codon","data":{"value":"tct"},"elements":[{"label":"base","data":{"value":"t"}},{"label":"base","data":{"value":"c"}},{"label":"base","data":{"value":"t"}}]},{"label":"codon","data":{"value":"atc"},"elements":[{"label":"base","data":{"value":"a"}},{"label":"base","data":{"value":"t"}},{"label":"base","data":{"value":"c"}}]},{"label":"codon","data":{"value":"acc"},"elements":[{"label":"base","data":{"value":"a"}},{"label":"base","data":{"value":"c"}},{"label":"base","data":{"value":"c"}}]},{"l

This simple force-directed graph shows character co-occurence in Les Misérables. A physical simulation of charged particles and springs places related characters in closer proximity, while unrelated characters are farther apart. Layout algorithm inspired by Tim Dwyer and Thomas Jakobsen. Data based on character coappearence in Victor Hugo's Les Misérables, compiled by Donald Knuth.

Compare this display to a force layout with curved links, a force layout with fisheye distortion and a matrix diagram.

@brandonhamric
brandonhamric / InstallRedisOnUbuntu.markdown
Last active December 20, 2015 01:09
Installing Redis on Ubuntu 12.04

Installing Redis on Ubuntu 12.04

Introduction

This is a quick guide to setting up the latest redis version on Ubuntu 12.04. I'm actually working on an Ubuntu 12.04 64 bit Amazon EC2 instance (ami-dof89fb0), but this should work on most Ubuntu and Linux flavors.

Dependencies

Redis doesn't have many dependencies, just Make, gcc, and TCL:

@brandonhamric
brandonhamric / index.html
Last active December 18, 2015 09:59 — forked from mbostock/.block
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Force Layouts - Multiple Foci Collision</title>
<script type="text/javascript" src="http://mbostock.github.com/d3/d3.js"></script>
<script type="text/javascript" src="http://mbostock.github.com/d3/d3.geom.js"></script>
<script type="text/javascript" src="http://mbostock.github.com/d3/d3.layout.js"></script>
<style type="text/css">
@brandonhamric
brandonhamric / README.md
Last active December 18, 2015 09:00 — forked from mbostock/.block

Bubble charts encode data in the area of circles. Although less perceptually-accurate than bar charts, they can pack hundreds of values into a small space. Implementation based on work by Jeff Heer.

testing with different data