Skip to content

Instantly share code, notes, and snippets.

View elsherbini's full-sized avatar

Joseph Elsherbini elsherbini

View GitHub Profile
@elsherbini
elsherbini / alignment_parsing.py
Created September 15, 2020 15:06
Code coaching example
from itertools import *
f = open("nucleotide_alignment.fa", "r")
ntseqs = [] # nucleotide sequences
ntgaps = [] # gaps in nucleotide sequences
ntnames = []
oldseq = ""
name = ""
coordinate = 0
@elsherbini
elsherbini / README.md
Last active December 18, 2019 14:41
Papal Pizzeria

Files in support of a statsexchange question:

https://stats.stackexchange.com/questions/441250/pope-effect-on-pizza-regression-with-presence-absence-and-similarity-data-as-d

Files

pope_pizza.Rmd - code to generate plots and simulate pizzas
pizzeria_coordinates_and_pope_preferences.csv - the locations of 20 pizzerias and which popes frequented them pizzeria_menus.csv - presence/absence of different menu items in the 20 pizzerias pizzeria_menu_item_similarities.csv - pairwise similarities of menu items at the different pizzerias

country year emissions
Brazil 1996 1.7268642168
China 1996 2.8443095815
India 1996 0.901348777
Russian Federation 1996 10.8866506666
United States 1996 19.496024737
Brazil 1997 1.7938286777
China 1997 2.8205678906
India 1997 0.9200723802
Russian Federation 1997 10.3167892489
library(tidyverse)
library(cowplot)
library(scales)
library(colorblindr)
point_map <- tribble(~place, ~old_value, ~new_value,
1,6,10,
2,5,7,
3,4,4,
4,3,2,
@elsherbini
elsherbini / hierarchical_labels.R
Created May 9, 2018 14:04
Trying to get nice labels
library(tidyverse)
library(cowplot)
dummy_data <- tibble(combo_id = seq_len(8), factor1=rep(c("1a","1b"), times=4), factor2=rep(rep(c("2a","2b"), each=2), times=2), factor3=rep(c("3a","3b"),each=4)) %>%
{replicate(8, ., simplify = FALSE)} %>%
bind_rows() %>%
group_by(combo_id) %>%
mutate(value=rpois(8, 2*combo_id))
head(dummy_data)
@elsherbini
elsherbini / tidy_nfl_salaries.R
Last active April 9, 2018 18:44
A submission for #tidytuesday week 2
library(tidyverse)
library(cowplot)
library(colorblindr)
library(ggbeeswarm)
library(ggrepel)
# human_usd from https://github.com/fdryan/R/blob/master/ggplot2_formatter.r copy and source that!
df <- read_csv("nfl_salaries.csv") # from week 2 @ https://github.com/rfordatascience/tidytuesday
@elsherbini
elsherbini / README.md
Last active October 12, 2017 16:02
cutadapt adapter trimming

A quick and somewhat untested pipeline to run cutadapt.

This will take paired reads corresponding to a single genome (already demultiplexed) and perform quality filtering and adapter trimming.

input (2 files)

  1. forward reads
  2. reverse reads

output:

@elsherbini
elsherbini / example_fastadb
Created June 1, 2017 18:50
mmseqs_issue33
This file has been truncated, but you can view the full file.
>9CS106_NODE_2_length_702_cov_1339.861816_1 # 177 # 665 # -1 # ID=2_1;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.425
MDSLNNTDFKKLASQQKTIQMKMRLLALAHFKDGLSRTQIAKSLKVSRTSVNKWVRIFFE
EGLEGLQEKPRTGRPAYLTDEQRAQLSAFIKKEAESPSGGRLVGSDIHDYIVKHFDKHYH
PNSIYYLLDHMGFSWITSRSKHPKQSQQIQDDFKKIPNRNDP*
>9CS106_NODE_3_length_68154_cov_55.956436_1 # 196 # 2079 # 1 # ID=3_1;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.471
MKLTLKQKLIGASLSAVVVMATALTWLSANQLFEQTRNGVYLRAESVSEAASEGIKNWID
IRTDIASAFNDFSREDDVVPFLRQARVAGGFDDIFLGTPEGGMYRSHPERNRADYDPRQR
PWYQEANAAGKQIITTAYQDAITKALLVTIAEPVRHNGQLVGVVGADVLIDQLVNDVISL
DVGDNAYAMLIDASDGTFLAHPDSALSLKPVSQLSNDISMPIIENAVRTGSIEIIKERGA
EKLLYFTKVPNTNWIFAVQMDKATEEANHSTLLTQLITTAVIITLIVIVLVSWLVSFLFR
@elsherbini
elsherbini / calculate_syn_and_non.py
Last active March 3, 2017 15:34
calculate parsimonious synonymous and nonsynonymous mutations for any two codons
import networkx as nx
def setup_codon_graph():
"""create a networkx graph of codons as nodes and mutations as codon_edges.
if the mutation is synonymous, the edgeweight is 0.01
if the mutation is non-synonymous, the edgeweight is 1.01
then, by calculating the shortest path length between two bases,
you can get the most parsimonious path
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.