Skip to content

Instantly share code, notes, and snippets.

@trvrb
Created July 30, 2012 15:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save trvrb/3207766 to your computer and use it in GitHub Desktop.
Save trvrb/3207766 to your computer and use it in GitHub Desktop.
Ruby script to collate nodes and attributes from a BEAST MCC file
#!/usr/bin/ruby
# This script reads through an MCC tree and reads out tip attributes
# Converting these to a more parsable tab-deliminated format
# Relies on each label existing for every node
# Load tree from file
filename = ARGV[0]
infile = File.new(filename, "r")
tree = ""
infile.each { |line|
if line=~ /^\s+tree/
tree = line
end
}
infile.close
# Substitute "|" for "," within label sets
while tree =~ /\{([A-Za-z0-9\-\.\"\|]+),/
tree.gsub!(/\{([A-Za-z0-9\-\.\"\|]+),/, '{\1|')
end
# Remove "posterior label"
tree.gsub!(/posterior=[0-9\-\.\"\{\}\|]+/, "")
# Replace tiny numbers with 0s
tree.gsub!(/\d+\.\d+E\-\d+/, "0.0")
# collect labels
labels = tree.scan(/[\,\&]([A-Za-z0-9\_\%\.][A-Za-z0-9\_\%\.]+)=/)
labels.flatten!.uniq!.sort!
puts labels.join "\t"
count = labels.length
# Scan tree for labels. Add to label hash and print once hash is full.
h = Hash.new("NA")
tree.scan(/([A-Za-z0-9\_\%\.][A-Za-z0-9\_\%\.]+)=([A-Za-z0-9\-\.\"\{\}\|]+)([\,\]])/) {|label,value,delim|
if delim == "]"
labels.each {|s|
if labels.first == s
print h[s]
elsif labels.last == s
print "\t" + h[s] + "\n"
else
print "\t" + h[s]
end
}
h.clear
end
value.gsub!("|",",")
h[label] = value
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment