Skip to content

Instantly share code, notes, and snippets.

@nitaku
Last active August 29, 2015 13:56
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nitaku/8822395 to your computer and use it in GitHub Desktop.
Save nitaku/8822395 to your computer and use it in GitHub Desktop.
Untangle a tangled tree

This example uses python graph-tool to convert a tangled tree into a regular tree. A tangled tree is a tree with some nodes with multiple parents, or, more precisely, a Directed Acyclic Graph with a node identified as root, in which not too many nodes have multiple parents.

Starting from this example tangled tree rooted in the node A:

Tangled tree

Two different methods are presented: a simple algorithm yielding one of the shortest tree covering all nodes, and a more complex one that returns one of the longest. In both cases, for each node, its distance from the root is computed. The difference is that the shortest tree algorithm uses the shortest distance (by invoking the corresponding function in graph-tool), while the other one uses the longest distance (that's a lot more difficult to compute. see this page for a description of an algorithm that leverages the topological sort of the graph).

Distances

By filtering the obtained graphs (imposing that the recursion level of a walk that starts from the root must be equal to the distance of the child node reached by an edge), the following results are obtained:

Results

As we can see from the pictures, only the shortest (or longest) paths to the root are retained.

In some cases, longest trees are more useful than shortest. For example, if we consider nodes G, H and I, we can see that the edge from G to I is still indirectly represented in the longest tree (as a path from I to H to G). In the shortest tree instead, the information associated with the edge that goes from H to I is completely lost. Unfortunately, there are still many cases in which the information is lost forever even with longest trees (e.g. the edge from C to F).

Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from graph_tool.all import *
print 'Creating the tangled tree...'
g = Graph()
v0 = g.add_vertex()
v1 = g.add_vertex()
v2 = g.add_vertex()
v3 = g.add_vertex()
g.add_edge(v0, v1)
g.add_edge(v0, v2)
g.add_edge(v0, v3)
v4 = g.add_vertex()
v5 = g.add_vertex()
v6 = g.add_vertex()
g.add_edge(v1, v4)
g.add_edge(v1, v5)
g.add_edge(v2, v5) # v5 has two parents
g.add_edge(v2, v6)
v7 = g.add_vertex()
v8 = g.add_vertex()
v9 = g.add_vertex()
g.add_edge(v6, v7)
g.add_edge(v6, v8)
g.add_edge(v8, v9)
g.add_edge(v7, v8) # v8 has two parents in two different levels
label = g.new_vertex_property('string')
label[v0] = 'A'
label[v1] = 'B'
label[v2] = 'C'
label[v3] = 'D'
label[v4] = 'E'
label[v5] = 'F'
label[v6] = 'G'
label[v7] = 'H'
label[v8] = 'I'
label[v9] = 'J'
# use the same layout for all drawings
layout = sfdp_layout(g)
print 'Drawing the tangled tree...'
graph_draw(g,
pos=layout,
output_size=(400, 400),
output='tangled.pdf',
vertex_size=32,
vertex_fill_color=(0.9,0.9,0.9,1),
vertex_pen_width=2,
vertex_color=(0.7,0.7,0.7,1),
vertex_text=label,
vertex_font_family='sans-serif',
edge_pen_width=2,
edge_marker_size=8
)
print 'Computing shortest distances from root...'
shortest_dist = shortest_distance(g, source=v0, directed=None)
print 'Drawing the tangled tree with shortest distances...'
graph_draw(g,
pos=layout,
output_size=(400, 400),
output='shortest_distances.pdf',
vertex_size=32,
vertex_fill_color=(0.9,0.9,0.9,1),
vertex_pen_width=2,
vertex_color = (0.7,0.7,0.7,1),
vertex_text = shortest_dist,
vertex_font_family='sans-serif',
edge_pen_width=2,
edge_marker_size=8
)
print 'Creating the shortest path tree...'
# all edges have to lead to a node with depth equal to shortest_dist, otherwise they are part of a not-minimal path
# in the case of a tie, an already seen node must not be taken twice
from itertools import izip
retain_shortest = g.new_edge_property('bool')
already_visited = set()
def walk(node, depth):
for edge, child in izip(node.out_edges(), node.out_neighbours()):
if shortest_dist[child] == depth and child not in already_visited:
retain_shortest[edge] = True
already_visited.add(child)
walk(child, depth+1)
else:
retain_shortest[edge] = False
tree = walk(g.vertex(0), 1)
# keep the edges covered by the tree
g.set_edge_filter(retain_shortest)
print 'Drawing the resulting shortest path tree...'
graph_draw(g,
pos=layout,
output_size=(400, 400),
output='shortest.pdf',
vertex_size=32,
vertex_fill_color=(0.9,0.9,0.9,1),
vertex_pen_width=2,
vertex_color=(0.7,0.7,0.7,1),
vertex_text=label,
vertex_font_family='sans-serif',
edge_pen_width=2,
edge_marker_size=8
)
# reset the filter
g.set_edge_filter(None)
print 'Computing topological sort...'
topological_order = topological_sort(g)
# print topological_order
print 'Computing longest distances from root...'
print 'Assigning initial distances...'
longest_dist = g.new_vertex_property('int32_t')
for v in g.vertices():
longest_dist[v] = -2147483648 # minimum int32_t number as -Inf ???
longest_dist[g.vertex(0)] = 0 # root is initialized to 0
from itertools import imap
# unweighted topological sort (all edges have weight=1)
# adapted from http://www.geeksforgeeks.org/find-longest-path-directed-acyclic-graph/
for u in imap(g.vertex, reversed(topological_order)):
# print 'Considering node %d (current distance %d)' % (u, longest_dist[u])
for v in u.out_neighbours():
# print ' Comparing with node %d (current distance %d)' % (v, longest_dist[v])
if longest_dist[v] < longest_dist[u]+1:
longest_dist[v] = longest_dist[u]+1
# print ' Updated distance to %d' % (longest_dist[v])
print 'Drawing the tangled tree with longest distances...'
graph_draw(g,
pos=layout,
output_size=(400, 400),
output='longest_distances.pdf',
vertex_size=32,
vertex_fill_color=(0.9,0.9,0.9,1),
vertex_color = (0.7,0.7,0.7,1),
vertex_pen_width=2,
vertex_text = longest_dist,
vertex_font_family='sans-serif',
edge_pen_width=2,
edge_marker_size=8
)
print 'Creating the longest path tree...'
# all edges have to lead to a node with depth equal to longest_dist, otherwise they are part of a not-maximal path
# in the case of a tie, an already seen node must not be taken twice
from itertools import izip
retain_longest = g.new_edge_property('bool')
already_visited = set()
def walk(node, depth):
for edge, child in izip(node.out_edges(), node.out_neighbours()):
if longest_dist[child] == depth and child not in already_visited:
retain_longest[edge] = True
already_visited.add(child)
walk(child, depth+1)
else:
retain_longest[edge] = False
tree = walk(g.vertex(0), 1)
# keep the edges covered by the tree
g.set_edge_filter(retain_longest)
print 'Drawing the resulting longest path tree...'
graph_draw(g,
pos=layout,
output_size=(400, 400),
output='longest.pdf',
vertex_size=32,
vertex_fill_color=(0.9,0.9,0.9,1),
vertex_pen_width=2,
vertex_color=(0.7,0.7,0.7,1),
vertex_text=label,
vertex_font_family='sans-serif',
edge_pen_width=2,
edge_marker_size=8
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment