20 Sep 2011 21:09
Re: Fast method for duplicated edge removal
Tiago de Paula Peixoto <tiago <at> skewed.de>
2011-09-20 19:09:58 GMT
2011-09-20 19:09:58 GMT
Hi Giuseppe, On 09/20/2011 04:36 PM, Giuseppe Profiti wrote: > Hello, > I'm using a cvsreader to parse a very big file and create an > undirected graph accordingly. > The file can contain duplicated edges (i.e. A B in one row, B A in > another one), so I'm checking > if g.edge(v1,v2)==None: > e = g.add_edge(v1,v2) > in order to discard them (v1 and v2 are vertices created from what > it's read from the file). > However the graph contains a lot of edges (few millions) and vertices > (many thousands), with a potentially high degree for the vertices, and > it takes a lot to process the data. > As far as I read in the soruce code, the Graph.edge() method checks > all the outgoing edges of the source node, but even if I check which > node has the highest degree, it takes a lot of time to build the > graph. > > Is there any other way to remove the duplicated edges? Maybe an edge > filter based on some lambda wizardry? The library includes a 'remove_parallel_edges()' function which does what you want, and is fast. If you want to mask the parallel edges temporarily, you can use filtering, as such: l = label_parallel_edges(g) g = GraphView(g, efilt=lambda e: l[e] == 0) Cheers, Tiago -- -- Tiago de Paula Peixoto <tiago <at> skewed.de>
_______________________________________________ graph-tool mailing list graph-tool <at> skewed.de http://lists.skewed.de/mailman/listinfo/graph-tool
RSS Feed