Skip to content

Latest commit

 

History

History
44 lines (37 loc) · 1.99 KB

README.md

File metadata and controls

44 lines (37 loc) · 1.99 KB

agglom_cluster

Agglomerative clustering tool for network-x graphs

Clustering

Implements the algorithm described by: "Fast algorithm for detecting community structure in networks" M. E. J. Newman. 2004 http://arxiv.org/pdf/cond-mat/0309508v1.pdf This is a greedy agglomerative hierarchical clustering algorithm

This implementation uses a heap to select the best pair to cluster at each iteration

  • A naive implementation considers all "n" edges in the graph (O(n))
  • A heap reduces this search dramatically (O(log(n))

Problems

  • The actual Modularity score does not exactly match the Modularity score of the example on the wikipedia page
  • Does not work for directed graphs
  • Does not work for negative graphs (TODO add this capability)

Stores the following information

  • Supergraph
    • Duplicate of the original graph. Gets manipulated during the clustering: edges and nodes are condensed and reweighted
    • Cluster degree stored in node attribute
    • Number of connections between clusters stored in edge attribute (weighted edge)
    • Implicitly keeps track of the existing nodes which is required for the heap
  • Dendrogram
    • Stores the clustering history in a tree
  • Heap
    • Stores the modularity quality difference for combining each pair of existing nodes
    • Popping returns the node pair with the largest modulairty/quality difference, then smallest id1, then smallest id2
      • Stored as a tuple (value, id1, id2) where the modularity/value is negated, and id1 is the smaller of the two
      • Processing the smaller IDs first means that smaller clusters will be chosen first during modularity tie breaking
    • Non-existing nodes are not actively removed from the heap, so pairs with non-existing nodes are ignored when popping
  • Quality History
    • Charts the Modularity score for each number of clustering

Author

Author(s): Ethan Lozano & Matthew Seal

Collaborator(s):

(C) Copyright 2013, Opengov