Unwinding the hairball graph: Pruning algorithms for weighted complex networks
Empirical networks of weighted dyadic relations often contain “noisy” edges that alter the global characteristics of the network and obfuscate the most important structures therein. Graph pruning is the process of identifying the most significant edges according to a generative null model and extracting the subgraph consisting of those edges. Here, we focus on integer-weighted graphs commonly arising when weights count the occurrences of an “event” relating the nodes. We introduce a simple and intuitive null model related to the configuration model of network generation and derive two significance filters from it: the marginal likelihood filter (MLF) and the global likelihood filter (GLF). The former is a fast algorithm assigning a significance score to each edge based on the marginal distribution of edge weights, whereas the latter is an ensemble approach which takes into account the correlations among edges. We apply these filters to the network of air traffic volume between US airports and recover a geographically faithful representation of the graph. Furthermore, compared with thresholding based on edge weight, we show that our filters extract a larger and significantly sparser giant component.