We show a sampling-based randomized approximation algorithm for estimating the betweenness centrality of all nodes in a graph, and to keep the estimations up-to-date as the graph evolves. Our algorithm, called ABRA, employs progressive sampling and a stopping condition that uses efficient-to-compute bounds to the Rademacher Averages, a fundamental concept from statistical learning theory. We also use pseudodimension to prove an upper bound to the number of samples needed by the algorithm. ABRA outperforms existing state-of-the-art algorithms offering the same quality guarantees, both in running time and in the number of samples needed.
Matteo Riondato is a Research Scientist in the Labs at Two Sigma Investments and a Visiting Assistant Professor in Computer Science at Brown University. His research focus is in algorithmic data science, developing theory and methods to extract the most information from large datasets, as fast as possible and in a statistically sound way. He got is PhD from Brown and held postdoc positions at Brown and Stanford. His works received the best student poster award at the 2014 SIAM International Conference on Data Mining (SDM) and the best student paper award at the 2016 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). He tweets @teorionda and lives at http://matteo.rionda.to.