CoreScope: Graph Mining Using k-Core Analysis--Patterns, Anomalies, and Algorithms
How do the k-core structures of real-world graphs look like? What are the common patterns and the anomalies? How can we use them for algorithm design and applications? A k-core is the maximal subgraph where all vertices have degree at least k. This concept has been applied to such diverse areas as hierarchical structure analysis, graph visualization, and graph clustering. Here, we explore pervasive patterns that are related to k-cores and emerging in graphs from several diverse domains. Our discoveries are as follows: (1) MIRROR PATTERN: coreness of vertices (i.e., maximum k such that each vertex belongs to the k-core) is strongly correlated to their degree. (2) CORETRIANGLE PATTERN: degeneracy of a graph (i.e., maximum k such that the k-core exists in the graph) obeys a 3-to-1 power law with respect to the count of triangles. (3) STRUCTURED CORE PATTERN: degeneracy-cores are not cliques but have non-trivial structures such as core-periphery and communities. Our algorithmic contributions show the usefulness of these patterns. (1) CORE-A, which measures the deviation from MIRROR PATTERN, successfully finds anomalies in real-world graphs complementing densest-subgraph based anomaly detection methods. (2) CORE-D, a single-pass streaming algorithm based on CORE-TRIANGLE PATTERN, accurately estimates the degeneracy of billion-scale graphs up to 7times faster than a recent multipass algorithm. (3) CORE-S, inspired by STRUCTURED CORE PATTERN, identifies influential spreaders up to 17times faster than top competitors with comparable accuracy.