Alexander Gates
London E1W 1YW, UK
Portland, ME 04101
2nd floor
11th floor
Boston, MA 02115
2nd floor
London E1W 1LP, UK
Talk recording
One of the most fundamental approaches for understanding complex data is clustering; for example, in network science, communities capture central organizing principles of the link structure and are critical for understanding the dynamical processes that operate on networks. Throughout many problems of clustering, such as evaluating clustering methods, identifying consensus clusterings, and tracking the evolution of clusters over time, the most basic task is quantitatively comparing clusterings. Most existing methods focus on comparing the clusters, either by measuring statistical independence, matching similar clusters, or counting co-clustered element pairs. Yet, all common measures have critical biases and no measure accommodates both overlapping and hierarchical clusterings. Here, in collaboration with Ian Wood & Y.Y. Ahn, I demonstrate how standard clustering similarity measures fail to meet common sense expectations and propose a new framework that not only addresses such biases but also unifies the comparison of overlapping and hierarchically structured clusterings. Furthermore, we demonstrate that our framework can provide detailed insights into how the clusterings differ. We apply our method to neuroscience, handwriting, and social network datasets to illustrate the strengths of our framework and reveal new insights into these datasets. The universality of clustering across disciplines suggest the far reaching impact of our framework across all areas of science.