Publication
A wide range of graph embedding objectives decompose into twocomponents: one that enforces similarity, attracting the embeddingsof nodes that are perceived as similar, and another that enforcesdissimilarity, repelling the embeddings of nodes that are perceivedas dissimilar. Without repulsion, the embeddings would collapseinto trivial solutions. Skip-Gram Negative Sampling (SGNS) is apopular and efficient repulsion approach that prevents collapse byrepelling each node from a sample of dissimilar nodes. In this work,we show that when repulsion is most needed and the embeddingsapproach collapse, SGNS node-wise repulsion is, in the aggregate,an approximate re-centering of the node embedding dimensions.Such dimension operations are more scalable than node operationsand produce a simpler geometric interpretation of the repulsion.Our theoretical result establishes dimension regularization as aneffective and more efficient, compared to skip-gram node contrast,approach to enforcing dissimilarity among embeddings of nodes.We use this result to propose a flexible algorithm augmentationframework that improves the scalability of any existing algorithmusing SGNS. The framework prioritizes node attraction and replacesSGNS with dimension regularization. We instantiate this genericframework for LINE and node2vec and show that the augmented algorithms preserve downstream link-prediction performance whilereducing GPU memory usage by up to 33.3% and training timeby 23.4%. Moreover, we show that completely removing repulsion(a special case of our augmentation framework) in LINE reducestraining time by 70.9% on average, while increasing link prediction performance, especially for graphs that are globally sparse butlocally dense. Global sparsity slows down dimensional collapse,while local density ensures that node attraction brings the nodesnear their neighbors. In general, however, repulsion is needed, anddimension regularization provides an efficient alternative to SGNS.



