Studying the (in)effectiveness of low dimensional graph embeddings
C. Seshadhri
C. Seshadhri, University of California Santa Cruz
Past Talk
Friday
Apr 30, 2021
Watch video
4:00 pm
Virtual
177 Huntington Ave.
11th floor
Online
Register here

Register in advance for this meeting here. After registering, you will receive a confirmation email containing information about joining the meeting.

Low dimensional graph embeddings are a fundamental and popular tool used for machine learning on graphs. Given a graph, the basic idea is to produce a low-dimensional vector for each vertex, such that "similarity" in geometric space corresponds to "proximity" in the graph. These vectors can then be used as features in a plethora of machine learning tasks, such as link prediction, community labeling, recommendations, etc. Despite many results emerging in this area over the past few years, there is less study on the core premise of these embeddings. Can such low-dimensional embeddings effectively capture the structure of real-world (such as social) networks? Contrary to common wisdom, we mathematically prove and empirically demonstrate that popular low-dimensional graph embeddings do not capture salient properties of real-world networks. We mathematically prove that common low-dimensional embeddings cannot generate graphs with both low average degree and large clustering coefficients, which have been widely established to be empirically true for real-world networks. Empirically, we observe that the embeddings generated by popular methods fail to recreate the triangle structure of real-world networks, and do not perform well on certain community labeling tasks. This is joint work with Ashish Goel, Caleb Levy, Aneesh Sharma, and Andrew Stolman.

About the speaker
About the speaker
C. Seshadhri (Sesh) is a professor of Computer Science at the University of California, Santa Cruz. Prior to joining UCSC, he was a researcher at Sandia National Labs, Livermore. His primary interests are in theoretical computer science and the mathematical foundations of big data. His work spans many areas: sublinear algorithms, graph algorithms, graph modeling, scalable computation, and data mining. In the theory world, his work has resolved numerous open problems in monotonicity testing and graph property testing. A number of his papers in the interface of TCS and applied algorithms have received paper awards at KDD, WWW, ICDM, SDM, and WSDM. He received the 2019 SDM/IBM Early Career Award for Excellence in Data Analytics.
C. Seshadhri (Sesh) is a professor of Computer Science at the University of California, Santa Cruz. Prior to joining UCSC, he was a researcher at Sandia National Labs, Livermore. His primary interests are in theoretical computer science and the mathematical foundations of big data. His work spans many areas: sublinear algorithms, graph algorithms, graph modeling, scalable computation, and data mining. In the theory world, his work has resolved numerous open problems in monotonicity testing and graph property testing. A number of his papers in the interface of TCS and applied algorithms have received paper awards at KDD, WWW, ICDM, SDM, and WSDM. He received the 2019 SDM/IBM Early Career Award for Excellence in Data Analytics.