Clustering and phase transitions in transformers
Visiting speaker
Yury Polyanskiy
Professor of Electrical Engineering and Computer Science, MIT
Past Talk
Hybrid talk
Friday
Mar 8, 2024
Watch video
1:00 pm
EST
Virtual
177 Huntington Ave.
11th floor
Devon House
58 St Katharine's Way
London E1W 1LP, UK
Online
Register here
Almost all of the recent advances in AI (large language models, image/video diffusion models, recommendation systems, and visuomotor policy in robotics) are based around a neural architecture known as transformer. We identify the process of propagation of representations through layers of transformer with a particular kind of an interacting particle system, whose dynamics exhibit several interesting properties. In the initial (short) phase, the particles coalesce to form medium-sized clusters. In the second (long) phase, these clusters slowly spin around and occasionally merge until they form a single lump. We hypothesize that the first phase's (meta-stable) clustering may explain transformers' ability for robust long-horizon logical deductions. The second phase corresponds to synchronization behavior discovered in certain dynamics, such as the Kuramoto model, which turns out to be a special case of the transformer.
About the speaker
About the speaker
Yury Polyanskiy is a Professor of Electrical Engineering and Computer Science, a member of IDSS and LIDS at MIT, and an IEEE Fellow. Yury received his M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 2005, and his Ph.D. degree in electrical engineering from Princeton University, Princeton, NJ, in 2010. His research interests span across information theory, statistical learning, error-correcting codes, wireless communication, and fault tolerance. Dr. Polyanskiy won the 2020 IEEE Information Theory Society James Massey Award, the 2013 NSF CAREER Award, and the 2011 IEEE Information Theory Society Paper Award.
Yury Polyanskiy is a Professor of Electrical Engineering and Computer Science, a member of IDSS and LIDS at MIT, and an IEEE Fellow. Yury received his M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 2005, and his Ph.D. degree in electrical engineering from Princeton University, Princeton, NJ, in 2010. His research interests span across information theory, statistical learning, error-correcting codes, wireless communication, and fault tolerance. Dr. Polyanskiy won the 2020 IEEE Information Theory Society James Massey Award, the 2013 NSF CAREER Award, and the 2011 IEEE Information Theory Society Paper Award.