How history is told is perpetually contested, shaping how nations and people understand both their pasts and the current moment. Who and what from history gets recorded, and what is missing from our collective historical understanding? Wikipedia represents the largest attempt to summarize all human knowledge and is increasingly recognized as the global consensus view about people, places, events and things around the world, providing an opportunity to systematically analyze precisely what is included, and what is omitted, from our collective histories. Using women’s movements in the United States as a case study of a complex and debated historical field, we quantify Wikipedia’s recall, or the comprehensiveness of their coverage of this movement. Using primary documents from three sections of the women’s movement between 1899 and 1935 - a professional women’s organization, a working-class organization, and writings from Black women suffragists - we apply the phrase-mining RAKE algorithm to identify key phrases used by this movement, producing a “ground truth” of things and concepts that ought to be covered in Wikipedia. Using a contemporary snapshot of Wikipedia, we then examine which of these phrases are present in Wikipedia, if they appear in history articles, and whether there are systematic differences in coverage across these three subcollections. In doing so, we identify a typology of mechanisms leading to historical omissions: paucity, restrictive paradigms, and categorical narrowness. We discuss the implications for both historians and Wikipedia editors.
Laura K. Nelson is an assistant professor of sociology in the College of Social Sciences and Humanities at Northeastern University, where she is also core faculty at the NULab for Texts, Maps, and Networks, and is on the Executive Committee for the Women's, Gender, and Sexuality Studies program. Previously, she has been a postdoctoral research fellow at Digital Humanities @ Berkeley and the Berkeley Institute for Data Science at the University of California, Berkeley, and in the Management and Organizations Department in the Kellogg School of Management at Northwestern University, where she was also a research affiliate at the Northwestern Institute on Complex Systems. She uses computational tools, principally automated text analysis and network analysis, to study social movements, culture, gender, institutions, and organizations. She is currently working on a book manuscript that combines these tools and concepts to understand the long-term trajectory of women's movements in the United States. She has published in Sociological Methods and Research, Oxford University Press, and Springer, among other outlets, and has given invited talks and workshops on computational methods throughout the U.S. and internationally.