Efficient access to Big Data and the development of related technologies have driven the study of complex social systems. In this thesis, I focus on two aspects of the study of complex social systems: predictability and inequality.
I first demonstrate the predictability aspect of complex social systems in the publishing industry. I begin with a Big Data analysis of New York Times bestsellers and bestselling authors, including the genre composition, longevity, and sales of bestsellers, as well as the gender composition and career characteristics of bestselling authors. Next, I examine weekly book sales curve and discover the universal pattern of``fast rise, slow decline.'' Though this model enables prediction of total sales by the ``peeking strategy'', such a prediction requires at least 25 weeks of sales of a given book, a period well beyond the peak of the sales curve. To provide early prediction of book sales, I extract book features such as author, book, and publisher before publication. Then I develop the ``Learning to Place'' algorithm which addresses the problem of imbalance in book sales, i.e. most books have low sales and much less books have high sales. This framework also allows us to understand the characteristics that drive book sales, which is very important for understanding complex social systems.
The second aspect of complex social systems that I examine in this thesis is inequality. First, I conduct a large-scale investigation of gender underrepresentation in the art world, using various statistical tools and complex networks. I propose two criteria: gender-neutral and gender-balanced to categorize the representation of women artists in each institution. I find a systematic underrepresentation of women artists in institutions, which may hinder career development and access to auctions for women artists. Finally, I use logistic regression to connect the institution exhibition inequality with the auction inequality, and find that institution exhibition inequality has an effect on artists' auction access.
Following the line of inequality, the last project focuses on how inequality emerges in information access in a network. One of the most important functions of networks is the dissemination of information, and it is argued that information is the basis for all kinds of inequality. In this project, I measure the information inequality of different models under different processes and understand what properties are related to information inequality. I propose different models with the majority/minority dichotomy, along with mechanisms such as homophily, preferential attachment, and diversity. I simulate different information spreading processes with different settings, from the type of process to the transmission rate to seeding settings. I propose a measure of inequality in information access that allows us to examine inequality at different stages of the process. I find that information access equality depends on both the network structure and the spreading process. It is also observed that there may be a trade-off between equality and efficiency in information spreading under certain circumstances.
Ultimately, the goal of this thesis is to provide a starting point and inspiration to explore the predictability of complex social systems, especially on other cultural products such as films, music and videos, and to explore inequalities in complex social systems, not only in relation to specific case studies such as gender or racial bias, but also how inequalities arise and possible interventions to promote information equality.
Albert-László Barabási (Chair), Network Science Institute, Northeastern University
Tina Eliassi-Rad, Network Science Institute, Northeastern University
Christoph Riedl, Network Science Institute, Northeastern University
Emilio Ferrara, Viterbi School of Engineering, University of South California
Xindi is a fifth-year Network Science PhD student working with Prof. Albert-László Barabási at CCNR. She received her Bachelor of Engineering from University of Electronic Science & Technology of China (UESTC) in 2015. She is broadly interested in network science, computational social science and complex systems. Currently her focus is related to computational social science and relationship between technology and society, including projects on unraveling gender bias in the art world and fairness in machine learning.
No upcoming events