Babak Fotouhi
London E1W 1YW, UK
Portland, ME 04101
2nd floor
11th floor
Boston, MA 02115
2nd floor
London E1W 1LP, UK
Talk recording
Social networks are valuable tools to study the diverse linkages of social structure and human behavior. Network sampling methods are employed to estimate the structural properties of these networks from limited sets of available observations. Sampling social networks, in particular, is practically costly and challenging. One omnipresent noise-inducing obstacle which also severely restricts the amount of information collectible is respondent fatigue during interviews. The conventional treatment is imposing artificial limits on the number of social ties each respondent must mention---typically less than 10. This is called the Fixed Choice Design (FCD).
FCD is used in almost all social network studies---including large-scale surveys such as the National Longitudinal Study of Adolescent Health, the General Social Survey, and the National Social Life, Health, and Aging Project. No framework for the inference of the structural properties of the underlying network from FCD data exists in the literature. Consequently, though it is common knowledge that FCD inevitably discards much information about the connectivities, most social network studies (e.g., studying diffusion, social contagion, etc.) have been forced to use the crude version of the sampled network---without any inference.
In this paper, we propose computationally-feasible estimators for several important network statistics, and corroborate their accuracy via extensive simulations. We demonstrate that using the crude estimates leads to considerable error, namely, consistent and gross underestimation of magnitudes of epidemics and social contagion. We demonstrate that employing the proposed estimators alleviates this problem. Our proposition offers substantial room for improvement in the analysis of almost all offline social network datasets existing in the literature.