ε - WGX: Adaptive Edge Probing for Enhancing Incomplete Networks
No matter how meticulously constructed, network datasets are often partially observed and incomplete. For example, most of the publicly available data from online social networking services (such as Facebook and Twitter) are collected via apps, users who make their accounts public, and/or the resources available to the researcher/practitioner. Such incompleteness can lead to inaccurate findings. We introduce the Adaptive Edge Probing problem. Suppose that one has observed a networked phenomenon via some form of sampling and has a budget to enhance the incomplete network by asking for additional information about specific nodes, with the ultimate goal of obtaining the most valuable information about the network as a whole. Which nodes should be further explored? We present ε-WGX, a network-based explore-exploit algorithm for identifying which nodes in the incomplete network to probe. Aggregated over multiple datasets and a wide range of probing budgets, we find that ε-WGX outperforms other explore-exploit strategies and baseline probing strategies. For example, for the task of adding as many nodes as possible, over incomplete networks observed via four popular sampling methods, ε-WGX outperforms the best comparison strategy by 12%-23% on average.