
Northeastern’s newest infrastructure to accelerate scientific research is now operational. As online platforms continue to restrict access to their data, researchers around the world have found it more difficult to study online behavior. Developing software, recruiting participants, and keeping pace with evolving technologies has become increasingly daunting especially amid budget cuts and grant terminations. The National Internet Observatory aims to change that, offering researchers access to large scale data on how Americans experience the internet.
The race to study Information Technology
Information technology has fundamentally transformed American life. Generative AI is suddenly everywhere, and social media has overtaken traditional outlets as the primary source of information for many. Yet, just as these platforms grow in influence, they are becoming harder for academic researchers to study. Companies like Facebook and Reddit sharply restricted data access, creating major barriers to understanding online behavior.
Co-led by David Lazer, University Distinguished Professor at the Network Science Institute, Northeastern's Dr. Christo Wilson and Dr. David Choffnes, supported by Dr. John Basl and Dr. Michelle N. Meyer, proposed the National Internet Observatory as a solution: to collect data on Americans’ online experiences and make that data available to the scientific community for research. The project was awarded the first Mid-scale grant the NSF has ever given for a social science infrastructure project. Now, four years later, the project is welcoming its first external users from the research community.
The vision of a telescope for the Internet
The National Internet Observatory (NIO) is intended to collect data on the Internet as Americans experience it. What news articles do people see in what order? What are they asking ChatGPT and how does it respond? How do they feel when they visit social media? Since 2023, the project has recruited over 10,000 participants to download software on their phones and web browsers. The software collects information about what they see, what they do, and where they go. The goal has been to generate a representative picture of the American internet—while operating under a robust research ethics framework that emphasizes ongoing informed consent and multiple layers of participant protection, both technical and legal.
Promising early results
Early investigations into the data collected by the NIO have been promising. One initial study sought to replicate another study conducted from 2018 to 2020 that found no partisan filtering of Google search results. In 2023 NIO researchers confirmed the original findings of no partisan differences in what conservatives and liberals see when they search on Google, however this replication took only several weeks to complete, compared to three years it took to conduct the original study. Moreover, the analysis was powered by data from three times as many participants.
The team went even further. Using NIO data, they compared these results across different online platforms and found a striking result. They found partisan differences varied substantially across platforms. While liberals and conservatives generally saw similar information on Google and Reddit, they saw very different information on Facebook and X.
Outreach efforts and access
To make its datasets available to a broader research community, the NIO team has been developing a phased access system. In the initial phase, researchers are invited to work with NIO datasets through specific projects, developed by the NIO team and its Board of Directors, comprising leading internet researchers and ethicists. Currently, there are 13 research projects based on proposals from over 50 faculty members across more than 20 universities. Topics range from the relationship between social media and depression to the effectiveness of privacy technologies in search results. In the second phase, NIO will open a submission system for independent research proposals and approved projects will gain access to the NIO datasets.
The response from the research community has been so far overwhelmingly positive, with strong expressions of interest and a waiting list of 100 researchers eager to work with NIO data on one of the 13 approved projects. This momentum is in part due to the outreach efforts of two Network Science Institute postdocs, Pranav Goel and Kai-Cheng Yang (now an external affiliate), who have led NIO workshops at academic conferences nationwide since 2024. In addition to sharing NIO’s capabilities, they’ve engaged in dialogue with faculty, postdocs, and students about future uses of the platform and collected feedback on what data researchers across disciplines are most interested in accessing.
A new path forward for Internet science
The National Internet Observatory represents a new approach to internet research; one where the data are ethically sourced, trustworthy, and representative of the population, and where they are tied to individuals’ real lives, opinions, and traits. Researchers are particularly excited about using validates data, avoiding the need to build and maintain new software or manage the recruitment of participants from scratch.
This summer, NIO will onboard the 13 current research proposals and open the application process to the waitlisted researchers interested in studying these projects. Over the next six months, the number of projects is expected to grow to nearly 50. Researchers interested in accessing the platform and participate in one of the approved proposals can submit their information to the waitlist form or email help@nationalinternetobservatory.org.
For more information on the National Internet Observatory and on the submission process visit https://nationalinternetobservatory.org