Using Off-the-Shelf Harmful Content Detection Models: Best Practices for Model Reuse

Publication

ACM Digital Library

May 2, 2025

Resources

Supervised machine learning is a common approach for automated harmful content detection to support content moderation. This approach relies on data annotated by humans to train models to recognize classes of harmful content. For detection tasks, researchers or content moderation communities typically either design their own annotation tasks to generate training data for new harmful content detection models, or use off-the-shelf (OTS) pre-trained harmful content detection models. OTS model reuse can enable detection tasks in resource-constrained contexts and can help to reduce the environmental impact of training new models -- an energy-intensive process. However, given the plethora of OTS models now available for reuse, determining which OTS model to reuse for a particular task and how to use it can be challenging, especially given that many of these models have been developed for specific contexts that are not always easily transferred onto others. This work aims to provide best practices for reusing OTS models for harmful content detection tasks. By using content analysis and statistical methods to evaluate assumptions about OTS model utility and reusability, we show that model reusers cannot assume that a model claimed to detect a particular concept, will actually detect that concept. Instead, based on our findings, we offer a decision tree for how to assess whether an OTS model would be appropriate for reuse for a new harmful content detection task. This decision tree directs model reusers to critically assess concept definitions, annotation task design, and additional features specified in our content analysis codebook to identify expected model output, and consequently evaluate whether that OTS model is appropriate for reuse for a new detection task.

NetSI authors

Sagar Kumar

Network Science PhD Student

Share this page:

Using Off-the-Shelf Harmful Content Detection Models: Best Practices for Model Reuse

Publication

Research areas

Resources

NetSI authors

Related publications