Auditing moderation

15 December 2025

Stefano Cresci wants platforms to be honest about their moderation – and he is using data from the Digital Services Act (DSA) to check. His research shows that bans and removals work for most users, but a vocal minority becomes more aggressive in their behaviour ‘There is no one-size-fits-all solution,’ he cautions.

One of the most surprising findings from his research, he says, is that people do not all react to moderation in the same way. ‘Many online moderation interventions - such as content removals, temporary suspensions, or community bans - are applied at scale, often affecting thousands of users at once.

Most users show modest, ‘desired’ changes in behaviour - for example, they may reduce their activity or post less toxic content after being moderated. But we often found small minorities of users who responded in the opposite direction. This backlash highlights how complex human responses to moderation can be.’

Cresci’s team analysed 1.5 billion moderation records from the DSA Transparency Database (Article 17), covering the EU’s eight largest platforms, and cross-checked them against company Transparency Reports (Article 15).

The results revealed striking inconsistencies. ‘One notable case was X, which declared that it moderates only synthetic media – such as deepfakes – entirely manually and at instantaneous speed. This is clearly implausible for a platform of its scale,’ Cresci says. Other platforms also showed gaps in how they report automation.

He points out that machine-readable reporting and centralised access are steps forward, but flexible categorisation creates ambiguity.

‘The Transparency Database gives platforms considerable flexibility in what they report and how they categorise their actions. This flexibility sometimes leads to opacity, unsupported reporting features, and ambiguity in how certain decisions should be encoded under the existing schema. Strengthening these aspects would significantly improve our ability to evaluate platforms’ claims and ensure that transparency translates into real oversight.’

Post-API lifeline

Access to platform data has become increasingly complex. Many social media companies have restricted, or even entirely revoked, the access that academics once relied on. After Elon Musk acquired Twitter/X, the platform’s API – the Application Programming Interface widely used for research - was put behind a paywall, while Meta recently terminated Crowdtangle.

‘This ‘post-API era’ has made independent research on content moderation far more challenging,’ Cresci explains. ‘In this context, the Digital Services Act (DSA) provides an essential lifeline for researchers. Mechanisms such as the Transparency Reports, the Transparency Database, and the data-access provisions under Article 40 allow researchers to request and analyse moderation and other platform data in ways that were previously daunting or outright impossible. The DSA is a first, essential step toward a research environment where we can rigorously hold platforms accountable while continuing to push the frontiers of knowledge in this rapidly evolving field.’

From blanket bans to smart moderation

Cresci’s ERC Starting Grant project, DEDUCE, flips the script on one-size-fits-all moderation, which often ignores user differences in demographics, ideology, and personality – usually backfiring spectacularly.

‘In my research, we use causal inference techniques to measure what actually happens after interventions, and the results reveal a striking pattern: user reactions are highly heterogeneous. These findings suggest that when we assess the success of moderation policies, we need to look beneath the surface averages: what seems like a mild overall improvement could in fact conceal serious adverse effects for a subset of users.’

DEDUCE develops causal metrics to test the effectiveness and fairness of current practices, models how traits like ‘dark personality’ traits drive toxicity (with a psychologist on board), and prototypes personalised interventions. The goal is proactive, data-driven moderation that platforms can simulate and audit independently – turning trial-and-error into science.

Watch the teaser

Toward real accountability

Cresci warns that top-down moderation has benefits but also risks. ‘Centralised policies are broad, inflexible, and can miss linguistic, cultural, or situational nuances, disproportionately affecting minority groups. They may also push platforms to over-moderate, and small mistakes can be amplified at scale. Centralised moderation places authority in the hands of a single entity, whose goals may not reflect users or society.’

Bottom-up approaches like community notes are not perfect either: ‘Ordinary users may lack the expertise for complex decisions, and crowdsourced systems can be manipulated. In content moderation, there is no one-size-fits-all solution. Effective approaches must be nuanced, adaptable, and transparent.’

Stefano Cresci is a researcher at the Institute of Informatics and Telematics of the National Research Council (IIT-CNR) in Pisa, Italy.

Auditing moderation

Share this page

Suggestions for you