A world of data in a few hands
Drawing on the ERC-funded research project NEWORLDatA, Simone Turchetti explains why EU science diplomacy cannot ignore global data inequality.
The European Framework for Science Diplomacy aims to transform international research into a driving force for European integration and global stewardship. The 89-page report also acknowledges the need for data sharing and openness within this framework, given the explosion of big data-driven research that typifies the global scientific enterprise today. But what if the very foundations of this system are unequal?
This is the question at the heart of NEWORLDatA, an ERC-funded project I lead, which explores how the diplomacy of research data has shaped global science over the past century. Our findings reveal a troubling pattern: while data is often celebrated as a neutral, universal resource, its governance has historically favoured a few – leaving many regions at a disadvantage. Targeted interventions on research data governance are needed if science diplomacy is to play the transformative role it aspires to have.
Research data are the foundation of modern science. While they vary in nature, classification and use across disciplines, data are both an enabling factor and a core element in any scientific study. Over the last century, they have also defined a distinct sub-domain of diplomacy, as bi- and multilateral talks and agreements on research data have informed their compilation and circulation globally. Yet, as NEWORLDatA shows, data governance produced a skewed global framework characterised by an imbalanced distribution of data repositories and geographically uneven data flows.
Data governance has created a skewed global framework, with data repositories and data flows distributed in highly uneven ways across the world.
Our team - researchers from Europe, Brazil, Kenya, and China - first mapped the global distribution of data centres, finding that they are predominantly based in North America and Western Europe, with sparse presence elsewhere. Why does this matter? Where data is stored, managed, and analysed determines who benefits from it.
Historical records explain the roots of this imbalance. From the 1950s, new data-gathering projects were advertised as world-reaching, yet participation remained uneven, with few individuals from specific national groups taking on leading roles. Global data networks such as the World Data Centre system set up during the 1957-58 International Geophysical Year promised to make data on the earth and its features globally available, but its operations primarily served earth scientists in the US, Soviet Russia, Western Europe and Japan, where the relevant data hubs were sited. From 1966, the International Council of Scientific Unions’ Committee on Data for Science and Technology propelled data sharing in physics and chemistry. Yet the committee’s membership initially assigned responsibility for data compilation to national research groups from a few ‘scientifically developed’ countries.
The Cold War influenced these arrangements, stimulating both competition and collaboration in data-gathering exercises, though mainly among scientists from the Western and Soviet blocs. Security and export control provisions further lessened research data availability globally. Meanwhile, the decolonisation process heightened tensions between developed and developing countries on data sharing as competing views emerged on the goals of international research cooperation. As expected, newly independent nations called for greater investments in development-oriented research, while denouncing how global research unfairly extracted data from the Global South.
Access to datasets has broadened, yet it has coincided with even faster growth in data-sharing among groups that already enjoyed strong access.
Since the 1990s, access to datasets has broadened as the global research enterprise has become more inclusive, and new technologies – particularly the internet – have made them available on digital platforms. Yet, if the creation of these new ties with previously marginalised research groups enabled research data flows to reach them, it also coincided with even faster growth in connections between groups already enjoying data-sharing. Moreover, during that decade, private concerns began to acquire previously state-owned datasets, thus further limiting access to previously free data. In turn, not only has the distribution of research data within the global research network continued to be uneven, but this unevenness has been further amplified.
The picture that NEWORLDatA paints suggests that it is vital to acknowledge these global data imbalances when advancing the EU's science diplomacy framework. Overlooking them risks instead reproducing a familiar pattern, uncritically underscoring the benefits of international cooperation in science, while downplaying its selective character in terms of membership and influence. Conversely, NEWORLDatA highlights the merits of tailored interventions to make the global scientific enterprise more inclusive, especially in data production and circulation, and recognises the need for new policies to address the historical legacy of imbalances in the global data infrastructure. Thus, while constructive, the recent EU framework’s pledge in favour of data openness and sharing seems to scratch the surface of a deeper structural issue connected to global research data inequality.
So, what could be more effective? The EU’s framework already calls for strategic priorities to drive action.
To address global research data disparities, some of these interventions should focus on limits in global infrastructural capacity for distributing data flows, rather than assuming that the means to widen access to research data already exist globally, or that it is sufficient for compilers simply to share existing datasets.
For instance, the 2025 EU Data Act has increased awareness of the need to prevent a few private technological giants from owning data platforms, including cloud-based systems, and having a monopolistic control over data sharing. As some of the current research involves big data exchanges in terabyte volumes, it is vital that new provisions truly enable global research data sharing, while circumventing the growing reliance on privately owned infrastructures for data transfer.
Above all, NEWORLDatA's findings demonstrate that research data have shaped global governance mechanisms, demanding diplomatic interventions beyond simply ensuring openness and sharing. For instance, the EU should initiate a dialogue on how this governance aligns with global priorities, especially as United Nations experts have recently noted that progress on the Sustainable Development Goals – including targets on climate change, poverty and economic growth – is undermined by insufficient support for research in the Global South aligned with the 2030 agenda.
Furthermore, data compilation capabilities and strategies should be comprehensively reassessed, assigning new roles to groups and organisations whose research priorities have been overlooked. One might even dare say that, while research data have been in the hands of a few over the last century, there is now scope to focus on a small number of targeted science diplomacy interventions that could reshape global data governance.
The EU’s science diplomacy framework presents a unique opportunity to move beyond rhetoric on data openness and toward structural reform. The alternative – perpetuating a system where data remains concentrated in the hands of a few – risks undermining the very goals of global cooperation that science diplomacy seeks to advance.

Simone Turchetti is a historian and scholar of science diplomacy, based at the Centre for the History of Science, Technology and Medicine, University of Manchester.