Public Library of Science
Download file
Download file
Download file
Download file
Download file
Download file
Download file
7 files

PLOS Open Science Indicators

posted on 2023-04-03, 14:19 authored by Public Library of Science

This dataset contains article metadata and information about Open Science Indicators for approximately 71,000 research articles published in PLOS from 1 January 2019 to 31 December 2022 and a set of approximately 7,600 comparator articles published in non-PLOS journals. This is the second release of this dataset, which will be updated with new versions as newly published content is analysed.

This version of the Open Science Indicators dataset focuses on detection of three Open Science practices by analysing the XML of published research articles:

  • Sharing of research data, in particular data shared in data repositories
  • Sharing of code
  • Posting of preprints

The dataset provides data and code generation and sharing rates, the location of shared data and code (whether in Supporting Information or in an online repository). It also provides preprint sharing rates as well as details of the shared preprint, such as publication date, URL and preprint server used. Additional data fields are also provided for each article analysed, such as geographic information (‘Country’) and research topics (‘Discipline’).

Further information on the methods used to collect and analyse the data can be found in OSI-Methods-Statement_v2_Mar23.pdf with accompanying information in OSI-Column-Descriptions_v2_Mar23.pdf and OSI-Repository-List_v1_Dec22.xlsx. Further information on the principles and requirements for developing Open Science Indicators is available in

The data files PLOS-Dataset_v2_Mar23.csv and Comparator-Dataset_v2_Mar23.csv contain 

  • descriptive metadata, e.g. article title, publication data, author countries, is taken from the article .xml files
  • additional information around the Open Science Indicators derived algorithmically, using Natural Language Processing.

The OSI-Summary-statistics_v2_Mar23.xlsx file contains the summary data for both PLOS-Dataset_v2_Mar23.csv and Comparator-Dataset_v2_Mar23.csv used in

Contact details for further information:

Iain Hrynaszkiewicz, Director, Open Research Solutions, PLOS, /

Lauren Cadwallader, Open Research Manager, PLOS, /


Thanks to Allegra Pearce and Tim Vines of DataSeer for contributing to data acquisition and supporting information.


No external funding was received for this work.