“Research has integrity when it is carried out in a way that is trustworthy, ethical, and responsible”
UK Committee on Research Integrity
There is growing interest in ensuring the transparency and reproducibility of published scientific research to ensure trust. Although there have been improvements in the last few years in aspects of reproducibility and transparency (eg, data and code availability), further improvements to make research fully reproducible across disciplines. In particular, features that highlight the integrity of research should be made more prominent are still required.1 In this blog we primarily focus on data availability as a marker of trust, to understand the practice of data sharing and also to see how this is changing over time.
Digital Science’s Dimensions has recently integrated a research integrity dashboard to provide access to data for trust markers in research publications, which are hallmarks of research integrity and open science. These include statements regarding data availability, code availability, competing interests, conflict of interest, and ethics approval, all of which are the markers of trustworthiness and reproducibility.
We take a detailed look at trust markers in a particular research area, vaccine hesitancy, and evaluate the proportion of scientific research publications that report on one of the sources of trust markers. Vaccine hesitancy is defined as “a delay in acceptance, or refusal of vaccination despite availability of vaccination services”2, and is driven by a number of factors. It is a global phenomenon supported by anti-vaccination groups, fake news, and misinformation spread through social media.3
In 2019 the World Health Organization (WHO) identified vaccine hesitancy as a top global health threat.4 According to WHO, it threatens to reverse the historic global efforts to stop vaccine-preventable diseases. Vaccine hesitancy was chosen as a subject with which to explore issues concerning trust because of the nature of the research and its potential to include trust markers.5 Markers such as ethics approval, data availability, data availability status eg, supplementary files providing access to data, are likely to be a requirement from a funder and/or journal to ensure the integrity of research including its reproducibility and transparency.
Vaccine hesitancy is closely linked to the clinical sciences as a research area. However, this topic is relevant in a societal context, from a public health perspective and in understanding why there is hesitancy. We might also expect that developing effective health communications and campaigns to correct vaccine misinformation, for example, would link to the social sciences. In this context, we will also look for interdisciplinarity within the vaccine hesitancy research output and compare the coverage of data availability in the social sciences with the clinical sciences, while at the same time assessing any crossover, providing evidence of interdisciplinarity.
Research questions
1. Vaccine hesitancy and its representation in research publications based on research classifications:
- Research, Condition and Disease Categorisation (RCDC)
2. Do Trust markers play a role in vaccine hesitancy research?
- Looking at categories of availability within one trust marker – data availability
3. Do patterns emerge amongst the data?
- Looking at interdisciplinarity with social sciences tagged research and clinical medicine tagged research
- Comparisons between trust markers in research publications included pre-Covid (2017-2019) and post-Covid (2020-2022).
Methodology
1. A ‘vaccine hesitancy’ search string was sourced and adapted from a recent paper on vaccine hesitancy and Covid-19.6 The search string is included below as an Appendix.
2. Relevant research publications were used to pull out data from GBQ relating to:
- data availability
- data availability for top five Research, Condition and Research Categorisation (RCDC).
3. The Dimensions Research Integrity dataset was used in conjunction with Google Big Query (GBQ) to access data relating to trust markers in research associated with vaccine hesitancy. These data feed into the Dimensions Research Integrity dashboard that is accessible in Dimensions.
4. Python programming was used to analyse the data.
Results
To get an initial sense of the data, we first analysed the vaccine hesitancy research publications from Dimensions to ascertain the distribution of subject areas within which the research in this area is aligned. We looked at the top five RCDC areas which provide the bulk of research in this area. We then use these data to unpick the inclusion of data availability statements alongside research outputs.
Table 1 below demonstrates the acceleration in data availability statements in the last five years.
Year | Number of vaccine hesitancy research publications | No. of vaccine hesitancy research papers including a data availability statement | Percentage of vaccine hesitancy research papers including a data availability statement |
2018 | 41 | 4 | 9.8 % |
2019 | 59 | 6 | 10.2 % |
2020 | 107 | 15 | 14 % |
2021 | 269 | 75 | 27.9 % |
2022 | 302 | 99 | 32.8 % |
To provide an example of research integrity available in the Dimensions Research Integrity dataset we explored one trust marker – data availability statements – and extracted the data attached to each of the categories of data availability. Figure 3 below displays the percentage for each category over a seven-year time period. Although there is an overall increase in data files made available on request from authors (peacock blue), the same increase has not translated to the inclusion of data made available as a file attached to the research publication. Other categories of data availability (online repository, not publicly available, etc) are small in number and show no pattern.
Figure 4 highlights the transformation in the uptake of data availability statements in published research as categorised by the RCDC classification systems available in Dimensions. We evidence an extremely small proportion of publications acknowledging a data available statement in 2011 (the year Dimensions established its reporting on trust markers) increasing to an 82% uptake in 2022. This rise in data availability is very marked and almost certainly related to the speed with which the research community responded to the Covid-19 pandemic. The arrival of Covid established a repositioning in data availability statements, either acknowledged or physically attached to vaccine hesitancy research publications.
The word clouds below set out a representation of the most included concepts in research publications associated with vaccine hesitancy. What is noticeable is that the focus for this research is associated with a number of vaccines pre-Covid but shifts to a predominance of Covid vaccine research during the post-Covid years.
What is also of note is that out of 147 vaccine hesitancy research publications published pre-Covid (2017-2019) 12 (8.1%) include a data availability statement, however, for research publications published post-Covid (2020-2022) we note that out of 725 vaccine hesitancy publications,190 (26%) include data availability statements. Although vaccine research turned around to respond to the Covid pandemic, and likely accounted for the marked increase in data availability, there are still signs of vaccine research generally for infectious diseases (see Figures 5 & 6).
Identifying and understanding the social basis of vaccine hesitancy is important for matters such as future public health policy planning and developing and implementing methods to spread accurate information about the safety and effectiveness of vaccination. This would be important for reducing or eliminating vaccine hesitancy.
Figure 7 displays four distinct clusters showing the connections within and between each topic area. The four clusters can be further visualised within two distinct clusters: i) two clinical/health research clusters (HPV and, more recently, a Covid related domain) and, ii) two social research clusters (religious exemption and conscientious objection – connected by the concept of ‘law’). The topic network visualisation gives us a sense of the multi- and interdisciplinary nature of vaccine related research.
Conclusions
The scientific research community is aware that the integrity and trustworthiness of their published research is of increasing importance, and research integrity practices are changing rapidly in response to this. Data transparency has played a key role in research conducted to develop a Covid vaccine. This blog demonstrates the considerable increase in the adoption of just one trust marker, data availability statements, as we move towards an era where open and trustworthy science are crucial. The more that data is made publicly available the more transparency, accountability, and democratisation of the research process is enabled.
Dimensions Research Integrity
To learn more about Dimensions Research Integrity and to request a demo or a free quote, click here: https://www.dimensions.ai/request-a-demo-or-quote/
Footnotes
1. https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2006930
2. https://pubmed.ncbi.nlm.nih.gov/25896383/
3. https://doi.org/10.29333/ejgm/13186
4. World Health Organization. Ten Threats to Global Health in 2019; WHO: Geneva, Switzerland, 2019.
5. Trust markers are explicit statements on a research publication such as funding, data availability, conflict of interest, author contributions, and ethical approval and represent a contract between authors and readers that proper research practices have been observed. Trust markers highlight a level of transparency within a publication and reduce the reputational risks of allowing non-compliance to research integrity policies to go unobserved.
7. The Research, Condition, and Disease Categorization (RCDC) is a classification scheme used by the US National Institutes of Health (NIH) for reporting required by the US Congress. The implementation of this system used automated allocation of RCDC codes to documents in Dimensions based on category definitions defined by machine learning.
Appendix
Vaccine hesitancy search string:
"vaccin* hesitan*" OR "hesitan* to vaccine*" OR "vaccin* refusal" OR "refusal to vaccine*" OR "vaccin* opposition" OR "opposit* to vaccin*" OR "antivacc* group*" OR "antivax" OR antivaxx OR antivaccination OR "object* to vaccin*" OR "resilience to vaccin*" OR "debate against vaccin*" OR "vaccin* *compliance" OR "vaccine* *adherence" OR "resist* to vaccin*" OR "incomplete vaccin*" OR "misinformation about vaccine*" OR "vaccin* misinformation" OR "vaccin* criticism*" OR "delaying vaccin*" OR "anxiety from vaccin*" OR "criticism to vaccin*" OR "barrier* to vaccin*" OR "lack of intent to vaccin*" OR "poor completion of vaccin*" OR "compulsory vaccin*" OR "negative perception about vaccin*" OR "engagement in vaccin*" OR "choice to vaccin*" OR "awareness about vaccin*" OR "knowledge about vaccin*" OR "behavi* toward vaccin*" OR "poor vaccin* uptake" OR "vaccin* uptake rate" OR "doubts about vaccine*" OR "acceptance of vaccine*" OR "acceptability of vaccine*" OR "contravers* about vaccine*" OR "fear from vaccin*" OR "belief in vaccin*" OR "mandatory vaccin*" OR "compulsory vaccin*" OR "willingness to accept vaccin*" OR "willing to accept a vaccin*" OR "parental control of child* vaccin*" OR "willingness to vaccinate" OR "willingness to accept vaccin*" OR ("religious exemption" AND vaccin*) OR "vaccin* accept*" OR "vaccin* resist*" OR "vaccin* conspiracy" OR "vaccin* skepticism" OR "accept* of the vaccin*" OR "intent* to vaccin*" OR "intent* to get vaccin*" OR "attitude* toward* vaccin*"