Discovery of scientific fraud aimed at boosting research impact

Published by Adrien,
Source: The Conversation under Creative Commons license
Other Languages: FR, DE, ES, PT

By Lonni Besançon, Linköping University, and Guillaume Cabanac, Institute of Research in Computer Science of Toulouse

The image of a researcher working alone while ignoring the scientific community is a myth. Research is founded on constant exchange, first and foremost to understand the work of others and then to make one's own results known. Reading and writing articles published in journals or at scientific conferences are at the heart of researchers' activities.


When writing an article, it is essential to cite the work of peers, whether to describe a context, detail sources of inspiration, or explain differences in approaches and results. Being cited by other researchers for "good reasons" is a measure of the importance of one's results. But what happens when this citation system is manipulated? Our recent study reveals an insidious method for artificially inflating citation counts: "phantom references."

The mechanics of manipulation


The world of scientific publication and its functioning, as well as its potential pitfalls and their causes, are recurrent subjects of scientific popularization. However, let's focus particularly on a new type of drift affecting citations between scientific articles, which are supposed to reflect the intellectual contributions and influences of a cited article on the citing article.

Citations of scientific work rely on a standardized referencing system: authors explicitly mention in the text of their article at least the title of the cited article, the names of its authors, the year of publication, the name of the journal or conference, and the page numbers...

These pieces of information appear in the bibliography of the article (a list of references) and are recorded in the form of auxiliary data (not visible in the article text) known as metadata, notably when assigning the DOI (digital object identifier), a unique identifier for each scientific publication.

References in a scientific publication allow authors to justify methodological choices or recall the results of past studies. The references listed in each scientific article are actually the evident manifestation of the iterative and collaborative nature of science. However, some unscrupulous actors have evidently added additional references, invisible in the text but present in the article's metadata during registration by publishing houses. The result? The citation counts of some researchers or journals skyrocket without valid reasons, as these references are not present in the articles that are supposed to cite them.

A new type of fraud and an opportunistic discovery


It all started with Guillaume Cabanac (co-author of the article), who published a post-publication review report on PubPeer, a site where scientists discuss and analyze publications. He noticed an inconsistency: an article, probably fraudulent due to exhibiting "tortured expressions", from a scientific journal published by the scientific journal publisher Hindawi, had received many more citations than downloads, which is very unusual. This post attracted the attention of several "scientific detectives;" a reactive team was formed with Lonni Besançon, Guillaume Cabanac, Cyril Labbé, and Alexander Magazinov.

We tried to locate, via a scientific search engine, the articles citing the original article, but Google Scholar provided no results while others (Crossref, Dimensions) did. It turns out that Google Scholar and Crossref or Dimensions do not use the same process to retrieve citations: Google Scholar uses the actual text of the scientific article, whereas Crossref or Dimensions use the article's metadata provided by the publishers.

To understand the extent of the manipulation, we then examined three scientific journals that seemed to be massively citing the Hindawi article. Here is our approach in three steps:
- First, we listed the references explicitly present in the HTML or PDF versions of the articles;
- Then, we compared these lists with the metadata recorded by Crossref, an agency that assigns DOIs and their metadata. We discovered that some additional references had been added here but did not appear in the articles;
- Finally, we checked a third source, Dimensions, a bibliometric platform that uses Crossref metadata to calculate citations. Once again, we found inconsistencies.

The result? In these three journals, at least 9% of the recorded references were "phantom references." These additional references do not appear in the articles but only in the metadata, thereby distorting citation counts and unfairly giving some authors an advantage. Some references actually present in articles are also "lost" in the metadata.

Implications and potential solutions


Why is this discovery important? Citation counts significantly influence research funding, academic promotions, and institutional rankings. They are used differently depending on the institutions and countries but always play a role in such decisions.

Manipulating citations can thus lead to injustices and decisions based on false data. More worryingly, this discovery raises questions about the integrity of scientific impact measurement systems, which have been highlighted for several years now.

Indeed, many researchers have previously pointed out that these measures could be manipulated but, more importantly, that they engender unhealthy competition among researchers who would consequently be tempted to cut corners to publish more quickly or have better results that would therefore be more cited.

A potentially more dramatic consequence of these productivity measures for researchers lies in the waste of effort and scientific resources due to the competition established by these measures.

To combat this practice, the "Invisible College," an informal collective of scientific detectives to which our team contributes, recommends several measures:
- Thorough verification of metadata by publishers and agencies like Crossref.
- Independent audits to ensure data reliability.
- Increased transparency in the management of references and citations.

This study highlights the importance of the precision and integrity of metadata because they too are subject to manipulation. It is also essential to note that Crossref and Dimensions have confirmed the study's findings and it seems that some corrections have been made by the publisher that manipulated the metadata entrusted to Crossref and, subsequently, to bibliometric platforms like Dimensions.

Pending corrective measures, which are sometimes very lengthy or even non-existent, this discovery serves as a reminder of the need for constant vigilance in the academic world.
Page generated in 0.071 second(s) - hosted by Contabo
About - Legal Notice - Contact
French version | German version | Spanish version | Portuguese version