Preprints

What are preprints?

A preprint is a version of a document that precedes the formal peer review and publication stage in peer-reviewed journals. A preprint of a document may be available before, and remain available after, publication. As a result, preprints may contain content similar, or identical, to the documents they preceded, and could therefore influence the Similarity Report.

What to do with preprints?

iThenticate has identified a list of websites that contain preprint sources. As the administrator, you have the ability to decide how users of your account see these sources.

  1. The options for preprint sources are available from the Settings page, accessible from the sidebar.
  2. From the Preprints heading, you will have the option to do the following with preprint sources:
  3. Label preprint sources

    This will label any source in the Similarity Report that iThenticate has identified as a preprint. With this option selected, preprint sources will appear in the Sources panel of the Similarity Report. Your account users will be able to exclude preprint sources from the Similarity Report using the Similarity Report settings page or from within the report.

    Label and exclude preprint sources

    This will label any source in the Similarity Report that iThenticate has identified as a preprint. iThenticate will automatically exclude these preprint sources. They will appear in the Similarity exclusion area of the Similarity Report. Users will not be able to reinclude these sources in the Similarity Report.

    Don’t label preprint sources

    This option means if any preprint sources are found in the Similarity Report, they will be identified as regular sources. There will be no differentiation between preprint source matches and regular matches.

  4. Use the Save button to confirm any changes you’ve made.

How do we identify preprints?

When enabled, this feature will label and exclude sources (depending on the setting) that have been identified as preprints.

Harvested metadata or a source's URL are used by iThenticate to clearly differentiate a preprint from a published article. Identification is achieved through use of the harvested metadata or by using the source's URL.

iThenticate has identified repositories that only host preprints to ensure that published content is not erroneously labeled for exclusion. iThenticate has identified repositories that only host preprint content, and therefore can be used with certainty to identify preprint sources.

Included preprint repositories

Below is a list of the current repositories that iThenticate uses to correctly identify preprints:

Repository URL
APSA Preprints https://preprints.apsanet.org/
arXiv https://arxiv.org/
Beilstein Archives https://www.beilstein-archives.org/
bioRxiv https://www.biorxiv.org/
Cambridge Open Engage https://www.cambridge.org/engage
ChemRxiv https://chemrxiv.org/
ChinaXiv http://chinaxiv.org/
Cryptology ePrint Archive https://eprint.iacr.org/
EarthArXiv https://eartharxiv.org/
EasyChair https://easychair.org/publications/preprints
EcoEvoRxiv https://ecoevorxiv.org/
engrXiv https://engrxiv.org/
Jxiv https://jxiv.jst.go.jp/
Mathematical Physics Preprint Archive https://web.ma.utexas.edu/mp_arc/
medRxiv https://www.medrxiv.org/
Nature Precedings https://www.nature.com/npre/
PeerJ PrePrints https://peerj.com/preprints/
Research Square https://www.researchsquare.com/
RIN arxiv (formerly INArxiv) https://rinarxiv.lipi.go.id/
SciELO Preprints https://preprints.scielo.org/
TechRxiv https://www.techrxiv.org/
ViXra https://vixra.org/
WikiJournal preprints https://en.wikiversity.org/wiki/WikiJournal_Preprints/

Identified but not yet included preprint repositories

iThenticate is always working towards expanding our preprints repository. Below is a list of preprints repositories that we are aware of and are actively working on adding to our repository:

Repository URL
AfricArXiv https://osf.io/preprints/africarxiv
Arabixiv https://arabixiv.org/
Authorea https://www.authorea.com/
BioHackrXiv https://biohackrxiv.org/
BodoArXiv https://osf.io/preprints/bodoarxiv
ECSarXiv https://ecsarxiv.org/
EdArXiv https://edarxiv.org/
ESS Open Archive https://essopenarchive.org/
FocUS Archive https://osf.io/preprints/focusarchive
FrenXiv https://osf.io/preprints/frenxiv
INA-Rxiv https://osf.io/preprints/inarxiv
LawArXiv https://osf.io/preprints/lawarxiv
LIS Scholarship Archive https://osf.io/preprints/lissa
MarXiv https://osf.io/preprints/marxiv
MataArXiv https://osf.io/preprints/metaarxiv
MediArXiv https://mediarxiv.org/
MetaArXiv https://osf.io/preprints/metaarxiv/
MindRxiv https://mindrxiv.org/
NutriXiv https://osf.io/preprints/nutrixiv
Optimization online http://www.optimization-online.org/
OSF Preprints https://osf.io/preprints/
PaleorXiv https://paleorxiv.org/
PsyArXiv https://psyarxiv.com/
SocArXiv https://osf.io/preprints/socarxiv/
SportRxiv https://osf.io/preprints/sportrxiv

Unincluded preprint repositories

iThenticate is also aware of sites that host both preprint content and published content, as well as sites that are not accessible to iThenticate. As a result, content from these sites cannot be correctly labeled. Below is a list of repositories that we are either unable to access or accurately label:

Repository URL
Advance: a SAGE Preprints Community https://advance.sagepub.com/
AgriRXiv https://www.cabidigitallibrary.org/journal/agrirxiv
ARPHA Preprints https://preprints.arphahub.com/
Cell Sneak Peek https://www.ssrn.com/index.cfm/en/cell-press-sneak-peeks/
JMIR Preprints https://preprints.jmir.org/
NBER Working Papers https://www.nber.org/papers
Preprints with The Lancet on SSRN https://www.ssrn.com/index.cfm/en/the-lancet/
Preprints.org https://www.preprints.org/
Preprints.ru https://preprints.ru/
SSRN https://www.ssrn.com/
Therapoid https://therapoid.net/en/preprint/
Zenodo https://zenodo.org/

If you are aware of any preprint repositories which are missing from this page, or you are able to assist us with accessing repositories we have been unable to access so far, then please content@turnitin.com.