BioRxiv - CC & Public Domain Catalog - 2020
BioRxiv

folder biorxiv_2020 (65814 files)
file01/104331.meca 6.98MB
file01/106666.meca 4.09MB
file01/107243.meca 33.92MB
file01/112102.meca 198.55MB
file01/125088.meca 52.11MB
file01/143875.meca 2.80MB
file01/146308.meca 6.27MB
file01/148650.meca 14.93MB
file01/148700.meca 20.63MB
file01/155267.meca 6.94MB
file01/160374.meca 5.23MB
file01/166306.meca 16.33MB
file01/170159.meca 34.72MB
file01/199778.meca 43.39MB
file01/221465.meca 10.13MB
file01/232926.meca 3.03MB
file01/270850.meca 5.96MB
file01/276592.meca 4.80MB
file01/280131.meca 1.26MB
file01/288696.meca 707.31kB
file01/290825.meca 14.31MB
file01/317610.meca 466.80kB
file01/330787.meca 13.05MB
file01/330928.meca 1.43MB
file01/332353.meca 10.01MB
file01/338350.meca 45.09MB
file01/338749.meca 5.10MB
file01/343392.meca 24.35MB
file01/344200.meca 1.13MB
file01/351098.meca 38.46MB
file01/352567.meca 14.08MB
file01/363119.meca 1.68MB
file01/367607.meca 622.52kB
file01/370080.meca 72.82MB
file01/372896.meca 12.65MB
file01/376939.meca 16.91MB
file01/387282.meca 6.83MB
file01/388447.meca 33.69MB
file01/396622.meca 25.29MB
file01/396663.meca 24.95MB
file01/400648.meca 8.79MB
file01/401620.meca 4.86MB
file01/406124.meca 8.05MB
file01/406439.meca 2.90MB
file01/407007.meca 1.52MB
file01/419010.meca 2.13MB
file01/423418.meca 46.52MB
file01/426569.meca 9.29MB
file01/427161.meca 3.55MB
Too many files! Click here to view them all.
Type: Dataset
Tags: preprints, bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprint

Bibtex:
@article{,
title= {BioRxiv - CC & Public Domain Catalog - 2020},
journal= {},
author= {BioRxiv},
year= {},
url= {https://sciop.net/datasets/biorxiv},
abstract= {Part of a set of torrents
- Index: https://sciop.net/datasets/biorxiv
- Back Catalogue: (in progress)
- 2018: https://academictorrents.com/details/1509e322f49fd946ab441aa7b092f53879971d87
- 2019: https://academictorrents.com/details/1956fb55a853aaf0558a20f75adfcb65154b7c6a
- 2020: (this torrent)
- 2021: https://academictorrents.com/details/c8ab36be273872466f6a391af4e42e6541c8e65a
- 2022: https://academictorrents.com/details/d33e08e51ece62509bc72042514a19a66d225bd6
- 2023: https://academictorrents.com/details/4a5dc447e3a3e0b338abaa26517689c5e804c13f
- 2024: (in progress)
- 2025 through 25-03-10: https://academictorrents.com/details/d70fda6123588f88478e36204b8be9a751f415da

---

Full archive of [MECA](https://www.niso.org/standards-committees/meca)-formatted dumps from BioRxiv's [full text S3 endpoint](https://www.biorxiv.org/tdm).

Scraped on an annual basis, with the initial upload in March 2025 partially complete.

## Format

These torrents are hybrid bittorrent v1/v2 torrents - this facilitates mutation, indexing, and download of individual files. You should use a bittorrent v2 capable client to download (e.g. qbittorrent with libtorrent 2, listed as `qt6 lt20` in the download page).

Academictorrents currently does not understand v2 torrent files - **the total size of the torrent listed on academictorrents is thus incorrect.** Hybrid torrents contain [BEP 47](https://www.bittorrent.org/beps/bep_0047.html) padding files to align the v1 pieces so each covers at most one file. A torrent client that understands v2 will *not download these files* since they are just empty placeholders.

These torrents also have had to make some modifications to the original source structure in order to fit within the 10MB torrent limit on academictorrents ([see issue](https://github.com/academictorrents/academictorrents-docs/issues/46)). The primary contributor to torrent size if the duplication of the file names in hybrid torrents, so the original meca filenames have been replaced with the DOI suffix for the item in the meca.

If you seed this torrent, consider also snatching and seeding the v2-only torrent which will be uploaded to sciop shortly, which should be a much more efficient torrent.

Individual item metadata is contained within the JATS XML of the meca (a meca is just a zip file, so it can be read without decompressing the whole archive), but some summary metadata is included for indexing purposes:

- `doi_map.json`: maps the item DOI to the location within the torrent
- `license_map.json`: maps the license to the meca
- `license_counts.json`: summary statistics for each license kind
- `errors.json`: any errors that were encountered while creating the torrent.

## Legality

BioRxiv's bulk access page (currently) reads:

> The TDM repository is not intended as a source for further redistribution of articles posted on bioRxiv, or their derivatives, nor does it grant others permission to re-host content posted on bioRxiv.  For most articles submitted to bioRxiv, authors retain copyright and reuse rights.  If you build indexing services or tools based on the full text of articles, you must therefore link back to the text hosted at bioRxiv rather than re-host content.  For reuse/redistribution of individual articles or their derivatives, please consult the licensing terms applied by the authors, which are provided in the metadata.  In most cases, this will require you to contact the copyright holder in advance to obtain permission.

It is true that *authors determine the copyright status of their work* but is not necessarily true that *"in most cases, this will require you to contact the copyright holder in advance to obtain permission."* The majority of work published by BioRxiv is licensed under some variant of [Creative Commons](https://creativecommons.org/) license that expressly permits redistribution. We respect the author's intent by redistributing all the CC and public domain works free of charge, with attribution, here.  All works licensed under restrictive licenses that prohibit redistribution have been removed from the dataset and are not present in the torrent.

This work is listed on academictorrents as CC BY-NC-ND 4.0, the most restrictive of the licenses found in the dataset, but the license for each work is provided in a `licenses_map.json` within the torrent

See https://sciop.net/datasets/biorxiv for further details about the creation of these torrents},
keywords= {bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprints, preprint},
terms= {},
license= {CC BY-NC-ND 4.0},
superseded= {}
}


Send Feedback