BioRxiv - CC & Public Domain Catalog - 2021
BioRxiv

folder biorxiv_2021 (66002 files)
file01/010850.meca 15.37MB
file01/011791.meca 8.11MB
file01/012690.meca 11.05MB
file01/012757.meca 16.27MB
file01/017186.meca 20.00MB
file01/017210.meca 27.99MB
file01/017228.meca 20.76MB
file01/018671.meca 42.28MB
file01/022079.meca 16.27MB
file01/022327.meca 4.72MB
file01/022905.meca 1.49MB
file01/024414.meca 19.67MB
file01/026070.meca 7.06MB
file01/026989.meca 24.69MB
file01/027664.meca 8.75MB
file01/027896.meca 27.45MB
file01/029918.meca 5.57MB
file01/030189.meca 6.28MB
file01/030700.meca 13.69MB
file01/032466.meca 473.94MB
file01/034942.meca 8.45MB
file01/038893.meca 16.07MB
file01/040022.meca 2.51MB
file01/041137.meca 29.70MB
file01/041293.meca 19.79kB
file01/043323.meca 39.06MB
file01/043471.meca 3.95MB
file01/043612.meca 124.16MB
file01/046052.meca 5.58MB
file01/046565.meca 7.19MB
file01/050609.meca 9.68MB
file01/051631.meca 14.07MB
file01/055384.meca 18.73MB
file01/055756.meca 13.09MB
file01/062158.meca 38.03MB
file01/062802.meca 5.46MB
file01/063735.meca 1.67MB
file01/064568.meca 5.89MB
file01/066365.meca 16.76MB
file01/066555.meca 3.95MB
file01/069708.meca 11.98MB
file01/073163.meca 6.23MB
file01/073940.meca 9.82kB
file01/076927.meca 3.69MB
file01/077461.meca 5.97MB
file01/077602.meca 81.87MB
file01/078477.meca 34.46MB
file01/078733.meca 31.06MB
file01/080325.meca 7.57MB
Too many files! Click here to view them all.
Type: Dataset
Tags: preprints, bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprint

Bibtex:
@article{,
title= {BioRxiv - CC & Public Domain Catalog - 2021},
journal= {},
author= {BioRxiv},
year= {},
url= {https://sciop.net/datasets/biorxiv},
abstract= {Part of a set of torrents
- Index: https://sciop.net/datasets/biorxiv
- Back Catalogue: (in progress)
- 2018: https://academictorrents.com/details/1509e322f49fd946ab441aa7b092f53879971d87
- 2019: https://academictorrents.com/details/1956fb55a853aaf0558a20f75adfcb65154b7c6a
- 2020: https://academictorrents.com/details/b81c1be4f0b7ec622ec9cbde2551aaf1547dc33c
- 2021: (this torrent)
- 2022: https://academictorrents.com/details/d33e08e51ece62509bc72042514a19a66d225bd6
- 2023: https://academictorrents.com/details/4a5dc447e3a3e0b338abaa26517689c5e804c13f
- 2024: (in progress)
- 2025 through 25-03-10: https://academictorrents.com/details/d70fda6123588f88478e36204b8be9a751f415da

---

Full archive of [MECA](https://www.niso.org/standards-committees/meca)-formatted dumps from BioRxiv's [full text S3 endpoint](https://www.biorxiv.org/tdm).

## Format

These torrents are hybrid bittorrent v1/v2 torrents - this facilitates mutation, indexing, and download of individual files. You should use a bittorrent v2 capable client to download (e.g. qbittorrent with libtorrent 2, listed as `qt6 lt20` in the download page).

Academictorrents currently does not understand v2 torrent files - **the total size of the torrent listed on academictorrents is thus incorrect.** Hybrid torrents contain [BEP 47](https://www.bittorrent.org/beps/bep_0047.html) padding files to align the v1 pieces so each covers at most one file. A torrent client that understands v2 will *not download these files* since they are just empty placeholders.

These torrents also have had to make some modifications to the original source structure in order to fit within the 10MB torrent limit on academictorrents ([see issue](https://github.com/academictorrents/academictorrents-docs/issues/46)). The primary contributor to torrent size if the duplication of the file names in hybrid torrents, so the original meca filenames have been replaced with the DOI suffix for the item in the meca.

If you seed this torrent, consider also snatching and seeding the v2-only torrent which will be uploaded to sciop shortly, which should be a much more efficient torrent.

Individual item metadata is contained within the JATS XML of the meca (a meca is just a zip file, so it can be read without decompressing the whole archive), but some summary metadata is included for indexing purposes:

- `doi_map.json`: maps the item DOI to the location within the torrent
- `license_map.json`: maps the license to the meca
- `license_counts.json`: summary statistics for each license kind
- `errors.json`: any errors that were encountered while creating the torrent.

## Legality

BioRxiv's bulk access page (currently) reads:

> The TDM repository is not intended as a source for further redistribution of articles posted on bioRxiv, or their derivatives, nor does it grant others permission to re-host content posted on bioRxiv.  For most articles submitted to bioRxiv, authors retain copyright and reuse rights.  If you build indexing services or tools based on the full text of articles, you must therefore link back to the text hosted at bioRxiv rather than re-host content.  For reuse/redistribution of individual articles or their derivatives, please consult the licensing terms applied by the authors, which are provided in the metadata.  In most cases, this will require you to contact the copyright holder in advance to obtain permission.

It is true that *authors determine the copyright status of their work* but is not necessarily true that *"in most cases, this will require you to contact the copyright holder in advance to obtain permission."* The majority of work published by BioRxiv is licensed under some variant of [Creative Commons](https://creativecommons.org/) license that expressly permits redistribution. We respect the author's intent by redistributing all the CC and public domain works free of charge, with attribution, here.  All works licensed under restrictive licenses that prohibit redistribution have been removed from the dataset and are not present in the torrent.

This work is listed on academictorrents as CC BY-NC-ND 4.0, the most restrictive of the licenses found in the dataset, but the license for each work is provided in a `licenses_map.json` within the torrent

See https://sciop.net/datasets/biorxiv for further details about the creation of these torrents},
keywords= {bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprints, preprint},
terms= {},
license= {CC BY-NC-ND 4.0},
superseded= {}
}


Send Feedback