BioRxiv - CC & Public Domain Catalog - 2023
BioRxiv

folder biorxiv_2023 (71756 files)
file01/014852.meca 4.01MB
file01/043737.meca 9.63MB
file01/050252.meca 29.75MB
file01/062158.meca 39.48MB
file01/104687.meca 6.20MB
file01/168211.meca 4.71MB
file01/194738.meca 14.25MB
file01/209510.meca 6.03MB
file01/219907.meca 16.65MB
file01/220467.meca 9.91MB
file01/237206.meca 16.91MB
file01/275362.meca 24.38MB
file01/289355.meca 12.09MB
file01/308197.meca 4.72MB
file01/313999.meca 13.60MB
file01/328799.meca 7.49MB
file01/334532.meca 82.73MB
file01/378257.meca 9.48MB
file01/417337.meca 7.19MB
file01/424231.meca 12.70MB
file01/424999.meca 35.73MB
file01/429268.meca 2.88MB
file01/429938.meca 24.35MB
file01/430561.meca 14.19MB
file01/431552.meca 6.26MB
file01/432055.meca 13.30MB
file01/432260.meca 35.07MB
file01/432310.meca 12.59MB
file01/435381.meca 16.89MB
file01/435846.meca 46.72MB
file01/436475.meca 26.96MB
file01/436634.meca 151.12MB
file01/439207.meca 12.59MB
file01/439527.meca 9.27MB
file01/440574.meca 5.88MB
file01/441269.meca 62.21MB
file01/441612.meca 6.28MB
file01/441631.meca 11.90MB
file01/441744.meca 3.73MB
file01/444717.meca 1.04MB
file01/445002.meca 27.80MB
file01/446004.meca 12.95MB
file01/446439.meca 19.51MB
file01/446785.meca 16.28MB
file01/448519.meca 9.22MB
file01/449937.meca 20.91MB
file01/450026.meca 3.92MB
file01/451295.meca 207.23MB
file01/451905.meca 5.86MB
Too many files! Click here to view them all.
Type: Dataset
Tags: bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprints, preprint

Bibtex:
@article{,
title= {BioRxiv - CC & Public Domain Catalog - 2023},
journal= {},
author= {BioRxiv},
year= {},
url= {https://sciop.net/datasets/biorxiv},
abstract= {Part of a set of torrents
- Index: https://sciop.net/datasets/biorxiv
- Back Catalogue: (in progress)
- 2018: https://academictorrents.com/details/1509e322f49fd946ab441aa7b092f53879971d87
- 2019: https://academictorrents.com/details/1956fb55a853aaf0558a20f75adfcb65154b7c6a
- 2020: https://academictorrents.com/details/b81c1be4f0b7ec622ec9cbde2551aaf1547dc33c
- 2021: https://academictorrents.com/details/c8ab36be273872466f6a391af4e42e6541c8e65a
- 2022: https://academictorrents.com/details/d33e08e51ece62509bc72042514a19a66d225bd6
- 2023: (this torrent)
- 2024: (in progress)
- 2025 through 25-03-10: https://academictorrents.com/details/d70fda6123588f88478e36204b8be9a751f415da

---

Full archive of [MECA](https://www.niso.org/standards-committees/meca)-formatted dumps from BioRxiv's [full text S3 endpoint](https://www.biorxiv.org/tdm).

## Format

These torrents are hybrid bittorrent v1/v2 torrents - this facilitates mutation, indexing, and download of individual files. You should use a bittorrent v2 capable client to download (e.g. qbittorrent with libtorrent 2, listed as `qt6 lt20` in the download page).

Academictorrents currently does not understand v2 torrent files - **the total size of the torrent listed on academictorrents is thus incorrect.** Hybrid torrents contain [BEP 47](https://www.bittorrent.org/beps/bep_0047.html) padding files to align the v1 pieces so each covers at most one file. A torrent client that understands v2 will *not download these files* since they are just empty placeholders.

These torrents also have had to make some modifications to the original source structure in order to fit within the 10MB torrent limit on academictorrents ([see issue](https://github.com/academictorrents/academictorrents-docs/issues/46)). The primary contributor to torrent size if the duplication of the file names in hybrid torrents, so the original meca filenames have been replaced with the DOI suffix for the item in the meca.

If you seed this torrent, consider also snatching and seeding the v2-only torrent which will be uploaded to sciop shortly, which should be a much more efficient torrent.

Individual item metadata is contained within the JATS XML of the meca (a meca is just a zip file, so it can be read without decompressing the whole archive), but some summary metadata is included for indexing purposes:

- `doi_map.json`: maps the item DOI to the location within the torrent
- `license_map.json`: maps the license to the meca
- `license_counts.json`: summary statistics for each license kind
- `errors.json`: any errors that were encountered while creating the torrent.

## Legality

BioRxiv's bulk access page (currently) reads:

> The TDM repository is not intended as a source for further redistribution of articles posted on bioRxiv, or their derivatives, nor does it grant others permission to re-host content posted on bioRxiv.  For most articles submitted to bioRxiv, authors retain copyright and reuse rights.  If you build indexing services or tools based on the full text of articles, you must therefore link back to the text hosted at bioRxiv rather than re-host content.  For reuse/redistribution of individual articles or their derivatives, please consult the licensing terms applied by the authors, which are provided in the metadata.  In most cases, this will require you to contact the copyright holder in advance to obtain permission.

It is true that *authors determine the copyright status of their work* but is not necessarily true that *"in most cases, this will require you to contact the copyright holder in advance to obtain permission."* The majority of work published by BioRxiv is licensed under some variant of [Creative Commons](https://creativecommons.org/) license that expressly permits redistribution. We respect the author's intent by redistributing all the CC and public domain works free of charge, with attribution, here.  All works licensed under restrictive licenses that prohibit redistribution have been removed from the dataset and are not present in the torrent.

This work is listed on academictorrents as CC BY-NC-ND 4.0, the most restrictive of the licenses found in the dataset, but the license for each work is provided in a `licenses_map.json` within the torrent

See https://sciop.net/datasets/biorxiv for further details about the creation of these torrents},
keywords= {preprints, bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprint},
terms= {},
license= {CC BY-NC-ND 4.0},
superseded= {}
}


Send Feedback