Type: Dataset
Tags: preprints, bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprint
Bibtex:
Tags: preprints, bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprint
Bibtex:
@article{, title= {BioRxiv - CC & Public Domain Catalog - 2025 through 25-03-10}, journal= {}, author= {BioRxiv}, year= {}, url= {https://sciop.net/datasets/biorxiv}, abstract= {Part of a set of torrents - Index: https://sciop.net/datasets/biorxiv - Back Catalogue: (in progress) - 2018: https://academictorrents.com/details/1509e322f49fd946ab441aa7b092f53879971d87 - 2019: https://academictorrents.com/details/1956fb55a853aaf0558a20f75adfcb65154b7c6a - 2020: (in progress) - 2021: (in progress) - 2022: (in progress) - 2023: (in progress) - 2024: (in progress) - 2025 through 25-03-10: (this torrent) --- Full archive of [MECA](https://www.niso.org/standards-committees/meca)-formatted dumps from BioRxiv's [full text S3 endpoint](https://www.biorxiv.org/tdm). Scraped on an annual basis, with the initial upload in March 2025 partially complete. ## Format These torrents are hybrid bittorrent v1/v2 torrents - this facilitates mutation, indexing, and download of individual files. You should use a bittorrent v2 capable client to download (e.g. qbittorrent with libtorrent 2, listed as `qt6 lt20` in the download page). Academictorrents currently does not understand v2 torrent files - **the total size of the torrent listed on academictorrents is thus incorrect.** Hybrid torrents contain [BEP 47](https://www.bittorrent.org/beps/bep_0047.html) padding files to align the v1 pieces so each covers at most one file. A torrent client that understands v2 will *not download these files* since they are just empty placeholders. ## Legality BioRxiv's bulk access page (currently) reads: > The TDM repository is not intended as a source for further redistribution of articles posted on bioRxiv, or their derivatives, nor does it grant others permission to re-host content posted on bioRxiv. For most articles submitted to bioRxiv, authors retain copyright and reuse rights. If you build indexing services or tools based on the full text of articles, you must therefore link back to the text hosted at bioRxiv rather than re-host content. For reuse/redistribution of individual articles or their derivatives, please consult the licensing terms applied by the authors, which are provided in the metadata. In most cases, this will require you to contact the copyright holder in advance to obtain permission. It is true that *authors determine the copyright status of their work* but is not necessarily true that *"in most cases, this will require you to contact the copyright holder in advance to obtain permission."* The majority of work published by BioRxiv is licensed under some variant of [Creative Commons](https://creativecommons.org/) license that expressly permits redistribution. We respect the author's intent by redistributing all the CC and public domain works free of charge, with attribution, here. All works licensed under restrictive licenses that prohibit redistribution have been removed from the dataset and are not present in the torrent. This work is listed on academictorrents as CC BY-NC-ND 4.0, the most restrictive of the licenses found in the dataset, but the license for each work is provided in a `licenses_map.json` within the torrent See https://sciop.net/datasets/biorxiv for further details about the creation of these torrents }, keywords= {bibliometrics, biorxiv, rxiv, scholarly-publishing, cc, creative-commons, scholcomm, preprints, preprint}, terms= {}, license= {CC BY-NC-ND 4.0}, superseded= {} }