OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz480.28MB
Type: Dataset
Tags:WebText, Reddit, gpt2

Bibtex:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}


Support
Academic Torrents!

Disable your
ad-blocker!

10 day statistics (1 downloads taking more than 30 seconds)

Average Time 11 minutes, 34 seconds
Average Speed 692.05kB/s
Best Time 11 minutes, 34 seconds
Best Speed 692.05kB/s
Worst Time 11 minutes, 34 seconds
Worst Speed 692.05kB/s
Report