WebText dataset urls.txt.tar.gz

Type: Dataset
Tags: WebText, Reddit

title= {WebText dataset urls.txt.tar.gz},
journal= {},
author= {},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Collection of URLs hosting content used in the WebText dataset described by OpenAI here: https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf

URLs obtained with the scripts by eukaryote31},
keywords= {WebText, Reddit},
terms= {},
license= {},
superseded= {}

10 day statistics (1 downloads)

Average Time 41 mins, 19 secs
Average Speed 707.85kB/s
Best Time 41 mins, 19 secs
Best Speed 707.85kB/s
Worst Time 41 mins, 19 secs
Worst Speed 707.85kB/s