OpenWebText-urls-26M-filtered.xz | 480.28MB |
Type: Dataset
Tags: WebText, Reddit, gpt2
Bibtex:
Tags: WebText, Reddit, gpt2
Bibtex:
@article{, title= {OpenWebText-urls-26M-filtered.xz}, journal= {}, author= {eukaryote and jcpeterson}, year= {}, url= {https://github.com/eukaryote31/openwebtext}, abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. }, keywords= {WebText, Reddit, gpt2}, terms= {}, license= {}, superseded= {} }
No comments yet
Add a comment