The Pile An 800GB Dataset of Diverse Text for Language Modeling
EleutherAI

Name DL Added Torrents Total Size

Send Feedback