|
Wikipedia Training Data for Megatron-LM
|
2 |
2021-08-28 |
7.84GB |
95 | 0 |
0 |
|
Abstract Meaning Representation AMR Annotation Release 3.0 LDC2017T10
|
1 |
2022-07-11 |
38.82MB |
179 | 8 |
1 |
|
Subreddit comments/submissions 2005-06 to 2022-12
|
39963 |
2023-02-28 |
1.66TB |
377 | 10 |
5 |
|
Wallstreetbets submissions/comments
|
2 |
2021-09-14 |
4.38GB |
1,167 | 0 |
0 |
|
US Stock Market End of Day dataset
|
1 |
2016-12-24 |
250.71MB |
1,321 | 6 |
0 |
|
Reddit comments/submissions 2023-02
|
2 |
2023-03-19 |
34.43GB |
1,856 | 28 |
1 |
|
Reddit comments/submissions 2023-01
|
2 |
2023-03-19 |
46.98GB |
2,015 | 24 |
1 |
|
Arizona State University Twitter Data Set
|
1 |
2013-12-23 |
354.77MB |
14,858 | 12+ |
0 |