|
Synthetic Data for Text Localisation in Natural Images
|
15 |
2021-11-15 |
73.50GB |
3,745 | 9 |
4 |
|
Reading Text in the Wild with Convolutional Neural Networks
|
1 |
2021-11-12 |
10.68GB |
41,448 | 36 |
1 |
|
PMC Open Access Subset
|
16 |
2020-05-24 |
84.14GB |
246 | 4+ |
0 |
|
r/WritingPrompts, Text (2018)
|
1 |
2019-06-19 |
87.47MB |
401 | 3 |
0 |
|
OpenWebText (Gokaslan's distribution, 2019), GPT-2 Tokenized
|
395 |
2019-06-01 |
16.02GB |
211 | 8 |
0 |
|
Flickr8k Dataset
|
2 |
2019-03-09 |
1.12GB |
14,430 | 21+ |
0 |
|
Common Crawl corpus - training-parallel-commoncrawl.tgz (CS-EN, DE-EN, ES-EN, FR-EN, RU-EN)
|
1 |
2019-02-04 |
918.31MB |
114 | 2+ |
0 |
|
UN corpus - training-parallel-un.tgz (ES-EN, FR-EN)
|
1 |
2019-02-04 |
2.37GB |
61 | 2+ |
0 |
|
Europarl v7 - training-parallel-europarl-v7.tgz (CS-EN, DE-EN, ES-EN, FR-EN)
|
1 |
2019-02-04 |
657.63MB |
50 | 2+ |
0 |
|
Phishing corpus
|
4555 |
2019-01-02 |
37.48MB |
989 | 4+ |
0 |
|
30M Factoid Question-Answer Corpus (30MQA)
|
2 |
2018-11-29 |
529.34MB |
4,144 | 9+ |
0 |
|
Indiana University - Chest X-Rays (XML Reports)
|
1 |
2018-11-22 |
1.11MB |
43,767 | 27+ |
0 |
|
Yelp reviews - Polarity
|
1 |
2018-10-16 |
166.37MB |
442 | 2+ |
0 |
|
Yelp reviews - Full
|
1 |
2018-10-16 |
196.15MB |
385 | 2+ |
0 |
|
Sogou news
|
1 |
2018-10-16 |
384.27MB |
263 | 2+ |
0 |
|
DBPedia ontology
|
1 |
2018-10-16 |
68.34MB |
131 | 2+ |
0 |
|
Amazon reviews - Polarity
|
1 |
2018-10-16 |
688.34MB |
1,102 | 2+ |
0 |
|
Amazon reviews - Full
|
1 |
2018-10-16 |
643.70MB |
1,114 | 4+ |
0 |
|
AG News
|
1 |
2018-10-16 |
11.78MB |
222 | 3+ |
0 |
|
WMT 2015 French/English parallel texts
|
1 |
2018-10-16 |
2.60GB |
2,099 | 5+ |
0 |
|
Wikitext-2
|
1 |
2018-10-16 |
4.07MB |
249 | 2+ |
0 |
|
Wikitext-103
|
1 |
2018-10-16 |
190.20MB |
738 | 3+ |
0 |
|
IMDb Large Movie Review Dataset
|
1 |
2018-10-16 |
26.40MB |
900 | 4+ |
0 |
|
Microsoft Academic Graph - 2016/02/05
|
1 |
2016-12-25 |
28.94GB |
259 | 2+ |
0 |
|
MovieLens 20M Dataset
|
1 |
2016-12-16 |
198.70MB |
2,055 | 6+ |
0 |
|
Sentiment Labelled Sentences Data Set
|
1 |
2016-08-26 |
512.21kB |
517 | 5+ |
0 |
|
Online News Popularity Data Set
|
1 |
2016-02-11 |
7.48MB |
3,074 | 3+ |
0 |
|
Structured Web Data Extraction Dataset (SWDE)
|
1 |
2015-11-29 |
207.31MB |
2,778 | 5 |
0 |
|
SMS Spam Collection Data Set
|
2 |
2015-11-28 |
695.38kB |
815 | 2+ |
0 |
|
Enwiki Word2vec model 1000 Dimensions
|
1 |
2015-04-09 |
8.63GB |
3,480 | 7 |
0 |
|
Yale YouTube Video Text
|
1 |
2014-10-20 |
434.77MB |
8,102 | 5+ |
0 |
|
Lerman Twitter 2010 Dataset
|
3 |
2014-08-15 |
292.17MB |
3,443 | 11+ |
0 |