Downsampled Open Images V4 Dataset

downsampled-open-images-v4 (16 files)
256px/test-256.tar.gz.md5 0.05kB
512px/test-512.tar.gz.md5 0.05kB
256px/train-256.tar.gz.md5 0.05kB
512px/train-512.tar.gz.md5 0.05kB
256px/validation-256.tar.gz.md5 0.06kB
512px/validation-512.tar.gz.md5 0.06kB
256px/test_challenge_2018-256.tar.gz.md5 0.07kB
512px/test_challenge_2018-512.tar.gz.md5 0.07kB
256px/validation-256.tar.gz 407.93MB
256px/test_challenge_2018-256.tar.gz 997.54MB
256px/test-256.tar.gz 1.23GB
512px/validation-512.tar.gz 1.33GB
512px/test_challenge_2018-512.tar.gz 3.31GB
512px/test-512.tar.gz 4.01GB
256px/train-256.tar.gz 17.19GB
512px/train-512.tar.gz 56.75GB
abstract= {This is the downsampled version of the Open Images V4 Dataset.

The Open Images V4 dataset contains 15.4M bounding-boxes for 600 categories on 1.9M images and 30.1M human-verified image-level labels for 19794 categories. The dataset is available at this link. This total size of the full dataset is 18TB. There's also a smaller version which contains rescaled images to have at most 1024 pixels on the longest side. However, the total size of the rescaled dataset is still large (513GB for training, 12GB for validation and 36GB for testing).

I provide a much smaller version of the Open Images Dataset V4, as inspired by Downsampled ImageNet datasets @PatrykChrabaszcz. These downsampled dataset are much smaller in size so everyone can download it with ease (59GB for training with 512px version and 16GB for training with 256px version). Experiments on these downsampled datasets are also much faster than the original.

| Dataset  | Train Size | Validation Size | Test Size | Test Challenge Size |
| Original | 513 GB     | 12 GB           | 36 GB     | 9.7 GB              |
| 512px    | 52.8 GB    | 1.23 GB         | 3.72 GB   | 3.08 GB             | 
| 256px    | 16 GB      | 0.4 GB          | 1.14 GB   | 0.95 GB             |},
