PASCAL Visual Object Classes Challenge 2008 (VOC2008) Complete Dataset
Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.

folder voc2008 (3 files)
fileVOCdevkit_14-Apr-2008.tar 254.98kB
fileVOCpatch_14-Jul-2008.tar 3.97MB
fileVOCtrainval_14-Jul-2008.tar 577.03MB
Type: Dataset

title= {PASCAL Visual Object Classes Challenge 2008 (VOC2008) Complete Dataset},
journal= {},
author= {Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.},
year= {2008},
url= {},
abstract= {Data

To download the training/validata data, see the development kit. In total there are 10,057 images [further statistics].

The training data provided consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the twenty classes present in the image. Note that multiple objects from multiple classes may be present in the same image. Some example images can be viewed online.

Annotation was performed according to a set of guidelines distributed to all annotators.

The data will be made available in two stages; in the first stage, a development kit will be released consisting of training and validation data, plus evaluation software (written in MATLAB). One purpose of the validation set is to demonstrate how the evaluation software works ahead of the competition submission.

In the second stage, the test set will be made available for the actual competition. As in the VOC2007 challenge, no ground truth for the test data will be released until after the challenge is complete.

The data has been split into 50% for training/validation and 50% for testing. The distributions of images and objects by class are approximately equal across the training/validation and test sets. In total there are 10,057 images. Further statistics are online - statistics for the test data will be released after the challenge.

Development Kit

The development kit consists of the training/validation data, MATLAB code for reading the annotation data, support files, and example implementations for each competition.

Download the training/validation data (550MB tar file) - includes patch of 14-Jul-2008
Download the development kit code and documentation (250KB tar file)

Patch 14-Jul-08

There were errors in the 14-Apr-2008 release of the training/validation data as follows:

image labels in x_train/x_trainval.txt (classification task) did not include the "don't care" (zero) label
the test set for the main challenge (classification/detection) included images used for the layout challenge - these will be ignored in the evaluation
some images contained only "difficult" objects - these will be ignored in the evaluation (classification/detection)
The errors will not affect evaluation, but participants wanting to take advantage of the "don't care" label (without having to compute it themselves) should download the patch, which contains updated image lists, and can be untarred over the original development kit:

Running on VOC2007 test data

If at all possible, participants are requested to submit results for both the VOC2008 and VOC2007 test sets provided in the test data, to allow comparison of results across the years. In both cases, the VOC2008 training/validation data should be used for training i.e.

Train on VOC2008 train+val, test on VOC2008 test.
Train on VOC2008 train+val, test on VOC2007 test.
The updated development kit provides a switch to select between test sets. Results are placed in two directories, results/VOC2007/ or results/VOC2008/ according to the test set.

Publication Policy

The main mechanism for dissemination of the results will be the challenge webpage.

For VOC2008, the detailed output of each submitted method will be published online e.g. per-image confidence for the classification task, and bounding boxes for the detection task. The intention is to assist others in the community in carrying out detailed analysis and comparison with their own methods. The published results will not be anonymous - by submitting results, participants are agreeing to have their results shared online.


We gratefully acknowledge the following, who spent many long hours providing annotation for the VOC2008 database: Jan-Hendrik Becker, Patrick Buehler, Kian Ming Chai, Miha Drenik, Chris Engels, Jan Van Gemert, Hedi Harzallah, Nicolas Heess, Zdenek Kalal, Lubor Ladicky, Marcin Marszalek, Alastair Moore, Maria-Elena Nilsback, Paul Sturgess, David Tingdahl, Hirofumi Uemura, Martin Vogt.


The preparation and running of this challenge is supported by the EU-funded PASCAL Network of Excellence on Pattern Analysis, Statistical Modelling and Computational Learning.},
keywords= {},
terms= {The VOC2008 data includes images obtained from the "flickr" website. Use of these images must respect the corresponding terms of use:

"flickr" terms of use
For the purposes of the challenge, the identity of the images in the database, e.g. source and name of owner, has been obscured. Details of the contributor of each image can be found in the annotation to be included in the final release of the data, after completion of the challenge. Any queries about the use or ownership of the data should be addressed to the organizers.}

Send Feedback