Great Zebra and Giraffe Count ID Dataset
Wild Me

Type: Dataset
Tags: zebra, wildlife, coco, identification, giraffe

title= {Great Zebra and Giraffe Count ID Dataset},
journal= {},
author= {Wild Me},
year= {2020},
url= {},
abstract= {Our dataset for plains zebra (Equus quagga) is taken from a two-day census of the Nairobi National Park, located just south of the capital’s airport in Nairobi, Kenya.  The “Great Zebra and Giraffe Count” (GZGC) photographic census was organized on February 28th and March 1st 2015 and had the participation of 27 different teams of citizen scientists, 55 total photographers, and collected 9,406 images of plains zebra and Masai giraffe (Giraffa tippelskirchi) (Parham et al. 2017).  Only images containing either zebras or giraffes were included in the exported dataset, a total of 4,948 images, where the original biographical information of the original contributors are removed.  All images are labeled with bounding boxes around the individual animals for which there is ID metadata, meaning some images contain missing boxes and are not intended to be used for object detection training or testing.  Viewpoints for all animal annotations were also added.  All ID assignments were completed using the HotSpotter algorithm (Crall et al. 2013) by visually matching the stripes and spots as seen on the body of the animal.  A total of 2,056 combined names are released for 6,286 individual zebra and 639 giraffe sightings.  This dataset presents as a challenging comparison compared to the whale shark dataset since it contains a significantly higher number of animals that are only seen once during the survey.

The dataset is released in the Microsoft COCO format ( and therefore uses flat image folders with associated YAML metadata files. We have collapsed the entire dataset into a single "train" label and have left "val" and "test" empty; we do this as an invitation to researchers to experiment with their own novel approaches for dealing with the unbalanced and chaotic distribution on the number of sightings per individual.  All of the images in the dataset have been resized to have a maximum linear dimension of 3,000 pixels.  The metadata for all animal sightings is defined by an axis-aligned bounding box via and includes information on the rotation of the box (theta), the viewpoint of the animal, a species (category) ID, a source image ID, an individual string ID name, and other miscellaneous values.  The temporal ordering of the images, and an anonymized ID for the original photographer, can be determined from the metadata for each image.

For research or press contact, please direct all correspondence to Wild Me at  Wild Me ( is a registered 501(c)(3) not-for-profit based in Portland, Oregon, USA and brings state-of-the-art computer vision tools to ecology researchers working around the globe on wildlife conservation.

Direct download mirror:},
keywords= {zebra, wildlife, coco, identification, giraffe},
terms= {Use of this dataset in scientific research must provide attribution under the CDLA-Permissive License (version 1.0) and must also cite the original research publication: 

  title={Animal population censusing at scale with citizen science and photographic identification},
  author={Parham, Jason and Crall, Jonathan and Stewart, Charles and Berger-Wolf, Tanya and Rubenstein, Daniel I},
  booktitle={AAAI Spring Symposium-Technical Report},
license= {Community Data License Agreement – Permissive – Version 1.0 (},
superseded= {}

Send Feedback