training_data.zip	10.93GB

Type: Dataset
Tags:

Bibtex:

@article{,
title= {DENTEX_CHALLENGE},
keywords= {},
author= {},
abstract= {https://i.imgur.com/sAXITsB.png

The DENTEX dataset comprises panoramic dental X-rays obtained from three different institutions using standard clinical conditions but varying equipment and imaging protocols, resulting in diverse image quality reflecting heterogeneous clinical practice. The dataset includes X-rays from patients aged 12 and above, randomly selected from the hospital's database to ensure patient privacy and confidentiality.

To enable effective use of the FDI system, the dataset is hierarchically organized into three types of data;

(a) 693 X-rays labeled for quadrant detection and quadrant classes only,

(b) 634 X-rays labeled for tooth detection with quadrant and tooth enumeration classes,

The diagnosis class includes four specific categories: caries, deep caries, periapical lesions, and impacted teeth. An additional 1571 unlabeled X-rays are provided for pre-training.

## Data Split for Evaluation and Training

The DENTEX 2023 dataset comprises three types of data: (a) partially annotated quadrant data, (b) partially annotated quadrant-enumeration data, and (c) fully annotated quadrant-enumeration-diagnosis data. The first two types of data are intended for training and development purposes, while the third type is used for training and evaluations.

To comply with standard machine learning practices, the fully annotated third dataset, consisting of 1005 panoramic X-rays, is partitioned into training, validation, and testing subsets, comprising 705, 50, and 250 images, respectively. Ground truth labels are provided only for the training data, while the validation data is provided without associated ground truth, and the testing data is kept hidden from participants.

Participants are allowed to use additional public data for augmenting the provided DENTEX dataset or for pre-training models on such datasets to enhance performance. However, they must ensure that all the data they use is publicly available. Additionally, they must document the use of external data clearly in their final short paper submission, providing details on the dataset and its source.

## Annotation Protocol

The DENTEX provides three hierarchically annotated datasets that facilitate various dental detection tasks: (1) quadrant-only for quadrant detection, (2) quadrant-enumeration for tooth detection, and (3) quadrant-enumeration-diagnosis for abnormal tooth detection. Although it may seem redundant to provide a quadrant detection dataset, it is crucial for utilizing the FDI Numbering System. The FDI system is a globally-used system that assigns each quadrant of the mouth a number from 1 through 4. The top right is 1, the top left is 2, the bottom left is 3, and the bottom right is 4. Then each of the eight teeth and each molar are numbered 1 through 8. The 1 starts at the front middle tooth, and the numbers rise the farther back we go. So for example, the back tooth on the lower left side would be 48 according to FDI notation, which means quadrant 4, number 8. Therefore, the quadrant segmentation dataset can significantly simplify the dental enumeration task, even though evaluations will be made only on the fully annotated third data.

All annotations in the DENTEX dataset are meticulously crafted by a team of dental experts. Specifically, each image is annotated by a last-year dental student, and the annotations are further verified and corrected by one of three expert dentists with over 15 years of experience. Therefore, the annotated data in DENTEX is of the highest quality and accuracy, which makes it a valuable resource for dental research.},
terms= {},
license= {},
superseded= {},
url= {https://dentex.grand-challenge.org/}
}