BRATS2013 Tumor-NoTumor Dataset (T-NT)

Type: Dataset
Tags: TNT

title= {BRATS2013 Tumor-NoTumor Dataset (T-NT)},
keywords= {TNT},
author= {},
abstract= {This dataset (called T-NT) contains images which contain or do not contain a tumor along with a segmentation of brain matter and the tumor. The goal is that it can be used to simulate bias in data in a controlled fashion.

# Dataset Construction 

The synthetic data of the BRATS2013 dataset is used to construct this dataset. Each brain contains a tumor but it is typically only on one side. Only the right side is taken in order to have examples that do not have tumors. 

Each image is filtered to ensure it has enough brain in the image (more than 30% of the pixels). If the tumor takes up at least 1% of the pixels in the brain then it is considered to have a tumor. 

Here is an snippet from the code used to construct the dataset:

def get_labels(rightside):
    met = {}
    met['brain'] = (
        1. * (rightside != 0).sum() / (rightside == 0).sum())
    met['tumor'] = (
        1. * (rightside > 2).sum() / ((rightside != 0).sum() + 1e-10))
    met['has_enough_brain'] = met['brain'] > 0.30
    met['has_tumor'] = met['tumor'] > 0.01
    return met

# File and Folder structure
The files are organized as follows:

For example:

The segmentation images are pixel values that correspond to the following 6 classes:

Non Tumor classes: 0, 10, 20
Tumor classes: 40
Unknown classes: 30, 50

A Tumor example

A NoTumor example

The folders are divided into training or testing by patient. Then they are divided into flair, t1, and a segmentation image.
train (2125 images, 1421 tumor, 704 notumor)
├── flair 
├── segmentation
└── t1
holdout (1415 images, 1051 tumor, 364 notumor)
├── flair
├── segmentation
└── t1

Patients in training: ['HG0018' 'HG0019' 'HG0012' 'HG0013' 'HG0010' 'HG0011' 'HG0016' 'HG0017'
 'HG0014' 'HG0015' 'HG0023' 'HG0022' 'HG0021' 'LG0005' 'LG0004' 'LG0007'
 'LG0006' 'LG0001' 'LG0003' 'LG0002' 'LG0025' 'LG0024' 'LG0009' 'LG0022'
 'LG0021' 'LG0020' 'HG0009' 'HG0008' 'HG0002' 'HG0025']

Patients in test: ['HG0001' 'HG0003' 'HG0024' 'HG0005' 'HG0004' 'HG0007' 'HG0006' 'HG0020'
 'LG0023' 'LG0008' 'LG0016' 'LG0017' 'LG0014' 'LG0015' 'LG0012' 'LG0013'
 'LG0010' 'LG0011' 'LG0018' 'LG0019']

Sample Flair Images

| Tumor   |      NoTumor      | 
| || 
| ||

# Citation

If you use this dataset, please cite:

Distribution Matching Losses Can Hallucinate Features in Medical Image Translation
Joseph Paul Cohen, Margaux Luck, Sina Honari
Medical Image Computing & Computer Assisted Intervention (MICCAI)

author = {Cohen, Joseph Paul and Luck, Margaux and Honari, Sina},
journal = {Medical Image Computing & Computer Assisted Intervention (MICCAI)},
title = {Distribution Matching Losses Can Hallucinate Features in Medical Image Translation},
year = {2018}
## License
The original files are shared with the following license so our dataset is shared with the same license. 

"Except where otherwise noted, content is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Switzerland License."

The following papers describe the original dataset:

Menze et al., The Multimodal Brain TumorImage Segmentation Benchmark (BRATS), IEEE Trans. Med. Imaging, 2015.Get the citation as BibTex

Kistler et. al, The virtual skeleton database: an open access repository for biomedical research and collaboration. JMIR, 2013. (BibTex)
terms= {},
license= {Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)},
superseded= {},
url= {}

10 day statistics (2 downloads)

Average Time 1 mins, 31 secs
Average Speed 717.27kB/s
Best Time 0 mins, 38 secs
Best Speed 1.73MB/s
Worst Time 2 mins, 25 secs
Worst Speed 452.63kB/s