MosMedData: Chest CT Scans with COVID-19 Related Findings COVID19_1110 1.0
MosMed

COVID19_1110 (1166 files)
dataset_registry.xlsx 35.24kB
LICENSE 30.71kB
masks/study_0255_mask.nii.gz 30.65kB
masks/study_0256_mask.nii.gz 26.84kB
masks/study_0257_mask.nii.gz 21.43kB
masks/study_0258_mask.nii.gz 25.35kB
masks/study_0259_mask.nii.gz 23.07kB
masks/study_0260_mask.nii.gz 27.50kB
masks/study_0261_mask.nii.gz 22.63kB
masks/study_0262_mask.nii.gz 20.78kB
masks/study_0263_mask.nii.gz 28.01kB
masks/study_0264_mask.nii.gz 28.56kB
masks/study_0265_mask.nii.gz 31.89kB
masks/study_0266_mask.nii.gz 18.50kB
masks/study_0267_mask.nii.gz 21.64kB
masks/study_0268_mask.nii.gz 28.31kB
masks/study_0269_mask.nii.gz 27.84kB
masks/study_0270_mask.nii.gz 34.22kB
masks/study_0271_mask.nii.gz 21.97kB
masks/study_0272_mask.nii.gz 19.98kB
masks/study_0273_mask.nii.gz 24.18kB
masks/study_0274_mask.nii.gz 28.13kB
masks/study_0275_mask.nii.gz 26.70kB
masks/study_0276_mask.nii.gz 30.41kB
masks/study_0277_mask.nii.gz 27.94kB
masks/study_0278_mask.nii.gz 25.62kB
masks/study_0279_mask.nii.gz 25.88kB
masks/study_0280_mask.nii.gz 26.42kB
masks/study_0281_mask.nii.gz 27.94kB
masks/study_0282_mask.nii.gz 21.14kB
masks/study_0283_mask.nii.gz 22.69kB
masks/study_0284_mask.nii.gz 24.84kB
masks/study_0285_mask.nii.gz 23.28kB
masks/study_0286_mask.nii.gz 32.38kB
masks/study_0287_mask.nii.gz 22.16kB
masks/study_0288_mask.nii.gz 39.78kB
masks/study_0289_mask.nii.gz 26.58kB
masks/study_0290_mask.nii.gz 24.80kB
masks/study_0291_mask.nii.gz 26.12kB
masks/study_0292_mask.nii.gz 25.78kB
masks/study_0293_mask.nii.gz 21.99kB
masks/study_0294_mask.nii.gz 25.79kB
masks/study_0295_mask.nii.gz 26.06kB
masks/study_0296_mask.nii.gz 18.89kB
masks/study_0297_mask.nii.gz 57.67kB
masks/study_0298_mask.nii.gz 32.68kB
masks/study_0299_mask.nii.gz 27.38kB
masks/study_0300_mask.nii.gz 21.90kB
masks/study_0301_mask.nii.gz 21.43kB
Too many files! Click here to view them all.
Type: Dataset
Tags: COVID19_1110, CT, pulmonary, viral, infection, lungs, chest, COVID-19, computed tomography, radiology

Bibtex:
@article{,
title= {MosMedData: Chest CT Scans with COVID-19 Related Findings  COVID19_1110 1.0},
keywords= {COVID19_1110, CT, pulmonary, viral, infection, lungs, chest, COVID-19, computed tomography, radiology},
author= {MosMed},
abstract= {This dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings, as well as without such findings. A small subset of studies has been annotated with binary pixel masks depicting regions of interests (ground-glass opacifications and consolidations). CT scans were obtained between 1st of March, 2020 and 25th of April, 2020, and provided by medical hospitals in Moscow, Russia.

https://i.imgur.com/hLFBdBH.png

## Data Structure
```
.
|-- dataset_registry.xlsx
|-- LICENSE
|-- README_EN.md
|-- README_RU.md
|-- README_EN.pdf
|-- README_RU.pdf
|-- masks
|   |-- study_BBBB_mask.nii.gz
|   |-- ...
|   `-- study_BBBB_mask.nii.gz
`-- studies
    |-- CT-0
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-1
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-2
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-3
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    `-- CT-4
        |-- study_BBBB.nii.gz
        |-- ...
        `-- study_BBBB.nii.gz
    
```

* `README_EN.md` and `README_RU.md` contain general information about the dataset; they have been saved in `Markdown` format in English and Russian languages, respectively. `README_EN.pdf` and `README_RU.pdf` contain the same information but have been saved in `PDF` format for the ease of convenience.
* `LICENSE` file contains full description of Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License
* `dataset_registry.xlsx` is a spreadsheet with full list of studies included in the dataset as well as relative paths to a study file and to a binary mask, if present.
* `studies` directory contains directories named as `CT-0`, `CT-1`, `CT-2`, `CT-3`, and `CT-4` (for more information see below). Each directory contains studies in `NIfTI` format, that have been saved in `Gzip` archive. Each study has a unique name like `study_BBBB.nii.gz`, where `BBBB` is a sequential number of the study in the whole dataset. 
* `masks` directory contains binary pixel masks in `NIfTI` format, that have been saved in `Gzip` archive. Each study has a unique name like `study_BBBB_mask.nii.gz`, where `BBBB` is a number of the corresponding study.

## Data Overview

| Property | Value |
| :--- | :--- |
| Number of studies, pcs. | 1110 |
| Number of patients, ppl. | 1110 |
| Distribution by sex, % (M/ F/ O) | 42/ 56/ 2 |
| Distribution by age, years (min./ median/ max.) | 18/ 47/ 97 |
| Number of binary pixel masks (Class A Annotation), pcs. | 50 |
| Number of studies in each category (Class C Annotation), psc. (CT–0/ CT–1/ CT–2/ CT–3/ CT–4) | 254/ 684/ 125/ 45/ 2 |

### Data Preprocessing

* Each study corresponds to unique patient.
* Each study is represented by one series of images reconstructed into soft tissue mediastinal window.
```
SeriesDescription LIKE '%BODY%'
```

* During the `DICOM`-to-`NIfTI` formatting process only every 10th image (Instance) was preserved.
```
InstanceNumber % 10 = 0
```

### Class C Annotation Principles
Studies are distributed into [5 categories](http://medradiology.moscow/f/svodnye_dannye_po_ocenke_tyazhesti_sostoyaniya_pacienta_s_covid-19_v4.pdf)<sup>1</sup>:
* **CT-0** (`/studies/CT-0` directory): normal lung tissue, no CT-signs of viral pneumonia.
* **CT-1** (`/studies/CT-1` directory): several ground-glass opacifications, involvement of lung parenchyma is less than 25%.
* **CT-2** (`/studies/CT-2` directory): ground-glass opacifications, involvement of lung parenchyma is between 25 and 50%. 
* **CT-3** (`/studies/CT-3` directory): ground-glass opacifications and regions of consolidation, involvement of lung parenchyma is between 50 and 75%.
* **CT-4** (`/studies/CT-4` directory): diffuse ground-glass opacifications and consolidation as well as reticular changes in lungs. Involvement of lung parenchyma exceeds 75%.

### Class A Annotation Principles
A small subset of studies (50 pcs.) have been annotated by the experts of Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department. During the annotation for every given image ground-glass opacifications and regions of consolidation were selected as positive (white) pixels on the corresponding binary pixel mask. The resulting masks have been saved in `NIfTI` format and then transformed into `Gzip` archive.

The [MedSeg](http://medicalsegmentation.com) software has been used for annotation purposes (© 2020 Artificial Intelligence AS).

## Sharing and Access Information

### License
Copyright © 2020 Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department.

This dataset is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License. See `LICENSE` file or follow the [link](https://creativecommons.org/licenses/by/4.0/) for more information.


### Citation
English version of recommended citation:
> Morozov, S., Andreychenko, A., Blokhin, I., Vladzymyrskyy, A., Gelezhe, P., Gombolevskiy, V., Gonchar, A., Ledikhova, N., Pavlov, N., Chernina, V. MosMedData: Chest CT Scans with COVID-19 Related Findings, 2020, v. 1.0, https://mosmed.ai/datasets/covid19_1110

Russian version of recommended citation:
> Морозов С. П., Андрейченко А. Е., Блохин И. А., Владзимирский А. В., Гележе П. Б., Гомболевский В. А., Гончар А. П., Ледихова Н. В., Павлов Н. А., Чернина В. Ю. MosMedData: результаты исследований компьютерной томографии органов грудной клетки с признаками COVID-19, 2020 г., версия 1.0, https://mosmed.ai/datasets/covid19_1110

https://www.medrxiv.org/content/10.1101/2020.05.20.20100362v1

},
terms= {},
license= {Creative Commons Attribution-NonCommercial-NoDerivs 3.0},
superseded= {},
url= {https://mosmed.ai/datasets/covid19_1110}
}


Send Feedback