MosMedData: Chest CT Scans with COVID-19 Related Findings COVID19_1110 1.0

COVID19_1110 (1166 files)
dataset_registry.xlsx 35.24kB
masks/study_0255_mask.nii.gz 30.65kB
masks/study_0256_mask.nii.gz 26.84kB
masks/study_0257_mask.nii.gz 21.43kB
masks/study_0258_mask.nii.gz 25.35kB
masks/study_0259_mask.nii.gz 23.07kB
masks/study_0260_mask.nii.gz 27.50kB
masks/study_0261_mask.nii.gz 22.63kB
masks/study_0262_mask.nii.gz 20.78kB
masks/study_0263_mask.nii.gz 28.01kB
masks/study_0264_mask.nii.gz 28.56kB
masks/study_0265_mask.nii.gz 31.89kB
masks/study_0266_mask.nii.gz 18.50kB
masks/study_0267_mask.nii.gz 21.64kB
masks/study_0268_mask.nii.gz 28.31kB
masks/study_0269_mask.nii.gz 27.84kB
masks/study_0270_mask.nii.gz 34.22kB
masks/study_0271_mask.nii.gz 21.97kB
masks/study_0272_mask.nii.gz 19.98kB
masks/study_0273_mask.nii.gz 24.18kB
masks/study_0274_mask.nii.gz 28.13kB
masks/study_0275_mask.nii.gz 26.70kB
masks/study_0276_mask.nii.gz 30.41kB
masks/study_0277_mask.nii.gz 27.94kB
masks/study_0278_mask.nii.gz 25.62kB
masks/study_0279_mask.nii.gz 25.88kB
masks/study_0280_mask.nii.gz 26.42kB
masks/study_0281_mask.nii.gz 27.94kB
masks/study_0282_mask.nii.gz 21.14kB
masks/study_0283_mask.nii.gz 22.69kB
masks/study_0284_mask.nii.gz 24.84kB
masks/study_0285_mask.nii.gz 23.28kB
masks/study_0286_mask.nii.gz 32.38kB
masks/study_0287_mask.nii.gz 22.16kB
masks/study_0288_mask.nii.gz 39.78kB
masks/study_0289_mask.nii.gz 26.58kB
masks/study_0290_mask.nii.gz 24.80kB
masks/study_0291_mask.nii.gz 26.12kB
masks/study_0292_mask.nii.gz 25.78kB
masks/study_0293_mask.nii.gz 21.99kB
masks/study_0294_mask.nii.gz 25.79kB
masks/study_0295_mask.nii.gz 26.06kB
masks/study_0296_mask.nii.gz 18.89kB
masks/study_0297_mask.nii.gz 57.67kB
masks/study_0298_mask.nii.gz 32.68kB
masks/study_0299_mask.nii.gz 27.38kB
masks/study_0300_mask.nii.gz 21.90kB
masks/study_0301_mask.nii.gz 21.43kB
masks/study_0302_mask.nii.gz 22.64kB
masks/study_0303_mask.nii.gz 28.86kB
masks/study_0304_mask.nii.gz 23.27kB 9.47kB
README_EN.pdf 286.91kB 14.55kB
README_RU.pdf 296.92kB
studies/CT-0/study_0001.nii.gz 11.17MB
studies/CT-0/study_0002.nii.gz 10.93MB
studies/CT-0/study_0003.nii.gz 10.41MB
studies/CT-0/study_0004.nii.gz 12.03MB
studies/CT-0/study_0005.nii.gz 10.53MB
studies/CT-0/study_0006.nii.gz 9.80MB
studies/CT-0/study_0007.nii.gz 11.57MB
studies/CT-0/study_0008.nii.gz 9.03MB
studies/CT-0/study_0009.nii.gz 9.70MB
studies/CT-0/study_0010.nii.gz 13.60MB
studies/CT-0/study_0011.nii.gz 11.37MB
studies/CT-0/study_0012.nii.gz 10.90MB
studies/CT-0/study_0013.nii.gz 10.03MB
studies/CT-0/study_0014.nii.gz 11.55MB
studies/CT-0/study_0015.nii.gz 11.29MB
studies/CT-0/study_0016.nii.gz 10.84MB
studies/CT-0/study_0017.nii.gz 11.35MB
studies/CT-0/study_0018.nii.gz 11.54MB
studies/CT-0/study_0019.nii.gz 11.83MB
studies/CT-0/study_0020.nii.gz 10.71MB
studies/CT-0/study_0021.nii.gz 11.09MB
studies/CT-0/study_0022.nii.gz 10.96MB
studies/CT-0/study_0023.nii.gz 11.55MB
studies/CT-0/study_0024.nii.gz 11.52MB
studies/CT-0/study_0025.nii.gz 11.55MB
studies/CT-0/study_0026.nii.gz 11.51MB
studies/CT-0/study_0027.nii.gz 9.42MB
studies/CT-0/study_0028.nii.gz 9.43MB
studies/CT-0/study_0029.nii.gz 8.93MB
studies/CT-0/study_0030.nii.gz 10.99MB
studies/CT-0/study_0031.nii.gz 8.71MB
studies/CT-0/study_0032.nii.gz 11.09MB
studies/CT-0/study_0033.nii.gz 11.52MB
studies/CT-0/study_0034.nii.gz 8.60MB
studies/CT-0/study_0035.nii.gz 11.32MB
studies/CT-0/study_0036.nii.gz 10.90MB
studies/CT-0/study_0037.nii.gz 11.62MB
studies/CT-0/study_0038.nii.gz 12.10MB
studies/CT-0/study_0039.nii.gz 11.31MB
studies/CT-0/study_0040.nii.gz 13.03MB
studies/CT-0/study_0041.nii.gz 10.71MB
studies/CT-0/study_0042.nii.gz 8.44MB
studies/CT-0/study_0043.nii.gz 11.18MB
studies/CT-0/study_0044.nii.gz 9.67MB
studies/CT-0/study_0045.nii.gz 10.33MB
studies/CT-0/study_0046.nii.gz 10.46MB
studies/CT-0/study_0047.nii.gz 10.92MB
studies/CT-0/study_0048.nii.gz 11.28MB
studies/CT-0/study_0049.nii.gz 11.35MB
studies/CT-0/study_0050.nii.gz 11.49MB
studies/CT-0/study_0051.nii.gz 11.49MB
studies/CT-0/study_0052.nii.gz 9.96MB
studies/CT-0/study_0053.nii.gz 9.95MB
studies/CT-0/study_0054.nii.gz 10.14MB
studies/CT-0/study_0055.nii.gz 10.24MB
studies/CT-0/study_0056.nii.gz 11.60MB
studies/CT-0/study_0057.nii.gz 9.83MB
studies/CT-0/study_0058.nii.gz 10.55MB
studies/CT-0/study_0059.nii.gz 9.92MB
studies/CT-0/study_0060.nii.gz 11.81MB
studies/CT-0/study_0061.nii.gz 10.97MB
studies/CT-0/study_0062.nii.gz 10.58MB
studies/CT-0/study_0063.nii.gz 11.36MB
studies/CT-0/study_0064.nii.gz 10.74MB
studies/CT-0/study_0065.nii.gz 9.62MB
studies/CT-0/study_0066.nii.gz 10.95MB
studies/CT-0/study_0067.nii.gz 11.18MB
studies/CT-0/study_0068.nii.gz 9.37MB
studies/CT-0/study_0069.nii.gz 10.30MB
studies/CT-0/study_0070.nii.gz 10.72MB
studies/CT-0/study_0071.nii.gz 11.59MB
studies/CT-0/study_0072.nii.gz 11.43MB
studies/CT-0/study_0073.nii.gz 10.52MB
studies/CT-0/study_0074.nii.gz 9.91MB
studies/CT-0/study_0075.nii.gz 10.57MB
studies/CT-0/study_0076.nii.gz 8.95MB
studies/CT-0/study_0077.nii.gz 9.51MB
studies/CT-0/study_0078.nii.gz 10.50MB
studies/CT-0/study_0079.nii.gz 10.20MB
studies/CT-0/study_0080.nii.gz 8.75MB
studies/CT-0/study_0081.nii.gz 10.45MB
studies/CT-0/study_0082.nii.gz 10.96MB
studies/CT-0/study_0083.nii.gz 10.76MB
studies/CT-0/study_0084.nii.gz 10.80MB
studies/CT-0/study_0085.nii.gz 10.52MB
studies/CT-0/study_0086.nii.gz 11.50MB
studies/CT-0/study_0087.nii.gz 11.94MB
studies/CT-0/study_0088.nii.gz 10.90MB
studies/CT-0/study_0089.nii.gz 9.98MB
studies/CT-0/study_0090.nii.gz 9.99MB
studies/CT-0/study_0091.nii.gz 10.41MB
studies/CT-0/study_0092.nii.gz 8.93MB
studies/CT-0/study_0093.nii.gz 9.97MB
studies/CT-0/study_0094.nii.gz 10.95MB
studies/CT-0/study_0095.nii.gz 9.08MB
studies/CT-0/study_0096.nii.gz 10.89MB
studies/CT-0/study_0097.nii.gz 9.62MB
studies/CT-0/study_0098.nii.gz 11.30MB
studies/CT-0/study_0099.nii.gz 9.81MB
studies/CT-0/study_0100.nii.gz 10.09MB
studies/CT-0/study_0101.nii.gz 10.02MB
studies/CT-0/study_0102.nii.gz 10.19MB
studies/CT-0/study_0103.nii.gz 11.44MB
studies/CT-0/study_0104.nii.gz 11.37MB
studies/CT-0/study_0105.nii.gz 11.83MB
studies/CT-0/study_0106.nii.gz 11.49MB
studies/CT-0/study_0107.nii.gz 10.73MB
studies/CT-0/study_0108.nii.gz 11.15MB
studies/CT-0/study_0109.nii.gz 10.10MB
studies/CT-0/study_0110.nii.gz 9.56MB
studies/CT-0/study_0111.nii.gz 10.99MB
studies/CT-0/study_0112.nii.gz 10.43MB
studies/CT-0/study_0113.nii.gz 12.26MB
studies/CT-0/study_0114.nii.gz 10.48MB
studies/CT-0/study_0115.nii.gz 11.29MB
studies/CT-0/study_0116.nii.gz 10.18MB
studies/CT-0/study_0117.nii.gz 11.31MB
studies/CT-0/study_0118.nii.gz 11.11MB
studies/CT-0/study_0119.nii.gz 12.65MB
studies/CT-0/study_0120.nii.gz 11.22MB
studies/CT-0/study_0121.nii.gz 11.62MB
studies/CT-0/study_0122.nii.gz 10.37MB
studies/CT-0/study_0123.nii.gz 12.18MB
studies/CT-0/study_0124.nii.gz 11.43MB
studies/CT-0/study_0125.nii.gz 11.81MB
studies/CT-0/study_0126.nii.gz 11.54MB
studies/CT-0/study_0127.nii.gz 12.18MB
studies/CT-0/study_0128.nii.gz 11.66MB
studies/CT-0/study_0129.nii.gz 11.68MB
studies/CT-0/study_0130.nii.gz 10.32MB
studies/CT-0/study_0131.nii.gz 10.25MB
studies/CT-0/study_0132.nii.gz 11.12MB
studies/CT-0/study_0133.nii.gz 10.11MB
studies/CT-0/study_0134.nii.gz 11.89MB
studies/CT-0/study_0135.nii.gz 11.30MB
studies/CT-0/study_0136.nii.gz 11.50MB
studies/CT-0/study_0137.nii.gz 8.91MB
studies/CT-0/study_0138.nii.gz 10.45MB
studies/CT-0/study_0139.nii.gz 8.74MB
studies/CT-0/study_0140.nii.gz 8.21MB
studies/CT-0/study_0141.nii.gz 17.43MB
studies/CT-0/study_0142.nii.gz 10.55MB
studies/CT-0/study_0143.nii.gz 11.98MB
Too many files! Click here to view them all.
Type: Dataset
Tags: COVID19_1110, CT, pulmonary, viral, infection, lungs, chest, COVID-19, computed tomography

title= {MosMedData: Chest CT Scans with COVID-19 Related Findings  COVID19_1110 1.0},
keywords= {COVID19_1110, computed tomography, CT, pulmonary, viral, infection, lungs, chest, COVID-19},
author= {MosMed},
abstract= {This dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings, as well as without such findings. A small subset of studies has been annotated with binary pixel masks depicting regions of interests (ground-glass opacifications and consolidations). CT scans were obtained between 1st of March, 2020 and 25th of April, 2020, and provided by medical hospitals in Moscow, Russia.

## Data Structure
|-- dataset_registry.xlsx
|-- README_EN.pdf
|-- README_RU.pdf
|-- masks
|   |-- study_BBBB_mask.nii.gz
|   |-- ...
|   `-- study_BBBB_mask.nii.gz
`-- studies
    |-- CT-0
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-1
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-2
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    |-- CT-3
    |   |-- study_BBBB.nii.gz
    |   |-- ...
    |   `-- study_BBBB.nii.gz
    `-- CT-4
        |-- study_BBBB.nii.gz
        |-- ...
        `-- study_BBBB.nii.gz

* `` and `` contain general information about the dataset; they have been saved in `Markdown` format in English and Russian languages, respectively. `README_EN.pdf` and `README_RU.pdf` contain the same information but have been saved in `PDF` format for the ease of convenience.
* `LICENSE` file contains full description of Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License
* `dataset_registry.xlsx` is a spreadsheet with full list of studies included in the dataset as well as relative paths to a study file and to a binary mask, if present.
* `studies` directory contains directories named as `CT-0`, `CT-1`, `CT-2`, `CT-3`, and `CT-4` (for more information see below). Each directory contains studies in `NIfTI` format, that have been saved in `Gzip` archive. Each study has a unique name like `study_BBBB.nii.gz`, where `BBBB` is a sequential number of the study in the whole dataset. 
* `masks` directory contains binary pixel masks in `NIfTI` format, that have been saved in `Gzip` archive. Each study has a unique name like `study_BBBB_mask.nii.gz`, where `BBBB` is a number of the corresponding study.

## Data Overview

| Property | Value |
| :--- | :--- |
| Number of studies, pcs. | 1110 |
| Number of patients, ppl. | 1110 |
| Distribution by sex, % (M/ F/ O) | 42/ 56/ 2 |
| Distribution by age, years (min./ median/ max.) | 18/ 47/ 97 |
| Number of binary pixel masks (Class A Annotation), pcs. | 50 |
| Number of studies in each category (Class C Annotation), psc. (CT–0/ CT–1/ CT–2/ CT–3/ CT–4) | 254/ 684/ 125/ 45/ 2 |

### Data Preprocessing

* Each study corresponds to unique patient.
* Each study is represented by one series of images reconstructed into soft tissue mediastinal window.
SeriesDescription LIKE '%BODY%'

* During the `DICOM`-to-`NIfTI` formatting process only every 10th image (Instance) was preserved.
InstanceNumber % 10 = 0

### Class C Annotation Principles
Studies are distributed into [5 categories](<sup>1</sup>:
* **CT-0** (`/studies/CT-0` directory): normal lung tissue, no CT-signs of viral pneumonia.
* **CT-1** (`/studies/CT-1` directory): several ground-glass opacifications, involvement of lung parenchyma is less than 25%.
* **CT-2** (`/studies/CT-2` directory): ground-glass opacifications, involvement of lung parenchyma is between 25 and 50%. 
* **CT-3** (`/studies/CT-3` directory): ground-glass opacifications and regions of consolidation, involvement of lung parenchyma is between 50 and 75%.
* **CT-4** (`/studies/CT-4` directory): diffuse ground-glass opacifications and consolidation as well as reticular changes in lungs. Involvement of lung parenchyma exceeds 75%.

### Class A Annotation Principles
A small subset of studies (50 pcs.) have been annotated by the experts of Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department. During the annotation for every given image ground-glass opacifications and regions of consolidation were selected as positive (white) pixels on the corresponding binary pixel mask. The resulting masks have been saved in `NIfTI` format and then transformed into `Gzip` archive.

The [MedSeg]( software has been used for annotation purposes (© 2020 Artificial Intelligence AS).

## Sharing and Access Information

### License
Copyright © 2020 Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department.

This dataset is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License. See `LICENSE` file or follow the [link]( for more information.

### Citation
English version of recommended citation:
> Morozov, S., Andreychenko, A., Blokhin, I., Vladzymyrskyy, A., Gelezhe, P., Gombolevskiy, V., Gonchar, A., Ledikhova, N., Pavlov, N., Chernina, V. MosMedData: Chest CT Scans with COVID-19 Related Findings, 2020, v. 1.0,

Russian version of recommended citation:
> Морозов С. П., Андрейченко А. Е., Блохин И. А., Владзимирский А. В., Гележе П. Б., Гомболевский В. А., Гончар А. П., Ледихова Н. В., Павлов Н. А., Чернина В. Ю. MosMedData: результаты исследований компьютерной томографии органов грудной клетки с признаками COVID-19, 2020 г., версия 1.0,

terms= {},
license= {Creative Commons Attribution-NonCommercial-NoDerivs 3.0},
superseded= {},
url= {}

10 day statistics (13 downloads taking more than 30 seconds)

Average Time 1 hours, 46 minutes, 36 seconds
Average Speed 1.85MB/s
Best Time 4 minutes, 59 seconds
Best Speed 39.67MB/s
Worst Time 7 hours, 19 minutes, 49 seconds
Worst Speed 449.48kB/s