Cardiac Assessment and Classification of Ultrasound (CACTUS) dataset

CactusDataset_2025.110003.zip 13.18GB
Type: Dataset

Bibtex:
@article{,
title= {Cardiac Assessment and Classification of Ultrasound (CACTUS) dataset},
keywords= {Image classification, Ultrasound Imaging, Cardiac Dataset, Convolutional Neural Network, Transfer Learning, Image Grading, Echocardiography, CACTUS},
author= {},
abstract= {The Cardiac Assessment and Classification of Ultrasound (CACTUS) dataset is an open-graded dataset designed for the evaluation and classification of cardiac ultrasound images. The dataset was created as part of the ARQUS project, which aims to develop an autonomous robotic system capable of performing ultrasound scans and extracting quantitative measurements. This project is funded by the NSERC (Natural Sciences and Engineering Research Council of Canada).

The dataset contains ultrasound images obtained from scans of the CAE Blue Phantom, a synthetic model used to simulate the human heart. These images represent a variety of heart views and exhibit different quality levels. A detailed grading schema was developed by two medical imaging experts to assess the quality of each image, which ensures that the dataset contains a diverse range of both high- and low-quality ultrasound scans.

The CACTUS dataset is particularly valuable for applications in artificial intelligence, specifically in the domain of echocardiography. It has been used in the development of automated system for the classification of cardiac ultrasound images and the assessment of image quality, which can assist medical practitioners in automating these traditionally labor-intensive tasks.



1. Title of Dataset:  CACTUS: An open dataset and framework for automated Cardiac Assessment and Classification of Ultrasound images using deep transfer learning. 

2. Author Information

A. Principal Investigator Contact Information
Name: Hanae Elmekki
Institution:  Concordia University
Email: hanae.elmekki@mail.concordia.ca

B. Associate or Co-investigator Contact Information
Name: Amanda Spilkin
 Institution:  Concordia University
Email: amanda.spilkin@mail.concordia.ca

3. Date of data collection (single date, range, approximate date): 2025-03-05

4. Geographic location of data collection: Concordia University, Montreal, Quebec, Canada

5. Information about funding sources that supported the collection of the data: Natural Sciences and Engineering Research Council of Canada (NSERC), Discovery Horizons Program and Individual Discovery Grant

---------------------------
SHARING/ACCESS INFORMATION
---------------------------

1. Licenses/restrictions placed on the data:  These data are available under a CC BY 4.0 license <https://creativecommons.org/licenses/by/4.0/> 

2. Links to publications that cite or use the data: https://doi.org/10.1016/j.compbiomed.2025.110003
@article{DBLP:journals/cbm/ElmekkiASSZZBKXPMOSM25,
  author       = {Hanae Elmekki and
                  Ahmed Alagha and
                  Hani Sami and
                  Amanda Spilkin and
                  Antonela Zanuttini and
                  Ehsan Zakeri and
                  Jamal Bentahar and
                  Lyes Kadem and
                  Wen{-}Fang Xie and
                  Philippe Pibarot and
                  Rabeb Mizouni and
                  Hadi Otrok and
                  Shakti Singh and
                  Azzam Mourad},
  title        = {{CACTUS:} An open dataset and framework for automated Cardiac Assessment
                  and Classification of Ultrasound images using deep transfer learning},
  journal      = {Comput. Biol. Medicine},
  volume       = {190},
  pages        = {110003},
  year         = {2025},
  url          = {https://doi.org/10.1016/j.compbiomed.2025.110003},
  doi          = {10.1016/J.COMPBIOMED.2025.110003},
  timestamp    = {Sun, 06 Jul 2025 13:23:07 +0200},
  biburl       = {https://dblp.org/rec/journals/cbm/ElmekkiASSZZBKXPMOSM25.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

3. Was data derived from another source? no

4. Recommended citation for this dataset: Elmekki, H., Alagha, A., Sami, H., Spilkin, A., Zanuttini, A.M., Zakeri, E., Bentahar, J., Kadem, J., Xie, W.F., Pibarot, P., Mizouni, R., Otrok, H., Singh, S., and Mourad, A. (2025). CACTUS: An open dataset and framework for automated Cardiac Assessment and Classification of Ultrasound images using deep transfer learning. Federated Research Data Repository. DOI: 10.20383/103.01484


---------------------
DATA & FOLDER OVERVIEW
---------------------

1. Folder List

   A. Folder name: Images Dataset      
      Short description: This folder contains the images used for training, validation, and testing the AI framework. It is organized into six subfolders: five representing different cardiac views and one containing random images. Each image is named starting with a number, which indicates the grade of the image. The included cardiac views are:

 	● Apical Four Chamber (A4C)  
  	● Subcostal Four Chamber (SC)  
  	● Parasternal Long Axis (PL)  
  	● Parasternal Short Axis - Aortic Valve (PSAV)  
  	● Parasternal Short Axis - Mitral Valve (PSMV)
  
   B. Folder name: Grades
      Short description: This folder contains CSV files for each cardiac view, listing the grades for each image. The grading was conducted by cardiovascular imaging experts, with random images assigned a grade of 0, and other cardiac views graded on a scale from 1 to 10.

   C. Folder name: Videos      
      Short description: This folder is subdivided into two subfolders:
	● Training: Contains videos and corresponding CSV files for the classes and grades.
	● Real-Time Scan: Contains a real-time scanning scenario with the output results through the proposed framework. The scan itself is also provided separately (without the output) for reference.


2. Relationship between folders, if important: The folder “Grades” contains the grades assigned to the images stored in the folder “Image Dataset”.

3. Additional related data collected that was not included in the current data package: No additional related data.

4. Are there multiple versions of the dataset? no


---------------------------
METHODOLOGICAL INFORMATION
---------------------------

1. Description of methods used for collection/generation of data: The data were collected by scanning the CAE Blue Phantom with the GE M4S Matrix Probe and the GE Healthcare Vivid-Q ultrasound machine.

2. Methods for processing the data: The phantom scanning process produces ultrasound images that are saved in a computer linked to the ultrasound machine. The images are categorized into the predefined cardiac views and are then stored in a repository for evaluation by skilled cardiovascular imaging experts.

3. Instrument- or software-specific information needed to interpret the data: CAE Blue Phantom with the GE M4S Matrix Probe and the GE Healthcare Vivid-Q ultrasound machine

4. Environmental/experimental conditions: to achieve optimal and clear views of the targeted structures. These parameters, including depth, gain, dynamic range, frequency and power, are fundamental for guiding a cardiac US examination. The range of values for these parameters is documented in the paper publishing this dataset.

5. Describe any quality-assurance procedures performed on the data:  The CACTUS dataset is evaluated by imaging experts, who have created a grading system centered on two key factors: completeness and clarity. Completeness evaluates the visibility of the targeted cardiac structures in the image, assigning higher grades to images displaying the entire structure compared to those revealing only partial views. Clarity examines the luminosity of images and their purity from speckles and noise. The grading scale spans from 0 to 10, where 0 signifies an image that fails to capture a specific cardiac window, rendering it uninterpretable, whereas 10 represents a fully visible cardiac view with distinctly identifiable structures and optimal gain/power settings for clear delineation. 

6. People involved with sample collection, processing, analysis and/or submission: Amanda Spilkin and Antonela Mariel Zanuttin (medical imaging experts).

-----------------------------------------------------------------
DATA-SPECIFIC INFORMATION FOR: CACTUS
-----------------------------------------------------------------

1. dataset consists of image, video, and CSV files, not tabular variables.

2. Total number of images: 37,736  
      – Apical Four Chamber (A4C): 7,422  
      – Subcostal Four Chamber (SC): 6,345  
      – Parasternal Long Axis (PL): 6,102  
      – Parasternal Short Axis – Aortic Valve (PSAV): 5,832  
      – Parasternal Short Axis – Mitral Valve (PSMV): 6,014  
      – Random Images: 6,021 

3. Missing data codes: None (all images and corresponding grade files are included)

https://www.sciencedirect.com/science/article/pii/S0010482525003543

https://users.encs.concordia.ca/~kadem/cactus/},
terms= {},
license= {https://creativecommons.org/licenses/by/4.0/},
superseded= {},
url= {https://www.frdr-dfdr.ca/repo/dataset/86beb91e-c0ba-496b-ad38-732b67e3d5f6}
}


Send Feedback