Longitudinal-CT

Creators: Küstner, Thomas¹; Peisen, Felix¹; Gatidis, Sergios¹; Wagner, Andreas¹; Megne, Ornela¹; Othman, Ahmed²; Sanner, Antoine²; Loßau, Tanja³; Moltz, Jan Hendrik³; Kohlbrandt, Temke³; Hering, Alessa⁴

1. Universitätsklinikum Tübingen
2. University Medical Center of the Johannes Gutenberg University Mainz
3. Fraunhofer Institute for Digital Medicine
4. Radboud University Nijmegen Medical Centre

Contact persons:: Küstner, Thomas¹; Peisen, Felix¹; Gatidis, Sergios¹

1. Universitätsklinikum Tübingen

Description

Introduction

A publicly available dataset of annotated longitudinal Computed Tomography (CT) studies. The dataset comprises whole-body CT scans from 300 melanoma patients undergoing longitudinal imaging for therapy response assessment. Each patient has two imaging timepoints: a baseline staging scan and a follow-up scan acquired after therapy treatment. The dataset includes training data from a single site (UKT).

All CT examinations were acquired on state-of-the-art CT scanners using standardized protocols following international guidelines. The imaging protocol includes whole-body CT imaging, typically extending from the skull base to mid-thigh level, with possible extensions to include the entire body when clinically relevant (all data is defaced). The dataset provides anonymized NIfTI files of all CT scans along with manually annotated segmentation masks of malignant tumors, including primary tumors and metastases. The lesion center of gravity is provided for each individual lesion in the volume (baseline and follow-up scans). The tumors can change shape (progression or regression), split or merge, disappear (complete response) or newly appear (metastasis). Additionally, scripts for image processing and conversion to different file formats (DICOM, mha, hdf5) are available.

The dataset is designed to facilitate the development and evaluation of AI-based lesion detection and segmentation algorithms in longitudinal CT imaging for oncology applications. The inclusion of multiple imaging timepoints allows for the assessment of lesion progression and therapy response, providing a clinically realistic dataset for algorithm training and validation.

Structure and usage

Filenames start with a unique patient ID (10 digits). The data is organized in the following structure:

|--- inputsTr
|--- c6f057b865.csv (lesion information for patient)
|--- c6f057b865_BL_00.json (lesion center of gravity per lesion in baseline CT; Grand-Challenge JSON format)
|--- c6f057b865_BL_img_BL_img_00.nii.gz (CT baseline image)
|--- c6f057b865_BL_mask_BL_img_00.nii.gz (CT baseline lesion mask, integer mask)
|--- c6f057b865_FU_00.json (lesion center of gravity per lesion in first follow-up CT; Grand-Challenge JSON format)
|--- c6f057b865_FU_01.json (lesion center of gravity per lesion in second follow-up CT; Grand-Challenge JSON format; if available)
|--- c6f057b865_FU_img_FU_img_00.nii.gz (CT follow-up image, first body region)
|--- c6f057b865_FU_img_FU_img_01.nii.gz (CT follow-up image, second body region; if available)
|--- ...
|--- targetsTr
|--- c6f057b865_FU_mask_FU_img_00.nii.gz (CT follow-up lesion mask of first body region, integer mask)
|--- c6f057b865_FU_mask_FU_img_01.nii.gz (CT follow-up lesion mask of second body region, integer mask; if available)
|--- ...

CSV file

The CSV file contains the following columns:

lesion_id: Continous ID count in the respective patient
cog_bl: Lesion center of gravity in baseline image as 3D pixel coordinates
img_id_bl: baseline image ID (either 0 or 1)
cog_propagated: Lesion center of gravity (as 3D pixel coordinates) propagated from baseline to follow-up scan using a conventional registration (not available for all lesions)
cog_fu: Lesion center of gravitiy in follow-up image as 3D pixel coordinates
img_id_fu: follow-up image ID (either 0 or 1)
lesion_type: Anatomical lesion location

We demonstrate how this dataset can be used for deep learning-based automated analysis of CT data and provide the trained deep learning model: www.autopet.org

CT acquisition protocol

All CT scans were acquired using Siemens CT scanners, including Siemens Sensation 64, Siemens SOMATOM Definition AS, Siemens SOMATOM Definition Flash, Siemens SOMATOM Force, and the Siemens Biograph128 PET/CT scanner. Patients were scanned using an in-house whole-body staging protocol in the supine position with arms raised above the head. The scanning procedure was performed during the portal-venous phase after the administration of body-weight-adapted contrast medium via the cubital vein.

To ensure consistent image quality, attenuation-based tube current modulation (CARE Dose, reference mAs 240) and a fixed tube voltage of 120 kV were applied. The following scan parameters were used across different CT scanners:

SOMATOM Force: Collimation 128 × 0.6 mm, rotation time 0.5 s, pitch 0.6.

Sensation64: Collimation 64 × 0.6 mm, rotation time 0.5 s, pitch 0.6.

SOMATOM Definition Flash: Collimation 128 × 0.6 mm, rotation time 0.5 s, pitch 1.0.

SOMATOM Definition AS: Collimation 64 × 0.6 mm, rotation time 0.5 s, pitch 0.6.

Biograph128: Collimation 128 × 0.6 mm, rotation time 0.5 s, pitch 0.8.

Slice thickness and increment were set to 3 mm, and image reconstruction was performed using a medium smooth kernel.

Annotation

All data were manually annotated by two experienced radiologists. To this end, tumor lesions were manually segmented on the CT image data using dedicated software.
The following annotation protocol was defined:
Step 1: Identification of tumor lesions by visual assessment of CT information together with the clinical examination reports.
Step 2: Manual free-hand segmentation of identified lesions in axial slices.
Step 3: Baseline and follow-up segmentations are viewed side-by-side to mark the matching lesions.

Files

Longitudinal-CT.zip

Files (53.3 GiB)

Name	Size	Actions
Longitudinal-CT.zip md5:64dd468d45d2c1826d67a4def19a52d6	53.3 GiB	Preview Download