PSMA-FDG-PET-CT-Lesions
- Creators
- Gatidis, Sergios1
- Küstner, Thomas1
- Ingrisch, Michael2
- Hepp, Tobias1
- Früh, Marcel1
- Nikolaou, Konstantin1
- La Fougère, Christian1
- Pfannenberg, Christina1
- Fabritius, Matthias2
- Jeblick, Katharina2
- Schachtner, Balthasar2
- Dexl, Jakob2
- Wesp, Philipp2
- Mittermeier, Andreas2
- Unterrainer, Lena2
- Sheikh, Gabriel2
- Böning, Guido2
- Brendel, Matthias2
- Ricke, Jens2
- Gu, Sijing2
- Geyer, Thomas2
- Cyran, Clemens2
Description
Introduction
A publicly available dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies. 1014 whole body Fluorodeoxyglucose (FDG)-PET/CT studies (900 patients) and 597 prostate-specific membrane antigen (PSMA)-PET/CT studies (378 patients) acquired between 2014 and 2022 were included. The FDG cohort comprises 501 patients diagnosed with histologically proven malignant melanoma, lymphoma, or lung cancer, along with 513 negative control patients. The PSMA cohort includes pre- and/or post-therapeutic PET/CT images of male individuals with prostate carcinoma, encompassing images with (537) and without PSMA-avid tumor lesions (60). Notably, the training datasets exhibit distinct age distributions: the FDG UKT cohort spans 570 male patients (mean age: 60; std: 16) and 444 female patients (mean age: 58; std: 16), whereas the PSMA LMU cohort tends to be older, with 378 male patients (mean age: 71; std: 8). Additionally, there are variations in imaging conditions between the FDG Tübingen and PSMA Munich cohorts, particularly regarding the types and number of PET/CT scanners utilized for acquisition. The PSMA Munich dataset was acquired using three different scanner types (Siemens Biograph 64-4R TruePoint, Siemens Biograph mCT Flow 20, and GE Discovery 690), whereas the FDG Tübingen dataset was acquired using a single scanner (Siemens Biograph mCT).
Structure and usage
The data is organized in the nnUNet structure:
|--- imagesTr
|--- tracer_patient1_study1_0000.nii.gz (CT image resampled to PET)
|--- tracer_patient1_study1_0001.nii.gz (PET image in SUV)
|--- ...
|--- labelsTr
|--- tracer_patient1_study1.nii.gz (manual annotations of tumor lesions)|--- dataset.json (nnUNet dataset description)
|--- dataset_fingerprint.json (nnUNet dataset fingerprint)|--- splits_final.json (reference 5fold split)
|--- psma_metadata.csv (metadata csv for psma)
|--- fdg_metadata.csv (original metadata csv for fdg)
We demonstrate how this dataset can be used for deep learning-based automated analysis of PET/CT data and provide the trained deep learning model: www.autopet.org
PET/CT acquisition protocol
FDG dataset: Patients fasted at least 6 h prior to the injection of approximately 350 MBq 18F-FDG. Whole-body PET/CT images were acquired using a Biograph mCT PET/CT scanner (Siemens, Healthcare GmbH, Erlangen, Germany) and were initiated approximately 60 min after intravenous tracer administration. Diagnostic CT scans of the neck, thorax, abdomen, and pelvis (200 reference mAs; 120 kV) were acquired 90 sec after intravenous injection of a contrast agent (90-120 ml Ultravist 370, Bayer AG) or without contrast agent (in case of existing contraindications). PET Images were reconstructed iteratively (three iterations, 21 subsets) with Gaussian post-reconstruction smoothing (2 mm full width at half-maximum). Slice thickness on contrast-enhanced CT was 2 or 3 mm.
PSMA dataset: Examinations were acquired on different PET/CT scanners (Siemens Biograph 64-4R TruePoint, Siemens Biograph mCT Flow 20, and GE Discovery 690). The imaging protocol mainly consisted of a diagnostic CT scan from the skull base to the mid-thigh using the following scan parameters: reference tube current exposure time product of 143 mAs (mean); tube voltage of 100kV or 120 kV for most cases, slice thickness of 3 mm for Biograph 64 and Biograph mCT, and 2.5 mm for GE Discovery 690 (except for 3 cases with 5 mm). Intravenous contrast enhancement was used in most studies (571), except for patients with contraindications (26).
The whole-body PSMA-PET scan was acquired on average around 74 minutes after intravenous injection of 246 MBq 18F-PSMA (mean, 369 studies) or 214 MBq 68Ga-PSMA (mean, 228 studies), respectively. The PET data was reconstructed with attenuation correction derived from corresponding CT data. For GE Discovery 690 the reconstruction process employed a VPFX algorithm with voxel size 2.73 mm × 2.73 mm × 3.27 mm, for Siemens Biograph mCT Flow 20 a PSF+TOF algorithm (2 iterations, 21 subsets) with voxel size 4.07 mm × 4.07 mm × 3.00 mm, and for Siemens Biograph 64-4R TruePoint a PSF algorithm (3 iterations, 21 subsets) with voxel size 4.07 mm × 4.07 mm × 5.00 mm.
Annotation
FDG PET/CT training and test data from UKT was annotated by a Radiologist with 10 years of experience in Hybrid Imaging and experience in machine learning research. FDG PET/CT test data from LMU was annotated by a radiologist with 8 years of experience in hybrid imaging. PSMA PET/CT training and test data from LMU as well as PSMA PET/CT test data from UKT was annotated by a single reader and reviewed by a radiologist with 5 years of experience in hybrid imaging.
The following annotation protocol was defined:
Step 1: Identification of tracer-avid tumor lesions by visual assessment of PET and CT information together with the clinical examination reports.
Step 2: Manual free-hand segmentation of identified lesions in axial slices.
Files
Additional details
- Accuracy
- The dataset consists of histologically proven cases, ensuring the correctness of cancer diagnosis.
- PET/CT acquisition follows standard imaging protocols, reducing measurement inaccuracies.
- Expert radiologists manually annotated the data, minimizing errors in segmentation and labeling.
- Completeness
- All required patient metadata, imaging parameters, and annotation details are available.
- Each study includes PET, CT, and segmentation masks.
- Missing data cases, such as patients without contrast-enhanced CT due to contraindications, are documented.
- Conformity
- The dataset adheres to standard DICOM imaging formats and follows institutional and international imaging guidelines.
- Annotation follows predefined protocols ensuring standardization.
- Ethical and data privacy regulations were met through approval by institutional review boards.
- Consistency
- Imaging acquisition for each cohort was performed following defined protocols, ensuring consistency within groups.
- Variations in PET/CT scanner types and acquisition parameters are documented to account for systematic differences.
- Metadata consistency was verified and corrected where necessary.
- Credibility
- Data was collected from reputable medical institutions (LMU Munich and UKT).
- Annotations were performed by experienced radiologists with review by additional experts.
- Published guidelines and established methodologies were followed.
- The dataset has been published in a peer-reviewed journal (Nature Scientific Data) and the DICOM files are hosted by The Cancer Imaging Archive (TCIA).
- Datasets are in use by the autoPET machine learning challenge series.
- Processability
- Data is provided in NIfTI format, ensuring compatibility with medical imaging software.
- Conversion scripts are available to transform data into commonly used formats (DICOM, mha, hdf5) for easier analysis.
- Metadata is structured in CSV format, enabling automated data handling.
- Relevance
- The dataset focuses on oncological PET/CT imaging, directly supporting research in cancer diagnostics and treatment assessment.
- Includes both positive and negative controls for model training.
- Facilitates AI-driven image analysis and automated segmentation for clinical applications.
- Timeliness
- Data was collected between 2014 and 2022, ensuring it remains relevant for current medical research and AI model training.
- Imaging protocols reflect current clinical practices.
- Understandability
- Data documentation includes descriptions of imaging protocols, scanner models, and patient demographics.
- Annotations follow a step-by-step protocol, ensuring clarity.
- The dataset is structured logically, with clear labels for PET, CT, and segmentation files.
- Data File format
- zip
- Data source type
- medical/clinical registers/records/accounts
- General data format
- still image
- Code repository
- https://github.com/lab-midas/TCIA_processing/
- Copyright holder
- University Hospital Tübingen
- Copyright year
- 2022