PSMA-PET-CT-Lesions
Creators
-
Jeblick, Katharina
(Contact person)1
-
Schachtner, Balthasar1
-
Mittermeier, Andreas1
-
Jakob, Dexl1
-
Wesp, Philipp1
-
Küstner, Thomas
(Contact person)2
-
Gatidis, Sergios2
-
Früh, Marcel2
-
Fabritius, Matthias1
- Unterrainer, Lena1
-
Sheikh, Gabriel1
-
Delker, Astrid1
-
Guido, Böhning1
-
Brendel, Matthias1
-
Ricke, Jens1
-
Werner, Rudolf1
- Gu, Sijing1
-
Ingrisch, Michael
(Contact person)1
-
Geyer, Thomas1
-
Cyran, Clemens
(Contact person)1
Description
Introduction
We provide a large, annotated dataset of 597 whole-body PSMA-PET/CT studies from 378 male patients with suspected or diagnosed prostate carcinoma to support developing and benchmarking machine learning (ML) models for automated quantitative PET/CT analysis. Alongside the FDG-PET/CT dataset, this dataset addresses the scarcity of publicly available, high-quality annotated PET/CT data. The FDG and PSMA-PET/CT datasets were jointly provided as training data for developing ML models in the autoPET III and IV Grand Challenges for automated lesion segmentation in whole-body PET/CT.
Data Acquisition
Scans were conducted at LMU University Hospital, LMU Munich, between 2014 and 2022 using three clinical PET/CT scanners: Siemens Biograph mCT Flow 20, Siemens Biograph 64-4R TruePoint, and GE Discovery 690. 537 studies contain at least one PSMA-avid tumor lesion, 60 studies do not contain any PSMA-avid tumor lesion. The imaging protocol consisted of a diagnostic CT scan usually from the skull base to the mid-thigh with the following scan parameters: reference tube current exposure time product of 143 mAs (mean); tube voltage of 120 kV or 100 kV for most cases (range: [80, 140] kV), slice thickness of 2.5 - 5.0 mm (mean: 2.82 mm), and x-y resolution of mainly 0.98 mm. Intravenous contrast enhancement was used in most studies, except for patients with contraindications (26 studies). The whole-body PSMA-PET scan was acquired on average 74 minutes after intravenous injection of 246 MBq 18F-PSMA (mean, 369 studies) or 214 MBq 68Ga-PSMA (mean, 228 studies), respectively. The PET data was reconstructed with attenuation correction derived from corresponding CT data using standard, vendor-provided image reconstruction algorithms with a slice thickness ranging from 3.0 - 5.0 mm (mean: 3.49 mm) and x-y resolution ranging from 2.73 - 4.07 mm (mean: 3.56 mm).
Data Annotation
All PSMA-avid tumor lesions, including the primary tumor and/or all metastases, were manually segmented on the PET images by a single reader with 3 years of experience in hybrid imaging using dedicated software (mint Medical, Heidelberg, Germany) and validated by board-certified medical imaging experts with 4 years and >10 years of experience in hybrid imaging. Tumor lesions with significantly increased PSMA expression were segmented in 3D space by drawing circular VOIs, in which voxels with uptake values above a user-defined threshold were pre-segmented automatically and then manually corrected slice by slice, resulting in 3D binary segmentation masks.
Data Processing
For DICOM-to-NIfTI conversion, CT volumes were resampled to match the size and resolution of the corresponding PET volume, PET voxel values were normalized to standardized uptake values (SUV) based on body mass. In addition, patient metadata was extracted from imaging DICOM tags and saved in a CSV file: patient age at imaging in years, PET/CT manufacturer and model name, PET radionuclide, and use of CT contrast agent. Information on radionuclides and the use of CT contrast agents was visually reviewed and validated by a radiologist with 10 years of experience in hybrid imaging. Each study is uniquely identified by an anonymized case identifier number and the study date. The study date was shifted by a global patient-level offset, such that differences between the study dates of a patient are conserved.
Data structure
The NIfTI dataset is organized in nnU-Net structure.
|--- imagesTr
|--- <tracer>_<patient_1>_<study_1>_0000.nii.gz (CT image resampled to PET)
|--- <tracer>_<patient_1>_<study_1>_0001.nii.gz (PET image in SUV)
|--- ...
|--- labelsTr
|--- <tracer>_<patient_1>_<study_1>.nii.gz (SEG mask)
|--- ...
|--- dataset.json (nnUNet dataset description)
|--- dataset_fingerprint.json (nnUNet dataset fingerprint)
|--- splits_final.json (reference 5-fold split)
|--- psma_metadata.csv (metadata csv for psma)
Usage
The dataset can be used for training deep learning models for automated lesion segmentation in whole-body PET/CT: www.autopet.org
Files
PSMA-PET-CT-Lesions_v1.zip
Files
(17.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:6c9432870ef7e55b8b614f2374f6c92c
|
17.8 GB | Preview Download |
Additional details
Identifiers
Data quality
- Accuracy
-
- PET/CT acquisition follows standard imaging protocols, reducing measurement inaccuracies.
- Expert radiologists manually annotated the data, minimizing errors in segmentation and labeling
- Metadata was verified and corrected where necessary..
- Completeness
-
- All required patient metadata, imaging parameters, and annotation details are available.
- Each study includes PET, CT, and segmentation masks.
- Missing data cases, such as patients without contrast-enhanced CT due to contraindications, are documented.
- Conformity
-
- The dataset adheres to standard DICOM imaging formats and follows institutional and international imaging guidelines.
- Annotation follows predefined protocols ensuring standardization.
- Ethical and data privacy regulations were met through approval by institutional review boards.
- Consistency
-
- Imaging acquisition for each cohort was performed following defined protocols, ensuring consistency within groups.
- Variations in PET/CT scanner types and acquisition parameters are documented to account for systematic differences.
- Metadata consistency was verified and corrected where necessary.
- Credibility
-
- Data was collected from reputable medical institutions (LMU Munich and UKT).
- Annotations were performed by experienced radiologists with review by additional experts.
- Published guidelines and established methodologies were followed.
- The DICOM files are hosted by The Cancer Imaging Archive (TCIA).
- Datasets are in use by the autoPET machine learning challenge series.
- Processability
-
- Data is provided in NIfTI format, ensuring compatibility with medical imaging software.
- Metadata is structured in CSV format, enabling automated data handling.
- Relevance
-
- The dataset focuses on oncological PET/CT imaging, directly supporting research in cancer diagnostics and treatment assessment.
- Includes both positive and negative controls for model training.
- Facilitates AI-driven image analysis and automated segmentation for clinical applications.
- Timeliness
-
- Data was collected between 2014 and 2022, ensuring it remains relevant for current medical research and AI model training.
- Imaging protocols reflect current clinical practices.
- Understandability
-
- Data documentation includes descriptions of imaging protocols, scanner models, and patient demographics.
- The dataset is structured logically, with clear labels for PET, CT, and segmentation files.