Publications - AImageLab

Multi-Structure Segmentation in CBCT Volumes: the ToothFairy2 Challenge

Authors: Bolelli, Federico; Lumetti, Luca; Van Nistelrooij, Niels; Vinayahalingam, Shankeeth; Di Bartolomeo, Mattia; Marchesini, Kevin; Pellacani, Arrigo; Candeloro, Ettore; Rosati, Gabriele; Xi, Tong; Isensee, Fabian; Kirchhoff, Yannick; Krämer, Lars; Rokuss, Maximilian; Ulrich, Constantin; Maier-Hein, Klaus; Jiang, Yuxian; Liu, Yusheng; Wang, Lisheng; Wang, Haoshen; Chen, Siyu; Cui, Zhiming; Shi, Pengcheng; Pan, Zhaohong; Liang, Xiaokun; Ma, Qi; Konukoglu, Ender; Wodzinski, Marek; Müller, Henning; Mai, Haipeng; Dang, Xiaobing; Bhandary, Shrajan; Grosu, Radu; Bergé, Stefaan; Anesi, Alexandre; Grana, Costantino

Published in: MEDICAL IMAGE ANALYSIS

Cone-beam computed tomography (CBCT) is widely used for dento-maxillofacial diagnostics and treatment planning, and comprehensive multi-structure segmentation remains time-consuming, limiting … (Read full abstract)

Cone-beam computed tomography (CBCT) is widely used for dento-maxillofacial diagnostics and treatment planning, and comprehensive multi-structure segmentation remains time-consuming, limiting large-scale, reproducible research. In this article, we present ToothFairy2, a MICCAI 2024 challenge on multi-structure segmentation in maxillofacial CBCT. The accompanying dataset comprises 530 CBCT volumes (480 public training, 50 hidden test) with expert 3D annotations of 42 classes, including maxilla, mandible, crowns, bridges, implants, inferior alveolar canals, maxillary sinuses, pharynx, and teeth using the International Tooth Numbering System (FDI). 26 international teams participated in ToothFairy2, and their methods were run and evaluated for voxel-wise multi-class segmentation using a standardized protocol. This report extends the evaluation of teeth to also investigate the current capabilities of tooth detection and FDI numbering. Furthermore, ranking stability was analyzed to assess the robustness of the final challenge outcome. Overall, challenge participants achieved consistently high performance for large, high-contrast structures such as jawbones, pharynx, and most teeth, while maxillary sinuses, dental restorations, and fine structures remain challenging due to class imbalance and metal artifacts. Analysis of tooth-related metrics further revealed that assigning correct FDI numbers was more challenging than delineating individual teeth. By releasing CBCT data, 3D annotations, baseline models, and evaluation code, ToothFairy2 establishes a long-term benchmark to drive the development of automated methods for robust, clinically meaningful multi-structure segmentation in maxillofacial CBCT.

2026 Articolo su rivista

DOI IRIS

Ontology-Grounded Structured Prediction for Dental CBCT Reporting

Authors: Lumetti, Luca; Di Bartolomeo, Mattia; Pellacani, Arrigo; Anesi, Alex; Grana, Costantino; Bolelli, Federico

We present a dataset and baseline for ontology-grounded structured prediction from dental Cone-Beam Computed Tomography (CBCT) volumes. Building on the … (Read full abstract)

We present a dataset and baseline for ontology-grounded structured prediction from dental Cone-Beam Computed Tomography (CBCT) volumes. Building on the public ToothFairy3 benchmark (532 volumes with expert-level segmentations), we contribute (i) a total of 893 free-text clinical reports for 529 publicly available CBCT volumes, (ii) their conversion into validated RDF/Turtle (Resource Description Framework) instances aligned with a clinician-designed OWL (Web Ontology Language) ontology spanning 13 finding types and multiple qualifier axes, and (iii) a strong baseline demonstrating the effectiveness of our setup and establishing a foundation for future work. We formulate CBCT reporting as a three-stage structured prediction problem—i.e., finding detection, anatomical slot allocation, and property prediction—and introduce a hierarchical evaluation suite of six clinically interpretable metrics that decouple detection, localization, and characterization. A baseline model using frozen multi-scale VoxTell features, a structure-indexed encoder, and ontology-driven prediction heads achieves strong results under 5-fold cross-validation, with stage-decoupled analysis identifying presence detection as the primary deployment bottleneck. Dataset, ontology, and code are publicly released: https://github.com/AImageLab-zip/CBCT-Report

2026 Relazione in Atti di Convegno

IRIS

Scalare l’Intelligenza Artificiale per l’Analisi di Immagini Orali e Dentali

Authors: Lumetti, Luca

La tomografia computerizzata a fascio conico (Cone Beam Computed Tomography, CBCT) è centrale nella pratica odontoiatrica e maxillo-facciale contemporanea, ma … (Read full abstract)

La tomografia computerizzata a fascio conico (Cone Beam Computed Tomography, CBCT) è centrale nella pratica odontoiatrica e maxillo-facciale contemporanea, ma i progressi nell’analisi automatizzata sono stati limitati dalla scarsità di dataset pubblici disponibili. Questa tesi affronta tale collo di bottiglia creando un ecosistema aperto ed estensibile che combina dataset, strumenti di annotazione, progressi algoritmici e dimostra come questi elementi interagiscano ciclicamente per accelerare la ricerca e la traduzione in prodotti clinici. Il dataset Maxillo è stato il primo nel suo genere, fornendo 91 volumi densamente annotati e 256 scansioni annotate in modo sparso per l’annotazione del Canale Alveolare Inferiore. La serie ToothFairy, a cui questa tesi ha contribuito, si è basata su queste fondamenta: la prima versione di ToothFairy ha aumentato le annotazioni dense a 156 volumi; ToothFairy2 si è espansa fino a 480 volumi CBCT, ciascuno con 42 classi semantiche; e ToothFairy3 ha ulteriormente ampliato il corpus a 532 volumi e 77 classi, migliorando al contempo la qualità delle annotazioni e la diversità degli scanner utilizzati. A complemento delle CBCT, il dataset Bits2Bites, anch'esso parte di questa tesi, ha fornito 200 coppie di scansioni intra-orali registrate con annotazioni multi-etichetta di occlusione. Tutte le risorse sono state rilasciate in modo aperto per consentire benchmarking riproducibili e sviluppi successivi. Per scalare le annotazioni senza sacrificare la fedeltà clinica, ho sviluppato strumenti di annotazione semi-automatizzati e una rigorosa pipeline di controllo qualità che combina modelli predittivi con la revisione da parte di esperti. Fondamentalmente, la creazione dei dataset, gli strumenti e lo sviluppo dei modelli sono progrediti in modo ciclico: dati aggiuntivi hanno permesso modelli migliori; modelli migliori hanno alimentato strumenti di annotazione più rapidi e accurati; e strumenti migliorati hanno a loro volta prodotto dataset più grandi e di qualità superiore, costituendo il contributo intellettuale centrale di questo lavoro. Su questa base di dati, ho migliorato i metodi di segmentazione volumetrica: moduli basati su architettura transformer che codificano esplicitamente le relazioni spaziali tra patch per preservare il dettaglio a livello di voxel aggregando al contempo il contesto a lungo raggio, e adattamenti dell'architettura Mamba per una segmentazione 3D efficiente e ad alta precisione. Infine, ho introdotto U-Net Transplant, un framework di fusione di modelli che propone tecniche innovative per aggiornare e specializzare modelli clinici senza un riaddestramento completo, riducendo i costi di rideploy, lo spazio di archiviazione e i rischi di esposizione dei dati. Nel complesso, questo ecosistema ha fornito il più grande benchmark CBCT aperto per la segmentazione maxillo-facciale fino ad oggi, insieme a un insieme coerente di metodi e strumenti che hanno migliorato in modo sostanziale l’accuratezza, l’efficienza e la gestione del ciclo di vita dell’IA clinica, abilitando una ricerca e un’implementazione dell’IA dentale più rapide, sicure e riproducibili.

2026 Tesi di dottorato

IRIS

The paper has a GitHub, the GitHub has a README, the README has nothing: Reproducibility Signals for Review Support

Authors: Bolelli, Federico; Santoli, Davide; Marchesini, Kevin; Lumetti, Luca; Grana, Costantino

Reproducibility policies promise "checkable" medical-imaging science, yet many submissions still ship unverifiable artifacts. Our analysis of 3722 MICCAI papers shows … (Read full abstract)

Reproducibility policies promise "checkable" medical-imaging science, yet many submissions still ship unverifiable artifacts. Our analysis of 3722 MICCAI papers shows code-linking rising from 51.8% (2021) to 72.5% (2025), but ~13% of linked repositories are inaccessible or empty. We present paper-snitch, a reviewer-facing decision-support tool that turns these signals into an evidence-grounded report. Paper-snitch parses PDFs, resolves and sanity-checks repositories, and applies policy-aware checklists aligned with MICCAI expectations, producing a review-time verifiability score decomposed into interpretable sub-scores plus criterion-linked excerpts and artifacts reviewers can inspect. It never executes untrusted code or attempts GPU-heavy reproduction, focusing instead on bounded, verifiable checks. We compare paper-snitch on 100 randomly sampled MICCAI 2025 papers with human annotators using shared evaluation criteria, indicating that automated, bounded checks can scale reproducibility screening while keeping final decisions with reviewers.

2026 Relazione in Atti di Convegno

IRIS

ToothFairy3: Scaling CBCT Maxillofacial Segmentation to 77 Classes with U-Mamba2

Authors: Lumetti, Luca; Tan, Zhi Qin; Borghi, Lorenzo; Van Nistelrooij, Niels; Rosati, Gabriele; Addison, Owen; Li, Yupeng; Vinayahalingam, Shankeeth; Grana, Costantino; Bolelli, Federico

Accurate delineation of maxillofacial anatomy in Cone-Beam Computed Tomography (CBCT) is essential for dental planning, but robust automated segmentation remains … (Read full abstract)

Accurate delineation of maxillofacial anatomy in Cone-Beam Computed Tomography (CBCT) is essential for dental planning, but robust automated segmentation remains challenging, due to limited public multi-structure datasets and the high computational burden of 3D deep learning models. We present and release ToothFairy3, a large-scale CBCT benchmark that extends ToothFairy2 with 102 additional fully annotated scans and an expanded taxonomy covering 77 classes, including 32 tooth-specific pulp cavities and small neurovascular structures. ToothFairy3 comprises 582 volumes (over 40000 annotated objects), with 532 released with voxel-level labels and 50 held out for leakage-free, server-side evaluation. We also introduce U-Mamba2, an efficient U-Net-style architecture that inserts a Mamba2 state-space block at the bottleneck to capture global context with favorable computational scaling. Our proposed domain-informed training further improves the learning of maxillofacial anatomies. Across CNN, Transformer, and Mamba baselines, U-Mamba2 achieves competitive Dice/HD95 scores with lower latency and, compared with training on state-of-the-art public CBCT datasets, ToothFairy3-trained models generalize best to the hidden test set, particularly for maxillary structures.

2026 Relazione in Atti di Convegno

IRIS

Accurate 3D Medical Image Segmentation with Mambas

Authors: Lumetti, Luca; Pipoli, Vittorio; Marchesini, Kevin; Ficarra, Elisa; Grana, Costantino; Bolelli, Federico

Published in: PROCEEDINGS INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING

CNNs and Transformer-based architectures are recently dominating the field of 3D medical segmentation. While CNNs face limitations in the local … (Read full abstract)

CNNs and Transformer-based architectures are recently dominating the field of 3D medical segmentation. While CNNs face limitations in the local receptive field, Transformers require significant memory and data, making them less suitable for analyzing large 3D medical volumes. Consequently, fully convolutional network models like U-Net are still leading the 3D segmentation scenario. Although efforts have been made to reduce the Transformers computational complexity, such optimized models still struggle with content-based reasoning. This paper examines Mamba, a Recurrent Neural Network (RNN) based on State Space Models (SSMs), which achieves linear complexity and has outperformed Transformers in long-sequence tasks. Specifically, we assess Mamba’s performance in 3D medical segmentation using three widely recognized and commonly employed datasets and propose architectural enhancements to improve its segmentation effectiveness by mitigating the primary shortcomings of existing Mamba-based solutions.

2025 Relazione in Atti di Convegno

DOI IRIS

Bits2Bites: Intra-oral Scans Occlusal Classification

Authors: Borghi, Lorenzo; Lumetti, Luca; Cremonini, Francesca; Rizzo, Federico; Grana, Costantino; Lombardo, Luca; Bolelli, Federico

We introduce Bits2Bites, the first publicly available dataset for occlusal classification from intra-oral scans, comprising 200 paired upper and lower … (Read full abstract)

We introduce Bits2Bites, the first publicly available dataset for occlusal classification from intra-oral scans, comprising 200 paired upper and lower dental arches annotated across multiple clinically relevant dimensions (sagittal, vertical, transverse, and midline relationships). Leveraging this resource, we propose a multi-task learning benchmark that jointly predicts five occlusal traits from raw 3D point clouds using state-of-the-art point-based neural architectures. Our approach includes extensive ablation studies assessing the benefits of multi-task learning against single-task baselines, as well as the impact of automatically-predicted anatomical landmarks as input features. Results demonstrate the feasibility of directly inferring comprehensive occlusion information from unstructured 3D data, achieving promising performance across all tasks. Our entire dataset, code, and pretrained models are publicly released to foster further research in automated orthodontic diagnosis.

2025 Relazione in Atti di Convegno

IRIS

Enhancing Testicular Ultrasound Image Classification Through Synthetic Data and Pretraining Strategies

Authors: Morelli, Nicola; Marchesini, Kevin; Lumetti, Luca; Santi, Daniele; Grana, Costantino; Bolelli, Federico

Testicular ultrasound imaging is vital for assessing male infertility, with testicular inhomogeneity serving as a key biomarker. However, subjective interpretation … (Read full abstract)

Testicular ultrasound imaging is vital for assessing male infertility, with testicular inhomogeneity serving as a key biomarker. However, subjective interpretation and the scarcity of publicly available datasets pose challenges to automated classification. In this study, we explore supervised and unsupervised pretraining strategies using a ResNet-based architecture, supplemented by diffusion-based generative models to synthesize realistic ultrasound images. Our results demonstrate that pretraining significantly enhances classification performance compared to training from scratch, and synthetic data can effectively substitute real images in the pretraining process, alleviating data-sharing constraints. These methods offer promising advancements toward robust, clinically valuable automated analysis of male infertility. The source code is publicly available at https://github.com/AImageLab-zip/TesticulUS/.

2025 Relazione in Atti di Convegno

IRIS

Investigating the ABCDE Rule in Convolutional Neural Networks

Authors: Bolelli, Federico; Lumetti, Luca; Marchesini, Kevin; Candeloro, Ettore; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Convolutional Neural Networks (CNNs) have been broadly employed in dermoscopic image analysis, mainly due to the large amount of data … (Read full abstract)

Convolutional Neural Networks (CNNs) have been broadly employed in dermoscopic image analysis, mainly due to the large amount of data gathered by the International Skin Imaging Collaboration (ISIC). But where do neural networks look? Several authors have claimed that the ISIC dataset is affected by strong biases, i.e. spurious correlations between samples that machine learning models unfairly exploit while discarding the useful patterns they are expected to learn. These strong claims have been supported by showing that deep learning models maintain excellent performance even when "no information about the lesion remains" in the debased input images. With this paper, we explore the interpretability of CNNs in dermoscopic image analysis by analyzing which characteristics are considered by autonomous classification algorithms. Starting from a standard setting, experiments presented in this paper gradually conceal well-known crucial dermoscopic features and thoroughly investigate how CNNs performance subsequently evolves. Experimental results carried out on two well-known CNNs, EfficientNet-B3, and ResNet-152, demonstrate that neural networks autonomously learn to extract features that are notoriously important for melanoma detection. Even when some of such features are removed, the others are still enough to achieve satisfactory classification performance. Obtained results demonstrate that literature claims on biases are not supported by carried-out experiments. Finally, to demonstrate the generalization capabilities of state-of-the-art CNN models for skin lesion classification, a large private dataset has been employed as an additional test set.

2025 Relazione in Atti di Convegno

DOI IRIS

Location Matters: Harnessing Spatial Information to Enhance the Segmentation of the Inferior Alveolar Canal in CBCTs

Authors: Lumetti, Luca; Pipoli, Vittorio; Bolelli, Federico; Ficarra, Elisa; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The segmentation of the Inferior Alveolar Canal (IAC) plays a central role in maxillofacial surgery, drawing significant attention in the … (Read full abstract)

The segmentation of the Inferior Alveolar Canal (IAC) plays a central role in maxillofacial surgery, drawing significant attention in the current research. Because of their outstanding results, deep learning methods are widely adopted in the segmentation of 3D medical volumes, including the IAC in Cone Beam Computed Tomography (CBCT) data. One of the main challenges when segmenting large volumes, including those obtained through CBCT scans, arises from the use of patch-based techniques, mandatory to fit memory constraints. Such training approaches compromise neural network performance due to a reduction in the global contextual information. Performance degradation is prominently evident when the target objects are small with respect to the background, as it happens with the inferior alveolar nerve that develops across the mandible, but involves only a few voxels of the entire scan. In order to target this issue and push state-of-the-art performance in the segmentation of the IAC, we propose an innovative approach that exploits spatial information of extracted patches and integrates it into a Transformer architecture. By incorporating prior knowledge about patch location, our model improves state of the art by ~2 points on the Dice score when integrated with the standard U-Net architecture. The source code of our proposal is publicly released.

2025 Relazione in Atti di Convegno

DOI IRIS

Publications by Luca Lumetti

Multi-Structure Segmentation in CBCT Volumes: the ToothFairy2 Challenge

Ontology-Grounded Structured Prediction for Dental CBCT Reporting

Scalare l’Intelligenza Artificiale per l’Analisi di Immagini Orali e Dentali

The paper has a GitHub, the GitHub has a README, the README has nothing: Reproducibility Signals for Review Support

ToothFairy3: Scaling CBCT Maxillofacial Segmentation to 77 Classes with U-Mamba2

Accurate 3D Medical Image Segmentation with Mambas

Bits2Bites: Intra-oral Scans Occlusal Classification

Enhancing Testicular Ultrasound Image Classification Through Synthetic Data and Pretraining Strategies

Investigating the ABCDE Rule in Convolutional Neural Networks

Location Matters: Harnessing Spatial Information to Enhance the Segmentation of the Inferior Alveolar Canal in CBCTs