Publications - AImageLab

Enabling 8B Bitwise Autoregressive Image Generation on Edge GPUs

Authors: Vezzali, Enrico; Bolelli, Federico; Grana, Costantino; Benini, Luca; Li, Yawei

Visual Autoregressive (VAR) models face a severe "Memory Wall" on edge devices due to large model size and substantial KV-cache … (Read full abstract)

Visual Autoregressive (VAR) models face a severe "Memory Wall" on edge devices due to large model size and substantial KV-cache requirements. In this work, we analyze the Infinity VAR family (2B and 8B) and propose a compression pipeline for deployment on constrained NVIDIA Jetson systems. We diagnose critical bottlenecks: activation outliers reaching 353x the median and channel-skewed cache variance. To address this, we propose a hybrid pipeline combining SVDQuant—to structurally decouple weight outliers—and Asymmetric Per-Channel KV8 quantization. Our approach reduces the Infinity-8B footprint by 64% (37.1GB →13.3GB), fitting it on the mid-range Orin NX with a 4.1x speedup over Flux.1-dev (W4A4), while achieving superior aesthetic alignment (ImageReward 1.13 vs 0.935). Crucially, we also unlock entry-level feasibility for the Infinity-2B, compressing it from 16.0 to 7.71 GB to enable deployment on the Orin Nano. These results establish a new efficiency standard for high-fidelity generative AI at the edge. The code is available at https://github.com/Henvezz95/deepcompressor.

2026 Relazione in Atti di Convegno

IRIS

Metodi di Deep Learning Efficienti e Adattivi per Sistemi di Automatic Data Capture

Authors: Vezzali, Enrico

I sistemi di Automatic Data Capture (ADC) rappresentano una tecnologia fondamentale per la logistica, il commercio e la produzione moderna, … (Read full abstract)

I sistemi di Automatic Data Capture (ADC) rappresentano una tecnologia fondamentale per la logistica, il commercio e la produzione moderna, consentendo tracciabilità, automazione e monitoraggio dei processi tramite la rapida acquisizione di informazioni visive o codificate. Tra queste tecnologie, i codici a barre restano una delle soluzioni più diffuse ed economiche per l’identificazione dei prodotti. Tuttavia, nonostante la loro maturità, il riconoscimento di codici e simboli presenta ancora difficoltà in condizioni industriali reali, dove variazioni di illuminazione, sfocature, lunghe distanze o bassa risoluzione riducono la leggibilità. Gli algoritmi di visione artificiale tradizionale – basati su analisi geometriche, operatori morfologici o sulla trasformata di Hough – sono affidabili in contesti controllati, ma non quando le condizioni di acquisizione si discostano dai parametri nominali. Le tecniche di deep learning, invece, offrono maggiore flessibilità e robustezza, ma richiedono risorse computazionali elevate che ne limitano l’uso su piattaforme embedded. Colmare questo divario tra accuratezza ed efficienza è quindi essenziale per la prossima generazione di sistemi ADC intelligenti. La tesi analizza strategie di benchmarking, ottimizzazione e deployment di modelli di deep learning efficienti per applicazioni ADC industriali. Il lavoro, svolto in collaborazione con Datalogic S.p.A., si concentra sull’integrazione di architetture neurali adattive in ambienti vincolati e in tempo reale. La prima parte affronta la carenza di dati open source e benchmark riproducibili nella localizzazione di codici a barre. A tal fine è stato sviluppato BarBeR – Barcode Benchmark Repository, un framework pubblico con 8 748 immagini annotate che unifica approcci classici e metodi di deep learning sotto protocolli comuni, garantendo confronti equi e riproducibilità. I test hanno confermato che, sebbene i modelli deep superino quelli tradizionali in accuratezza, il loro costo computazionale resta un ostacolo per l’esecuzione in tempo reale su dispositivi embedded. Per superare tale limite è stato proposto BaFaLo, un localizzatore leggero basato sulla segmentazione, ottimizzato per operare su CPU senza acceleratori. Ispirato al paradigma Fast-SCNN, BaFaLo bilancia velocità e precisione, rilevando codici piccoli o degradati in condizioni difficili e mantenendo prestazioni real-time. Poiché la sola localizzazione non basta, e occorre leggere i codici anche in condizioni avverse, è stato introdotto Mosaic-SR, un metodo di super-risoluzione adattivo a più passaggi che alloca le risorse di calcolo alle regioni più complesse. Guidato da una stima di incertezza, Mosaic-SR migliora accuratezza e latenza rispetto agli approcci uniformi, consentendo ricostruzioni di alta qualità su hardware embedded. L’ultima parte, svolta presso l’Integrated Systems Laboratory dell’ETH Zurich, riguarda la quantizzazione e il deployment di modelli generativi. Combinando strategie avanzate come SVDQuant e la quantizzazione della cache, è stato possibile ridurre di oltre il 50 % la memoria richiesta senza compromettere qualità o stabilità. Questi risultati aprono la strada all’uso di modelli generativi su piattaforme a risorse limitate e alla creazione di dataset sintetici quando i dati reali o open source sono insufficienti. In sintesi, la tesi dimostra come il deep learning efficiente e adattivo renda accessibili capacità visive avanzate ai sistemi ADC in tempo reale. Attraverso benchmarking, ottimizzazione e deployment di architetture neurali per rilevamento, miglioramento e generazione, il lavoro contribuisce all’evoluzione della visione industriale: da pipeline rigide e basate su regole a soluzioni flessibili e guidate dai dati, affidabili anche in condizioni operative reali

2026 Tesi di dottorato

IRIS

A Deep-Learning-Based Method for Real-Time Barcode Segmentation on Edge CPUs

Authors: Vezzali, Enrico; Vorabbi, Lorenzo; Grana, Costantino; Bolelli, Federico

Barcodes are a critical technology in industrial automation, logistics, and retail, enabling fast and reliable data capture. While deep learning … (Read full abstract)

Barcodes are a critical technology in industrial automation, logistics, and retail, enabling fast and reliable data capture. While deep learning has significantly improved barcode localization accuracy, most modern architectures remain too computationally demanding for real-time deployment on embedded systems without dedicated hardware acceleration. In this work, we present BaFaLo (Barcode Fast Localizer), an ultra-lightweight segmentation-based neural network for barcode localization. Our model is specifically optimized for real-time performance on low-power CPUs while maintaining high localization accuracy for both 1D and 2D barcodes. It features a two-branch architecture—comprising a local feature extractor and a global context module—and is tailored for low-resolution inputs to improve inference speed further. We benchmark BaFaLo against several lightweight architectures for object detection or segmentation, including YOLO Nano, Fast-SCNN, BiSeNet V2, and ContextNet, using the BarBeR dataset. BaFaLo achieves the fastest inference time among all deep-learning models tested, operating at 57.62ms per frame on a single CPU core of a Raspberry Pi 3B+. Despite its compact design, it achieves a decoding rate nearly equivalent to YOLO Nano for 1D barcodes and only 3.5 percentage points lower for 2D barcodes while being approximately nine times faster.

2025 Relazione in Atti di Convegno

IRIS

BarBeR: A Barcode Benchmarking Repository

Authors: Vezzali, E.; Bolelli, F.; Santi, S.; Grana, C.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Since their invention in 1949, barcodes have remained the preferred method for automatic data capture, playing a crucial role in … (Read full abstract)

Since their invention in 1949, barcodes have remained the preferred method for automatic data capture, playing a crucial role in supply chain management. To detect a barcode in an image, multiple algorithms have been proposed in the literature, with a significant increase of interest in the topic since the rise of deep learning. However, research in the field suffers from many limitations, including the scarcity of public datasets and code implementations, which hampers the reproducibility and reliability of published results. For this reason, we developed "BarBeR" (Barcode Benchmark Repository), a benchmark designed for testing and comparing barcode detection algorithms. This benchmark includes the code implementation of various detection algorithms for barcodes, along with a suite of useful metrics. It offers a range of test setups and can be expanded to include any localization algorithm. In addition, we provide a large, annotated dataset of 8748 barcode images, combining multiple public barcode datasets with standardized annotation formats for both detection and segmentation tasks. Finally, we share the results obtained from running the benchmark on our dataset, offering valuable insights into the performance of different algorithms.

2025 Relazione in Atti di Convegno

DOI IRIS

Mosaic-SR: An Adaptive Multi-step Super-Resolution Method for Low-Resolution 2D Barcodes

Authors: Vezzali, Enrico; Vorabbi, Lorenzo; Grana, Costantino; Bolelli, Federico

QR and Datamatrix codes are widely used in warehouse logistics and high-speed production pipelines. Still, distant or small barcodes often … (Read full abstract)

QR and Datamatrix codes are widely used in warehouse logistics and high-speed production pipelines. Still, distant or small barcodes often yield low-pixel-density images that are hard to read. Conventional solutions rely on costly hardware or enhanced lighting, raising expenses and potentially reducing depth of field. We propose Mosaic-SR, a multi-step, adaptive super-resolution (SR) method that devotes more computation to barcode regions than uniform backgrounds. For each patch, it predicts an uncertainty value to decide how many refinement steps are required. Our experiments show that Mosaic-SR surpasses state-of-the-art SR models on 2D barcode images, achieving higher PSNR and decoding rates in less time. All code and trained models are publicly available at https://github.com/Henvezz95/mosaic-sr.

2025 Relazione in Atti di Convegno

DOI IRIS

State-of-the-art Review and Benchmarking of Barcode Localization Methods

Authors: Vezzali, Enrico; Bolelli, Federico; Santi, Stefano; Grana, Costantino

Published in: ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

Barcodes, despite their long history, remain an essential technology in supply chain management. In addition, barcodes have found extensive use … (Read full abstract)

Barcodes, despite their long history, remain an essential technology in supply chain management. In addition, barcodes have found extensive use in industrial engineering, particularly in warehouse automation, component tracking, and robot guidance. To detect a barcode in an image, multiple algorithms have been proposed in the literature, with a significant increase of interest in the topic since the rise of deep learning. However, research in the field suffers from many limitations, including the scarcity of public datasets and code implementations which hinders the reproducibility and reliability of published results. For this reason, we developed ``BarBeR'' (Barcode Benchmark Repository), a benchmark designed for testing and comparing barcode detection algorithms. This benchmark includes the code implementation of various detection algorithms for barcodes, along with a suite of useful metrics. Among the supported localization methods, there are multiple deep-learning detection models, that will be used to assess the recent contributions of Artificial Intelligence to this field. In addition, we provide a large, annotated dataset of 8748 barcode images, combining multiple public barcode datasets with standardized annotation formats for both detection and segmentation tasks. Finally, we provide a thorough summary of the history and literature on barcode localization and share the results obtained from running the benchmark on our dataset, offering valuable insights into the performance of different algorithms when applied to real-world problems.

2025 Articolo su rivista

DOI IRIS

BarBeR: A Barcode Benchmarking Repository

Authors: Vezzali, Enrico; Bolelli, Federico; Santi, Stefano; Grana, Costantino

Since their invention in 1949, barcodes have remained the preferred method for automatic data capture, playing a crucial role in … (Read full abstract)

Since their invention in 1949, barcodes have remained the preferred method for automatic data capture, playing a crucial role in supply chain management. To detect a barcode in an image, multiple algorithms have been proposed in the literature, with a significant increase of interest in the topic since the rise of deep learning. However, research in the field suffers from many limitations, including the scarcity of public datasets and code implementations, which hampers the reproducibility and reliability of published results. For this reason, we developed "BarBeR" (Barcode Benchmark Repository), a benchmark designed for testing and comparing barcode detection algorithms. This benchmark includes the code implementation of various detection algorithms for barcodes, along with a suite of useful metrics. It offers a range of test setups and can be expanded to include any localization algorithm. In addition, we provide a large, annotated dataset of 8748 barcode images, combining multiple public barcode datasets with standardized annotation formats for both detection and segmentation tasks. Finally, we share the results obtained from running the benchmark on our dataset, offering valuable insights into the performance of different algorithms.

2024 Relazione in Atti di Convegno

IRIS

Publications by Enrico Vezzali

Enabling 8B Bitwise Autoregressive Image Generation on Edge GPUs

Metodi di Deep Learning Efficienti e Adattivi per Sistemi di Automatic Data Capture

A Deep-Learning-Based Method for Real-Time Barcode Segmentation on Edge CPUs

BarBeR: A Barcode Benchmarking Repository

Mosaic-SR: An Adaptive Multi-step Super-Resolution Method for Low-Resolution 2D Barcodes

State-of-the-art Review and Benchmarking of Barcode Localization Methods

BarBeR: A Barcode Benchmarking Repository