Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

Authors: Millunzi, Monica; Bonicelli, Lorenzo; Porrello, Angelo; Credi, Jacopo; Kolm, Petter N.; Calderara, Simone

Forgetting presents a significant challenge during incremental training, making it particularly demanding for contemporary AI systems to assimilate new knowledge … (Read full abstract)

Forgetting presents a significant challenge during incremental training, making it particularly demanding for contemporary AI systems to assimilate new knowledge in streaming data environments. To address this issue, most approaches in Continual Learning (CL) rely on the replay of a restricted buffer of past data. However, the presence of noise in real-world scenarios, where human annotation is constrained by time limitations or where data is automatically gathered from the web, frequently renders these strategies vulnerable. In this study, we address the problem of CL under Noisy Labels (CLN) by introducing Alternate Experience Replay (AER), which takes advantage of forgetting to maintain a clear distinction between clean, complex, and noisy samples in the memory buffer. The idea is that complex or mislabeled examples, which hardly fit the previously learned data distribution, are most likely to be forgotten. To grasp the benefits of such a separation, we equip AER with Asymmetric Balanced Sampling (ABS): a new sample selection strategy that prioritizes purity on the current task while retaining relevant samples from the past. Through extensive computational comparisons, we demonstrate the effectiveness of our approach in terms of both accuracy and purity of the obtained buffer, resulting in a remarkable average gain of 4.71% points in accuracy with respect to existing loss-based purification strategies. Code is available at https://github.com/aimagelab/mammoth

2024 Relazione in Atti di Convegno

MONOT: High-Quality Privacy-compliant Morphed Synthetic Images for Everyone

Authors: Borghi, Guido; Domenico, Nicolò Di; Ferrara, Matteo; Franco, Annalisa; Latif, Uzma; Maltoni, Davide

2024 Relazione in Atti di Convegno

Multi-Class Unlearning for Image Classification via Weight Filtering

Authors: Poppi, Samuele; Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: IEEE INTELLIGENT SYSTEMS

Machine Unlearning is an emerging paradigm for selectively removing the impact of training datapoints from a network. Unlike existing methods … (Read full abstract)

Machine Unlearning is an emerging paradigm for selectively removing the impact of training datapoints from a network. Unlike existing methods that target a limited subset or a single class, our framework unlearns all classes in a single round. We achieve this by modulating the network's components using memory matrices, enabling the network to demonstrate selective unlearning behavior for any class after training. By discovering weights that are specific to each class, our approach also recovers a representation of the classes which is explainable by design. We test the proposed framework on small- and medium-scale image classification datasets, with both convolution- and Transformer-based backbones, showcasing the potential for explainable solutions through unlearning.

2024 Articolo su rivista

ONOT: a High-Quality ICAO-compliant Synthetic Mugshot Dataset

Authors: Di Domenico, N.; Borghi, G.; Franco, A.; Maltoni, D.

Nowadays, state-of-the-art AI-based generative models represent a viable solution to overcome privacy issues and biases in the collection of datasets … (Read full abstract)

Nowadays, state-of-the-art AI-based generative models represent a viable solution to overcome privacy issues and biases in the collection of datasets containing personal information, such as faces. Following this intuition, in this paper we introduce ONOT11One, No one and One hundred Thousand (L. Pirandello, 1926), a synthetic dataset specifically focused on the generation of high-quality faces in adherence to the requirements of the ISO/IEC 39794-5 standards that, following the guidelines of the International Civil Aviation Organization (ICAO), defines the interchange formats of face images in electronic Machine-Readable Travel Documents (eMRTD). The strictly controlled and varied mugshot images included in ONOT are useful in research fields related to the analysis of face images in eMRTD, such as Morphing Attack Detection and Face Quality Assessment. The dataset is publicly released2https://miatbiolab.csr.unibo.it/icao-synthetic-dataset, in combination with the generation procedure details in order to improve the reproducibility and enable future extensions.

2024 Relazione in Atti di Convegno

Optimizing Resource Consumption in Diffusion Models through Hallucination Early Detection

Authors: Betti, Federico; Baraldi, Lorenzo; Baraldi, Lorenzo; Cucchiara, Rita; Sebe, Nicu

2024 Relazione in Atti di Convegno

P. I. E. N. O.—Petrol-Filling Itinerary Estimation aNd Optimization

Authors: Savarese, M.; De Blasi, A.; Zaccagnino, C.; Grazia, C. A.

Published in: IEEE ACCESS

The recent rise of intelligent transportation systems (ITS) has challenged the integration between different data sources. Reaching the goal of … (Read full abstract)

The recent rise of intelligent transportation systems (ITS) has challenged the integration between different data sources. Reaching the goal of sustainable mobility requires properly managing and merging information coming from the vehicle (intra-) and information coming off the vehicle (inter-). In this paper, we provide a proof-of-concept leveraging on data merging between intra- and inter-networking presenting our framework: Petrol-Filling Itinerary Estimation aNd Optimization (PIENO). PIENO is a system that not only automates the search for the best fuel station but also paves the road to significant reductions in fuel consumption, making eco-driving a practical reality from a user perspective. The PIENO framework is designed to be fuel-type independent, ensuring its adaptability to different vehicles and conditions. It achieves this by merging data from the vehicle through a CAN Access Module (CAM) and data outside the vehicle through a mobile application connected to the internet. Different domains are stressed to reach the goal: microcontroller and OEM to retrieve the fuel level from the car, national authorities to retrieve the daily fuel price, AI models to predict the price trend for the next days, and algorithms to compute the best fuel station and the best time to fill. The modularity of PIENO allows it to adapt to different OEMs by modifying the intra-network interface to properly collect the fuel level, as well as to adapt to different markets and countries, retrieving the station’s locations and fuel prices by modifying the inter-network interface.

2024 Articolo su rivista

Pain and Fear in the Eyes: Gaze Dynamics Predicts Social Anxiety from Fear Generalisation

Authors: Patania, Sabrina; D’Amelio, Alessandro; Cuculo, Vittorio; Limoncini, Matteo; Ghezzi, Marco; Conversano, Vincenzo; Boccignone, Giuseppe

Published in: LECTURE NOTES IN COMPUTER SCIENCE

2024 Relazione in Atti di Convegno

Parameter Identification of a 6-DoF Serial Manipulator with Coupled Joints and Load-Assisting Springs for Industrial Applications

Authors: Nini, Matteo; Ferraguti, Federica; Ragaglia, Matteo; Bertuletti, Mattia; Di Napoli, Simone; Fantuzzi, Cesare

This paper presents a novel approach for identifying the dynamic parameters of a 6 DoF serial manipulator characterized by coupling … (Read full abstract)

This paper presents a novel approach for identifying the dynamic parameters of a 6 DoF serial manipulator characterized by coupling and springs, which is a common mechanics for industrial robots. The proposed method consists of two steps: at first, a static identification process for estimating the masses and centers of gravity (CoGs) of the links is performed; then, a dynamic identification process for determining the inertias, motor inertias, and frictions is executed. In the dynamic identification process, a trajectory is used to generate the required dynamic response of the system, and a regression matrix is employed to combine the identified parameters. Finally, a constrained optimization method is utilized to extract the parameters. The proposed method has been validated through simulations and experiments, showing high accuracy and reliability. This research contributes to the advancement of robot modeling and control, and has potential applications in various industrial fields.

Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images

Authors: Amoroso, Roberto; Morelli, Davide; Cornia, Marcella; Baraldi, Lorenzo; Del Bimbo, Alberto; Cucchiara, Rita

Published in: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS

Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. While these … (Read full abstract)

Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. While these models have numerous benefits across various sectors, they have also raised concerns about the potential misuse of fake images and cast new pressures on fake image detection. In this work, we pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models. Firstly, we conduct a comprehensive analysis of the performance of contrastive and classification-based visual features, respectively, extracted from CLIP-based models and ResNet or Vision Transformer (ViT)-based architectures trained on image classification datasets. Our results demonstrate that fake images share common low-level cues, which render them easily recognizable. Further, we devise a multimodal setting wherein fake images are synthesized by different textual captions, which are used as seeds for a generator. Under this setting, we quantify the performance of fake detection strategies and introduce a contrastive-based disentangling method that lets us analyze the role of the semantics of textual descriptions and low-level perceptual cues. Finally, we release a new dataset, called COCOFake, containing about 1.2 million images generated from the original COCO image–caption pairs using two recent text-to-image diffusion models, namely Stable Diffusion v1.4 and v2.0.

2024 Articolo su rivista

Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments

Authors: Barsellotti, Luca; Bigazzi, Roberto; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

In the last years, the research interest in visual navigation towards objects in indoor environments has grown significantly. This growth … (Read full abstract)

In the last years, the research interest in visual navigation towards objects in indoor environments has grown significantly. This growth can be attributed to the recent availability of large navigation datasets in photo-realistic simulated environments, like Gibson and Matterport3D. However, the navigation tasks supported by these datasets are often restricted to the objects present in the environment at acquisition time. Also, they fail to account for the realistic scenario in which the target object is a user-specific instance that can be easily confused with similar objects and may be found in multiple locations within the environment. To address these limitations, we propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object by distinguishing it among multiple instances of the same category. The task is accompanied by PInNED, a dedicated new dataset composed of photo-realistic scenes augmented with additional 3D objects. In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions. Through comprehensive evaluations and analyses, we showcase the challenges of the PIN task as well as the performance and shortcomings of currently available methods designed for object-driven navigation, considering modular and end-to-end agents.

2024 Relazione in Atti di Convegno

Page 20 of 110 • Total publications: 1098