Publications - AImageLab

Foreword by general chairs

Authors: Cucchiara, R.; Bimbo, A. D.; Sclaroff, S.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

2021 Relazione in Atti di Convegno

IRIS

Foreword by general chairs

Authors: Cucchiara, R.; Del Bimbo, A.; Sclaroff, S.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

2021 Relazione in Atti di Convegno

IRIS

Foreword by general chairs

Authors: Cucchiara, R.; Bimbo, A. D.; Sclaroff, S.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

2021 Relazione in Atti di Convegno

IRIS

FUNGI: FUsioN Gene Integration toolset

Authors: Cervera, Alejandra; Rausio, Heidi; Kähkönen, Tiia; Andersson, Noora; Partel, Gabriele; Rantanen, Ville; Paciello, Giulia; Ficarra, Elisa; Hynninen, Johanna; Hietanen, Sakari; Carpén, Olli; Lehtonen, Rainer; Hautaniemi, Sampsa; Huhtinen, Kaisa

Published in: BIOINFORMATICS

2021 Articolo su rivista

DOI IRIS

Future Urban Scenes Generation Through Vehicles Synthesis

Authors: Simoni, Alessandro; Bergamini, Luca; Palazzi, Andrea; Calderara, Simone; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite … (Read full abstract)

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite recent advances, generating the entire scene in an end-to-end fashion is still far from being achieved. Instead, here we follow a two stages approach, where interpretable information is included in the loop and each actor is modelled independently. We leverage a per-object novel view synthesis paradigm; i.e. generating a synthetic representation of an object undergoing a geometrical roto-translation in the 3D space. Our model can be easily conditioned with constraints (e.g. input trajectories) provided by state-of-the-art tracking methods or by the user itself. This allows us to generate a set of diverse realistic futures starting from the same input in a multi-modal fashion. We visually and quantitatively show the superiority of this approach over traditional end-to-end scene-generation methods on CityFlow, a challenging real world dataset.

2021 Relazione in Atti di Convegno

DOI IRIS

Improving Car Model Classification through Vehicle Keypoint Localization

Authors: Simoni, Alessandro; D'Eusanio, Andrea; Pini, Stefano; Borghi, Guido; Vezzani, Roberto

In this paper, we present a novel multi-task framework which aims to improve the performance of car model classification leveraging … (Read full abstract)

In this paper, we present a novel multi-task framework which aims to improve the performance of car model classification leveraging visual features and pose information extracted from single RGB images. In particular, we merge the visual features obtained through an image classification network and the features computed by a model able to predict the pose in terms of 2D car keypoints. We show how this approach considerably improves the performance on the model classification task testing our framework on a subset of the Pascal3D dataset containing the car classes. Finally, we conduct an ablation study to demonstrate the performance improvement obtained with respect to a single visual classifier network.

2021 Relazione in Atti di Convegno

DOI IRIS

Improving Indoor Semantic Segmentation with Boundary-level Objectives

Authors: Amoroso, Roberto; Baraldi, Lorenzo; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

While most of the recent literature on semantic segmentation has focused on outdoor scenarios, the generation of accurate indoor segmentation … (Read full abstract)

While most of the recent literature on semantic segmentation has focused on outdoor scenarios, the generation of accurate indoor segmentation maps has been partially under-investigated, although being a relevant task with applications in augmented reality, image retrieval, and personalized robotics. With the goal of increasing the accuracy of semantic segmentation in indoor scenarios, we develop and propose two novel boundary-level training objectives, which foster the generation of accurate boundaries between different semantic classes. In particular, we take inspiration from the Boundary and Active Boundary losses, two recent proposals which deal with the prediction of semantic boundaries, and propose modified geometric distance functions that improve predictions at the boundary level. Through experiments on the NYUDv2 dataset, we assess the appropriateness of our proposal in terms of accuracy and quality of boundary prediction and demonstrate its accuracy gain.

2021 Relazione in Atti di Convegno

DOI IRIS

L'intelligenza non è artificiale. La rivoluzione tecnologica che sta già cambiando il mondo

Authors: Cucchiara, Rita

2021 Monografia/Trattato scientifico

IRIS

Learning to Read L'Infinito: Handwritten Text Recognition with Synthetic Training Data

Authors: Cascianelli, Silvia; Cornia, Marcella; Baraldi, Lorenzo; Piazzi, Maria Ludovica; Schiuma, Rosiana; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Deep learning-based approaches to Handwritten Text Recognition (HTR) have shown remarkable results on publicly available large datasets, both modern and … (Read full abstract)

Deep learning-based approaches to Handwritten Text Recognition (HTR) have shown remarkable results on publicly available large datasets, both modern and historical. However, it is often the case that historical manuscripts are preserved in small collections, most of the time with unique characteristics in terms of paper support, author handwriting style, and language. State-of-the-art HTR approaches struggle to obtain good performance on such small manuscript collections, for which few training samples are available. In this paper, we focus on HTR on small historical datasets and propose a new historical dataset, which we call Leopardi, with the typical characteristics of small manuscript collections, consisting of letters by the poet Giacomo Leopardi, and devise strategies to deal with the training data scarcity scenario. In particular, we explore the use of carefully designed but cost-effective synthetic data for pre-training HTR models to be applied to small single-author manuscripts. Extensive experiments validate the suitability of the proposed approach, and both the Leopardi dataset and synthetic data will be available to favor further research in this direction.

2021 Relazione in Atti di Convegno

DOI IRIS

Learning to Select: A Fully Attentive Approach for Novel Object Captioning

Authors: Cagrandi, Marco; Cornia, Marcella; Stefanini, Matteo; Baraldi, Lorenzo; Cucchiara, Rita

Image captioning models have lately shown impressive results when applied to standard datasets. Switching to real-life scenarios, however, constitutes a … (Read full abstract)

Image captioning models have lately shown impressive results when applied to standard datasets. Switching to real-life scenarios, however, constitutes a challenge due to the larger variety of visual concepts which are not covered in existing training sets. For this reason, novel object captioning (NOC) has recently emerged as a paradigm to test captioning models on objects which are unseen during the training phase. In this paper, we present a novel approach for NOC that learns to select the most relevant objects of an image, regardless of their adherence to the training set, and to constrain the generative process of a language model accordingly. Our architecture is fully-attentive and end-to-end trainable, also when incorporating constraints. We perform experiments on the held-out COCO dataset, where we demonstrate improvements over the state of the art, both in terms of adaptability to novel objects and caption quality.

2021 Relazione in Atti di Convegno

DOI IRIS