Publications by Simone Calderara

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Simone Calderara

Classifying Signals on Irregular Domains via Convolutional Cluster Pooling

Authors: Porrello, Angelo; Abati, Davide; Calderara, Simone; Cucchiara, Rita

Published in: PROCEEDINGS OF MACHINE LEARNING RESEARCH

We present a novel and hierarchical approach for supervised classification of signals spanning over a fixed graph, reflecting shared properties … (Read full abstract)

We present a novel and hierarchical approach for supervised classification of signals spanning over a fixed graph, reflecting shared properties of the dataset. To this end, we introduce a Convolutional Cluster Pooling layer exploiting a multi-scale clustering in order to highlight, at different resolutions, locally connected regions on the input graph. Our proposal generalises well-established neural models such as Convolutional Neural Networks (CNNs) on irregular and complex domains, by means of the exploitation of the weight sharing property in a graph-oriented architecture. In this work, such property is based on the centrality of each vertex within its soft-assigned cluster. Extensive experiments on NTU RGB+D, CIFAR-10 and 20NEWS demonstrate the effectiveness of the proposed technique in capturing both local and global patterns in graph-structured data out of different domains.

2019 Relazione in Atti di Convegno

End-to-end 6-DoF Object Pose Estimation through Differentiable Rasterization

Authors: Palazzi, Andrea; Bergamini, Luca; Calderara, Simone; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Here we introduce an approximated differentiable renderer to refine a 6-DoF pose prediction using only 2D alignment information. To this … (Read full abstract)

Here we introduce an approximated differentiable renderer to refine a 6-DoF pose prediction using only 2D alignment information. To this end, a two-branched convolutional encoder network is employed to jointly estimate the object class and its 6-DoF pose in the scene. We then propose a new formulation of an approximated differentiable renderer to re-project the 3D object on the image according to its predicted pose; in this way the alignment error between the observed and the re-projected object silhouette can be measured. Since the renderer is differentiable, it is possible to back-propagate through it to correct the estimated pose at test time in an online learning fashion. Eventually we show how to leverage the classification branch to profitably re-project a representative model of the predicted class (i.e. a medoid) instead. Each object in the scene is processed independently and novel viewpoints in which both objects arrangement and mutual pose are preserved can be rendered. Differentiable renderer code is available at:https://github.com/ndrplz/tensorflow-mesh-renderer.

2019 Relazione in Atti di Convegno

Gait-Based Diplegia Classification Using LSMT Networks

Authors: Ferrari, Alberto; Bergamini, Luca; Guerzoni, Giorgio; Calderara, Simone; Bicocchi, Nicola; Vitetta, Giorgio; Borghi, Corrado; Neviani, Rita; Ferrari, Adriano

Published in: JOURNAL OF HEALTHCARE ENGINEERING

Diplegia is a specific subcategory of the wide spectrum of motion disorders gathered under the name of cerebral palsy. Recent … (Read full abstract)

Diplegia is a specific subcategory of the wide spectrum of motion disorders gathered under the name of cerebral palsy. Recent works proposed to use gait analysis for diplegia classification paving the way for automated analysis. A clinically established gait-based classification system divides diplegic patients into 4 main forms, each one associated with a peculiar walking pattern. In this work, we apply two different deep learning techniques, namely, multilayer perceptron and recurrent neural networks, to automatically classify children into the 4 clinical forms. For the analysis, we used a dataset comprising gait data of 174 patients collected by means of an optoelectronic system. The measurements describing walking patterns have been processed to extract 27 angular parameters and then used to train both kinds of neural networks. Classification results are comparable with those provided by experts in 3 out of 4 forms.

2019 Articolo su rivista

Latent Space Autoregression for Novelty Detection

Authors: Abati, Davide; Porrello, Angelo; Calderara, Simone; Cucchiara, Rita

Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of … (Read full abstract)

Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying its latent representations through an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and video anomaly detection settings. Differently from prior works, our proposal does not make any assumption about the nature of the novelties, making our work readily applicable to diverse contexts.

2019 Relazione in Atti di Convegno

METODO DI VALUTAZIONE DI UNO STATO DI SALUTE DI UN ELEMENTO ANATOMICO, RELATIVO DISPOSITIVO DI VALUTAZIONE E RELATIVO SISTEMA DI VALUTAZIONE

Authors: Giuseppe, Marrucchella; Bergamini, Luca; Porrello, Angelo; Del Negro, Ercole; Capobianco Dondona, Andrea; Di Tondo, Francesco; Calderara, Simone

Sistema in grado di rilevare le lesioni delle mezzene al macello attraverso l'utilizzo di tecniche di deep learning per individuazioni … (Read full abstract)

Sistema in grado di rilevare le lesioni delle mezzene al macello attraverso l'utilizzo di tecniche di deep learning per individuazioni del tipo di lesioni presenti

2019 Brevetto

Predicting the Driver's Focus of Attention: the DR(eye)VE Project

Authors: Palazzi, Andrea; Abati, Davide; Calderara, Simone; Solera, Francesco; Cucchiara, Rita

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Predicting the Driver's Focus of Attention: the DR(eye)VE Project Andrea Palazzi, Davide Abati, Simone Calderara, Francesco Solera, Rita Cucchiara (Submitted … (Read full abstract)

Predicting the Driver's Focus of Attention: the DR(eye)VE Project Andrea Palazzi, Davide Abati, Simone Calderara, Francesco Solera, Rita Cucchiara (Submitted on 10 May 2017 (v1), last revised 6 Jun 2018 (this version, v3)) In this work we aim to predict the driver's focus of attention. The goal is to estimate what a person would pay attention to while driving, and which part of the scene around the vehicle is more critical for the task. To this end we propose a new computer vision model based on a multi-branch deep architecture that integrates three sources of information: raw video, motion and scene semantics. We also introduce DR(eye)VE, the largest dataset of driving scenes for which eye-tracking annotations are available. This dataset features more than 500,000 registered frames, matching ego-centric views (from glasses worn by drivers) and car-centric views (from roof-mounted camera), further enriched by other sensors measurements. Results highlight that several attention patterns are shared across drivers and can be reproduced to some extent. The indication of which elements in the scene are likely to capture the driver's attention may benefit several applications in the context of human-vehicle interaction and driver attention analysis.

2019 Articolo su rivista

Segmentation Guided Scoring of Pathological Lesions in Swine Through CNNs

Authors: Bergamini, L.; Trachtman, A. R.; Palazzi, A.; Negro, E. D.; Capobianco Dondona, A.; Marruchella, G.; Calderara, S.

Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

The slaughterhouse is widely recognised as a useful checkpoint for assessing the health status of livestock. At the moment, this … (Read full abstract)

The slaughterhouse is widely recognised as a useful checkpoint for assessing the health status of livestock. At the moment, this is implemented through the application of scoring systems by human experts. The automation of this process would be extremely helpful for veterinarians to enable a systematic examination of all slaughtered livestock, positively influencing herd management. However, such systems are not yet available, mainly because of a critical lack of annotated data. In this work we: (i) introduce a large scale dataset to enable the development and benchmarking of these systems, featuring more than 4000 high-resolution swine carcass images annotated by domain experts with pixel-level segmentation; (ii) exploit part of this annotation to train a deep learning model in the task of pleural lesion scoring. In this setting, we propose a segmentation-guided framework which stacks together a fully convolutional neural network performing semantic segmentation with a rule-based classifier integrating a-priori veterinary knowledge in the process. Thorough experimental analysis against state-of-the-art baselines proves our method to be superior both in terms of accuracy and in terms of model interpretability. Code and dataset are publicly available here: https://github.com/lucabergamini/swine-lesion-scoring.

2019 Relazione in Atti di Convegno

Self-Supervised Optical Flow Estimation by Projective Bootstrap

Authors: Alletto, Stefano; Abati, Davide; Calderara, Simone; Cucchiara, Rita; Rigazio, Luca

Published in: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Dense optical flow estimation is complex and time consuming, with state-of-the-art methods relying either on large synthetic data sets or … (Read full abstract)

Dense optical flow estimation is complex and time consuming, with state-of-the-art methods relying either on large synthetic data sets or on pipelines requiring up to a few minutes per frame pair. In this paper, we address the problem of optical flow estimation in the automotive scenario in a self-supervised manner. We argue that optical flow can be cast as a geometrical warping between two successive video frames and devise a deep architecture to estimate such transformation in two stages. First, a dense pixel-level flow is computed with a projective bootstrap on rigid surfaces. We show how such global transformation can be approximated with a homography and extend spatial transformer layers so that they can be employed to compute the flow field implied by such transformation. Subsequently, we refine the prediction by feeding a second, deeper network that accounts for moving objects. A final reconstruction loss compares the warping of frame Xₜ with the subsequent frame Xₜ₊₁ and guides both estimates. The model has the speed advantages of end-to-end deep architectures while achieving competitive performances, both outperforming recent unsupervised methods and showing good generalization capabilities on new automotive data sets.

2019 Articolo su rivista

Spotting Insects from Satellites: Modeling the Presence of Culicoides Imicola Through Deep CNNs

Authors: Vincenzi, Stefano; Porrello, Angelo; Buzzega, Pietro; Conte, Annamaria; Ippoliti, Carla; Candeloro, Luca; Di Lorenzo, Alessio; Capobianco Dondona, Andrea; Calderara, Simone

Nowadays, Vector-Borne Diseases (VBDs) raise a severe threat for public health, accounting for a considerable amount of human illnesses. Recently, … (Read full abstract)

Nowadays, Vector-Borne Diseases (VBDs) raise a severe threat for public health, accounting for a considerable amount of human illnesses. Recently, several surveillance plans have been put in place for limiting the spread of such diseases, typically involving on-field measurements. Such a systematic and effective plan still misses, due to the high costs and efforts required for implementing it. Ideally, any attempt in this field should consider the triangle vectors-host-pathogen, which is strictly linked to the environmental and climatic conditions. In this paper, we exploit satellite imagery from Sentinel-2 mission, as we believe they encode the environmental factors responsible for the vector's spread. Our analysis - conducted in a data-driver fashion - couples spectral images with ground-truth information on the abundance of Culicoides imicola. In this respect, we frame our task as a binary classification problem, underpinning Convolutional Neural Networks (CNNs) as being able to learn useful representation from multi-band images. Additionally, we provide a multi-instance variant, aimed at extracting temporal patterns from a short sequence of spectral images. Experiments show promising results, providing the foundations for novel supportive tools, which could depict where surveillance and prevention measures could be prioritized.

2019 Relazione in Atti di Convegno

Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era

Authors: Cornia, Marcella; Abati, Davide; Baraldi, Lorenzo; Palazzi, Andrea; Calderara, Simone; Cucchiara, Rita

Published in: INTELLIGENZA ARTIFICIALE

Estimating the focus of attention of a person looking at an image or a video is a crucial step which … (Read full abstract)

Estimating the focus of attention of a person looking at an image or a video is a crucial step which can enhance many vision-based inference mechanisms: image segmentation and annotation, video captioning, autonomous driving are some examples. The early stages of the attentive behavior are typically bottom-up; reproducing the same mechanism means to find the saliency embodied in the images, i.e. which parts of an image pop out of a visual scene. This process has been studied for decades both in neuroscience and in terms of computational models for reproducing the human cortical process. In the last few years, early models have been replaced by deep learning architectures, that outperform any early approach compared against public datasets. In this paper, we discuss the effectiveness of convolutional neural networks (CNNs) models in saliency prediction. We present a set of Deep Learning architectures developed by us, which can combine both bottom-up cues and higher-level semantics, and extract spatio-temporal features by means of 3D convolutions to model task-driven attentive behaviors. We will show how these deep networks closely recall the early saliency models, although improved with the semantics learned from the human ground-truth. Eventually, we will present a use-case in which saliency prediction is used to improve the automatic description of images.

2018 Articolo su rivista

Page 9 of 17 • Total publications: 161