Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Search: "#visual-attention" Keywords: visual-attention

TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes

Authors: D'Amelio, Alessandro; Cartella, Giuseppe; Cuculo, Vittorio; Lucchi, Manuele; Cornia, Marcella; Cucchiara, Rita; Boccignone, Giuseppe

Attention guides our gaze to fixate the proper location of the scene and holds it in that location for the … (Read full abstract)

Attention guides our gaze to fixate the proper location of the scene and holds it in that location for the deserved amount of time given current processing demands, before shifting to the next one. As such, gaze deployment crucially is a temporal process. Existing computational models have made significant strides in predicting spatial aspects of observer's visual scanpaths (where to look), while often putting on the background the temporal facet of attention dynamics (when). In this paper we present TPP-Gaze, a novel and principled approach to model scanpath dynamics based on Neural Temporal Point Process (TPP), that jointly learns the temporal dynamics of fixations position and duration, integrating deep learning methodologies with point process theory. We conduct extensive experiments across five publicly available datasets. Our results show the overall superior performance of the proposed model compared to state-of-the-art approaches.

2025 Relazione in Atti di Convegno

Using Gaze for Behavioural Biometrics

Authors: D’Amelio, Alessandro; Patania, Sabrina; Bursic, Sathya; Cuculo, Vittorio; Boccignone, Giuseppe

Published in: SENSORS

A principled approach to the analysis of eye movements for behavioural biometrics is laid down. The approach grounds in foraging … (Read full abstract)

A principled approach to the analysis of eye movements for behavioural biometrics is laid down. The approach grounds in foraging theory, which provides a sound basis to capture the unique- ness of individual eye movement behaviour. We propose a composite Ornstein-Uhlenbeck process for quantifying the exploration/exploitation signature characterising the foraging eye behaviour. The rel- evant parameters of the composite model, inferred from eye-tracking data via Bayesian analysis, are shown to yield a suitable feature set for biometric identification; the latter is eventually accomplished via a classical classification technique. A proof of concept of the method is provided by measuring its identification performance on a publicly available dataset. Data and code for reproducing the analyses are made available. Overall, we argue that the approach offers a fresh view on either the analyses of eye-tracking data and prospective applications in this field.

2023 Articolo su rivista

How to look next? A data-driven approach for scanpath prediction

Authors: Boccignone, G.; Cuculo, V.; D'Amelio, A.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

By and large, current visual attention models mostly rely, when considering static stimuli, on the following procedure. Given an image, … (Read full abstract)

By and large, current visual attention models mostly rely, when considering static stimuli, on the following procedure. Given an image, a saliency map is computed, which, in turn, might serve the purpose of predicting a sequence of gaze shifts, namely a scanpath instantiating the dynamics of visual attention deployment. The temporal pattern of attention unfolding is thus confined to the scanpath generation stage, whilst salience is conceived as a static map, at best conflating a number of factors (bottom-up information, top-down, spatial biases, etc.). In this note we propose a novel sequential scheme that consists of a three-stage processing relying on a center-bias model, a context/layout model, and an object-based model, respectively. Each stage contributes, at different times, to the sequential sampling of the final scanpath. We compare the method against classic scanpath generation that exploits state-of-the-art static saliency model. Results show that accounting for the structure of the temporal unfolding leads to gaze dynamics close to human gaze behaviour.

2020 Relazione in Atti di Convegno

Problems with Saliency Maps

Authors: Boccignone, Giuseppe; Cuculo, Vittorio; D’Amelio, Alessandro

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Despite the popularity that saliency models have gained in the computer vision community, they are most often conceived, exploited and … (Read full abstract)

Despite the popularity that saliency models have gained in the computer vision community, they are most often conceived, exploited and benchmarked without taking heed of a number of problems and subtle issues they bring about. When saliency maps are used as proxies for the likelihood of fixating a location in a viewed scene, one such issue is the temporal dimension of visual attention deployment. Through a simple simulation it is shown how neglecting this dimension leads to results that at best cast shadows on the predictive performance of a model and its assessment via benchmarking procedures.

2019 Relazione in Atti di Convegno

Worldly eyes on video: Learnt vs. reactive deployment of attention to dynamic stimuli

Authors: Cuculo, V.; D'Amelio, A.; Grossi, G.; Lanzarotti, R.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Computational visual attention is a hot topic in computer vision. However, most efforts are devoted to model saliency, whilst the … (Read full abstract)

Computational visual attention is a hot topic in computer vision. However, most efforts are devoted to model saliency, whilst the actual eye guidance problem, which brings into play the sequence of gaze shifts characterising overt attention, is overlooked. Further, in those cases where the generation of gaze behaviour is considered, stimuli of interest are by and large static (still images) rather than dynamic ones (videos). Under such circumstances, the work described in this note has a twofold aim: (i) addressing the problem of estimating and generating visual scan paths, that is the sequences of gaze shifts over videos; (ii) investigating the effectiveness in scan path generation offered by features dynamically learned on the base of human observers attention dynamics as opposed to bottom-up derived features. To such end a probabilistic model is proposed. By using a publicly available dataset, our approach is compared against a model of scan path simulation that does not rely on a learning step.

2019 Relazione in Atti di Convegno