Publications - AImageLab

Video surveillance and multimedia forensics: an application to trajectory analysis

Authors: Calderara, Simone; Prati, Andrea; Cucchiara, Rita

This paper reports an application of trajectory analysis in which forensics and video surveillance techniques are jointly employed for providing … (Read full abstract)

This paper reports an application of trajectory analysis in which forensics and video surveillance techniques are jointly employed for providing a new tool of multimedia forensics. Advanced video surveillance techniques are used to extract from a multi-camera system the trajectories of the moving people which are then modelled by either their positions (projected on the ground plane) or their directions of movement. Both these two representations can be very suitable for querying large video repositories, by searching for similar trajectories in terms of either sequences of positions or trajectory shape (encoded as sequence of angles, where positions do not care). Preliminary examples of the possible use of this approach are shown.

2009 Relazione in Atti di Convegno

DOI IRIS

"Inside the Bible": Segmentation, Annotation and Retrieval for a New Browsing Experience

Authors: Grana, Costantino; Borghesani, Daniele; Calderara, Simone; Cucchiara, Rita

In this paper we present a system for automatic segmentation, annotation and image retrieval based on content, focused on illuminated … (Read full abstract)

In this paper we present a system for automatic segmentation, annotation and image retrieval based on content, focused on illuminated manuscripts and in particular the Borso D'Este Holy Bible. To enhance the interaction possibilities with this work, full of decorations and illustrations, we exploit some well known document analysis techniques in addition to some new approaches, in order to achieve good segmentation of pages into meaningful visual objects with the relative annotation. We wanted to extend the standard keyword-based retrieval approach in a commentary with a modern visual-based retrieval by appearance similarity: an entire software user interface for exploration and visual search of illuminated manuscripts.

2008 Relazione in Atti di Convegno

DOI IRIS

A Markerless Approach for Consistent Action Recognition in a Multi-camera System

Authors: Calderara, Simone; Prati, Andrea; Cucchiara, Rita

This paper presents a method for recognizing human actions in a multi-camera setup. The proposed method automatically extracts significant points … (Read full abstract)

This paper presents a method for recognizing human actions in a multi-camera setup. The proposed method automatically extracts significant points on the human body, without the need of artificial markers. A sophisticated appearance-based tracking able to cope with occlusions is exploited to extract a probability map for each moving object. A segmentation technique based on mixture of Gaussians is then employed to extract and track significant points on this map, corresponding to significant regions on the human silhouette. The point tracking produces a set of 3D trajectories that are compared with other trajectories by means of global alignment and dynamic programming techniques. Preliminary experiments showed the potentiality of the proposed approach.

2008 Relazione in Atti di Convegno

DOI IRIS

Action Signature: a Novel Holistic Representation for Action Recognition

Authors: Calderara, Simone; Cucchiara, Rita; Prati, Andrea

Recognizing different actions with a unique approach can be a difficult task. This paper proposes a novel holistic representation of … (Read full abstract)

Recognizing different actions with a unique approach can be a difficult task. This paper proposes a novel holistic representation of actions that we called "action signature". This 1D trajectory is obtained by parsing the 2D image containing the orientations of the gradient calculated on the motion feature map called motion-history image. In this way, the trajectory is a sketch representation of how the object motion varies in time. A robust statistical framework based on mixtures of von Mises distributions and dynamic programming for sequence alignment are used to compare and classify actions/trajectories. The experimental results show a rather high accuracy in distinguishing quite complicated actions, such as drinking, jumping, or abandoning an object.

2008 Relazione in Atti di Convegno

DOI IRIS

Bayesian-competitive Consistent Labeling for People Surveillance

Authors: Calderara, Simone; Cucchiara, Rita; Prati, Andrea

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

This paper presents a novel and robust approach to consistent labeling for people surveillance in multi-camera systems. A general framework … (Read full abstract)

This paper presents a novel and robust approach to consistent labeling for people surveillance in multi-camera systems. A general framework scalable to any number of cameras with overlapped views is devised. An off-line training process automatically computes ground-plane homography and recovers epipolar geometry. When a new object is detected in any one camera, hypotheses for potential matching objects in the other cameras are established. Each of the hypotheses is evaluated using a prior and likelihood value. The prior accounts for the positions of the potential matching objects, while the likelihood is computed by warping the vertical axis of the new object on the field of view of the other cameras and measuring the amount of match. In the likelihood, two contributions (forward and backward) are considered so as to correctly handle the case of groups of people merged into single objects. Eventually, a maximum-a-posteriori approach estimates the best label assignment for the new object. Comparisons with other methods based on homography and extensive outdoor experiments demonstrate that the proposed approach is accurate and robust in coping with segmentation errors and in disambiguating groups.

2008 Articolo su rivista

DOI IRIS

HECOL: Homography and Epipolar-based Consistent Labeling for Outdoor Park Surveillance

Authors: Calderara, Simone; Prati, Andrea; Cucchiara, Rita

Published in: COMPUTER VISION AND IMAGE UNDERSTANDING

Outdoor surveillance is one of the most attractive application of video processing and analysis. Robust algorithms must be defined and … (Read full abstract)

Outdoor surveillance is one of the most attractive application of video processing and analysis. Robust algorithms must be defined and tuned to cope with the non-idealities of outdoor scenes. For instance, in a public park, an automatic video surveillance system must discriminate between shadows, reflections, waving trees, people standing still or moving, and other objects. Visual knowledge coming from multiple cameras can disambiguate cluttered and occluded targets by providing a continuous consistent labeling of tracked objects among the different views. This work proposes a new approach for coping with this problem in multi-camera systems with overlapped Fields of View (FoVs). The presence of overlapped zones allows the definition of a geometry-based approach to reconstruct correspondences between FoVs, using only homography and epipolar lines (hereinafter HECOL: Homography and Epipolar-based COnsistent Labeling) computed automatically with a training phase. We also propose a complete system that provides segmentation and tracking of people in each camera module. Segmentation is performed by means of the SAKBOT (Statistical and Knowledge Based Object Tracker) approach, suitably modified to cope with multi-modal backgrounds, reflections and other artefacts, typical of outdoor scenes. The extracted objects are tracked using a statistical appearance model robust against occlusions and segmentation errors. The main novelty of this paper is the approach to consistent labeling. A specific Camera Transition Graph is adopted to efficiently select the possible correspondence hypotheses between labels. A Bayesian MAP optimization assigns consistent labels to objects detected by several points of views: the object axis is computed from the shape tracked in each camera module and homography and epipolar lines allow a correct axis warping in other image planes. Both forward and backward probability contributions from the two different warping directions make the approach robust against segmentation errors, and capable of disambiguating groups of people. The system has been tested in a real setup of a urban public park, within the Italian LAICA (Laboratory of Ambient Intelligence for a friendly city) project. The experiments show how the system can correctly track and label objects in a distributed system with real-time performance. Comparisons with simpler consistent labeling methods and extensive outdoor experiments with ground truth demonstrate the accuracy and robustness of the proposed approach.

2008 Articolo su rivista

DOI IRIS

Reliable smoke detection system in the domains of image energy and color

Authors: Piccinini, Paolo; Calderara, Simone; Cucchiara, Rita

Published in: PROCEEDINGS - INTERNATIONAL CONFERENCE ON IMAGE PROCESSING

Smoke detection calls for a reliable and fast distinction between background, moving objects and variable shapes that are recognizable as … (Read full abstract)

Smoke detection calls for a reliable and fast distinction between background, moving objects and variable shapes that are recognizable as smoke. In our system we propose a stable background suppression module joined with a smoke detection module working on segmented objects. It exploits two features: the energy variation in wavelet model and a color model of the smoke. The decrease of energy ratio in wavelet domain between background and current image is a clue to detect smoke representing the variations of texture level. A mixture of Gaussians models this texture ratio for temporal evolution. The color model is used as reference to measure the deviation of the current pixel color from the model. The two features have been combined using a Bayesian classifier to detect smoke in the scene. Experiments on real data and a comparison between our background model and Gaussian Mixture(MOG) model for smoke detection are presented. © 2008 IEEE.

2008 Relazione in Atti di Convegno

DOI IRIS

Smoke detection in video surveillance: A MoG model in the wavelet domain

Authors: Calderara, Simone; Piccinini, Paolo; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The paper presents a new fast and robust technique of smoke detection in video surveillance images. The approach aims at … (Read full abstract)

The paper presents a new fast and robust technique of smoke detection in video surveillance images. The approach aims at detecting the spring or the presence of smoke by analyzing color and texture features of moving objects, segmented with background subtraction. The proposal embodies some novelties: first the temporal behavior of the smoke is modeled by a Mixture of Gaussians (MoG ) of the energy variation in the wavelet domain. The MoG takes into account the image energy variation due to either external luminance changes or the smoke propagation. It allows a distinction to energy variation due to the presence of real moving objects such as people and vehicles. Second, this textural analysis is enriched by a color analysis based on the blending function. Third, a Bayesian model is defined where the texture and color features, detected at block level, contributes to model the likelihood while a global evaluation of the entire image models the prior probability contribution. The resulting approach is very flexible and can be adopted in conjunction to a whichever video surveillance system based on dynamic background model. Several tests on tens of different contexts, both outdoor and indoor prove its robustness and precision. © 2008 Springer-Verlag Berlin Heidelberg.

2008 Relazione in Atti di Convegno

DOI IRIS

Smoke detection in videosurveillance: the use of VISOR (Video Surveillance On-line Repository)

Authors: Vezzani, Roberto; Calderara, Simone; Piccinini, Paolo; Cucchiara, Rita

Visor (VIdeo Surveillance Online Repository) is a large videorepository, designed for containing annotated video surveillancefootages, comparing annotations, evaluating systemperformance, and … (Read full abstract)

Visor (VIdeo Surveillance Online Repository) is a large videorepository, designed for containing annotated video surveillancefootages, comparing annotations, evaluating systemperformance, and performing retrieval tasks. The web interfaceallows video browse, query by annotated conceptsor by keywords, compressed video preview, media downloadand upload. The repository contains metadata annotations,both manually created ground-truth data and automaticallyobtained outputs of particular systems. An exampleof application is the collection of videos and annotationsfor smoke detection, an important video surveillance task. Inthis paper we present the architecture of ViSOR, the build-insurveillance ontology which integrates many concepts, alsocoming from LSCOM, and MediaMill, the annotation toolsand the visualization of results for performance evaluation.The annotation is obtained with an automatic smoke detectionsystem, capable to detect people, moving objects, andsmoke in real-time.

2008 Relazione in Atti di Convegno

DOI IRIS

Using circular statistics for trajectory shape analysis

Authors: Prati, Andrea; Calderara, Simone; Cucchiara, Rita

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

The analysis of patterns of movement is a crucial task for several surveillance applications, for instance to classify normal or … (Read full abstract)

The analysis of patterns of movement is a crucial task for several surveillance applications, for instance to classify normal or abnormal people trajectories on the basis of their occurrence. This paper proposes to model the shape of a single trajectory as a sequence of angles described using a Mixture of Von Mises (MoVM) distribution. A complete EM (Expectation Maximization) algorithm is derived for MoVM parameters estimation and an on-line version proposed to meet real time requirement. Maximum-A-Posteriori is used to encode the trajectory as a sequence of symbols corresponding to the MoVM components. Iterative k-medoids clustering groups trajectories in a variable number of similarity classes. The similarity is computed aligning (with dynamic programming) two sequences and considering as symbol-to-symbol distance the Bhattacharyya distance between von Mises distributions. Extensive experiments have been performed on both synthetic and real data. ©2008 IEEE.

2008 Relazione in Atti di Convegno

DOI IRIS

Publications by Simone Calderara

Video surveillance and multimedia forensics: an application to trajectory analysis

"Inside the Bible": Segmentation, Annotation and Retrieval for a New Browsing Experience

A Markerless Approach for Consistent Action Recognition in a Multi-camera System

Action Signature: a Novel Holistic Representation for Action Recognition

Bayesian-competitive Consistent Labeling for People Surveillance

HECOL: Homography and Epipolar-based Consistent Labeling for Outdoor Park Surveillance

Reliable smoke detection system in the domains of image energy and color

Smoke detection in video surveillance: A MoG model in the wavelet domain

Smoke detection in videosurveillance: the use of VISOR (Video Surveillance On-line Repository)

Using circular statistics for trajectory shape analysis