Publications - AImageLab

Modeling Local Descriptors with Multivariate Gaussians for Object and Scene Recognition

Authors: Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose … (Read full abstract)

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose to employ a parametric description and compare its capabilities to histogram based approaches. We use the multivariate Gaussian distribution, applied over the SIFT descriptors, extracted with dense sampling on a spatial pyramid. Every distribution is converted to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. Experiments on Caltech-101 and ImageCLEF2011 are performed using the Stochastic Gradient Descent solver, which allows to deal with large scale datasets and high dimensional feature spaces.

2013 Relazione in Atti di Convegno

DOI IRIS

On the design of embedded solutions to banknote recognition

Authors: Rashid, A.; Prati, A.; Cucchiara, R.

Published in: OPTICAL ENGINEERING

Banknote recognition systems have many applications in the modern world of automatic monetary transaction machines. They are traditionally based on … (Read full abstract)

Banknote recognition systems have many applications in the modern world of automatic monetary transaction machines. They are traditionally based on simple classifiers applied over manually selected areas. A new solution in this field, borrowed by content-based image retrieval (CBIR), which is based on dense scale-invariant feature transform features in a bag-of-words framework followed by a support vector machine (SVM) classifier, is explored. The proposed computer vision system for banknote recognition, on one hand, enables recognition at high accuracy and speed, and, on the other hand, provides basis for further applications, e.g., counterfeit detection and fitness test. This approach makes the system robust to various defects, which may occur during image acquisition or during circulation life of banknote. We implemented and tested on an embedded platform three state-of-the-art classification techniques [SVM, artificial neural network (ANN), and hidden Markov model (HMM)]. The comparative results are reported for accuracy with different sizes of the training datasets and with various types of datasets. In this framework, the SVM classifier outperforms ANN and HMM on the basis of speed and accuracy on our embedded platform. © 2013 Society of Photo-Optical Instrumentation Engineers.

2013 Articolo su rivista

DOI IRIS

People reidentification in surveillance and forensics: a Survey

Authors: Vezzani, Roberto; Baltieri, Davide; Cucchiara, Rita

Published in: ACM COMPUTING SURVEYS

The field of surveillance and forensics research is currently shifting focus and is now showing an ever increasing interest in … (Read full abstract)

The field of surveillance and forensics research is currently shifting focus and is now showing an ever increasing interest in the task of people reidentification. This is the task of assigning the same identifier to all instances of a particular individual captured in a series of images or videos, even after the occurrence of significant gaps over time or space. People reidentification can be a useful tool for people analysis in security as a data association method for long-term tracking in surveillance. However, current identification techniques being utilized present many difficulties and shortcomings. For instance, they rely solely on the exploitation of visual cues such as color, texture, and the object's shape. Despite the many advances in this field, reidentification is still an open problem. This survey aims to tackle all the issues and challenging aspects of people reidentification while simultaneously describing the previously proposed solutions for the encountered problems. This begins with the first attempts of holistic descriptors and progresses to the more recently adopted 2D and 3D model-based approaches. The survey also includes an exhaustive treatise of all the aspects of people reidentification, including available datasets, evaluation metrics, and benchmarking.

2013 Articolo su rivista

DOI IRIS

Sensing floors for privacy-compliant surveillance of wide areas

Authors: Lombardi, Martino; Pieracci, Augusto; Santinelli, Paolo; Vezzani, Roberto; Cucchiara, Rita

Surveillance systems can really benefit from the integration of multiple and heterogeneous sensors. In this paper we describe an innovative … (Read full abstract)

Surveillance systems can really benefit from the integration of multiple and heterogeneous sensors. In this paper we describe an innovative sensing floor. Thanks to its low cost and ease of installation, the floor is suitable for both private and public environments, from narrow zones to wide areas. The floor is made adding a sensing layer below commercial floating tiles. The sensor is scalable, reliable, and completely invisible to the users. The temporal and spatial resolutions of the data are high enough to identify the presence of people, to recognize their behavior and to detect events in a privacy compliant way. Experimental results on a real prototype implementation confirm the potentiality of the framework.

2013 Relazione in Atti di Convegno

DOI IRIS

Structured learning for detection of social groups in crowd

Authors: Solera, Francesco; Calderara, Simone; Cucchiara, Rita

Group detection in crowds will play a key role in future behavior analysis surveillance systems. In this work we build … (Read full abstract)

Group detection in crowds will play a key role in future behavior analysis surveillance systems. In this work we build a new Structural SVM-based learning framework able to solve the group detection task by exploiting annotated video data to deduce a sociologically motivated distance measure founded on Hall's proxemics and Granger's causality. We improve over state-of-the-art results even in the most crowded test scenarios, while keeping the classification time affordable for quasi-real time applications. A new scoring scheme specifically designed for the group detection task is also proposed.

2013 Relazione in Atti di Convegno

DOI IRIS

UNIMORE at ImageCLEF 2013: Scalable Concept Image Annotation

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita; Martoglia, Riccardo; Mandreoli, Federica

Published in: CEUR WORKSHOP PROCEEDINGS

In this paper we propose a large-scale Image annotation system for the Scalable Concept Image Annotation task. For each concept … (Read full abstract)

In this paper we propose a large-scale Image annotation system for the Scalable Concept Image Annotation task. For each concept to be detected a separated classifier is built using the provided textual annotation. Images are represented as a Multivariate Gaussian distribution of a set of local features extracted over a dense regular grid. Textual analysis, on the web pages containing training images, is performed to retrieve a relevant set of samples for learning each concept classifier. An online SVMs solver based on Stochastic Gradient Descent is used to manage the large amount of training data. Experimental results show that the combination of different kind of local features encoded with our strategy achieves very competitive performance both in terms of mAP and mean F-measure.

2013 Relazione in Atti di Convegno

IRIS

Video surveillance online repository (ViSOR)

Authors: Vezzani, Roberto; Cucchiara, Rita

This paper describe the ViSOR (Video Surveillance Online Repository) repository, designed with the aim of establishing an open platform for … (Read full abstract)

This paper describe the ViSOR (Video Surveillance Online Repository) repository, designed with the aim of establishing an open platform for collecting, annotating, retrieving, and sharing surveillance videos, as well as evaluating the performance of automatic surveillance systems. The repository is free and researchers can collaborate sharing their own videos or datasets. Most of the included videos are annotated. Annotations are based on a reference ontology which has been defined integrating hundreds of concepts, some of them coming from the LSCOM and MediaMill ontologies. A new annotation classification schema is also provided, which is aimed at identifying the spatial, temporal and domain detail level used. The web interface allows video browsing, querying by annotated concepts or by keywords, compressed video previewing, media downloading and uploading. Finally, ViSOR includes a performance evaluation desk which can be used to compare different annotations.

2013 Relazione in Atti di Convegno

DOI IRIS

2D Images Map Warping for Improved User Interaction

Authors: Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In this paper, we suggest an interaction model designed to fit users' expectations in front of an image retrieval system. … (Read full abstract)

In this paper, we suggest an interaction model designed to fit users' expectations in front of an image retrieval system. A lightweight relevance feedback strategy, working directly on the 2D projection of image features, allows the user to spatially navigate the media collection maintaining the real-time constraint. A preliminary evaluation of this relevance feedback strategy shows good performance compared with other known approaches.

2012 Relazione in Atti di Convegno

IRIS

A human vs. machine challenge in fashion color classification

Authors: Grana, C.; Borghesani, D.; Cucchiara, R.

Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

For this demo, we present a set of stark applications designed to evaluate the performance of a color similarity retrieval … (Read full abstract)

For this demo, we present a set of stark applications designed to evaluate the performance of a color similarity retrieval system against human operators performance in the same tasks. The proposed series of tests give some interesting insights about the perception of color classes and the reliability of manual annotation in the fashion context. © 2012 Springer-Verlag.

2012 Relazione in Atti di Convegno

DOI IRIS

Class-based color bag of words for fashion retrieval

Authors: Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita

Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO

Color signatures, histograms and bag of colors are basic and effective strategies for describing the color content of images, for … (Read full abstract)

Color signatures, histograms and bag of colors are basic and effective strategies for describing the color content of images, for retrieving images by their color appearance or providing color annotation. In some domains, colors assume a specific meaning for users and the color-based classification and retrieval should mirror the initial suggestions given by users in the training set. For instance in fashion world, the names given to the dominant color of a garment or a dress reflect the fashion dictact and not an uniform division of the color space.In this paper we propose a general approach to implement color signature as a trained bag of words, defined on the basis of user defined color classes. The novel Class-based Color Bag of Words is a easy computable bag of words of color, constructed following an approach similar to the Median Cut algorithm, but biased by color distribution in the trained classes. Moreover, to dramatically reduce the computational effort we propose 3D integral histograms, a 3D extension of integral images, easily extensible for many histogram-based signature in 3D color space. Several comparisons in large fashion datasets confirm the discriminant power of this signature.

2012 Relazione in Atti di Convegno

DOI IRIS

Publications by Rita Cucchiara

Modeling Local Descriptors with Multivariate Gaussians for Object and Scene Recognition

On the design of embedded solutions to banknote recognition

People reidentification in surveillance and forensics: a Survey

Sensing floors for privacy-compliant surveillance of wide areas

Structured learning for detection of social groups in crowd

UNIMORE at ImageCLEF 2013: Scalable Concept Image Annotation

Video surveillance online repository (ViSOR)

2D Images Map Warping for Improved User Interaction

A human vs. machine challenge in fashion color classification

Class-based color bag of words for fashion retrieval