Publications - AImageLab

A people counting system for business analytics

Authors: Pane, C.; Gasparini, M.; Prati, A.; Gualdi, G.; Cucchiara, R.

This paper deals with people counting in stores for business analytics using stereo vision. Among the several problems in this … (Read full abstract)

This paper deals with people counting in stores for business analytics using stereo vision. Among the several problems in this type of applications, two are the most relevant for our purposes: the management of occlusions and the distinction between adult people (potential customers) and other objects (children, trolleys, strollers, animals, etc.). The proposed solution uses a novel approach for object detection (based on background suppression on a so-called 'depth bird-eye view' and the clustering on the 3D point cloud by means of mean shift with a cylindrical kernel) followed by an adult people classifier which exploits a fitness measure with respect to a cylindrical human body model. The fitness is computed using Montecarlo sampling to estimate the volume occupation. Experiments are conducted on two real setups (including a store in a normal day of activity) and compared with a previous work. The results demonstrate the accuracy of the proposed solution. © 2013 IEEE.

2013 Relazione in Atti di Convegno

DOI IRIS

AN AUTOMATED PICKING WORKSTATION FOR HEALTHCARE APPLICATIONS

Authors: Piccinini, P.; Gamberini, Rita; Prati, A.; Rimini, Bianca; Cucchiara, Rita

Published in: COMPUTERS & INDUSTRIAL ENGINEERING

The costs associated with the management of healthcare systems have been subject to continuous scrutiny for some time now, with … (Read full abstract)

The costs associated with the management of healthcare systems have been subject to continuous scrutiny for some time now, with a view to reducing them without affecting the quality as perceived by final users. A number of different solutions have arisen based on centralisation of healthcare services and investments in Information Technology (IT). One such example is centralised management of pharmaceuticals among a group of hospitals which is then incorporated into the different steps of the automation supply chain. This paper focuses on a new picking workstation available for insertion in automated pharmaceutical distribution centres and which is capable of replacing manual workstations and bringing about improvements in working time. The workstation described uses a sophisticated computer vision algorithm to allow picking of very diverse and complex objects randomly available on a belt or in bins. The algorithm exploits state-of-the-art feature descriptors for an approach that is robust against occlusions and distracting objects, and invariant to scale, rotation or illumination changes. Finally, the performance of the designed picking workstation is tested in a large experimentation focused on the management of pharmaceutical items.

2013 Articolo su rivista

DOI IRIS

Automatic Single-Image People Segmentation and Removal for Cultural Heritage Imaging

Authors: Manfredi, Marco; Grana, Costantino; Cucchiara, Rita

In this paper, the problem of automatic people removal from digital photographs is addressed. Removing unintended people from a scene … (Read full abstract)

In this paper, the problem of automatic people removal from digital photographs is addressed. Removing unintended people from a scene can be very useful to focus further steps of image analysis only on the object of interest, A supervised segmentation algorithm is presented and tested in several scenarios.

2013 Relazione in Atti di Convegno

DOI IRIS

Beyond Bag of Words for Concept Detection and Search of Cultural Heritage Archives

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Several local features have become quite popular for concept detection and search, due to their ability to capture distinctive details. … (Read full abstract)

Several local features have become quite popular for concept detection and search, due to their ability to capture distinctive details. Typically a Bag of Words approach is followed, where a codebook is built by quantizing the local features. In this paper, we propose to represent SIFT local features extracted from an image as a multivariate Gaussian distribution, obtaining a mean vector and a covariance matrix. Differently from common techniques based on the Bag of Words model, our solution does not rely on the construction of a visual vocabulary, thus removing the dependence of the image descriptors on the specific dataset and allowing to immediately retargeting the features to different classification and search problems. Experimental results are conducted on two very different Cultural Heritage image archives, composed of illuminated manuscript miniatures, and architectural elements pictures collected from the web, on which the proposed approach outperforms the Bag of Words technique both in classification and retrieval.

2013 Relazione in Atti di Convegno

DOI IRIS

Hand Segmentation for Gesture Recognition in EGO-Vision

Authors: Serra, Giuseppe; Camurri, Marco; Baraldi, Lorenzo; Michela, Benedetti; Cucchiara, Rita

Portable devices for first-person camera views will play a central role in future interactive systems. One necessary step for feasible … (Read full abstract)

Portable devices for first-person camera views will play a central role in future interactive systems. One necessary step for feasible human-computer guided activities is gesture recognition, preceded by a reliable hand segmentation from egocentric vision. In this work we provide a novel hand segmentation algorithm based on Random Forest superpixel classification that integrates light, time and space consistency. We also propose a gesture recognition method based Exemplar SVMs since it requires a only small set of positive samples, hence it is well suitable for the egocentric video applications. Furthermore, this method is enhanced by using segmented images instead of full frames during test phase. Experimental results show that our hand segmentation algorithm outperforms the state-of-the-art approaches and improves the gesture recognition accuracy on both the publicly available EDSH dataset and our dataset designed for cultural heritage applications.

2013 Relazione in Atti di Convegno

DOI IRIS

Human Behavior Understanding with Wide Area Sensing Floors

Authors: Lombardi, Martino; Pieracci, Augusto; Santinelli, Paolo; Vezzani, Roberto; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The research on innovative and natural interfaces aims at developing devices able to capture and understand the human behavior without … (Read full abstract)

The research on innovative and natural interfaces aims at developing devices able to capture and understand the human behavior without the need of a direct interaction. In this paper we propose and describe a framework based on a sensing floor device. The pressure field generated by people or objects standing on the floor is captured and analyzed. Local and global features are computed by a low level processing unit and sent to high level interfaces. The framework can be used in different applications, such as entertainment, education or surveillance. A detailed description of the sensing element and the processing architectures is provided, together with some sample applications developed to test the device capabilities.

2013 Relazione in Atti di Convegno

DOI IRIS

Image Classification with Multivariate Gaussian Descriptors

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. … (Read full abstract)

Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. Dierently, in this paper we describe an image as multivariate Gaussian distribution, estimated over the extracted local descriptors. The estimated distribution is mapped to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. The experimental results on Caltech-101 and ImageCLEF2011 show that the method obtains competitive performance with state-of-the art approaches.

2013 Relazione in Atti di Convegno

DOI IRIS

Intelligent video surveillance as a service

Authors: Prati, A.; Vezzani, R.; Fornaciari, M.; Cucchiara, R.

Nowadays, intelligent video surveillance has become an essential tool of the greatest importance for several security-related applications. With the growth … (Read full abstract)

Nowadays, intelligent video surveillance has become an essential tool of the greatest importance for several security-related applications. With the growth of installed cameras and the increasing complexity of required algorithms, in-house self-contained video surveillance systems become a chimera for most institutions and (small) companies. The paradigm of Video Surveillance as a Service (VSaaS) helps distributing not only storage space in the cloud (necessary for handling large amounts of video data), but also infrastructures and computational power. This chapter will briefly introduce the motivations and the main characteristics of a VSaaS system, providing a case study where research-lab computer vision algorithms are integrated in a VSaaS platform. The lessons learnt and some future directions on this topic will be also highlighted.

2013 Capitolo/Saggio

DOI IRIS

Learning articulated body models for people re-identification

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same … (Read full abstract)

People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same person which have been acquired from different points of view and after a temporal gap. Image-based appearance features are usually adopted but, in addition to their intrinsically low discriminability, they are subject to perspective and view-point issues. We propose to completely change the approach by mapping local descriptors extracted from RGB-D sensors on a 3D body model for creating a view-independent signature. An original bone-wise color descriptor is generated and reduced with PCA to compute the person signature. The virtual bone set used to map appearance features is learned using a recursive splitting approach. Finally, people matching for re-identification is performed using the Relaxed Pairwise Metric Learning, which simultaneously provides feature reduction and weighting. Experiments on a specific dataset created with the Microsoft Kinect sensor and the OpenNi libraries prove the advantages of the proposed technique with respect to state of the art methods based on 2D or non-articulated 3D body models.

2013 Relazione in Atti di Convegno

DOI IRIS

Lightweight Sign Recognition for Mobile Devices

Authors: Fornaciari, Michele; Prati, Andrea; Grana, Costantino; Cucchiara, Rita

The diffusion of powerful mobile devices has posed the basis for new applications implementing on the devices (which are embedded … (Read full abstract)

The diffusion of powerful mobile devices has posed the basis for new applications implementing on the devices (which are embedded devices) sophisticated computer vision and pattern recognition algorithms. This paper describes the implementation of a complete system for automatic recognition of places localized on a map through the recognition of significant signs by means of the camera of a mobile device (smartphone, tablet, etc.). The paper proposes a novel classification algorithm based on the innovative use of bag-of-words on ORB features. The recognition is achieved using a simple yet effective search scheme which exploits GPS localization to limit the possible matches. This simple solution brings several advantages, such as the speed also on limited-resource devices, the usability also with limited training samples and the easiness of adapting to new training samples and classes. The overall architecture of the system is based on a REST-JSON client-server architecture. The experimental results have been conducted in a real scenario and evaluating the different parameters which influence the performance.

2013 Relazione in Atti di Convegno

IRIS

Publications by Rita Cucchiara

A people counting system for business analytics

AN AUTOMATED PICKING WORKSTATION FOR HEALTHCARE APPLICATIONS

Automatic Single-Image People Segmentation and Removal for Cultural Heritage Imaging

Beyond Bag of Words for Concept Detection and Search of Cultural Heritage Archives

Hand Segmentation for Gesture Recognition in EGO-Vision

Human Behavior Understanding with Wide Area Sensing Floors

Image Classification with Multivariate Gaussian Descriptors

Intelligent video surveillance as a service

Learning articulated body models for people re-identification

Lightweight Sign Recognition for Mobile Devices