Publications - AImageLab

An Indoor Location-aware System for an IoT-based Smart Museum

Authors: Alletto, Stefano; Cucchiara, Rita; Del Fiore, Giuseppe; Mainetti, Luca; Mighali, Vincenzo; Patrono, Luigi; Serra, Giuseppe

Published in: IEEE INTERNET OF THINGS JOURNAL

The new technologies characterizing the Internet of Things (IoT) allow realizing real smart environments able to provide advanced services to … (Read full abstract)

The new technologies characterizing the Internet of Things (IoT) allow realizing real smart environments able to provide advanced services to the users. Recently, these smart environments are also being exploited to renovate the users' interest on the cultural heritage, by guaranteeing real interactive cultural experiences. In this paper, we design and validate an indoor location-aware architecture able to enhance the user experience in a museum. In particular, the proposed system relies on a wearable device that combines image recognition and localization capabilities to automatically provide the users with cultural contents related to the observed artworks. The localization information is obtained by a Bluetooth low energy (BLE) infrastructure installed in the museum. Moreover, the system interacts with the Cloud to store multimedia contents produced by the user and to share environment-generated events on his/her social networks. Finally, several location-aware services, running in the system, control the environment status also according to users' movements. These services interact with physical devices through a multiprotocol middleware. The system has been designed to be easily extensible to other IoT technologies and its effectiveness has been evaluated in the MUST museum, Lecce, Italy.

2016 Articolo su rivista

DOI IRIS

Analysis and Re-use of Videos in Educational Digital Libraries with Automatic Scene Detection

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

The advent of modern approaches to education, like Massive Open Online Courses (MOOC), made video the basic media for educating … (Read full abstract)

The advent of modern approaches to education, like Massive Open Online Courses (MOOC), made video the basic media for educating and transmitting knowledge. However, IT tools are still not adequate to allow video content re-use, tagging, annotation and personalization. In this paper we analyze the problem of identifying coherent sequences, called scenes, in order to provide the users with a more manageable editing unit. A simple spectral clustering technique is proposed and compared with state-of-the-art results. We also discuss correct ways to evaluate the performance of automatic scene detection algorithms.

2016 Relazione in Atti di Convegno

DOI IRIS

Bridging the experiential gap in cultural visits with computer vision

Authors: Cucchiara, R.; Del Bimbo, A.

This paper discusses the role of computer vision to bridge the experiential gap between the cultural and emotional experience of … (Read full abstract)

This paper discusses the role of computer vision to bridge the experiential gap between the cultural and emotional experience of the visitors in museums or cultural heritage sites. We don't argue against the use of multiple sensors to provide a more complete cultural experience but claim the primary role of computer vision for such a task. Although many research challenges are still far to be solved effectively, especially for detection, re-identification, tracking and recognition, we believe that technology can be deployed already in real contexts and support concrete applications with interesting results that will open the door to valuable future applications.

2016 Relazione in Atti di Convegno

DOI IRIS

Context Change Detection for an Ultra-Low Power Low-Resolution Ego-Vision Imager

Authors: Paci, Francesco; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita; Benini, Luca

Published in: LECTURE NOTES IN COMPUTER SCIENCE

With the increasing popularity of wearable cameras, such as GoPro or Narrative Clip, research on continuous activity monitoring from egocentric … (Read full abstract)

With the increasing popularity of wearable cameras, such as GoPro or Narrative Clip, research on continuous activity monitoring from egocentric cameras has received a lot of attention. Research in hardware and software is devoted to find new efficient, stable and long-time running solutions; however, devices are too power-hungry for truly always-on operation, and are aggressively duty-cycled to achieve acceptable lifetimes. In this paper we present a wearable system for context change detection based on an egocentric camera with ultra-low power consumption that can collect data 24/7. Although the resolution of the captured images is low, experimental results in real scenarios demonstrate how our approach, based on Siamese Neural Networks, can achieve visual context awareness. In particular, we compare our solution with hand-crafted features and with state of art technique and propose a novel and challenging dataset composed of roughly 30000 low-resolution images.

2016 Relazione in Atti di Convegno

DOI IRIS

Exploring Architectural Details Through aWearable Egocentric Vision Device

Authors: Alletto, Stefano; Abati, Davide; Serra, Giuseppe; Cucchiara, Rita

Published in: SENSORS

Augmented user experiences in the cultural heritage domain are in increasing demand by the new digital native tourists of 21st … (Read full abstract)

Augmented user experiences in the cultural heritage domain are in increasing demand by the new digital native tourists of 21st century. In this paper, we propose a novel solution that aims at assisting the visitor during an outdoor tour of a cultural site using the unique first person perspective of wearable cameras. In particular, the approach exploits computer vision techniques to retrieve the details by proposing a robust descriptor based on the covariance of local features. Using a lightweight wearable board the solution can localize the user with respect to the 3D point cloud of the historical landmark and provide him with information about the details he is currently looking at. Experimental results validate the method both in terms of accuracy and computational effort. Furthermore, user evaluation based on real-world experiments shows that the proposal is deemed effective in enriching a cultural experience.

2016 Articolo su rivista

DOI IRIS

Fast gesture recognition with Multiple StreamDiscrete HMMs on 3D Skeletons

Authors: Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high … (Read full abstract)

HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high parallelism. They have worth performance even with a limited training set. All these characteristics are hard to find together in other even more accurate methods. In this paper, we propose a novel doublestage classification approach, based on Multiple Stream Discrete Hidden Markov Models (MSD-HMM) and 3D skeleton joint data, able to reach high performances maintaining all advantages listed above. The approach allows both to quickly classify presegmented gestures (offline classification), and to perform temporal segmentation on streams of gestures (online classification) faster than real time. We test our system on three public datasets, MSRAction3D, UTKinect-Action and MSRDailyAction, and on a new dataset, Kinteract Dataset, explicitly created for Human Computer Interaction (HCI). We obtain state of the art performances on all of them.

2016 Relazione in Atti di Convegno

DOI IRIS

Historical Document Digitization through Layout Analysis and Deep Content Classification

Authors: Corbelli, Andrea; Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with … (Read full abstract)

Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with historical documents. This paper presents an hybrid approach to layout segmentation as well as a strategy to classify document regions, which is applied to the process of digitization of an historical encyclopedia. Our layout analysis method merges a classic top-down approach and a bottom-up classification process based on local geometrical features, while regions are classified by means of features extracted from a Convolutional Neural Network merged in a Random Forest classifier. Experiments are conducted on the first volume of the ``Enciclopedia Treccani'', a large dataset containing 999 manually annotated pages from the historical Italian encyclopedia.

2016 Relazione in Atti di Convegno

DOI IRIS

Layout analysis and content enrichment of digitized books

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Coppi, Dalia; Cucchiara, Rita

Published in: MULTIMEDIA TOOLS AND APPLICATIONS

In this paper we describe a system for automatically analyzing old documents and creating hyper linking between different epochs, thus … (Read full abstract)

In this paper we describe a system for automatically analyzing old documents and creating hyper linking between different epochs, thus opening ancient documents to young people and to make them available on the web with old and current content. We propose a supervised learning approach to segment text and illustration of digitized old documents using a texture feature based on local correlation aimed at detecting the repeating patterns of text regions and differentiate them from pictorial elements. Moreover we present a solution to help the user in finding contemporary content connected to what is automatically extracted from the ancient documents.

2016 Articolo su rivista

DOI IRIS

Multi-Level Net: a Visual Saliency Prediction Model

Authors: Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

State of the art approaches for saliency prediction are based on Full Convolutional Networks, in which saliency maps are built … (Read full abstract)

State of the art approaches for saliency prediction are based on Full Convolutional Networks, in which saliency maps are built using the last layer. In contrast, we here present a novel model that predicts saliency maps exploiting a non-linear combination of features coming from different layers of the network. We also present a new loss function to deal with the imbalance issue on saliency masks. Extensive results on three public datasets demonstrate the robustness of our solution. Our model outperforms the state of the art on SALICON, which is the largest and unconstrained dataset available, and obtains competitive results on MIT300 and CAT2000 benchmarks.

2016 Relazione in Atti di Convegno

DOI IRIS

Optimizing image registration for interactive applications

Authors: Gasparini, Riccardo; Alletto, Stefano; Serra, Giuseppe; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

With the spread of wearable and mobile devices, the request for interactive augmented reality applications is in constant growth. Among … (Read full abstract)

With the spread of wearable and mobile devices, the request for interactive augmented reality applications is in constant growth. Among the different possibilities, we focus on the cultural heritage domain where a key step in the development applications for augmented cultural experiences is to obtain a precise localization of the user, i.e. the 6 degree-of-freedom of the camera acquiring the images used by the application. Current state of the art perform this task by extracting local descriptors from a query and exhaustively matching them to a sparse 3D model of the environment. While this procedure obtains good localization performance, due to the vast search space involved in the retrieval of 2D-3D correspondences this is often not feasible in real-time and interactive environments. In this paper we hence propose to perform descriptor quantization to reduce the search space and employ multiple KD-Trees combined with a principal component analysis dimensionality reduction to enable an efficient search. We experimentally show that our solution can halve the computational requirements of the correspondence search with regard to the state of the art while maintaining similar accuracy levels.

2016 Relazione in Atti di Convegno

DOI IRIS

Publications by Rita Cucchiara

An Indoor Location-aware System for an IoT-based Smart Museum

Analysis and Re-use of Videos in Educational Digital Libraries with Automatic Scene Detection

Bridging the experiential gap in cultural visits with computer vision

Context Change Detection for an Ultra-Low Power Low-Resolution Ego-Vision Imager

Exploring Architectural Details Through aWearable Egocentric Vision Device

Fast gesture recognition with Multiple StreamDiscrete HMMs on 3D Skeletons

Historical Document Digitization through Layout Analysis and Deep Content Classification

Layout analysis and content enrichment of digitized books

Multi-Level Net: a Visual Saliency Prediction Model

Optimizing image registration for interactive applications