Publications - AImageLab

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on … (Read full abstract)

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on HOG descriptors and exploits Dynamic Time Warping technique to compare feature vectors elaborated from single handwritten words. Our strategy is applied to a new challenging dataset extracted from Italian civil registries of the XIX century. Experimental results, compared with some previously developed word spotting strategies, confirmed that our method outperforms competitors.

2017 Relazione in Atti di Convegno

DOI IRIS

Learning to Map Vehicles into Bird's Eye View

Authors: Palazzi, Andrea; Borghi, Guido; Abati, Davide; Calderara, Simone; Cucchiara, Rita

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is … (Read full abstract)

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is gaining importance both for the academia and car companies. This paper presents a way to learn a semantic-aware transformation which maps detections from a dashboard camera view onto a broader bird's eye occupancy map of the scene. To this end, a huge synthetic dataset featuring 1M couples of frames, taken from both car dashboard and bird's eye view, has been collected and automatically annotated. A deep-network is then trained to warp detections from the first to the second view. We demonstrate the effectiveness of our model against several baselines and observe that is able to generalize on real-world data despite having been trained solely on synthetic ones.

2017 Relazione in Atti di Convegno

DOI IRIS

POSEidon: Face-from-Depth for Driver Pose Estimation

Authors: Borghi, Guido; Venturelli, Marco; Vezzani, Roberto; Cucchiara, Rita

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

Fast and accurate upper-body and head pose estimation is a key task for automatic monitoring of driver attention, a challenging … (Read full abstract)

Fast and accurate upper-body and head pose estimation is a key task for automatic monitoring of driver attention, a challenging context characterized by severe illumination changes, occlusions and extreme poses. In this work, we present a new deep learning framework for head localization and pose estimation on depth images. The core of the proposal is a regression neural network, called POSEidon, which is composed of three independent convolutional nets followed by a fusion layer, specially conceived for understanding the pose by depth. In addition, to recover the intrinsic value of face appearance for understanding head position and orientation, we propose a new Face-from-Depth approach for learning image faces from depth. Results in face reconstruction are qualitatively impressive. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Results show that our method overcomes all recent state-of-art works, running in real time at more than 30 frames per second.

2017 Relazione in Atti di Convegno

DOI IRIS

Fast gesture recognition with Multiple StreamDiscrete HMMs on 3D Skeletons

Authors: Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high … (Read full abstract)

HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high parallelism. They have worth performance even with a limited training set. All these characteristics are hard to find together in other even more accurate methods. In this paper, we propose a novel doublestage classification approach, based on Multiple Stream Discrete Hidden Markov Models (MSD-HMM) and 3D skeleton joint data, able to reach high performances maintaining all advantages listed above. The approach allows both to quickly classify presegmented gestures (offline classification), and to perform temporal segmentation on streams of gestures (online classification) faster than real time. We test our system on three public datasets, MSRAction3D, UTKinect-Action and MSRDailyAction, and on a new dataset, Kinteract Dataset, explicitly created for Human Computer Interaction (HCI). We obtain state of the art performances on all of them.

2016 Relazione in Atti di Convegno

DOI IRIS

Shot, scene and keyframe ordering for interactive video re-use

Authors: Baraldi, L.; Grana, C.; Borghi, G.; Vezzani, R.; Cucchiara, R.

This paper presents a complete system for shot and scene detection in broadcast videos, as well as a method to … (Read full abstract)

This paper presents a complete system for shot and scene detection in broadcast videos, as well as a method to select the best representative key-frames, which could be used in new interactive interfaces for accessing large collections of edited videos. The final goal is to enable an improved access to video footage and the re-use of video content with the direct management of user-selected video-clips.

2016 Relazione in Atti di Convegno

DOI IRIS

Publications by Guido Borghi

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features

Learning to Map Vehicles into Bird's Eye View

POSEidon: Face-from-Depth for Driver Pose Estimation

Fast gesture recognition with Multiple StreamDiscrete HMMs on 3D Skeletons

Shot, scene and keyframe ordering for interactive video re-use