Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

From Depth Data to Head Pose Estimation: a Siamese approach

Authors: Venturelli, Marco; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it … (Read full abstract)

The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it is an enabling technology in automotive for driver attention monitoring. In this paper, we tackle the pose estimation problem through a deep learning network working in regression manner. Traditional methods usually rely on visual facial features, such as facial landmarks or nose tip position. In contrast, we exploit a Convolutional Neural Network (CNN) to perform head pose estimation directly from depth data. We exploit a Siamese architecture and we propose a novel loss function to improve the learning of the regression network layer. The system has been tested on two public datasets, Biwi Kinect Head Pose and ICT-3DHP database. The reported results demonstrate the improvement in accuracy with respect to current state-of-the-art approaches and the real time capabilities of the overall framework.

2017 Relazione in Atti di Convegno

From Groups to Leaders and Back. Exploring Mutual Predictability Between Social Groups and Their Leaders

Authors: Solera, Francesco; Calderara, Simone; Cucchiara, Rita

Recently, social theories and empirical observations identified small groups and leaders as the basic elements which shape a crowd. This … (Read full abstract)

Recently, social theories and empirical observations identified small groups and leaders as the basic elements which shape a crowd. This leads to an intermediate level of abstraction that is placed between the crowd as a flow of people, and the crowd as a collection of individuals. Consequently, automatic analysis of crowds in computer vision is also experiencing a shift in focus from individuals to groups and from small groups to their leaders. In this chapter, we present state-of-the-art solutions to the groups and leaders detection problem, which are able to account for physical factors as well as for sociological evidence observed over short time windows. The presented algorithms are framed as structured learning problems over the set of individual trajectories. However, the way trajectories are exploited to predict the structure of the crowd is not fixed but rather learned from recorded and annotated data, enabling the method to adapt these concepts to different scenarios, densities, cultures, and other unobservable complexities. Additionally, we investigate the relation between leaders and their groups and propose the first attempt to exploit leadership as prior knowledge for group detection.

2017 Capitolo/Saggio

FuGePrior: A novel gene fusion prioritization algorithm based on accurate fusion structure analysis in cancer RNA-seq samples

Authors: Paciello, Giulia; Ficarra, Elisa

Published in: BMC BIOINFORMATICS

2017 Articolo su rivista

Generative Adversarial Models for People Attribute Recognition in Surveillance

Authors: Fabbri, Matteo; Calderara, Simone; Cucchiara, Rita

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. … (Read full abstract)

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. Our proposal explicitly deal with poor resolution and occlusion issues that often occur in surveillance footages by enhancing the images by means of Deep Convolutional Generative Adversarial Networks (DCGAN). Experiments show that by combining both our Generative Reconstruction and Deep Attribute Classification Network we can effectively extract attributes even when resolution is poor and in presence of strong occlusions up to 80% of the whole person figure.

2017 Relazione in Atti di Convegno

Guest Editorial Special Issue on Wearable and Ego-Vision Systems for Augmented Experience

Authors: Serra, G.; Cucchiara, R.; Kitani, K. M.; Civera, J.

Published in: IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS

2017 Articolo su rivista

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be … (Read full abstract)

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets.

2017 Relazione in Atti di Convegno

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on … (Read full abstract)

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on HOG descriptors and exploits Dynamic Time Warping technique to compare feature vectors elaborated from single handwritten words. Our strategy is applied to a new challenging dataset extracted from Italian civil registries of the XIX century. Experimental results, compared with some previously developed word spotting strategies, confirmed that our method outperforms competitors.

2017 Relazione in Atti di Convegno

Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text

Authors: Bolelli, Federico

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access … (Read full abstract)

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access a variety of historical documents published on the web. The paper presents an overview of the indexing process that will be used to achieve the goal, focusing on the adopted dewarping technique. The proposed dewarping approach performs its task with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The novelty introduced with this work regards the possibility of applying dewarping to document images which contain both handwritten and typewritten text.

2017 Relazione in Atti di Convegno

isomiR-SEA: miRNA and isomiR expression level detection in seven RNA-Seq datasets

Authors: Urgese, Gianvito; Paciello, Giulia; Macii, Enrico; Acquaviva, Andrea; Ficarra, Elisa

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR … (Read full abstract)

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR molecules can affect their targeting activity, with consequences in gene expression and potential impact in multi-factorial diseases. miRNAs are considered good biomarkers, making their adoption for disease characterization highly desirable. Several methodologies and tools were devised to identify and quantify miRNAs from sequencing data. However, all these tools are built on-top of general-purpose alignment algorithms, providing poorly accurate results and no information concerning isomiRs and conserved miRNA-mRNA interaction sites. Method: To overcome these limitations we developed the isomiR-SEA algorithm. By implementing a miRNA-specific alignment procedure, isomiR-SEA analysis accounts for accurate miRNA/isomiR expression levels and for a precise evaluation of the conserved interaction sites. As first, isomiR-SEA identifies miRNA seeds within the tags. If the seed is found, the alignment is extended and the positions of the encountered mismatches recorded. Then, the collected info is evaluated to distinguish among miRNAs and isomiRs and to assess the conservation of the interaction sites. Results & Conclusion: isomiR-SEA performance was assessed on 7 public RNA-Seq datasets. 40% of reads attributed to miRNAs (189M) comes from mature miRNAs, 50% derives instead from 3’ isomiRs, and the remaining reads account for 5’/SNP isomiRs or combinations between them. Furthermore, about 2% of reads lost some interaction sites. This proves the importance of a miRNA-specific alignment algorithm to correctly evaluate miRNA targeting activity. Expression levels of isomiRs detected in the two experiments were aggregated and classified with two deepness. In experiment 1, isoforms with indel (in one or both ends) are grouped together. Whereas, in experiment 2 we make a distinction between reads aligned on the mature miRNA with insertion (+) or deletion (-) on 5' or 3' ends. This shows the capability of isomiR-SEA to generate enriched results that can be analysed in down-stream analysis customized for the investigation purpose.

2017 Poster

Layout analysis and content classification in digitized books

Authors: Corbelli, Andrea; Baraldi, Lorenzo; Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In … (Read full abstract)

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.

2017 Relazione in Atti di Convegno

Page 55 of 106 • Total publications: 1059