Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text

Authors: Bolelli, Federico

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access … (Read full abstract)

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access a variety of historical documents published on the web. The paper presents an overview of the indexing process that will be used to achieve the goal, focusing on the adopted dewarping technique. The proposed dewarping approach performs its task with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The novelty introduced with this work regards the possibility of applying dewarping to document images which contain both handwritten and typewritten text.

2017 Relazione in Atti di Convegno

isomiR-SEA: miRNA and isomiR expression level detection in seven RNA-Seq datasets

Authors: Urgese, Gianvito; Paciello, Giulia; Macii, Enrico; Acquaviva, Andrea; Ficarra, Elisa

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR … (Read full abstract)

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR molecules can affect their targeting activity, with consequences in gene expression and potential impact in multi-factorial diseases. miRNAs are considered good biomarkers, making their adoption for disease characterization highly desirable. Several methodologies and tools were devised to identify and quantify miRNAs from sequencing data. However, all these tools are built on-top of general-purpose alignment algorithms, providing poorly accurate results and no information concerning isomiRs and conserved miRNA-mRNA interaction sites. Method: To overcome these limitations we developed the isomiR-SEA algorithm. By implementing a miRNA-specific alignment procedure, isomiR-SEA analysis accounts for accurate miRNA/isomiR expression levels and for a precise evaluation of the conserved interaction sites. As first, isomiR-SEA identifies miRNA seeds within the tags. If the seed is found, the alignment is extended and the positions of the encountered mismatches recorded. Then, the collected info is evaluated to distinguish among miRNAs and isomiRs and to assess the conservation of the interaction sites. Results & Conclusion: isomiR-SEA performance was assessed on 7 public RNA-Seq datasets. 40% of reads attributed to miRNAs (189M) comes from mature miRNAs, 50% derives instead from 3’ isomiRs, and the remaining reads account for 5’/SNP isomiRs or combinations between them. Furthermore, about 2% of reads lost some interaction sites. This proves the importance of a miRNA-specific alignment algorithm to correctly evaluate miRNA targeting activity. Expression levels of isomiRs detected in the two experiments were aggregated and classified with two deepness. In experiment 1, isoforms with indel (in one or both ends) are grouped together. Whereas, in experiment 2 we make a distinction between reads aligned on the mature miRNA with insertion (+) or deletion (-) on 5' or 3' ends. This shows the capability of isomiR-SEA to generate enriched results that can be analysed in down-stream analysis customized for the investigation purpose.

2017 Poster

Layout analysis and content classification in digitized books

Authors: Corbelli, Andrea; Baraldi, Lorenzo; Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In … (Read full abstract)

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.

2017 Relazione in Atti di Convegno

Learning to Map Vehicles into Bird's Eye View

Authors: Palazzi, Andrea; Borghi, Guido; Abati, Davide; Calderara, Simone; Cucchiara, Rita

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is … (Read full abstract)

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is gaining importance both for the academia and car companies. This paper presents a way to learn a semantic-aware transformation which maps detections from a dashboard camera view onto a broader bird's eye occupancy map of the scene. To this end, a huge synthetic dataset featuring 1M couples of frames, taken from both car dashboard and bird's eye view, has been collected and automatically annotated. A deep-network is then trained to warp detections from the first to the second view. We demonstrate the effectiveness of our model against several baselines and observe that is able to generalize on real-world data despite having been trained solely on synthetic ones.

2017 Relazione in Atti di Convegno

Learning Where to Attend Like a Human Driver

Authors: Palazzi, Andrea; Solera, Francesco; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita

Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will … (Read full abstract)

Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will still maintain a central role as a guarantee in terms of legal responsibility during the driving task. In this paper we study the dynamics of the driver's gaze and use it as a proxy to understand related attentional mechanisms. First, we build our analysis upon two questions: where and what the driver is looking at? Second, we model the driver's gaze by training a coarse-to-fine convolutional network on short sequences extracted from the DR(eye)VE dataset. Experimental comparison against different baselines reveal that the driver's gaze can indeed be learnt to some extent, despite i) being highly subjective and ii) having only one driver's gaze available for each sequence due to the irreproducibility of the scene. Eventually, we advocate for a new assisted driving paradigm which suggests to the driver, with no intervention, where she should focus her attention.

2017 Relazione in Atti di Convegno

Mining textural knowledge in biological images: applications, methods and trends

Authors: Di Cataldo, Santa; Ficarra, Elisa

Published in: COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL

Texture analysis is a major task in many areas of computer vision and pattern recognition, including biological imaging. Indeed, visual … (Read full abstract)

Texture analysis is a major task in many areas of computer vision and pattern recognition, including biological imaging. Indeed, visual textures can be exploited to distinguish specific tissues or cells in a biological sample, to highlight chemical reactions between molecules, as well as to detect subcellular patterns that can be evidence of certain pathologies. This makes automated texture analysis fundamental in many applications of biomedicine, such as the accurate detection and grading of multiple types of cancer, the differential diagnosis of autoimmune diseases, or the study of physiological processes. Due to their specific characteristics and challenges, the design of texture analysis systems for biological images has attracted ever-growing attention in the last few years. In this paper, we perform a critical review of this important topic. First, we provide a general definition of texture analysis and discuss its role in the context of bioimaging, with examples of applications from the recent literature. Then, we review the main approaches to automated texture analysis, with special attention to the methods of feature extraction and encoding that can be successfully applied to microscopy images of cells or tissues. Our aim is to provide an overview of the state of the art, as well as a glimpse into the latest and future trends of research in this area.

2017 Articolo su rivista

Modeling Multimodal Cues in a Deep Learning-based Framework for Emotion Recognition in the Wild

Authors: Pini, Stefano; Ben Ahmed, Olfa; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita; Huet, Benoit

In this paper, we propose a multimodal deep learning architecture for emotion recognition in video regarding our participation to the … (Read full abstract)

In this paper, we propose a multimodal deep learning architecture for emotion recognition in video regarding our participation to the audio-video based sub-challenge of the Emotion Recognition in the Wild 2017 challenge. Our model combines cues from multiple video modalities, including static facial features, motion patterns related to the evolution of the human expression over time, and audio information. Specifically, it is composed of three sub-networks trained separately: the first and second ones extract static visual features and dynamic patterns through 2D and 3D Convolutional Neural Networks (CNN), while the third one consists in a pretrained audio network which is used to extract useful deep acoustic signals from video. In the audio branch, we also apply Long Short Term Memory (LSTM) networks in order to capture the temporal evolution of the audio features. To identify and exploit possible relationships among different modalities, we propose a fusion network that merges cues from the different modalities in one representation. The proposed architecture outperforms the challenge baselines (38.81% and 40.47%): we achieve an accuracy of 50.39% and 49.92% respectively on the validation and the testing data.

2017 Relazione in Atti di Convegno

NeuralStory: an Interactive Multimedia System for Video Indexing and Re-use

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

In the last years video has been swamping the Internet: websites, social networks, and business multimedia systems are adopting video … (Read full abstract)

In the last years video has been swamping the Internet: websites, social networks, and business multimedia systems are adopting video as the most important form of communication and information. Video are normally accessed as a whole and are not indexed in the visual content. Thus, they are often uploaded as short, manually cut clips with user-provided annotations, keywords and tags for retrieval. In this paper, we propose a prototype multimedia system which addresses these two limitations: it overcomes the need of human intervention in the video setting, thanks to fully deep learning-based solutions, and decomposes the storytelling structure of the video into coherent parts. These parts can be shots, key-frames, scenes and semantically related stories, and are exploited to provide an automatic annotation of the visual content, so that parts of video can be easily retrieved. This also allows a principled re-use of the video itself: users of the platform can indeed produce new storytelling by means of multi-modal presentations, add text and other media, and propose a different visual organization of the content. We present the overall solution, and some experiments on the re-use capability of our platform in edutainment by conducting an extensive user valuation %with students from primary schools.

2017 Relazione in Atti di Convegno

Personalized Egocentric Video Summarization of Cultural Tour on User Preferences Input

Authors: Varini, P.; Serra, G.; Cucchiara, R.

Published in: IEEE TRANSACTIONS ON MULTIMEDIA

In this paper, we propose a new method for customized summarization of egocentric videos according to specific user preferences, so … (Read full abstract)

In this paper, we propose a new method for customized summarization of egocentric videos according to specific user preferences, so that different users can extract different summaries from the same stream. Our approach, tailored on a cultural heritage scenario, relies on creating a short synopsis of the original video focused on key shots, in which concepts relevant to user preferences can be visually detected and the chronological flow of the original video is preserved. Moreover, we release a new dataset, composed of egocentric streams taken in uncontrolled scenarios, capturing tourists cultural visits in six art cities, with geolocalization information. Our experimental results show that the proposed approach is able to leverage user's preferences with an accent on storyline chronological flow and on visual smoothness.

2017 Articolo su rivista

Pixel classification methods to detect skin lesions on dermoscopic medical images

Authors: Balducci, Fabrizio; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In recent years the interest of biomedical and computer vision communities in acquisition and analysis of epidermal images increased because … (Read full abstract)

In recent years the interest of biomedical and computer vision communities in acquisition and analysis of epidermal images increased because melanoma is one of the deadliest form of skin cancer and its early identification could save lives reducing unnecessary medical treatments. User-friendly automatic tools can be very useful for physicians and dermatologists in fact high-resolution images and their annotated data, combined with analysis pipelines and machine learning techniques, represent the base to develop intelligent and proactive diagnostic systems. In this work we present two skin lesion detection pipelines on dermoscopic medical images, by exploiting standard techniques combined with workarounds that improve results; moreover to highlight the performance we consider a set of metrics combined with pixel labeling and classification. A preliminary but functional evaluation phase has been conducted with a sub-set of hard-to-treat images, in order to check which proposed detection pipeline reaches the best results.

2017 Relazione in Atti di Convegno

Page 58 of 109 • Total publications: 1084