Method for generating probabilistic representations and deep neural network
Authors: Garattoni, Lorenzo; Francesca, Gianpiero; Pini, Stefano; Simoni, Alessandro; Vezzani, Roberto; Borghi, Guido
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
Tip: type @ to pick an author and # to pick a keyword.
Authors: Garattoni, Lorenzo; Francesca, Gianpiero; Pini, Stefano; Simoni, Alessandro; Vezzani, Roberto; Borghi, Guido
Authors: Gibertoni, Giovanni; Rovati, Luigi; Borghi, Guido
La presente invenzione riguarda un metodo per stimare automaticamente una posizione conforme della pupilla di un paziente durante l’esecuzione di un esame oftalmico. Il metodo si basa sull’acquisizione di immagini rappresentative della pupilla e sulla loro elaborazione mediante algoritmi di classificazione, comprendenti tecniche di machine learning, al fine di determinare la posizione della pupilla rispetto all’asse ottico di un dispositivo oftalmico o di valutare un parametro di stato della pupilla. L’invenzione riguarda inoltre un dispositivo per esami oftalmici che implementa tale metodo, comprendente un modulo ottico che include uno specchio dicroico configurato per deviare un segnale luminoso rappresentativo della pupilla verso un sensore ottico di acquisizione di immagini, consentendo al contempo ad un ulteriore segnale luminoso rappresentativo della pupilla di propagarsi senza interferenze rilevanti verso componenti ottiche interne del dispositivo oftalmico per l’esecuzione dell’esame di interesse. L’invenzione comprende altresì un kit elettronico collegabile ad un dispositivo oftalmico esistente, che ne consente l’aggiornamento funzionale per l’esecuzione della stima della posizione della pupilla senza alterare le funzionalità diagnostiche originarie. La soluzione proposta migliora l’affidabilità, la ripetibilità e l’usabilità degli esami oftalmici eseguiti da personale specializzato, mantenendo la compatibilità con la strumentazione oftalmica esistente.
Authors: Pianfetti, E.; Lovino, M.; Ficarra, E.; Martignetti, L.
Published in: BMC BIOINFORMATICS
Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for understanding gene regulation, and various models (statistical and neural network-based) have been developed for this purpose. A few models predict mRNA expression levels from the DNA sequence, exploiting the DNA sequence and gene features (e.g., number of exons/introns, gene length). Other models include information about long-range interaction molecules (i.e., enhancers/silencers) and transcriptional regulators as predictive features, such as transcription factors (TFs) and small RNAs (e.g., microRNAs - miRNAs). Recently, a convolutional neural network (CNN) model, called Xpresso, has been proposed for mRNA expression level prediction leveraging the promoter sequence and mRNAs’ half-life features (gene features). To push forward the mRNA level prediction, we present miREx, a CNN-based tool that includes information about miRNA targets and expression levels in the model. Indeed, each miRNA can target specific genes, and the model exploits this information to guide the learning process. In detail, not all miRNAs are included, only a selected subset with the highest impact on the model. MiREx has been evaluated on four cancer primary sites from the genomics data commons (GDC) database: lung, kidney, breast, and corpus uteri. Results show that mRNA level prediction benefits from selected miRNA targets and expression information. Future model developments could include other transcriptional regulators or be trained with proteomics data to infer protein levels.
Authors: Baldrati, Alberto; Morelli, Davide; Cartella, Giuseppe; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita
Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION
Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, showing how clothes interact with the human body. In this context, computer vision can thus be used to improve the fashion design process. Differently from previous works that mainly focused on the virtual try-on of garments, we propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images by following multimodal prompts, such as text, human body poses, and garment sketches. We tackle this problem by proposing a new architecture based on latent diffusion models, an approach that has not been used before in the fashion domain. Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets, namely Dress Code and VITON-HD, with multimodal annotations collected in a semi-automatic manner. Experimental results on these new datasets demonstrate the effectiveness of our proposal, both in terms of realism and coherence with the given multimodal inputs. Source code and collected multimodal annotations are publicly available at: https://github.com/aimagelab/multimodal-garment-designer.
Authors: Marconato, E.; Bontempo, G.; Ficarra, E.; Calderara, S.; Passerini, A.; Teso, S.
Published in: PROCEEDINGS OF MACHINE LEARNING RESEARCH
We introduce Neuro-Symbolic Continual Learning, where a model has to solve a sequence of neuro-symbolic tasks, that is, it has to map sub-symbolic inputs to high-level concepts and compute predictions by reasoning consistently with prior knowledge. Our key observation is that neuro-symbolic tasks, although different, often share concepts whose semantics remains stable over time. Traditional approaches fall short: existing continual strategies ignore knowledge altogether, while stock neuro-symbolic architectures suffer from catastrophic forgetting. We show that leveraging prior knowledge by combining neurosymbolic architectures with continual strategies does help avoid catastrophic forgetting, but also that doing so can yield models affected by reasoning shortcuts. These undermine the semantics of the acquired concepts, even when detailed prior knowledge is provided upfront and inference is exact, and in turn continual performance. To overcome these issues, we introduce COOL, a COncept-level cOntinual Learning strategy tailored for neuro-symbolic continual problems that acquires high-quality concepts and remembers them over time. Our experiments on three novel benchmarks highlights how COOL attains sustained high performance on neuro-symbolic continual learning tasks in which other strategies fail.
Authors: Millunzi, M.; Bonicelli, L.; Zurli, A.; Salman, A.; Credi, J.; Calderara, S.
Published in: CEUR WORKSHOP PROCEEDINGS
Many Machine Learning and Deep Learning algorithms are widely used with remarkable success in scenarios whose benchmark datasets consist of reliable data. However, they often struggle to handle realistic scenarios, particularly those in the financial sector, where available data constantly vary, increase daily, and may contain noise. As a result, we present an overview of the ongoing research at the AImageLab research laboratory of the University of Modena and Reggio Emilia, in collaboration with AxyonAI, focused on exploring Continual Learning methods in the presence of noisy data, with a special focus on noisy labels. To the best of our knowledge, this is a problem that has received limited attention from the scientific community thus far.
Authors: D’Amelio, Alessandro; Lanzarotti, Raffaella; Patania, Sabrina; Grossi, Giuliano; Cuculo, Vittorio; Valota, Andrea; Boccignone, Giuseppe
Published in: LECTURE NOTES IN COMPUTER SCIENCE
Authors: Cartella, Giuseppe; Baldrati, Alberto; Morelli, Davide; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita
Published in: LECTURE NOTES IN COMPUTER SCIENCE
The inexorable growth of online shopping and e-commerce demands scalable and robust machine learning-based solutions to accommodate customer requirements. In the context of automatic tagging classification and multimodal retrieval, prior works either defined a low generalizable supervised learning approach or more reusable CLIP-based techniques while, however, training on closed source data. In this work, we propose OpenFashionCLIP, a vision-and-language contrastive learning method that only adopts open-source fashion data stemming from diverse domains, and characterized by varying degrees of specificity. Our approach is extensively validated across several tasks and benchmarks, and experimental results highlight a significant out-of-domain generalization capability and consistent improvements over state-of-the-art methods both in terms of accuracy and recall. Source code and trained models are publicly available at: https://github.com/aimagelab/open-fashion-clip.
Authors: Sarto, Sara; Barraco, Manuele; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita
Published in: PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
The CLIP model has been recently proven to be very effective for a variety of cross-modal tasks, including the evaluation of captions generated from vision-and-language models. In this paper, we propose a new recipe for a contrastive-based evaluation metric for image captioning, namely Positive-Augmented Contrastive learning Score (PAC-S), that in a novel way unifies the learning of a contrastive visual-semantic space with the addition of generated images and text on curated data. Experiments spanning several datasets demonstrate that our new metric achieves the highest correlation with human judgments on both images and videos, outperforming existing reference-based metrics like CIDEr and SPICE and reference-free metrics like CLIP-Score. Finally, we test the system-level correlation of the proposed metric when considering popular image captioning approaches, and assess the impact of employing different cross-modal features. We publicly release our source code and trained models.