Publications - AImageLab

PIK3R1 fusion drives chemoresistance in ovarian cancer by activating ERK1/2 and inducing rod and ring-like structures

Authors: Rausio, H.; Cervera, A.; Heuser, V. D.; West, G.; Oikkonen, J.; Pianfetti, E.; Lovino, M.; Ficarra, E.; Taimen, P.; Hynninen, J.; Lehtonen, R.; Hautaniemi, S.; Carpen, O.; Huhtinen, K.

Published in: NEOPLASIA

Gene fusions are common in high-grade serous ovarian cancer (HGSC). Such genetic lesions may promote tumorigenesis, but the pathogenic mechanisms … (Read full abstract)

Gene fusions are common in high-grade serous ovarian cancer (HGSC). Such genetic lesions may promote tumorigenesis, but the pathogenic mechanisms are currently poorly understood. Here, we investigated the role of a PIK3R1-CCDC178 fusion identified from a patient with advanced HGSC. We show that the fusion induces HGSC cell migration by regulating ERK1/2 and increases resistance to platinum treatment. Platinum resistance was associated with rod and ring-like cellular structure formation. These structures contained, in addition to the fusion protein, CIN85, a key regulator of PI3K-AKT-mTOR signaling. Our data suggest that the fusion-driven structure formation induces a previously unrecognized cell survival and resistance mechanism, which depends on ERK1/2-activation.

2024 Articolo su rivista

DOI IRIS

BERT Classifies SARS-CoV-2 Variants

Authors: Ghione, G.; Lovino, M.; Ficarra, E.; Cirrincione, G.

Published in: SMART INNOVATION, SYSTEMS AND TECHNOLOGIES

Medical diagnostics faced numerous difficulties during the COVID-19 pandemic. One of these has been the need for ongoing monitoring of … (Read full abstract)

Medical diagnostics faced numerous difficulties during the COVID-19 pandemic. One of these has been the need for ongoing monitoring of SARS-CoV-2 mutations. Genomics is the technique most frequently used for precisely identifying variants. The ongoing global gathering of RNA samples of the virus has made such an approach possible. Nevertheless, variant identification techniques are frequently resource-intensive. As a result, the diagnostic capability of small medical laboratories might not be sufficient. In this work, an effective deep learning strategy for identifying SARS-CoV-2 variants is presented. This work makes two contributions: (1) a fine-tuning architecture of Bidirectional Encoder Representations from Transformers (BERT) to identify SARS-CoV-2 variants; (2) providing biological insights by exploiting BERT self-attention. Such an approach enables the analysis of the S gene of the virus to quickly recognize its variant. The selected model BERT is a transformer-based neural network first developed for natural language processing. Nonetheless, it has been effectively used in numerous applications, such as genomic sequence analysis. Thus, the fine-tuning of BERT was performed to adapt it to the RNA sequence domain, achieving a 98.59% F1-score on test data: it was successful in identifying variants circulating to date. The interpretability of the model was examined, since BERT utilizes the self-attention mechanism. In fact, it was discovered that by attending particular areas of the S gene, BERT extracts pertinent biological information on variants. Thus, the presented approach allows obtaining insights into the particular characteristics of SARS-CoV-2 RNA samples.

2023 Capitolo/Saggio

DOI IRIS

Enhancing PFI Prediction with GDS-MIL: A Graph-based Dual Stream MIL Approach

Authors: Bontempo, Gianpaolo; Bartolini, Nicola; Lovino, Marta; Bolelli, Federico; Virtanen, Anni; Ficarra, Elisa

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Whole-Slide Images (WSI) are emerging as a promising resource for studying biological tissues, demonstrating a great potential in aiding cancer … (Read full abstract)

Whole-Slide Images (WSI) are emerging as a promising resource for studying biological tissues, demonstrating a great potential in aiding cancer diagnosis and improving patient treatment. However, the manual pixel-level annotation of WSIs is extremely time-consuming and practically unfeasible in real-world scenarios. Multi-Instance Learning (MIL) have gained attention as a weakly supervised approach able to address lack of annotation tasks. MIL models aggregate patches (e.g., cropping of a WSI) into bag-level representations (e.g., WSI label), but neglect spatial information of the WSIs, crucial for histological analysis. In the High-Grade Serous Ovarian Cancer (HGSOC) context, spatial information is essential to predict a prognosis indicator (the Platinum-Free Interval, PFI) from WSIs. Such a prediction would bring highly valuable insights both for patient treatment and prognosis of chemotherapy resistance. Indeed, NeoAdjuvant ChemoTherapy (NACT) induces changes in tumor tissue morphology and composition, making the prediction of PFI from WSIs extremely challenging. In this paper, we propose GDS-MIL, a method that integrates a state-of-the-art MIL model with a Graph ATtention layer (GAT in short) to inject a local context into each instance before MIL aggregation. Our approach achieves a significant improvement in accuracy on the ``Ome18'' PFI dataset. In summary, this paper presents a novel solution for enhancing PFI prediction in HGSOC, with the potential of significantly improving treatment decisions and patient outcomes.

2023 Relazione in Atti di Convegno

DOI IRIS

MiREx: mRNA levels prediction from gene sequence and miRNA target knowledge

Authors: Pianfetti, E.; Lovino, M.; Ficarra, E.; Martignetti, L.

Published in: BMC BIOINFORMATICS

Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for … (Read full abstract)

Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for understanding gene regulation, and various models (statistical and neural network-based) have been developed for this purpose. A few models predict mRNA expression levels from the DNA sequence, exploiting the DNA sequence and gene features (e.g., number of exons/introns, gene length). Other models include information about long-range interaction molecules (i.e., enhancers/silencers) and transcriptional regulators as predictive features, such as transcription factors (TFs) and small RNAs (e.g., microRNAs - miRNAs). Recently, a convolutional neural network (CNN) model, called Xpresso, has been proposed for mRNA expression level prediction leveraging the promoter sequence and mRNAs’ half-life features (gene features). To push forward the mRNA level prediction, we present miREx, a CNN-based tool that includes information about miRNA targets and expression levels in the model. Indeed, each miRNA can target specific genes, and the model exploits this information to guide the learning process. In detail, not all miRNAs are included, only a selected subset with the highest impact on the model. MiREx has been evaluated on four cancer primary sites from the genomics data commons (GDC) database: lung, kidney, breast, and corpus uteri. Results show that mRNA level prediction benefits from selected miRNA targets and expression information. Future model developments could include other transcriptional regulators or be trained with proteomics data to infer protein levels.

2023 Articolo su rivista

DOI IRIS

Predicting gene and protein expression levels from DNA and protein sequences with Perceiver

Authors: Stefanini, Matteo; Lovino, Marta; Cucchiara, Rita; Ficarra, Elisa

Published in: COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

Background and objective: The functions of an organism and its biological processes result from the expression of genes and proteins. … (Read full abstract)

Background and objective: The functions of an organism and its biological processes result from the expression of genes and proteins. Therefore quantifying and predicting mRNA and protein levels is a crucial aspect of scientific research. Concerning the prediction of mRNA levels, the available approaches use the sequence upstream and downstream of the Transcription Start Site (TSS) as input to neural networks. The State-of-the-art models (e.g., Xpresso and Basenjii) predict mRNA levels exploiting Convolutional (CNN) or Long Short Term Memory (LSTM) Networks. However, CNN prediction depends on convolutional kernel size, and LSTM suffers from capturing long-range dependencies in the sequence. Concerning the prediction of protein levels, as far as we know, there is no model for predicting protein levels by exploiting the gene or protein sequences. Methods: Here, we exploit a new model type (called Perceiver) for mRNA and protein level prediction, exploiting a Transformer-based architecture with an attention module to attend to long-range interactions in the sequences. In addition, the Perceiver model overcomes the quadratic complexity of the standard Transformer architectures. This work's contributions are 1. DNAPerceiver model to predict mRNA levels from the sequence upstream and downstream of the TSS; 2. ProteinPerceiver model to predict protein levels from the protein sequence; 3. Protein&DNAPerceiver model to predict protein levels from TSS and protein sequences. Results: The models are evaluated on cell lines, mice, glioblastoma, and lung cancer tissues. The results show the effectiveness of the Perceiver-type models in predicting mRNA and protein levels. Conclusions: This paper presents a Perceiver architecture for mRNA and protein level prediction. In the future, inserting regulatory and epigenetic information into the model could improve mRNA and protein level predictions. The source code is freely available at https://github.com/MatteoStefanini/DNAPerceiver.

2023 Articolo su rivista

DOI IRIS

Transformer-Based Approach to Melanoma Detection

Authors: Cirrincione, G.; Cannata, S.; Cicceri, G.; Prinzi, F.; Currieri, T.; Lovino, M.; Militello, C.; Pasero, E.; Vitabile, S.

Published in: SENSORS

Melanoma is a malignant cancer type which develops when DNA damage occurs (mainly due to environmental factors such as ultraviolet … (Read full abstract)

Melanoma is a malignant cancer type which develops when DNA damage occurs (mainly due to environmental factors such as ultraviolet rays). Often, melanoma results in intense and aggressive cell growth that, if not caught in time, can bring one toward death. Thus, early identification at the initial stage is fundamental to stopping the spread of cancer. In this paper, a ViT-based architecture able to classify melanoma versus non-cancerous lesions is presented. The proposed predictive model is trained and tested on public skin cancer data from the ISIC challenge, and the obtained results are highly promising. Different classifier configurations are considered and analyzed in order to find the most discriminating one. The best one reached an accuracy of 0.948, sensitivity of 0.928, specificity of 0.967, and AUROC of 0.948.

2023 Articolo su rivista

DOI IRIS

A survey on data integration for multi-omics sample clustering

Authors: Lovino, Marta; Randazzo, Vincenzo; Ciravegna, Gabriele; Barbiero, Pietro; Ficarra, Elisa; Cirrincione, Giansalvo

Published in: NEUROCOMPUTING

2022 Articolo su rivista

DOI IRIS

FusionFlow: an integrated system workflow for gene fusion detection in genomic samples

Authors: Citarrella, Francesca; Bontempo, Gianpaolo; Lovino, Marta; Ficarra, Elisa

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

2022 Relazione in Atti di Convegno

DOI IRIS

Identifying the oncogenic potential of gene fusions exploiting miRNAs

Authors: Lovino, M.; Montemurro, M.; Barrese, V. S.; Ficarra, E.

Published in: JOURNAL OF BIOMEDICAL INFORMATICS

It is estimated that oncogenic gene fusions cause about 20% of human cancer morbidity. Identifying potentially oncogenic gene fusions may … (Read full abstract)

It is estimated that oncogenic gene fusions cause about 20% of human cancer morbidity. Identifying potentially oncogenic gene fusions may improve affected patients’ diagnosis and treatment. Previous approaches to this issue included exploiting specific gene-related information, such as gene function and regulation. Here we propose a model that profits from the previous findings and includes the microRNAs in the oncogenic assessment. We present ChimerDriver, a tool to classify gene fusions as oncogenic or not oncogenic. ChimerDriver is based on a specifically designed neural network and trained on genetic and post-transcriptional information to obtain a reliable classification. The designed neural network integrates information related to transcription factors, gene ontologies, microRNAs and other detailed information related to the functions of the genes involved in the fusion and the gene fusion structure. As a result, the performances on the test set reached 0.83 f1-score and 96% recall. The comparison with state-of-the-art tools returned comparable or higher results. Moreover, ChimerDriver performed well in a real-world case where 21 out of 24 validated gene fusion samples were detected by the gene fusion detection tool Starfusion. ChimerDriver integrates transcriptional and post-transcriptional information in an ad-hoc designed neural network to effectively discriminate oncogenic gene fusions from passenger ones. ChimerDriver source code is freely available at https://github.com/martalovino/ChimerDriver.

2022 Articolo su rivista

DOI IRIS

Predicting gene expression levels from DNA sequences and post-transcriptional information with transformers

Authors: Pipoli, Vittorio; Cappelli, Mattia; Palladini, Alessandro; Peluso, Carlo; Lovino, Marta; Ficarra, Elisa

Published in: COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

Background and objectives: In the latest years, the prediction of gene expression levels has been crucial due to its potential … (Read full abstract)

Background and objectives: In the latest years, the prediction of gene expression levels has been crucial due to its potential applications in the clinics. In this context, Xpresso and others methods based on Convolutional Neural Networks and Transformers were firstly proposed to this aim. However, all these methods embed data with a standard one-hot encoding algorithm, resulting in impressively sparse matrices. In addition, post-transcriptional regulation processes, which are of uttermost importance in the gene expression process, are not considered in the model.Methods: This paper presents Transformer DeepLncLoc, a novel method to predict the abundance of the mRNA (i.e., gene expression levels) by processing gene promoter sequences, managing the problem as a regression task. The model exploits a transformer-based architecture, introducing the DeepLncLoc method to perform the data embedding. Since DeepLncloc is based on word2vec algorithm, it avoids the sparse matrices problem.Results: Post-transcriptional information related to mRNA stability and transcription factors is included in the model, leading to significantly improved performances compared to the state-of-the-art works. Transformer DeepLncLoc reached 0.76 of R-2 evaluation metric compared to 0.74 of Xpresso.Conclusion: The Multi-Headed Attention mechanisms which characterizes the transformer methodology is suitable for modeling the interactions between DNA's locations, overcoming the recurrent models. Finally, the integration of the transcription factors data in the pipeline leads to impressive gains in predictive power. (C) 2022 Elsevier B.V. All rights reserved.

2022 Articolo su rivista

DOI IRIS

Publications by Marta Lovino

PIK3R1 fusion drives chemoresistance in ovarian cancer by activating ERK1/2 and inducing rod and ring-like structures

BERT Classifies SARS-CoV-2 Variants

Enhancing PFI Prediction with GDS-MIL: A Graph-based Dual Stream MIL Approach

MiREx: mRNA levels prediction from gene sequence and miRNA target knowledge

Predicting gene and protein expression levels from DNA and protein sequences with Perceiver

Transformer-Based Approach to Melanoma Detection

A survey on data integration for multi-omics sample clustering

FusionFlow: an integrated system workflow for gene fusion detection in genomic samples

Identifying the oncogenic potential of gene fusions exploiting miRNAs

Predicting gene expression levels from DNA sequences and post-transcriptional information with transformers