PhD in Informatics Seminar #1 2021/2022 | DI Ciências ULisboa

Title: Deep learning interpretation to understand RNA splicing
Speaker: Pedro Barbosa, LASIGE/DI-FCUL
Date: March 3, 12h
Where: Room C1.3.14

Abstract: Human cells have highly precise mechanisms to regulate fundamental molecular processes, such as RNA splicing. In splicing, specific signals embedded in RNA sequences are recognized, yet the way these signals are combined to produce a given phenotype are poorly understood. This is particularly relevant in a disease context, where a proper assessment of the impact of genetic variants on splicing has large clinical implications.

In this work, we use a performant deep learning model, SpliceAI, to study these mechanisms. SpliceAI is an opaque system that lacks explanatory value for the target audience. Hence, we aim to explore new strategies to interpret what SpliceAI learns. To accomplish this goal, we generated curated datasets deemed to be relevant for the task by finding regions in the genome whose splicing is affected by the removal of individual variables (aka splicing factors). Afterwards, we used these datasets to evaluate whether SpliceAI performance is not driven by underlying biases. We applied sensitivity analysis to the network and found that SpliceAI recapitulates previously known biological effects, reinforcing the potential of using these black-box models as a source of knowledge. These results lay the foundations for the development of a new explainable AI solution that decodes what SpliceAI learns into biologically interpretable expressions.