IEEE VIS 2024 Content: Inside an interpretable-by-design machine learning model: enabling RNA splicing rational design

Inside an interpretable-by-design machine learning model: enabling RNA splicing rational design

Mateus Silva Aragao - New York University, New York, United States

Shiwen Zhu - New York University, New York, United States

Nhi Nguyen - New York University, New York, United States

Alejandro Garcia - University of Pennsylvania, Philadelphia, United States

Susan Elizabeth Liao - New York University, New York, United States

Room: Bayshore I

2024-10-13T12:30:00ZGMT-0600Change your timezone on the schedule page
2024-10-13T12:30:00Z
Abstract

Deciphering the regulatory logic of RNA splicing, a critical process in genome function, remains a major challenge in modern biology. While various machine learning models have been proposed to address this issue, many of them fall short in terms of interpretability, unable to articulate how they arrive at their predictions. We recently introduced an interpretable machine learning model that predicts splicing outcomes based on input sequence and structure. Here, we present a series of interactive data visualization tools to illuminate the process behind the network's predictions. Specifically, we introduce visualizations that emphasize both the global and local interpretability of our model. These visualizations emphasize the clear intermediate reasoning stages of our model that trace how specific RNA features contribute to the final splicing prediction. We highlight how these visualizations can be used to explain the network’s performance on prior training and validation datasets. Finally, we explore how these interactive visualizations can be harnessed to facilitate domain-specific applications, such as rational design of RNA sequences with desired splicing outcomes. Together, these visualizations highlight the role of data visualization and interactivity in enhancing machine learning interpretability and model adoption.