MENU

In silico saturation mutagenesis of cancer genes

Muinos, Ferran; Martinez-Jimenez, Francisco; Pich, Oriol; Gonzalez-Perez, Abel; Lopez-Bigas, Nuria

NATURE
2021
VL / 596 - BP / 428 - EP / +
abstract
Despite the existence of good catalogues of cancer genes(1,2), identifying the specific mutations of those genes that drive tumorigenesis across tumour types is still a largely unsolved problem. As a result, most mutations identified in cancer genes across tumours are of unknown significance to tumorigenesis(3). We propose that the mutations observed in thousands of tumours-natural experiments testing their oncogenic potential replicated across individuals and tissues-can be exploited to solve this problem. From these mutations, features that describe the mechanism of tumorigenesis of each cancer gene and tissue may be computed and used to build machine learning models that encapsulate these mechanisms. Here we demonstrate the feasibility of this solution by building and validating 185 gene-tissue-specific machine learning models that outperform experimental saturation mutagenesis in the identification of driver and passenger mutations. The models and their assessment of each mutation are designed to be interpretable, thus avoiding a black-box prediction device. Using these models, we outline the blueprints of potential driver mutations in cancer genes, and demonstrate the role of mutation probability in shaping the landscape of observed driver mutations. These blueprints will support the interpretation of newly sequenced tumours in patients and the study of the mechanisms of tumorigenesis of cancer genes across tissues.

AccesS level

Green submitted

MENTIONS DATA