TSMDA target and symptom-based computational model for miRNA-disease association prediction

Abstract: The emergence of high-throughput sequencing techniques has revealed a primary role of miRNAs in a wide range of diseases, including cancers and neurodegenerative disorders. Understanding novel relationships between miRNAs and diseases can potentially unveil complex pathogenesis mechanisms, leading to effective diagnosis and treatment. The investigation of novel miRNA-disease associations, however, is currently costly and time consuming. Over the years, several computational models have been proposed to prioritize potential miRNA-disease associations, however, with limited usability or predictive capability. In order to fill this gap, we introduce TSMDA, a novel machine learning method that leverages target and symptom information and negative sample selection to predict miRNA-disease association. TSMDA significantly outperforms similar methods, achieving an Area Under the ROC curve (AUC) of 0.989 and 0.982 under 5-fold cross-validation and blind test, respectively. We also demonstrate the capability of the method to uncover potential miRNA-disease associations in breast, prostate, and lung cancers, as case studies. We believe TSMDA will be an invaluable tool for the community to explore and prioritise potentially new miRNA-disease association for further experimental characterization.