Abstract
RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.
Citations
-
1 5
CrossRef
-
0
Web of Science
-
1 2
Scopus
Authors (3)
Cite as
Full text
- Publication version
- Accepted or Published Version
- License
- open in new tab
Keywords
Details
- Category:
- Articles
- Type:
- artykuły w czasopismach
- Published in:
-
Scientific Reports
no. 8,
pages 1 - 10,
ISSN: 2045-2322 - Language:
- English
- Publication year:
- 2018
- Bibliographic description:
- Jain D., Gupte S., Aduri R.: A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine// Scientific Reports -Vol. 8,iss. 1 (2018), s.1-10
- DOI:
- Digital Object Identifier (open in new tab) 10.1038/s41598-018-27814-2
- Verified by:
- Gdańsk University of Technology
seen 119 times
Recommended for you
Explainable machine learning for diffraction patterns
- S. Nawaz,
- V. Rahmani,
- D. Pennicard
- + 3 authors
Machine Learning and Deep Learning Methods for Fast and Accurate Assessment of Transthoracic Echocardiogram Image Quality
- W. Nazar,
- K. Nazar,
- L. Daniłowicz-Szymanowicz
MP3vec: A Reusable Machine-Constructed Feature Representation for Protein Sequences
- S. R. Gupte,
- D. S. Jain,
- A. Srinivasan
- + 1 authors