Abstrakt
—Machine Learning (ML) methods have been used with varying degrees of success on protein prediction tasks, with two inherent limitations. First, prediction performance often depends upon the features extracted from the proteins. Second, experimental data may be insufficient to construct reliable ML models. Here we introduce MP3vec, a transferable representation for protein sequences that is designed to be used specifically for sequence-to-sequence learning tasks. We use transfer learning to generate the MP3vecs by training a deep neural network on the source problem of protein secondary structure prediction, and then extracting representations learned by the trained network for use in related downstream prediction tasks. ML methods using MP3vecs perform as well as the state-of-the-art (or better) on the target problems, while being orders of magnitude faster in terms of training time. We suggest that MP3vec can act as a strong baseline for comparative work on the use of ML in protein-prediction tasks; and for future extensions with domainspecific features.
Cytowania
-
0
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (4)
Cytuj jako
Pełna treść
pełna treść publikacji nie jest dostępna w portalu
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Aktywność konferencyjna
- Typ:
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Tytuł wydania:
- 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) strony 421 - 425
- Język:
- angielski
- Rok wydania:
- 2020
- Opis bibliograficzny:
- Gupte S. R., Jain D. S., Srinivasan A., Aduri R.: MP3vec: A Reusable Machine-Constructed Feature Representation for Protein Sequences// 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)/ : , 2020, s.421-425
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1109/bibm49941.2020.9313301
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 83 razy
Publikacje, które mogą cię zainteresować
Impact of AlphaFold on structure prediction of protein complexes: The CASP15‐CAPRI experiment
- M. F. Lensink,
- G. Brysbaert,
- N. Raouraoua
- + 110 autorów
Defining a novel domain that provides an essential contribution to site-specific interaction of Rep protein with DNA
- K. Wegrzyn,
- E. Zabrocka,
- K. Bury
- + 11 autorów
An Intelligent Approach to Short-Term Wind Power Prediction Using Deep Neural Networks
- T. Niksa-Rynkiewicz,
- P. Stomma,
- A. Witkowska
- + 5 autorów