Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

SPRED: A machine learning approach for the identification of classical and non-classical secretory proteins in mammalian genomes

Krishna Kumar Kandaswamy, Ganesan Pugalenthi, Enno Hartmann, Kai Uwe Kalies, Steffen Möller, P. N. Suganthan, Thomas Martinetz*

*Korrespondierende/r Autor/-in für diese Arbeit

Abstract

Eukaryotic protein secretion generally occurs via the classical secretory pathway that traverses the ER and Golgi apparatus. Secreted proteins usually contain a signal sequence with all the essential information required to target them for secretion. However, some proteins like fibroblast growth factors (FGF-1, FGF-2), interleukins (IL-1 alpha, IL-1 beta), galectins and thioredoxin are exported by an alternative pathway. This is known as leaderless or non-classical secretion and works without a signal sequence. Most computational methods for the identification of secretory proteins use the signal peptide as indicator and are therefore not able to identify substrates of non-classical secretion. In this work, we report a random forest method, SPRED, to identify secretory proteins from protein sequences irrespective of N-terminal signal peptides, thus allowing also correct classification of non-classical secretory proteins. Training was performed on a dataset containing 600 extracellular proteins and 600 cytoplasmic and/or nuclear proteins. The algorithm was tested on 180 extracellular proteins and 1380 cytoplasmic and/or nuclear proteins. We obtained 85.92% accuracy from training and 82.18% accuracy from testing. Since SPRED does not use N-terminal signals, it can detect non-classical secreted proteins by filtering those secreted proteins with an N-terminal signal by using SignalP. SPRED predicted 15 out of 19 experimentally verified non-classical secretory proteins. By scanning the entire human proteome we identified 566 protein sequences potentially undergoing non-classical secretion. The dataset and standalone version of the SPRED software is available at http://www.inb.uni-luebeck.de/tools-demos/spred/spred.

OriginalspracheEnglisch
ZeitschriftBiochemical and Biophysical Research Communications
Jahrgang391
Ausgabenummer3
Seiten (von - bis)1306-1311
Seitenumfang6
ISSN0006-291X
DOIs
PublikationsstatusVeröffentlicht - 15.01.2010

UN SDGs

Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung

  1. SDG 3 – Gesundheit und Wohlergehen
    SDG 3 – Gesundheit und Wohlergehen
  2. SDG 9 – Industrie, Innovation und Infrastruktur
    SDG 9 – Industrie, Innovation und Infrastruktur

Fingerprint

Untersuchen Sie die Forschungsthemen von „SPRED: A machine learning approach for the identification of classical and non-classical secretory proteins in mammalian genomes“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren