Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics

Anne Laure Boulesteix*, Silke Janitza, Jochen Kruppa, Inke R. König

*Korrespondierende/r Autor/-in für diese Arbeit

Abstract

The random forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and return measures of variable importance. This paper synthesizes 10 years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is paid to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of themethodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research.

OriginalspracheEnglisch
ZeitschriftWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Jahrgang2
Ausgabenummer6
Seiten (von - bis)493-507
Seitenumfang15
ISSN1942-4787
DOIs
PublikationsstatusVeröffentlicht - 11.2012

UN SDGs

Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung

  1. SDG 3 – Gesundheit und Wohlergehen
    SDG 3 – Gesundheit und Wohlergehen

Fingerprint

Untersuchen Sie die Forschungsthemen von „Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren