Projekte pro Jahr
Abstract
AbstractIn this thesis, we are concerned with two major challenges in computationallearning theory: learning in the presence of large amounts of irrelevant information and learning from noisy data. We model the former issue by assumingthat the concepts to be learned depend only on few relevant attributes—suchconcepts are calledjuntas. The latter issue is modeled by a random noise processthat affects attribute and classification values. Thereby, we confine ourselves tostudying Boolean attributes and classifications. We approach the coincidence ofboth issues from two different perspectives.First, we investigate a specific greedy algorithm for finding the relevant attributes of a Boolean concept. This algorithm is very simple and successfullyused in practice. We provide a precise characterization of the concepts for whichthe greedy algorithm is successful. This characterization is based on a propertyof the Fourier spectrum of the concept under consideration. In addition, we showthat the algorithm can tolerate quite general attribute and classification noise.Second, we design and analyze Fourierbased algorithms with the explicit goalof efficiently learning large classes of juntas from uniformly distributed noisecorrupted examples. It turns out that the Fourier method and an extension ofthe greedy method are capable of learning exactly the same concept classes withequal efficiency. We extend the Fourier approach to nonuniformly distributedexamples and prove that monotone juntas and parity functions with few relevantvariables can be efficiently learned in this setting.Both approaches are inefficient in learning the class of parity juntas fromuniformly distributed noisy examples. For this task, we propose an alternativemethod, which is based on the method of minimizing the disagreements between the learning data and the output hypothesis. While the running time ofthis method is also very high, it works independently of the input distribution.
Furthermore, we prove lower bounds on the sample size that is necessary forsuccessful learning of parity juntas from noisy data in terms of certain noiseparameters.As a sideproduct, we prove a characterization of general learnability fromnoiseaffected uniformly distributed examples.
Furthermore, we prove lower bounds on the sample size that is necessary forsuccessful learning of parity juntas from noisy data in terms of certain noiseparameters.As a sideproduct, we prove a characterization of general learnability fromnoiseaffected uniformly distributed examples.
Originalsprache  Englisch 

Qualifikation  Doctorate 
Gradverleihende Hochschule  
Betreuer/in / Berater/in 

Datum der Vergabe  08.12.2006 
Publikationsstatus  Veröffentlicht  01.08.2006 
Projekte
 1 Abgeschlossen

Robuste Lernverfahren und Datenkomprimierung
Reischuk, R.
01.01.04 → 31.12.08
Projekt: DFGProjekte › DFG Einzelförderungen