Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

AveBoost2: Boosting for Noisy Data

Nikunj C. OzaNASA Ames Research Center

Date:  2004-10-05 at 16:00

Location:  EJ228  (Directions)

   Abstract

AdaBoost is a well-known ensemble learning algorithm that constructs its base models in sequence. AdaBoost constructs a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed with the goal of making the next base model's mistakes uncorrelated with those of the previous base model. We previously developed an algorithm, AveBoost, that first constructed a distribution the same way as AdaBoost but then averaged it with the previous models' distributions to create the next base model's distribution. Our experiments demonstrated the superior accuracy of this approach. In this paper, we slightly revise our algorithm to obtain non-trivial theoretical results: bounds on the training error and generalization error (difference between training and test error). Our averaging process has a regularizing effect which leads us to a worse training error bound for our algorithm than for AdaBoost but a better generalization error bound. This leads us to suspect that our new algorithm works better than AdaBoost on noisy data. For this paper, we experimented with the data that we used in both as originally supplied and with added label noise---some of the data has its original label changed randomly. Our algorithm's experimental performance improvement over AdaBoost is even greater on the noisy data than the original data.

   Bio for Nikunj C. Oza

Nikunj C. Oza received his B.S. in Mathematics with Computer Science from the Massachusetts Institute of Technology (MIT) in 1994. He received his M.S. (in 1998) and Ph.D. (in 2001) in Computer Science from the University of California at Berkeley. He then joined NASA Ames Research Center as a research scientist and is a member of the Data Mining and Complex Adaptive Systems group at NASA. His research interests include machine learning (especially ensemble learning and online learning), data mining, fault detection, and satellite image understanding.

   Note for Visitors to SRI

Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Builing E entrance signage.

SRI International
©2014 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy