Ensemble and Boosting
Presented by Lunjun Wan
University of Science & Technology of China
2
An ensemble is a group of learning models solving problems jointly
Ensemble
3
Ensemble Design
Individual precision
Diversity
ACC = 90%
ACC = 90%
ACC = 90%
classifier 1
classifier 2
classifier 3
3 similar classifiers
ACC = 50%
ACC = 50%
ACC = 50%
classifier 1
classifier 2
classifier 3
3 random classifiers
ACC = 90%
ACC = 90%
ACC = 90%
classifier 1
classifier 2
classifier 3
3 similar classifiers
ACC = 67%
ACC = 67%
ACC = 67%
classifier 1
classifier 2
classifier 3
3 perfectly diverse classifiers
Handling the accuracy/diversity trade-off is crucial
4
Boosting(Y. Freund, 1995)
Bagging(L. Breiman, 1996)
Negative Correlation Learning (X. Yao, 1996)
Random Forest (L. Breiman, 2001)
……
Ensemble Methods
5
Boosting
Training Sample
classifier
C(0)(x)
Weighted
Sample
re-weight
classifier
C(1)(x)
Weighted
Sample
re-weight
classifier
C(2)(x)
Weighted
Sample
re-weight
Weighted
Sample
re-weight
classifier
C(3)(x)
classifier
C(m)(x)
ClassifierN
(i)
i
i
y(x) w C (x)
6
AdaBoost
Training Sample
classifier
C(0)(x)
Weighted
Sample
re-weight
classifier
C(1)(x)
Weighted
Sample
re-weight
classifier
C(2)(x)
Weighted
Sample
re-weight
Weighted
Sample
re-weight
classifier
C(3)(x)
classifier
C(m)(x)
err
err
err
1 f
with :
f
misclassified events
f
all events
ClassifierN (i)
(i)err
(i)
i err
1 f
y(x) log C (x)
f
AdaBoost re-weights events misclassified
by previous classifier by:
AdaBoost weights the classifiers also
using the error rate of the individual
classifier according to:
7
AdaBoost
1. Start with weights 𝑤𝑖 = 1/𝑁 𝑖 = 1,… , 𝑁
2. Repeat for m=1,2,…,M:
a) Fit the classifier 𝐺𝑚 𝑥 ∈ −1,1 using
weights 𝑤𝑖 on the training data
b) Compute 𝑒𝑟𝑟𝑚 = 𝐸𝑤 1𝑦≠𝐺𝑚 𝑥 =
𝑖=1
𝑁 𝑤𝑖𝐼(𝑦≠𝐺𝑚(𝑥𝑖))
𝑖=1
𝑁 𝑤𝑖
,
c) Compute 𝛼𝑚 = log
𝑙−𝑒𝑟𝑟𝑚
𝑒𝑟𝑟𝑚
d) 𝑤𝑖← 𝑤𝑖 ∗ exp 𝛼𝑚𝐼 𝑦 ≠ 𝐺𝑚 𝑥𝑖 , 𝑖 =
1,2,… ,𝑁
3. 𝐺 𝑥 = 𝑠𝑖𝑔𝑛[ 𝑚=1
𝑀 𝛼𝑚𝐺𝑚(𝑥)]
8
AdaBoost fits an additive model in a
forward stagewise manner.
“A Statistical View of Boosting” (Friedman et.al 2000)
9
Additive models
Suppose we have M basic functions
𝐹 𝑥 =
𝑚=1
𝑀
𝛽𝑚𝑏(𝑥; 𝛾𝑚)
Select specific loss function 𝐿(𝑦, 𝑓 𝑥 )
min
{𝛽𝑚,𝛾𝑚} 1
𝑀
𝑖=1
𝑁
𝐿(𝑦𝑖 ,
𝑚=1
𝑀
𝛽𝑚𝑏(𝑥𝑖; 𝛾𝑚))
It’s very hard to solve
10
Additive models
But, if we use a single basis function once a time, then
Much easier
min
𝛽,𝛾
𝑖=1
𝑁
𝐿(𝑦𝑖 , 𝛽𝑏(𝑥𝑖; 𝛾))
1. Start with 𝑓0 𝑥 ← 0
2. Repeat for m=1,2,…,M:
a) 𝛽𝑚, 𝛾𝑚 =
argmin
𝛽,𝛾
𝑖=1
𝑁 𝐿(𝑦𝑖 , 𝑓𝑚−1 𝑥𝑖 + 𝛽𝑏(𝑥𝑖; 𝛾) )
b) 𝑓𝑚 𝑥 = 𝑓𝑚−1 𝑥 + 𝛽𝑚𝑏(𝑥𝑖; 𝛾𝑚)
Forward Stagewise Additive Method
11
𝐿 𝑦, 𝑓 𝑥 = exp(−𝑦𝑓(𝑥))
If we take exponential loss like
use 𝐺𝑚 𝑥 ∈ {−1,1} from AdaBoost as the basis function,
rewrite
Boosting fits an additive model (1)
1. Start with 𝑓0 𝑥 ← 0
2. Repeat for m=1,2,…,M:
a) 𝛽𝑚, 𝐺𝑚 =
argmin
𝛽,𝐺
𝑖=1
𝑁 exp[−𝑦𝑖(𝑓𝑚−1 𝑥𝑖 + 𝛽𝐺(𝑥𝑖))]
b) 𝑓𝑚 𝑥 = 𝑓𝑚−1 𝑥 + 𝛽𝑚𝐺𝑚(𝑥)
Forward Stagewise Additive Method
12
Define 𝑤𝑖
𝑚 = exp(−𝑦𝑖𝑓𝑚−1(𝑥𝑖))
Boosting fits an additive model (2)
𝛽𝑚, 𝐺𝑚 = argmin
𝛽,𝐺
𝑖=1
𝑁 exp[−𝑦𝑖(𝑓𝑚−1 𝑥𝑖 + 𝛽𝐺(𝑥𝑖))]
= argmin
𝛽,𝐺
(𝑒𝛽−𝑒−𝛽)
𝑖=1
𝑁
𝑤𝑖
𝑚I(𝑦𝑖 ≠ 𝐺 𝑥𝑖 ) + 𝑒
−𝛽
𝑖=1
𝑁
𝑤𝑖
𝑚
= argmin
𝛽,𝐺
𝑒−𝛽
𝑦𝑖=𝐺(𝑥𝑖)
𝑤𝑖
𝑚 + 𝑒𝛽
𝑦𝑖≠𝐺(𝑥𝑖)
𝑤𝑖
𝑚
= argmin
𝛽,𝐺
𝑖=1
𝑁 𝑤𝑖
𝑚exp[−𝑦𝑖𝛽𝐺(𝑥𝑖))]
13
Boosting fits an additive model (3)
𝐺𝑚 = argmin
𝐺
𝑖=1
𝑁
𝑤𝑖
𝑚𝐼(𝑦𝑖 ≠ 𝐺(𝑥𝑖))
𝛽𝑚 =
1
2
log
1 − 𝑒𝑟𝑟𝑚
𝑒𝑟𝑟𝑚
𝑒𝑟𝑟𝑚 =
𝑖=1
𝑁 𝑤𝑖𝐼(𝑦 ≠ 𝐺𝑚(𝑥𝑖))
𝑖=1
𝑁 𝑤𝑖
𝑤𝑖
𝑚+1 = exp −𝑦𝑖𝑓𝑚 𝑥𝑖 = exp −𝑦𝑖 𝑓𝑚−1 𝑥𝑖 + 𝛽𝑚𝐺𝑚 𝑥𝑖
= 𝑤𝑖
𝑚exp −𝑦𝑖𝛽𝑚𝐺𝑚 𝑥𝑖 = 𝑤𝑖
𝑚exp 𝛼𝑚𝐼 𝑦𝑖 ≠ 𝐺𝑚 𝑥𝑖 ∗ 𝑒
−𝛽𝑚
where 𝛼𝑚 = 2𝛽𝑚 𝑛𝑜𝑡𝑖𝑐𝑒: −𝑦𝑖𝐺𝑚 𝑥𝑖 = 2𝐼 𝑦𝑖 ≠ 𝐺𝑚 𝑥𝑖 − 1
AdaBoost builds an additive model for minimizing exp(−𝑦𝑓(𝑥))
14
Gradient Boosting Machine (1)
Considering gradient descent method
using {𝑥𝑖 , 𝑔𝑚(𝑥𝑖)} to fit the gradient over data distribution
15
Distribution Loss Function −𝝏𝑳(𝒚𝒊, 𝑭(𝒙𝒊))/𝝏𝑭(𝒙𝒊)
Least-squares
1
2
[𝑦𝑖 − 𝐹(𝑥𝑖)]
2 𝑦𝑖 − 𝐹(𝑥𝑖)
LAD |𝑦𝑖 − 𝐹(𝑥𝑖)| 𝑠𝑖𝑔𝑛[𝑦𝑖 − 𝐹(𝑥𝑖)]
Binominal
log-likelihood
log(1 + exp(−2𝑦𝑖𝐹(𝑥𝑖)))
2𝑦𝑖
1 + exp 2𝑦𝑖𝐹 𝑥𝑖
AdaBoost exp(−𝑦𝑖𝐹(𝑥𝑖)) 𝑦𝑖exp(−𝑦𝑖𝐹(𝑥𝑖))
Gradient Boost is a way to implement “boosting” with
arbitrary “loss functions” by approximating “somehow”
the gradient of the loss function
Loss functions
Gradient Boosting Machine (2)
本文档为【Ensemble and Boosting】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。