Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Machine Learning with R
Joshua Reich
josh@i2pi.com
April 2, 2009
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Why R?
How to find out about stuff?
What is Machine Learning?
Show me the money
Learn More
Questions
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
ML Alternatives
• Matlab
• Weka
• Python
• Stand alone (e.g. Vowpal Wabbit)
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
”The best thing about R is that it was developed by statisticians.
The worst thing about R is that it was developed by statisticians.”
–Bo Cowgill, Google (at SF R Meetup)
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Why R?
• Working with the CLI - iterative discovery
• Integrated graphics
• Community supported packages (CRAN)
• ODBC Integration
• You already use it
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
How to find out about stuff?
• ?function
• help.search("search string")
• RSiteSearch("search string")
• http://rseek.org/
• names(object) or attributes(object)
• > kmeans
function (x, centers, iter.max = 10, nstart = 1,
algorithm = c("Hartigan-Wong",
"Lloyd", "Forgy", "MacQueen"))
...
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
What is Machine Learning?
Statistics Machine Learning
Probability Model Learning Model
Observations Observations
Estimation Training
MLE Optimization
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Semantics v. Pragmatics
• For most statistical models there are either closed form or
quick numerical approximations for finding model properties -
e.g., confidence intervals. Assuming you believe that your
data generating process is accurately captured by your model,
then you can make direct statements about unseen events.
• Machine learning is a close cousin to non-parametric
techniques and relies on training/testing/validation cycles,
bootstrapping and cross-validation to determine measures of
reliability.
But invariably, simple models and a lot of data trump more
elaborate models based on less data
–Halevy, Norvig & Pereira.
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Inductive Bias
In the days when Sussman was a novice, Minsky once
came to him as he sat hacking at the PDP-6.
”What are you doing?”, asked Minsky.
”I am training a randomly wired neural net to play
Tic-tac-toe”, Sussman replied.
”Why is the net wired randomly?”, asked Minsky.
”I do not want it to have any preconceptions of how to
play”, Sussman said.
Minsky then shut his eyes.
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Inductive Bias
”Why do you close your eyes?” Sussman asked his
teacher.
”So that the room will be empty.”
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
What is Machine Learning?
• Regression vs. Classification
Y ∈ Rp
vs.
Y ∈ {Y1,Y2, . . . ,YN}
• Supervised vs. Unsupervised Learning
Y = f (X )
vs.
X1,X2, . . . ,XN
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
General Supervised Learning Framework
(Ch 7 of ElemStatLearn)
• Training / Validation / Test
• Variance - Bias Decomposition: Overfitting
• Feature Selection / Regularization
• Bootstrapping / Cross-Validation
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
What we will walk through
• K-Means clustering: kmeans()
• K-Nearest Neighbours: knn()
• Regression Trees: rpart()
• Improving trees with PCA: princomp()
• Linear Discriminant Analysis: lda()
• Support Vector Machines: svm()
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
The Problem
−5 0 5 10
−
10
−
5
0
5
10
15
x
y
A
AA
A
A
A
A
AA
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
A
A A AAA
A
A
A
A
AA
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
AAA
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
AA
A
A
A
A
A A
A
A A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
AA
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
AA A
A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
AA A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A A
A
A
A
AA
A
A
A
AA
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
AA
A
A
A
A
AA
AAA
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
AA
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A A
A
A
A
A
A
A
A
A
A
AAA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AAA
AA
A
A
A
AA
A
A
A
A
AA
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
AA
A
A
A
AA
A
A
A
A
AA
A
A
A
A
A
A
A
A A
A
A
A
A
A
AA
A
A
A A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
AAAA
A
A
A
AAA
A
A
AA
A
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
AA
A AA
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
AA
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A A
A
AA
A
A
A
A
A
AA
A A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AAA
A
A
A
A
A
A
A
A
A A
A
A
A
A
A
A
A
A
A
A
A
AAA
A
A
A
A
A
AA
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA A
AA
A
A
AA
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
AA AA
A
A
A
A
A
A
AA
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
B BB
B
B BB
B
BB
B
B
B
BBB BB
B
BB
B B
B
B
BBB
B
BB
B
B
B
B
B B
B B
B
B B B
B
BB
B
B
B
B
BBB BBB B B
B
BB
B B
B
BB B
B
B
BB B BB
B B B
B
B
BB
B
B B BBB BB
BB
B BB
B
BBB
BB B
B
B B
BBB
B B B
B
B
B B
BB
B
B
BB
BBB B
B B
B
B
B
B
B BB
BB
B
BB BB B
B
B
BBB
BB
B
B
B
B
B
B BB BB B
BB B
BB B
B
B BB
BB B
B
B B
B
B
B
B
B
B
B
BB
BB
B
B
B BB
B BB BBB BB
B
B
B
BB BB
B
BBB B
B
B
B
B BBBB B
BB
B
BB B
B
B
B
B
B
B B B
B BB BBBB B
BB BB
B
B B
B
B BBB B
BB B
BB
B
BBBB
BB B
B
B
BB BB
B
B
BB BB
BB B
B
B
B
B
B
BB B
BB
B
B BBB
B
B
BB
BBB
B
BB
B
BB BB
B
B B
B
B
BB
B
BBBB B
B
BBB
B
BB
B B
B
BB
B
B BB
B
B BB B
B
B
B
B BB
B
BB
BB
B
BB BBBB BBB B B
B
B
BBBB
B
B
B B
B
BBB
B BB
B BB BBB BBB B BB B
B
B
BB B
BB B
B
BBBB BB
B
B
B B
B
BB
BBB B
B
B BB
BB B
BB
B
B
B
B
B
BB
B B
B B BBBB
B
B
B BB
BBB
B
B
BB
B
B
B B
BBB
B B BBB
B BB
B
B BB B B
B
BBBB
B
B BB
B
B
B
B
BB BB BB BB
B BB
B
B
B B
B
B
BBB B
B
B
B
B
B BBB
B
B
BB
B
B
BB BBB B B
B
B
BB BB B
BB
BB
B
B
B B B
B BBBBB
BB
BB
B
B
B B
B
BBB
B
B
BB
B
B
BBB BB
B
BB
B
B
B
BB
B
BB
B
B BBB
B
B
B
B
B
B
B B B
B
BBBB
BBB B B
B
B
BBB
B BB B
B
B
BB B
B
B
BBBB BB
B
B
B
B
B
B BB
B
BBB
B
BB
B
BB
B
B BB
B
B B
B
BBB B
B
B
B
B
B
B BBB
B
BB
B
BBBB B
B
B
B
B
B
B
B
BBB
BBB
B
B
B
BBB
B BB
B
BBB
B B
B
BB B
B
B
B
B
B
B
B
B
BB
B
BBB B
BB
B
BB
B
B
B
B B
B
B
B
BBB
B B BBB
BB
B
B
B B
B
B
B
B B BB B
B
B
B
B
B
B B
B
B
BB
B
BB
B
BB
B
B
B BB BBBB B
BBBB B B
B
B
B BB B
B
B
BB
B
B BB
B
B BBB
BB
BB
B B
B B BBB BB BB B
BB B
B
B
B BBB B
BBB
B
B
B
BB B
B BB BBBB
B
B
B
BBB
B
B
BB
BB
B
B
B
BB BB
B B
B BBBB
B
B BB
BB
B
B
B
BB
BB
BB BB B
B BBB BB B
B
BB
BB
B BB
B
B B
B
B
B BBB
B
BB BBB
B
B
B
B BB
B
B B
B
B
B
B
B B
B
B
B
BB
B
B
BBB
B
BB B
B
B
B
BBBB
B
B
B
BB
B
B BB
B
B B BBB B
B
B
B
B
B
B
B
B
B
B
C
C
CC
C C
CC
C
C
C
C C C
C
CC C
C
C
CC
C C
CC C
C
CC CC C C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
CC C
C CC
C
C
CC C
C
C
C
C
CC
C
C C
C
C
CC C
C
C
CCC
C
C
C
C
C
CC
C
CC
C
CC CC CC
C
C
C
CC CC
C
CC
CC
C
C
CC C
CC
C
C
CC
C
CCC
C C
C
C
C
C
C C
C
C
C
C
C
C
C
C
C CCCC
C
C
CC C
C
C
C
C
C
CC
C
C
C
C
C
C
C
CCC
CC
C
C
C
C
C C
C
C
C
C
C
C CC CC
CC
C
C
C
C
C
C
C
C
CC
C C
C
C
C C
C CC
CCC CCC C
C C C
CC CC
C
C
CC C
C
C C
C C
C
C
C
C
CC
C
C
C
CCC
CC
CC
CC
CC
C CC CC
C
C
C
C
C
C C
C
C CC
C C
C
C
C
C
C
C C
C
C
C
CCC
C
C
CC
C
C
C
C
C
C C CC C
C
CC
C
C
C CC
CC
C
CC
C
C
CC
C
C
C
C
C
C
CC
C
C
C
C
C
CCC C
C
CC
C
C C
C
C
C CC C
C
CC
C
C
C CC CCC
C
C C
C C
C CC
C
CC
C CC
C C C
C
C
CC CC
C
C
C
C
CCC
C
C
C
C
CC C
C
C
C
C
C
C
C
C
C
CC CCC
C
C
C
C
C
CC
CC
C
C
C
CC C
C
CC
C
C
C
C C
C
C
C
CC
C
CC
C
C
C C
C
CC C
C
C CC
C
C C
C
C
CC
C C
C
CC
C
C
CC
C
C
C
C
C
C
CC C
C
CC
C C
CC
C
C
C
CC
C
C
C CC
C C
CC
C
CC
C
CC
C
C
C
C
C
CCC
CC
C
C
C
C
C
CC
C
C
C CC
CC
C CCCC
C
C
C
C
C
C
C
C
C C
C
C
C C
C
C
C CC
C
C C
C
C
C
C C
C
C C
C CC C
C
C C
CC
C
C
C
CC
CC
C
CC C
CC C
C
CC C
CC
C
CC
C
C
C
C
C
C
C C
C
CC CC
C
C
CC
C
C
CC
C
C
C
C
C
C
C
C
CCC
C
C C
C
C
C
CCC CC
CC C
C
C
C
C
C
C CC
C
CC
C
C C
C C CC CC
C
C
C
C
C
C
C
C
C
C
C
C C
C
C
CC
C
C
C CC
C
C
C
C
C CC
C
C
C C
C
CC
C
C
C
CCC
CC
C
CC
C
C
C
C
C
C
C C
C C
C
C
C
C C
C C
C
CC
CC
C
CCC
C
C
CC CC
C
C
CC
C C
C CC
C
C
C C
C
C
C
C
CC CCCC
C
C
C
C C
C
CC CC
C C
C
C
C
C C C
C
C C
CC
C
C
C
C
C
CC
CC
C
CC
C
CC C
C
C
CC
CCC
C C
C C
C
C C
C
C
C
C
C C
C
C
C
C
C
C
C
C
C
C
C
CC
C
C
C C
CC
C
C
CC C
C
CC
C
C CCC C
C CCC
C
CC C CC CC
CCC
C
C C
C
C
C
CC CC
C
C CC
C CCC
C
C C
C C
CC
C C
C
C
C
CC
C
C CC
C
C
C
C
CC
C
C
C
C
C
CC C
CC
C C
C
C
CC CC
C
C
CCCCCC C
CC
C
C CC C
C
C CC
C
CCCC
C
C
C
C
C C
C
CC
C
C CCC
CCCC C
C C
C
C
CC C
CC CC
C
C
CCC
C
C
C
C
C
C
CC
CC
C C
CCC
C
C
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Machine Learning More
• MachineLearning CRAN view
http://cran.r-project.org/web/views/MachineLearning.html
• The caret package is a good one.
• Elements of Statistical Learning
http://www-stat.stanford.edu/ tibs/ElemStatLearn/
• Machine Learning (Mitchell)
http://www.cs.cmu.edu/ tom/mlbook.html
• Video Lectures
http://videolectures.net/
Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
Questions?
Outline
Why R?
How to find out about stuff?
What is Machine Learning?
Show me the money
Learn More
Questions
本文档为【machine learning with r】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。