Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)
PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.
-
Upload
gabriella-franklin -
Category
Documents
-
view
217 -
download
3
Transcript of PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.
PAC Learning
adapted from
Tom M.Mitchell
Carnegie Mellon University
Learning Issues
Under what conditions is successful learning
… possible ?
… assured for a particular learning algorithm ?
Sample Complexity
How many training examples are needed
… for a learner to converge (with high probability) to a successful hypothesis?
Computational Complexity
How much computational effort is needed
… for a learner to converge (with high probability) to a successful hypothesis?
The world
X is the sample space
Example: Two dice{(1,1),(1,2),…,(6,5),(6,6)}
x x xx
x x x
xx
x
xx
x
Weighted world
X is a distribution over X
Example: Biased dice{(1,1; p11),(1,2 ; p12),…,(6,5 ; p65),(6,6 ;
p66)}
xx xx
x xx
xx
x
xx
x
An event
E is a subset of X
Example: Two dice{(1,1),(1,2),…,(6,5),(6,6)}
x x xx
x x x
xx
x
xx
x
An event
E is a subset of X
Example: A pair in Two dice{(1,1),(2,2),(3,3),(4,4),(5,5),(6,6)}
x x xx
x x x
xx
x
xx
x
A Concept
C is an indicator function of an event E
Example: A pair in Two dicec(x,y) := (x==y)x x xx
x x x
xx
x
xx
x
A hypotesis
h is an approximation to a concept c
Example: A separating hyperplane
h(x,y) := (0.5).[1+sign(a.x+by+c)]
x x xx
x x x
xx
x
xx
x
The dataset
D is an i.i.d. sample from (X, )
{<xi,c(xi)>}i=1,…,m
m examples
An Inductive learner
L is an algorithm that uses data D to produce hH
Example: The Perceptron Algorithm
h(x,y) := (0.5).[1+sign(a(D).x+b(D).y+c(D))]
x x xx
x x x
xx
x
xx
x
Error Measures
Training error of hypothesis h
How often over training instances
True error of hypothesis h
How often over future random instances
True error
True error
Learnability
How to describe Learn-ability ?
the number of training examples needed to learn a hypothesis for
which = 0.
Infeasible Infeasible
PAC Learnability
Weaken demands on the learner
true error accuracy failure probability
and can be arbitrarily small
Probably Approximately Correct Probably Approximately Correct LearningLearning
PAC Learnability
C is PAC-learnable by L
true error < with probability (1-) after reasonable # of examples reasonable time per example
Reasonable polynomial in terms of 1/, 1/, n(size of
examples) and target concept encoding length
PAC Learnability
1)(Pr herrorD
C is PAC-Learnable
each target concept in C can be learned from a polynominal number of training examples
the processing time per example is also polynominal bounded
polynomial in terms of 1/, 1/, n (size of examples) and target c encoding length