App Forum 2015 Deciding your next step in application development for Android
Deciding what to try next
description
Transcript of Deciding what to try next
Advice for applying machine learning
Deciding what to try next
Machine Learning
Andrew Ng
Debugging a learning algorithm:Suppose you have implemented regularized linear regression to predict housing prices.
However, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?
- Get more training examples- Try smaller sets of features- Try getting additional features- Try adding polynomial features- Try decreasing- Try increasing
Andrew Ng
Machine learning diagnostic:Diagnostic: A test that you can run to gain insight what is/isn’t working with a learning algorithm, and gain guidance as to how best to improve its performance.
Diagnostics can take time to implement, but doing so can be a very good use of your time.
Advice for applying machine learning
Evaluating a hypothesis
Machine Learning
Andrew Ng
Evaluating your hypothesis
Fails to generalize to new examples not in training set.
size
pric
e
size of houseno. of bedroomsno. of floorsage of house
kitchen sizeaverage income in neighborhood
Andrew Ng
Evaluating your hypothesisDataset:
Size Price2104 4001600 3302400 3691416 2323000 5401985 3001534 3151427 1991380 2121494 243
Andrew Ng
Training/testing procedure for linear regression
- Learn parameter from training data (minimizing training error )
- Compute test set error:
Andrew Ng
Training/testing procedure for logistic regression- Learn parameter from training data- Compute test set error:
- Misclassification error (0/1 misclassification error):
Advice for applying machine learning
Model selection and training/validation/test sets
Machine Learning
Andrew Ng
Overfitting example
size
pric
e Once parameterswere fit to some set of data (training set), the error of the parameters as measured on that data (the training error xxxxx) is likely to be lower than the actual generalization error.
Andrew Ng
Model selection1.2.3.
10.
ChooseHow well does the model generalize? Report test set error .Problem: is likely to be an optimistic estimate of generalization error. I.e. our extra parameter ( = degree of polynomial) is fit to test set.
Andrew Ng
Evaluating your hypothesisDataset:
Size Price2104 4001600 3302400 3691416 2323000 5401985 3001534 3151427 1991380 2121494 243
Andrew Ng
Train/validation/test errorTraining error:
Cross Validation error:
Test error:
Andrew Ng
Model selection
1.2.3.
10.
PickEstimate generalization error for test set
Advice for applying machine learning
Diagnosing bias vs. variance
Machine Learning
Andrew Ng
Bias/variance
High bias(underfit)
“Just right” High variance(overfit)
Pric
e
Size
Pric
e
Size
Pric
e
Size
Andrew Ng
Bias/variance
degree of polynomial d
erro
r
Training error:
Cross validation error:
Andrew Ng
Diagnosing bias vs. variance
degree of polynomial d
erro
r
Suppose your learning algorithm is performing less well than you were hoping. ( or is high.) Is it a bias problem or a variance problem?
(cross validation error)
(training error)
Bias (underfit):
Variance (overfit):
Advice for applying machine learning
Regularization and bias/variance
Machine Learning
Andrew Ng
Linear regression with regularization
Large xxHigh bias (underfit)
Intermediate xx“Just right”
Small xxHigh variance (overfit)
Model:Pr
ice
Size
Pric
e
Size
Pric
e
Size
Andrew Ng
Choosing the regularization parameter
Andrew Ng
1. Try2. Try3. Try4. Try5. Try
12. Try
Model:
Choosing the regularization parameter
Pick (say) . Test error:
Andrew Ng
Bias/variance as a function of the regularization parameter
Advice for applying machine learning
Learning curves
Machine Learning
Andrew Ng
Learning curves
(training set size)
erro
r
Andrew Ng
High bias
(training set size)
erro
r
size
pric
e
size
pric
e
If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much.
Andrew Ng
High variance
(training set size)
erro
r
size
pric
e
size
pric
e
If a learning algorithm is suffering from high variance, getting more training data is likely to help.
(and small )
Advice for applying machine learning
Deciding what to try next (revisited)
Machine Learning
Andrew Ng
Debugging a learning algorithm:Suppose you have implemented regularized linear regression to predict housing prices. However, when you test your hypothesis in a new set of houses, you find that it makes unacceptably large errors in its prediction. What should you try next?
- Get more training examples- Try smaller sets of features- Try getting additional features- Try adding polynomial features- Try decreasing- Try increasing
Andrew Ng
Neural networks and overfitting“Small” neural network
(fewer parameters; more prone to underfitting)
“Large” neural network(more parameters; more prone
to overfitting)
Computationally more expensive.
Use regularization ( ) to address overfitting.
Computationally cheaper