Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all...
Transcript of Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all...
![Page 1: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/1.jpg)
CS343:ArtificialIntelligenceNaïveBayes
Prof.ScottNiekum—TheUniversityofTexasatAustin[TheseslidesbasedonthoseofDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableathttp://ai.berkeley.edu.]
![Page 2: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/2.jpg)
MachineLearning
▪ Upuntilnow:howuseamodeltomakeoptimaldecisions
▪ Machinelearning:howtoacquireamodelfromdata/experience▪ Learningparameters(e.g.probabilities)▪ Learningstructure(e.g.BNgraphs)▪ Learninghiddenconcepts(e.g.clustering)
▪ Today:model-basedclassificationwithNaiveBayes
![Page 3: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/3.jpg)
Classification
![Page 4: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/4.jpg)
Example:SpamFilter
▪ Input:anemail▪ Output:spam/real
▪ Setup:▪ Getalargecollectionofexampleemails,eachlabeled
“spam”or“real”▪ Note:someonehastohandlabelallthisdata!▪ Wanttolearntopredictlabelsofnew,futureemails
▪ Features:Theattributesusedtomakethereal/spamdecision▪ Words:FREE!▪ TextPatterns:$dd,CAPS▪ Non-text:SenderInContacts▪ …
DearSir.
First,Imustsolicityourconfidenceinthistransaction,thisisbyvirtureofitsnatureasbeingutterlyconfidencialandtopsecret.…
TOBEREMOVEDFROMFUTUREMAILINGS,SIMPLYREPLYTOTHISMESSAGEANDPUT"REMOVE"INTHESUBJECT.
99MILLIONEMAILADDRESSESFORONLY$99
Ok,IknowthisisblatantlyOTbutI'mbeginningtogoinsane.HadanoldDellDimensionXPSsittinginthecorneranddecidedtoputittouse,Iknowitwasworkingprebeingstuckinthecorner,butwhenIpluggeditin,hitthepowernothinghappened.
![Page 5: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/5.jpg)
Example:DigitRecognition
▪ Input:images/pixelgrids▪ Output:adigit0-9
▪ Setup:▪ Getalargecollectionofexampleimages,eachlabeledwithadigit▪ Note:someonehastohandlabelallthisdata!▪ Wanttolearntopredictlabelsofnew,futuredigitimages
▪ Features:Theattributesusedtomakethedigitdecision▪ Pixels:(6,8)=ON▪ ShapePatterns:NumComponents,AspectRatio,NumLoops▪ …
0
1
2
1
??
![Page 6: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/6.jpg)
OtherClassificationTasks
▪ Classification:giveninputsx,predictlabels(classes)y
▪ Examples:▪ Spamdetection(input:document, classes:spam/ham)▪ OCR(input:images,classes:characters)▪ Medicaldiagnosis(input:symptoms, classes:diseases)▪ Automaticessaygrading(input:document, classes:grades)▪ Frauddetection(input:accountactivity, classes:fraud/nofraud)▪ Customerserviceemailrouting▪ …manymore
▪ Classificationisanimportantcommercialtechnology!
![Page 7: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/7.jpg)
Model-BasedClassification
![Page 8: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/8.jpg)
Model-BasedClassification
▪ Model-basedapproach▪ Buildamodel(e.g.Bayesnet)whereboththelabelandfeaturesarerandomvariables
▪ Instantiateanyobservedfeatures▪ Queryforthedistributionofthelabelconditionedonthefeatures
▪ Challenges▪ WhatstructureshouldtheBNhave?▪ Howshouldwelearnitsparameters?
![Page 9: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/9.jpg)
NaïveBayesforDigits▪ NaïveBayes:Assumeallfeaturesareindependenteffectsofthelabel
▪ Simpledigitrecognitionversion:▪ Onefeature(variable)Fijforeachgridposition<i,j>▪ Featurevaluesareon/off,basedonwhetherintensity ismoreorlessthan0.5inunderlyingimage▪ Eachinputmapstoafeaturevector,e.g.
▪ Here:lotsoffeatures,eachisbinaryvalued
▪ NaïveBayesmodel:
▪ Whatdoweneedtolearn?
Y
F1 FnF2
![Page 10: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/10.jpg)
GeneralNaïveBayes
▪ AgeneralNaiveBayesmodel:
▪ Weonlyhavetospecifyhoweachfeaturedependsontheclass▪ Totalnumberofparametersislinearinn▪ Modelisverysimplistic,butoftenworksanyway
Y
F1 FnF2
|Y|parameters
nx|F|x|Y|parameters
|Y|x|F|nvalues
![Page 11: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/11.jpg)
▪ Goal:computeposteriordistributionoverlabelvariableY▪ Step1:getjointprobabilityoflabelandevidenceforeachlabel
▪ Step2:sumtogetprobabilityofevidence
▪ Step3:normalizebydividingStep1byStep2
InferenceforNaïveBayes
+
![Page 12: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/12.jpg)
GeneralNaïveBayes
▪ WhatdoweneedinordertouseNaïveBayes?
▪ Inferencemethod(wejustsawthispart)▪ Startwithabunchofprobabilities:P(Y)andtheP(Fi|Y)tables▪ UsestandardinferencetocomputeP(Y|F1…Fn)▪ Nothingnewhere
▪ Estimatesoflocalconditionalprobabilitytables▪ P(Y),theprioroverlabels▪ P(Fi|Y)foreachfeature(evidencevariable)▪ Theseprobabilitiesarecollectivelycalledtheparametersofthemodelanddenotedbyθ
▪ Upuntilnow,weassumedtheseappearedbymagic,but…▪ …theytypicallycomefromtrainingdatacounts:we’lllookatthissoon
![Page 13: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/13.jpg)
Example:ConditionalProbabilities
1 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.10 0.1
1 0.012 0.053 0.054 0.305 0.806 0.907 0.058 0.609 0.500 0.80
1 0.052 0.013 0.904 0.805 0.906 0.907 0.258 0.859 0.600 0.80
![Page 14: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/14.jpg)
NaïveBayesforText
▪ Bag-of-wordsNaïveBayes:▪ Features:Wiisthewordatpositioni▪ Asbefore:predictlabelconditionedonfeaturevariables(spamvs.ham)▪ Asbefore:assumefeaturesareconditionallyindependentgivenlabel▪ New:eachWiisidenticallydistributed
▪ Generativemodel:
▪ “Tied”distributionsandbag-of-words▪ Usually,eachvariablegetsitsownconditionalprobabilitydistributionP(F|Y)▪ Inabag-of-wordsmodel
▪ Eachpositionisidenticallydistributed▪ AllpositionssharethesameconditionalprobsP(W|Y)▪ Whymakethisassumption?
▪ Called“bag-of-words”becausemodelisinsensitivetowordorderorreordering
Wordatpositioni,notithwordinthedictionary!
![Page 15: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/15.jpg)
Example:SpamFiltering
▪ Model:
▪ Whataretheparameters?
▪ Wheredothesetablescomefrom?
the : 0.0156 to : 0.0153 and : 0.0115 of : 0.0095 you : 0.0093 a : 0.0086 with: 0.0080 from: 0.0075 ...
the : 0.0210 to : 0.0133 of : 0.0119 2002: 0.0110 with: 0.0108 from: 0.0107 and : 0.0105 a : 0.0100 ...
ham : 0.66 spam: 0.33
![Page 16: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/16.jpg)
SpamExample
Word P(w|spam) P(w|ham) Tot Spam Tot Ham(prior) 0.33333 0.66666 -1.1 -0.4Gary 0.00002 0.00021 -11.8 -8.9would 0.00069 0.00084 -19.1 -16.0
you 0.00881 0.00304 -23.8 -21.8
like 0.00086 0.00083 -30.9 -28.9
to 0.01517 0.01339 -35.1 -33.2lose 0.00008 0.00002 -44.5 -44.0weight 0.00016 0.00002 -53.3 -55.0
while 0.00027 0.00027 -61.5 -63.2
you 0.00881 0.00304 -66.2 -69.0
sleep 0.00006 0.00001 -76.0 -80.5
P(spam | w) = 98.9
![Page 17: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/17.jpg)
TrainingandTesting
![Page 18: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/18.jpg)
ImportantConcepts
▪ Data:labeledinstances,e.g.emailsmarkedspam/ham▪ Trainingset▪ Heldoutset▪ Testset
▪ Features:attribute-valuepairswhichcharacterizeeachx
▪ Experimentationcycle▪ Learnparameters(e.g.modelprobabilities)ontrainingset▪ (Tunehyperparametersonheld-outset)▪ Computeaccuracyoftestset▪ Veryimportant:never“peek”atthetestset!
▪ Evaluation▪ Accuracy:fractionofinstancespredictedcorrectly
▪ Overfittingandgeneralization▪ Wantaclassifierwhichdoeswellontestdata▪ Overfitting:fittingthetrainingdataveryclosely,butnotgeneralizing
well—tuningonheldoutdatahelpstoavoidthis▪ We’llinvestigateoverfittingandgeneralizationformallyinafew
lectures
TrainingData
Held-OutData
TestData
![Page 19: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/19.jpg)
GeneralizationandOverfitting
![Page 20: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/20.jpg)
0 2 4 6 8 10 12 14 16 18 20-15
-10
-5
0
5
10
15
20
25
30
Degree15polynomial
Overfitting
![Page 21: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/21.jpg)
Example:Overfitting
2wins!!
![Page 22: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/22.jpg)
Example:Overfitting
▪ Posteriorsdeterminedbyrelativeprobabilities(oddsratios):
south-west : inf nation : inf morally : inf nicely : inf extent : inf seriously : inf ...
Whatwentwronghere?
screens : inf minute : inf guaranteed : inf $205.00 : inf delivery : inf signature : inf ...
![Page 23: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/23.jpg)
GeneralizationandOverfitting
▪ Relativefrequencyparameterswilloverfitthetrainingdata!▪ Justbecauseweneversawa3withpixel(15,15)onduringtrainingdoesn’tmeanwewon’tseeitattesttime▪ Unlikelythateveryoccurrenceof“minute”is100%spam▪ Unlikelythateveryoccurrenceof“seriously”is100%ham▪ Whataboutallthewordsthatdon’toccurinthetrainingsetatall?▪ Ingeneral,wecan’tgoaroundgivingunseeneventszeroprobability
▪ Asanextremecase,imagineusingtheentireemailastheonlyfeature▪ Wouldgetthetrainingdataperfect(ifdeterministiclabeling)▪ Wouldn’tgeneralizeatall▪ Justmakingthebag-of-wordsassumptiongivesussomegeneralization,butisn’tenough
▪ Togeneralizebetter:weneedtosmoothorregularizetheestimates
![Page 24: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/24.jpg)
ParameterEstimation
![Page 25: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/25.jpg)
ParameterEstimation
▪ Estimatingthedistributionofarandomvariable
▪ Elicitation:askahuman(whyisthishard?)
▪ Empirically:usetrainingdata(learning!)▪ E.g.:foreachoutcomex,lookattheempiricalrateofthatvalue:
▪ Thisistheestimatethatmaximizesthelikelihoodofthedata
r r b
r b b
r bbr
bb
r bb
r
b
b
![Page 26: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/26.jpg)
Smoothing
![Page 27: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/27.jpg)
MaximumLikelihood?
▪ Relativefrequenciesarethemaximumlikelihoodestimates
▪ Anotheroptionistoconsiderthemostlikelyparametervaluegiventhedata
????
![Page 28: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/28.jpg)
UnseenEvents
![Page 29: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/29.jpg)
LaplaceSmoothing
▪ Laplace’sestimate:▪ Pretendyousaweveryoutcomeonce
morethanyouactuallydid
▪ CanderivethisestimatewithDirichletpriors
r r b
![Page 30: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/30.jpg)
LaplaceSmoothing
▪ Laplace’sestimate(extended):▪ Pretendyousaweveryoutcomekextratimes
▪ What’sLaplacewithk=0?▪ kisthestrengthoftheprior
▪ Laplaceforconditionals:▪ Smootheachconditionindependently:
r r b
![Page 31: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/31.jpg)
Estimation:LinearInterpolation*
▪ Inpractice,LaplaceoftenperformspoorlyforP(X|Y):▪ When|X|isverylarge▪ When|Y|isverylarge
▪ Anotheroption:linearinterpolation▪ AlsogettheempiricalP(X)fromthedata▪ MakesuretheestimateofP(X|Y)isn’ttoodifferentfromtheempiricalP(X)
▪ Whatifαis0?1?
![Page 32: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/32.jpg)
SpamFiltering:Smoothing
▪ Forrealclassificationproblems,smoothingiscritical▪ Newoddsratios:
helvetica : 11.4 seems : 10.8 group : 10.2 ago : 8.4 areas : 8.3 ...
verdana : 28.8 Credit : 28.4 ORDER : 27.2 <FONT> : 26.9 money : 26.5 ...
Dothesemakemoresense?
![Page 33: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/33.jpg)
Tuning
![Page 34: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/34.jpg)
TuningonHeld-OutData
▪ Nowwe’vegottwokindsofunknowns▪ Parameters:theprobabilitiesP(X|Y),P(Y)▪ Hyperparameters:e.g.theamount/typeofsmoothingtodo,k,α
▪ Whatshouldwelearnwhere?▪ Learnparametersfromtrainingdata▪ Tunehyperparametersondifferentdata
▪ Why?▪ Foreachvalueofthehyperparameters,trainandtestontheheld-outdata
▪ Choosethebestvalueanddoafinaltestonthetestdata
![Page 35: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/35.jpg)
Features
![Page 36: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/36.jpg)
Errors,andWhattoDo
▪ Examplesoferrors
Dear GlobalSCAPE Customer,
GlobalSCAPE has partnered with ScanSoft to offer you the latest version of OmniPage Pro, for just $99.99* - the regular list price is $499! The most common question we've received about this offer is - Is this genuine? We would like to assure you that this offer is authorized by ScanSoft, is genuine and valid. You can get the . . .
. . . To receive your $30 Amazon.com promotional certificate, click through to
http://www.amazon.com/apparel
and see the prominent link for the $30 offer. All details are there. We hope you enjoyed receiving this message. However, if you'd rather not receive future e-mails announcing new store launches, please click . . .
![Page 37: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/37.jpg)
WhattoDoAboutErrors?
▪ Needmorefeatures–wordsaren’tenough!▪ Haveyouemailedthesenderbefore?▪ Have1Kotherpeoplejustgottenthesameemail?▪ Isthesendinginformationconsistent?▪ IstheemailinALLCAPS?▪ DoinlineURLspointwheretheysaytheypoint?▪ Doestheemailaddressyouby(your)name?
▪ Canaddtheseinformationsourcesasnewvariablesinthemodel
▪ Nextclasswe’lltalkaboutclassifierswhichletyoueasilyaddarbitraryfeaturesmoreeasily
![Page 38: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/38.jpg)
Baselines
▪ Firststep:getabaseline▪ Baselinesareverysimple“strawman”procedures▪ Helpdeterminehowhardthetaskis▪ Helpknowwhata“good”accuracyis
▪ Weakbaseline:mostfrequentlabelclassifier▪ Givesalltestinstanceswhateverlabelwasmostcommoninthetrainingset▪ E.g.forspamfiltering,mightlabeleverythingasham▪ Accuracymightbeveryhighiftheproblemisskewed▪ E.g.callingeverything“ham”gets66%,soaclassifierthatgets70%isn’tverygood…
▪ Forrealresearch,usuallyusepreviousworkasa(strong)baseline
![Page 39: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/39.jpg)
Summary
▪ Bayesruleletsusdodiagnosticquerieswithcausalprobabilities
▪ ThenaïveBayesassumptiontakesallfeaturestobeindependentgiventheclasslabel
▪ WecanbuildclassifiersoutofanaïveBayesmodelusingtrainingdata
▪ Smoothingestimatesisimportantinrealsystems
![Page 40: Naïve Bayes - University of Texas at Austin · Naïve Bayes for Digits Naïve Bayes: Assume all features are independent effects of the label Simple digit recognition version: One](https://reader030.fdocuments.net/reader030/viewer/2022021702/5cca409988c9936a208e1fab/html5/thumbnails/40.jpg)
NextTime:Perceptron!