Learning OpenCV 3 Computer Vision with...

356

Transcript of Learning OpenCV 3 Computer Vision with...

Page 1: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 2: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 3: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LearningOpenCV3ComputerVisionwithPythonSecondEdition

Page 4: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TableofContents

LearningOpenCV3ComputerVisionwithPythonSecondEdition

Credits

AbouttheAuthors

AbouttheReviewers

www.PacktPub.com

Supportfiles,eBooks,discountoffers,andmore

Whysubscribe?

FreeaccessforPacktaccountholders

Preface

Whatthisbookcovers

Whatyouneedforthisbook

Whothisbookisfor

Conventions

Readerfeedback

Customersupport

Downloadingtheexamplecode

Errata

Piracy

Questions

1.SettingUpOpenCV

Choosingandusingtherightsetuptools

InstallationonWindows

Usingbinaryinstallers(nosupportfordepthcameras)

UsingCMakeandcompilers

InstallingonOSX

UsingMacPortswithready-madepackages

UsingMacPortswithyourowncustompackages

UsingHomebrewwithready-madepackages(nosupportfordepthcameras)

UsingHomebrewwithyourowncustompackages

Page 5: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallationonUbuntuanditsderivatives

UsingtheUbunturepository(nosupportfordepthcameras)

BuildingOpenCVfromasource

InstallationonotherUnix-likesystems

InstallingtheContribmodules

Runningsamples

Findingdocumentation,help,andupdates

Summary

2.HandlingFiles,Cameras,andGUIs

BasicI/Oscripts

Reading/writinganimagefile

Convertingbetweenanimageandrawbytes

Accessingimagedatawithnumpy.array

Reading/writingavideofile

Capturingcameraframes

Displayingimagesinawindow

Displayingcameraframesinawindow

ProjectCameo(facetrackingandimagemanipulation)

Cameo–anobject-orienteddesign

Abstractingavideostreamwithmanagers.CaptureManager

Abstractingawindowandkeyboardwithmanagers.WindowManager

Applyingeverythingwithcameo.Cameo

Summary

3.ProcessingImageswithOpenCV3

Convertingbetweendifferentcolorspaces

AquicknoteonBGR

TheFourierTransform

Highpassfilter

Lowpassfilter

Creatingmodules

Edgedetection

Page 6: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Customkernels–gettingconvoluted

Modifyingtheapplication

EdgedetectionwithCanny

Contourdetection

Contours–boundingbox,minimumarearectangle,andminimumenclosingcircle

Contours–convexcontoursandtheDouglas-Peuckeralgorithm

Lineandcircledetection

Linedetection

Circledetection

Detectingshapes

Summary

4.DepthEstimationandSegmentation

Creatingmodules

Capturingframesfromadepthcamera

Creatingamaskfromadisparitymap

Maskingacopyoperation

Depthestimationwithanormalcamera

ObjectsegmentationusingtheWatershedandGrabCutalgorithms

ExampleofforegrounddetectionwithGrabCut

ImagesegmentationwiththeWatershedalgorithm

Summary

5.DetectingandRecognizingFaces

ConceptualizingHaarcascades

GettingHaarcascadedata

UsingOpenCVtoperformfacedetection

Performingfacedetectiononastillimage

Performingfacedetectiononavideo

Performingfacerecognition

Generatingthedataforfacerecognition

Recognizingfaces

Preparingthetrainingdata

Page 7: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Loadingthedataandrecognizingfaces

PerforminganEigenfacesrecognition

PerformingfacerecognitionwithFisherfaces

PerformingfacerecognitionwithLBPH

Discardingresultswithconfidencescore

Summary

6.RetrievingImagesandSearchingUsingImageDescriptors

Featuredetectionalgorithms

Definingfeatures

Detectingfeatures–corners

FeatureextractionanddescriptionusingDoGandSIFT

Anatomyofakeypoint

FeatureextractionanddetectionusingFastHessianandSURF

ORBfeaturedetectionandfeaturematching

FAST

BRIEF

Brute-Forcematching

FeaturematchingwithORB

UsingK-NearestNeighborsmatching

FLANN-basedmatching

FLANNmatchingwithhomography

Asampleapplication–tattooforensics

Savingimagedescriptorstofile

Scanningformatches

Summary

7.DetectingandRecognizingObjects

Objectdetectionandrecognitiontechniques

HOGdescriptors

Thescaleissue

Thelocationissue

Imagepyramid

Page 8: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Slidingwindows

Non-maximum(ornon-maxima)suppression

Supportvectormachines

Peopledetection

Creatingandtraininganobjectdetector

Bag-of-words

BOWincomputervision

Thek-meansclustering

Detectingcars

Whatdidwejustdo?

SVMandslidingwindows

Example–cardetectioninascene

Examiningdetector.py

Associatingtrainingdatawithclasses

Dude,where’smycar?

Summary

8.TrackingObjects

Detectingmovingobjects

Basicmotiondetection

Backgroundsubtractors–KNN,MOG2,andGMG

MeanshiftandCAMShift

Colorhistograms

ThecalcHistfunction

ThecalcBackProjectfunction

Insummary

Backtothecode

CAMShift

TheKalmanfilter

Predictandupdate

Anexample

Areal-lifeexample–trackingpedestrians

Page 9: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Theapplicationworkflow

Abriefdigression–functionalversusobject-orientedprogramming

ThePedestrianclass

Themainprogram

Wheredowegofromhere?

Summary

9.NeuralNetworkswithOpenCV–anIntroduction

Artificialneuralnetworks

Neuronsandperceptrons

ThestructureofanANN

Networklayersbyexample

Theinputlayer

Theoutputlayer

Thehiddenlayer

Thelearningalgorithms

ANNsinOpenCV

ANN-imalclassification

Trainingepochs

HandwrittendigitrecognitionwithANNs

MNIST–thehandwrittendigitdatabase

Customizedtrainingdata

Theinitialparameters

Theinputlayer

Thehiddenlayer

Theoutputlayer

Trainingepochs

Otherparameters

Mini-libraries

Themainfile

Possibleimprovementsandpotentialapplications

Improvements

Page 10: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Potentialapplications

Summary

Toboldlygo…

Index

Page 11: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 12: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LearningOpenCV3ComputerVisionwithPythonSecondEdition

Page 13: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 14: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LearningOpenCV3ComputerVisionwithPythonSecondEditionCopyright©2015PacktPublishing

Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.

Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthors,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.

PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.

Firstpublished:September2015

Productionreference:1240915

PublishedbyPacktPublishingLtd.

LiveryPlace

35LiveryStreet

BirminghamB32PB,UK.

ISBN978-1-78528-384-0

www.packtpub.com

Page 15: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 16: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CreditsAuthors

JoeMinichino

JosephHowse

Reviewers

NandanBanerjee

TianCao

BrandonCastellano

HaojianJin

AdrianRosebrock

CommissioningEditor

AkramHussain

AcquisitionEditors

VivekAnantharaman

PrachiBisht

ContentDevelopmentEditor

RitikaSingh

TechnicalEditors

NovinaKewalramani

ShivaniKiranMistry

CopyEditor

SoniaCheema

ProjectCoordinator

MiltonDsouza

Proofreader

SafisEditing

Indexer

MonicaAjmeraMehta

Graphics

DishaHaria

ProductionCoordinator

Page 17: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ArvindkumarGupta

CoverWork

ArvindkumarGupta

Page 18: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 19: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

AbouttheAuthorsJoeMinichinoisacomputervisionengineerforHooluxMedicalbydayandadeveloperoftheNoSQLdatabaseLokiJSbynight.Onweekends,heisaheavymetalsinger/songwriter.Heisapassionateprogrammerwhoisimmenselycuriousaboutprogramminglanguagesandtechnologiesandconstantlyexperimentswiththem.AtHoolux,JoeleadsthedevelopmentofanAndroidcomputervision-basedadvertisingplatformforthemedicalindustry.

BornandraisedinVarese,Lombardy,Italy,andcomingfromahumanisticbackgroundinphilosophy(atMilan’sUniversitàStatale),Joehasspenthislast11yearslivinginCork,Ireland,whichiswherehebecameacomputersciencegraduateattheCorkInstituteofTechnology.

Iamimmenselygratefultomypartner,Rowena,foralwaysencouragingme,andalsomytwolittledaughtersforinspiringme.Abigthankyoutothecollaboratorsandeditorsofthisbook,especiallyJoeHowse,AdrianRoesbrock,BrandonCastellano,theOpenCVcommunity,andthepeopleatPacktPublishing.

JosephHowselivesinCanada.Duringthewinters,hegrowshisbeard,whilehisfourcatsgrowtheirthickcoatsoffur.Helovescombinghiscatseverydayandsometimes,hiscatsalsopullhisbeard.

HehasbeenwritingforPacktPublishingsince2012.HisbooksincludeOpenCVforSecretAgents,OpenCVBlueprints,AndroidApplicationProgrammingwithOpenCV3,OpenCVComputerVisionwithPython,andPythonGameProgrammingbyExample.

Whenheisnotwritingbooksorgroominghiscats,heprovidesconsulting,training,andsoftwaredevelopmentservicesthroughhiscompany,NummistMedia(http://nummist.com).

Page 20: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 21: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

AbouttheReviewersNandanBanerjeehasabachelor’sdegreeincomputerscienceandamaster’sinroboticsengineering.HestartedworkingwithSamsungElectronicsrightaftergraduation.HeworkedforayearatitsR&DcentreinBangalore.HealsoworkedintheWPI-CMUteamontheBostonDynamics’robot,Atlas,fortheDARPARoboticsChallenge.HeiscurrentlyworkingasaroboticssoftwareengineerinthetechnologyorganizationatiRobotCorporation.Heisanembeddedsystemsandroboticsenthusiastwithaninclinationtowardcomputervisionandmotionplanning.Hehasexperienceinvariouslanguages,includingC,C++,Python,Java,andDelphi.HealsohasasubstantialexperienceinworkingwithROS,OpenRAVE,OpenCV,PCL,OpenGL,CUDAandtheAndroidSDK.

Iwouldliketothanktheauthorandpublisherforcomingoutwiththiswonderfulbook.

TianCaoispursuinghisPhDincomputerscienceattheUniversityofNorthCarolinainChapelHill,USA,andworkingonprojectsrelatedtoimageanalysis,computervision,andmachinelearning.

Idedicatethisworktomyparentsandgirlfriend.

BrandonCastellanoisastudentfromCanadapursuinganMEScinelectricalengineeringattheUniversityofWesternOntario,CityofLondon,Canada.HereceivedhisBEScinthesamesubjectin2012.ThefocusofhisresearchisinparallelprocessingandGPGPU/FPGAoptimizationforreal-timeimplementationsofimageprocessingalgorithms.BrandonalsoworksforEagleVisionSystemsInc.,focusingontheuseofreal-timeimageprocessingforroboticsapplications.

WhilehehasbeenusingOpenCVandC++formorethan5years,hehasalsobeenadvocatingtheuseofPythonfrequentlyinhisresearch,mostnotably,foritsrapidspeedofdevelopment,allowinglow-levelinterfacingwithcomplexsystems.ThisisevidentinhisopensourceprojectshostedonGitHub,forexample,PySceneDetect,whichismostlywritteninPython.Inadditiontoimage/videoprocessing,hehasalsoworkedonimplementationsofthree-dimensionaldisplaysaswellasthesoftwaretoolstosupportthedevelopmentofsuchdisplays.

Inadditiontopostingtechnicalarticlesandtutorialsonhiswebsite(http://www.bcastell.com),heparticipatesinavarietyofbothopenandclosedsourceprojectsandcontributestoGitHubundertheusernameBreakthrough(http://www.github.com/Breakthrough).HeisanactivememberoftheSuperUserandStackOverflowcommunities(underthenameBreakthrough),andcanbecontacteddirectlyviahiswebsite.

Iwouldliketothankallmyfriendsandfamilyfortheirpatienceduringthepastfewyears(especiallymyparents,PeterandLori,andmybrother,Mitchell).Icouldnothaveaccomplishedeverythingwithouttheircontinuedloveandsupport.Ican’teverthankeveryoneenough.

Iwouldalsoliketoextendaspecialthankstoallofthedevelopersthatcontributetoopen

Page 22: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

sourcesoftwarelibraries,specificallyOpenCV,whichhelpbringthedevelopmentofcutting-edgesoftwaretechnologyclosertoallthesoftwaredevelopersaroundtheworld,freeofcost.Iwouldalsoliketothankthosepeoplewhohelpwritedocumentation,submitbugreports,andwritetutorials/books(especiallytheauthorofthisbook!).Theircontributionsarevitaltothesuccessofanyopensourceproject,especiallyonethatisasextensiveandcomplexasOpenCV.

HaojianJinisasoftwareengineer/researcheratYahoo!Labs,Sunnyvale,CA.Helooksprimarilyatbuildingnewsystemsofwhat’spossibleoncommoditymobiledevices(orwithminimumhardwarechanges).Tocreatethingsthatdon’texisttoday,hespendslargechunksofhistimeplayingwithsignalprocessing,computervision,machinelearning,andnaturallanguageprocessingandusingthemininterestingways.Youcanfindmoreabouthimathttp://shift-3.com/

AdrianRosebrockisanauthorandbloggerathttp://www.pyimagesearch.com/.HeholdsaPhDincomputersciencefromtheUniversityofMaryland,BaltimoreCounty,USA,withafocusoncomputervisionandmachinelearning.

HehasconsultedfortheNationalCancerInstitutetodevelopmethodsthatautomaticallypredictbreastcancerriskfactorsusingbreasthistologyimages.Hehasalsoauthoredabook,PracticalPythonandOpenCV(http://pyimg.co/x7ed5),ontheutilizationofPythonandOpenCVtobuildreal-worldcomputervisionapplications.

Page 23: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 24: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

www.PacktPub.com

Page 25: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Supportfiles,eBooks,discountoffers,andmoreForsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.

DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusat<[email protected]>formoredetails.

Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.

https://www2.packtpub.com/books/subscription/packtlib

DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.

Page 26: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser

Page 27: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FreeaccessforPacktaccountholdersIfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccessPacktLibtodayandview9entirelyfreebooks.Simplyuseyourlogincredentialsforimmediateaccess.

Page 28: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 29: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PrefaceOpenCV3isastate-of-the-artcomputervisionlibrarythatisusedforavarietyofimageandvideoprocessingoperations.Someofthemorespectacularandfuturisticfeatures,suchasfacerecognitionorobjecttracking,areeasilyachievablewithOpenCV3.Learningthebasicconceptsbehindcomputervisionalgorithms,models,andOpenCV’sAPIwillenablethedevelopmentofallsortsofreal-worldapplications,includingsecurityandsurveillancetools.

Startingwithbasicimageprocessingoperations,thisbookwilltakeyouthroughajourneythatexploresadvancedcomputervisionconcepts.Computervisionisarapidlyevolvingsciencewhoseapplicationsintherealworldareexploding,sothisbookwillappealtocomputervisionnovicesaswellasexpertsofthesubjectwhowanttolearnaboutthebrandnewOpenCV3.0.0.

Page 30: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WhatthisbookcoversChapter1,SettingUpOpenCV,explainshowtosetupOpenCV3withPythonondifferentplatforms.Itwillalsotroubleshootcommonproblems.

Chapter2,HandlingFiles,Cameras,andGUIs,introducesOpenCV’sI/Ofunctionalities.Itwillalsodiscusstheconceptofaprojectandthebeginningsofanobject-orienteddesignforthisproject.

Chapter3,ProcessingImageswithOpenCV3,presentssometechniquesrequiredtoalterimages,suchasdetectingskintoneinanimage,sharpeninganimage,markingcontoursofsubjects,anddetectingcrosswalksusingalinesegmentdetector.

Chapter4,DepthEstimationandSegmentation,showsyouhowtousedatafromadepthcameratoidentifyforegroundandbackgroundregions,suchthatwecanlimitaneffecttoonlytheforegroundorbackground.

Chapter5,DetectingandRecognizingFaces,introducessomeofOpenCV’sfacedetectionfunctionalities,alongwiththedatafilesthatdefineparticulartypesoftrackableobjects.

Chapter6,RetrievingImagesandSearchingUsingImageDescriptors,showshowtodetectthefeaturesofanimagewiththehelpofOpenCVandmakeuseofthemtomatchandsearchforimages.

Chapter7,DetectingandRecognizingObjects,introducestheconceptofdetectingandrecognizingobjects,whichisoneofthemostcommonchallengesincomputervision.

Chapter8,TrackingObjects,exploresthevasttopicofobjecttracking,whichistheprocessoflocatingamovingobjectinamovieorvideofeedwiththehelpofacamera.

Chapter9,NeuralNetworkswithOpenCV–anIntroduction,introducesyoutoArtificialNeuralNetworksinOpenCVandillustratestheirusageinareal-lifeapplication.

Page 31: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 32: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WhatyouneedforthisbookYousimplyneedarelativelyrecentcomputer,asthefirstchapterwillguideyouthroughtheinstallationofallthenecessarysoftware.Awebcamishighlyrecommended,butnotnecessary.

Page 33: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 34: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WhothisbookisforThisbookisaimedatprogrammerswithworkingknowledgeofPythonaswellaspeoplewhowanttoexplorethetopicofcomputervisionusingtheOpenCVlibrary.NopreviousexperienceofcomputervisionorOpenCVisrequired.Programmingexperienceisrecommended.

Page 35: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 36: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.

Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:“Wecanincludeothercontextsthroughtheuseoftheincludedirective.”

Ablockofcodeissetasfollows:

importcv2

importnumpyasnp

img=cv2.imread('images/chess_board.png')

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

gray=np.float32(gray)

dst=cv2.cornerHarris(gray,2,23,0.04)

Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevantlinesoritemsaresetinbold:

img=cv2.imread('images/chess_board.png')

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

gray=np.float32(gray)

dst=cv2.cornerHarris(gray,2,23,0.04)

Anycommand-lineinputoroutputiswrittenasfollows:

mkdirbuild&&cdbuild

cmakeDCMAKE_BUILD_TYPE=Release-DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modulesDCMAKE_INSTALL_PREFIX=/usr/local..make

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,forexample,inmenusordialogboxes,appearinthetextlikethis:”OnWindowsVista/Windows7/Windows8,clickontheStartmenu.”

NoteWarningsorimportantnotesappearinaboxlikethis.

TipTipsandtricksappearlikethis.

Page 37: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 38: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook—whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.

Tosendusgeneralfeedback,simplye-mail<[email protected]>,andmentionthebook’stitleinthesubjectofyourmessage.

Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.

Page 39: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 40: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.

Page 41: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesfromyouraccountathttp://www.packtpub.comforallthePacktPublishingbooksyouhavepurchased.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Page 42: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthecode—wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.

Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.

Page 43: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.

Pleasecontactusat<[email protected]>withalinktothesuspectedpiratedmaterial.

Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.

Page 44: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

QuestionsIfyouhaveaproblemwithanyaspectofthisbook,youcancontactusat<[email protected]>,andwewilldoourbesttoaddresstheproblem.

Page 45: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 46: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter1.SettingUpOpenCVYoupickedupthisbooksoyoumayalreadyhaveanideaofwhatOpenCVis.Maybe,youheardofSci-Fi-soundingfeatures,suchasfacedetection,andgotintrigued.Ifthisisthecase,you’vemadetheperfectchoice.OpenCVstandsforOpenSourceComputerVision.Itisafreecomputervisionlibrarythatallowsyoutomanipulateimagesandvideostoaccomplishavarietyoftasksfromdisplayingthefeedofawebcamtopotentiallyteachingarobottorecognizereal-lifeobjects.

Inthisbook,youwilllearntoleveragetheimmensepotentialofOpenCVwiththePythonprogramminglanguage.Pythonisanelegantlanguagewitharelativelyshallowlearningcurveandverypowerfulfeatures.ThischapterisaquickguidetosettingupPython2.7,OpenCV,andotherrelatedlibraries.Aftersetup,wealsolookatOpenCV’sPythonsamplescriptsanddocumentation.

NoteIfyouwishtoskiptheinstallationprocessandjumprightintoaction,youcandownloadthevirtualmachine(VM)I’vemadeavailableathttp://techfort.github.io/pycv/.

ThisfileiscompatiblewithVirtualBox,afree-to-usevirtualizationapplicationthatletsyoubuildandrunVMs.TheVMI’vebuiltisbasedonUbuntuLinux14.04andhasallthenecessarysoftwareinstalledsothatyoucanstartcodingrightaway.

ThisVMrequiresatleast2GBofRAMtorunsmoothly,somakesurethatyouallocateatleast2(but,ideally,morethan4)GBofRAMtotheVM,whichmeansthatyourhostmachinewillneedatleast6GBofRAMtosustainit.

Thefollowingrelatedlibrariesarecoveredinthischapter:

NumPy:ThislibraryisadependencyofOpenCV’sPythonbindings.Itprovidesnumericcomputingfunctionality,includingefficientarrays.SciPy:ThislibraryisascientificcomputinglibrarythatiscloselyrelatedtoNumPy.ItisnotrequiredbyOpenCV,butitisusefulformanipulatingdatainOpenCVimages.OpenNI:ThislibraryisanoptionaldependencyofOpenCV.Itaddsthesupportforcertaindepthcameras,suchasAsusXtionPRO.SensorKinect:ThislibraryisanOpenNIpluginandoptionaldependencyofOpenCV.ItaddssupportfortheMicrosoftKinectdepthcamera.

Forthisbook’spurposes,OpenNIandSensorKinectcanbeconsideredoptional.TheyareusedthroughoutChapter4,DepthEstimationandSegmentation,butarenotusedintheotherchaptersorappendices.

NoteThisbookfocusesonOpenCV3,thenewmajorreleaseoftheOpenCVlibrary.AlladditionalinformationaboutOpenCVisavailableathttp://opencv.org,anditsdocumentationisavailableathttp://docs.opencv.org/master.

Page 47: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ChoosingandusingtherightsetuptoolsWearefreetochoosevarioussetuptools,dependingonouroperatingsystemandhowmuchconfigurationwewanttodo.Let’stakeanoverviewofthetoolsforWindows,Mac,Ubuntu,andotherUnix-likesystems.

Page 48: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallationonWindowsWindowsdoesnotcomewithPythonpreinstalled.However,installationwizardsareavailableforprecompiledPython,NumPy,SciPy,andOpenCV.Alternatively,wecanbuildfromasource.OpenCV’sbuildsystemusesCMakeforconfigurationandeitherVisualStudioorMinGWforcompilation.

Ifwewantsupportfordepthcameras,includingKinect,weshouldfirstinstallOpenNIandSensorKinect,whichareavailableasprecompiledbinarieswithinstallationwizards.Then,wemustbuildOpenCVfromasource.

NoteTheprecompiledversionofOpenCVdoesnotoffersupportfordepthcameras.

OnWindows,OpenCV2offersbettersupportfor32-bitPythonthan64-bitPython;however,withthemajorityofcomputerssoldtodaybeing64-bitsystems,ourinstructionswillreferto64-bit.Allinstallershave32-bitversionsavailablefromthesamesiteasthe64-bit.

Someofthefollowingstepsrefertoeditingthesystem’sPATHvariable.ThistaskcanbedoneintheEnvironmentVariableswindowofControlPanel.

1. OnWindowsVista/Windows7/Windows8,clickontheStartmenuandlaunchControlPanel.Now,navigatetoSystemandSecurity|System|Advancedsystemsettings.ClickontheEnvironmentVariables…button.

2. OnWindowsXP,clickontheStartmenuandnavigatetoControlPanel|System.SelecttheAdvancedtab.ClickontheEnvironmentVariables…button.

3. Now,underSystemvariables,selectPathandclickontheEdit…button.4. Makechangesasdirected.5. Toapplythechanges,clickonalltheOKbuttons(untilwearebackinthemain

windowofControlPanel).6. Then,logoutandlogbackin(alternatively,reboot).

Usingbinaryinstallers(nosupportfordepthcameras)YoucanchoosetoinstallPythonanditsrelatedlibrariesseparatelyifyouprefer;however,therearePythondistributionsthatcomewithinstallersthatwillsetuptheentireSciPystack(whichincludesPythonandNumPy),whichmakeitverytrivialtosetupthedevelopmentenvironment.

OnesuchdistributionisAnacondaPython(downloadableathttp://09c8d0b2229f813c1b93-c95ac804525aac4b6dba79b00b39d1d3.r79.cf1.rackcdn.com/Anaconda-2.1.0Windows-x86_64.exe).Oncetheinstallerisdownloaded,runitandremembertoaddthepathtotheAnacondainstallationtoyourPATHvariablefollowingtheprecedingprocedure.

HerearethestepstosetupPython7,NumPy,SciPy,andOpenCV:

Page 49: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

1. Downloadandinstallthe32-bitPython2.7.9fromhttps://www.python.org/ftp/python/2.7.9/python-2.7.9.amd64.msi.

2. DownloadandinstallNumPy1.6.2fromhttp://www.lfd.uci.edu/~gohlke/pythonlibs/#numpyhttp://sourceforge.net/projects/numpy/files/NumPy/1.6.2/numpy-1.6.2-win32-superpack-python2.7.exe/download(notethatinstallingNumPyonWindows64-bitisabittrickyduetothelackofa64-bitFortrancompileronWindows,whichNumPydependson.Thebinaryattheprecedinglinkisunofficial).

3. DownloadandinstallSciPy11.0fromhttp://www.lfd.uci.edu/~gohlke/pythonlibs/#scipyhttp://sourceforge.net/projects/scipy/files/scipy/0.11.0/scipy-0.11.0win32-superpack-python2.7.exe/download(thisisthesameasNumPyandthesearecommunityinstallers).

4. Downloadtheself-extractingZIPofOpenCV3.0.0fromhttps://github.com/Itseez/opencv.RunthisZIP,andwhenprompted,enteradestinationfolder,whichwewillrefertoas<unzip_destination>.Asubfolder,<unzip_destination>\opencv,iscreated.

5. Copy<unzip_destination>\opencv\build\python\2.7\cv2.pydtoC:\Python2.7\Lib\site-packages(assumingthatwehadinstalledPython2.7tothedefaultlocation).IfyouinstalledPython2.7withAnaconda,usetheAnacondainstallationfolderinsteadofthedefaultPythoninstallation.Now,thenewPythoninstallationcanfindOpenCV.

6. AfinalstepisnecessaryifwewantPythonscriptstorunusingthenewPythoninstallationbydefault.Editthesystem’sPATHvariableandappend;C:\Python2.7(assumingthatwehadinstalledPython2.7tothedefaultlocation)oryourAnacondainstallationfolder.RemoveanypreviousPythonpaths,suchas;C:\Python2.6.Logoutandlogbackin(alternatively,reboot).

UsingCMakeandcompilersWindowsdoesnotcomewithanycompilersorCMake.Weneedtoinstallthem.Ifwewantsupportfordepthcameras,includingKinect,wealsoneedtoinstallOpenNIandSensorKinect.

Let’sassumethatwehavealreadyinstalled32-bitPython2.7,NumPy,andSciPyeitherfrombinaries(asdescribedpreviously)orfromasource.Now,wecanproceedwithinstallingcompilersandCMake,optionallyinstallingOpenNIandSensorKinect,andthenbuildingOpenCVfromthesource:

1. DownloadandinstallCMake3.1.2fromhttp://www.cmake.org/files/v3.1/cmake-3.1.2-win32-x86.exe.Whenrunningtheinstaller,selecteitherAddCMaketothesystemPATHforallusersorAddCMaketothesystemPATHforcurrentuser.Don’tworryaboutthefactthata64-bitversionofCMakeisnotavailableCMakeisonlyaconfigurationtoolanddoesnotperformanycompilationsitself.Instead,onWindows,itcreatesprojectfilesthatcanbeopenedwithVisualStudio.

2. DownloadandinstallMicrosoftVisualStudio2013(theDesktopeditionifyouareworkingonWindows7)fromhttps://www.visualstudio.com/products/free-developer-offers-vs.aspx?slcid=0x409&type=weborMinGW.

Page 50: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

NotethatyouwillneedtosigninwithyourMicrosoftaccountandifyoudon’thaveone,youcancreateoneonthespot.Installthesoftwareandrebootafterinstallationiscomplete.

ForMinGW,gettheinstallerfromhttp://sourceforge.net/projects/mingw/files/Installer/mingw-get-setup.exe/downloadandhttp://sourceforge.net/projects/mingw/files/OldFiles/mingw-get-inst/mingw-get-inst-20120426/mingw-get-inst-20120426.exe/download.Whenrunningtheinstaller,makesurethatthedestinationpathdoesnotcontainspacesandthattheoptionalC++compilerisincluded.Editthesystem’sPATHvariableandappend;C:\MinGW\bin(assumingthatMinGWisinstalledtothedefaultlocation).Rebootthesystem.

3. Optionally,downloadandinstallOpenNI1.5.4.0fromthelinksprovidedintheGitHubhomepageofOpenNIathttps://github.com/OpenNI/OpenNI.

4. YoucandownloadandinstallSensorKinect0.93fromhttps://github.com/avin2/SensorKinect/blob/unstable/Bin/SensorKinect093-Bin-Win32-v5.1.2.1.msi?raw=true(32-bit).Alternatively,for64-bitPython,downloadthesetupfromhttps://github.com/avin2/SensorKinect/blob/unstable/Bin/SensorKinect093-Bin-Win64-v5.1.2.1.msi?raw=true(64-bit).Notethatthisrepositoryhasbeeninactiveformorethanthreeyears.

5. Downloadtheself-extractingZIPofOpenCV3.0.0fromhttps://github.com/Itseez/opencv.Runtheself-extractingZIP,andwhenprompted,enteranydestinationfolder,whichwewillrefertoas<unzip_destination>.Asubfolder,<unzip_destination>\opencv,isthencreated.

6. OpenCommandPromptandmakeanotherfolderwhereourbuildwillgousingthiscommand:

>mkdir<build_folder>

Changethedirectoryofthebuildfolder:

>cd<build_folder>

7. Now,wearereadytoconfigureourbuild.Tounderstandalltheoptions,wecanreadthecodein<unzip_destination>\opencv\CMakeLists.txt.However,forthisbook’spurposes,weonlyneedtousetheoptionsthatwillgiveusareleasebuildwithPythonbindings,andoptionally,depthcamerasupportviaOpenNIandSensorKinect.

8. OpenCMake(cmake-gui)andspecifythelocationofthesourcecodeofOpenCVandthefolderwhereyouwouldliketobuildthelibrary.ClickonConfigure.Selecttheprojecttobegenerated.Inthiscase,selectVisualStudio12(whichcorrespondstoVisualStudio2013).AfterCMakehasfinishedconfiguringtheproject,itwilloutputalistofbuildoptions.Ifyouseearedbackground,itmeansthatyourprojectmayneedtobereconfigured:CMakemightreportthatithasfailedtofindsomedependencies.ManyofOpenCV’sdependenciesareoptional,sodonotbetooconcernedyet.

Note

Page 51: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ifthebuildfailstocompleteoryourunintoproblemslater,tryinstallingmissingdependencies(oftenavailableasprebuiltbinaries),andthenrebuildOpenCVfromthisstep.

Youhavetheoptionofselecting/deselectingbuildoptions(accordingtothelibrariesyouhaveinstalledonyourmachine)andclickonConfigureagain,untilyougetaclearbackground(white).

9. Attheendofthisprocess,youcanclickonGenerate,whichwillcreateanOpenCV.slnfileinthefolderyou’vechosenforthebuild.Youcanthennavigateto<build_folder>/OpenCV.slnandopenthefilewithVisualStudio2013,andproceedwithbuildingtheproject,ALL_BUILD.YouwillneedtobuildboththeDebugandReleaseversionsofOpenCV,sogoaheadandbuildthelibraryintheDebugmode,thenselectReleaseandrebuildit(F7isthekeytolaunchthebuild).

10. Atthisstage,youwillhaveabinfolderintheOpenCVbuilddirectory,whichwillcontainallthegenerated.dllfilesthatwillallowyoutoincludeOpenCVinyourprojects.

Alternatively,forMinGW,runthefollowingcommand:

>cmake-D:CMAKE_BUILD_TYPE=RELEASE-D:WITH_OPENNI=ON-G

"MinGWMakefiles"<unzip_destination>\opencv

IfOpenNIisnotinstalled,omit-D:WITH_OPENNI=ON.(Inthiscase,depthcameraswillnotbesupported.)IfOpenNIandSensorKinectareinstalledtonondefaultlocations,modifythecommandtoinclude-D:OPENNI_LIB_DIR=<openni_install_destination>\Lib-D:OPENNI_INCLUDE_DIR=

<openni_install_destination>\Include-

D:OPENNI_PRIME_SENSOR_MODULE_BIN_DIR=

<sensorkinect_install_destination>\Sensor\Bin.

Alternatively,forMinGW,runthiscommand:

>mingw32-make

11. Copy<build_folder>\lib\Release\cv2.pyd(fromaVisualStudiobuild)or<build_folder>\lib\cv2.pyd(fromaMinGWbuild)to<python_installation_folder>\site-packages.

12. Finally,editthesystem’sPATHvariableandappend;<build_folder>/bin/Release(foraVisualStudiobuild)or;<build_folder>/bin(foraMinGWbuild).Rebootyoursystem.

Page 52: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallingonOSXSomeversionsofMacusedtocomewithaversionofPython2.7preinstalledthatwerecustomizedbyAppleforasystem’sinternalneeds.However,thishaschangedandthestandardversionofOSXshipswithastandardinstallationofPython.Onpython.org,youcanalsofindauniversalbinarythatiscompatiblewithboththenewIntelsystemsandthelegacyPowerPC.

NoteYoucanobtainthisinstallerathttps://www.python.org/downloads/release/python-279/(refertotheMacOSX32-bitPPCortheMacOSX64-bitIntellinks).InstallingPythonfromthedownloaded.dmgfilewillsimplyoverwriteyourcurrentsysteminstallationofPython.

ForMac,thereareseveralpossibleapproachesforobtainingstandardPython2.7,NumPy,SciPy,andOpenCV.AllapproachesultimatelyrequireOpenCVtobecompiledfromasourceusingXcodeDeveloperTools.However,dependingontheapproach,thistaskisautomatedforusinvariouswaysbythird-partytools.WewilllookatthesekindsofapproachesusingMacPortsorHomebrew.ThesetoolscanpotentiallydoeverythingthatCMakecan,plustheyhelpusresolvedependenciesandseparateourdevelopmentlibrariesfromsystemlibraries.

TipIrecommendMacPorts,especiallyifyouwanttocompileOpenCVwithdepthcamerasupportviaOpenNIandSensorKinect.Relevantpatchesandbuildscripts,includingsomethatImaintain,areready-madeforMacPorts.Bycontrast,Homebrewdoesnotcurrentlyprovideaready-madesolutiontocompileOpenCVwithdepthcamerasupport.

Beforeproceeding,let’smakesurethattheXcodeDeveloperToolsareproperlysetup:

1. DownloadandinstallXcodefromtheMacAppStoreorhttps://developer.apple.com/xcode/downloads/.Duringinstallation,ifthereisanoptiontoinstallCommandLineTools,selectit.

2. OpenXcodeandacceptthelicenseagreement.3. Afinalstepisnecessaryiftheinstallerdoesnotgiveustheoptiontoinstall

CommandLineTools.NavigatetoXcode|Preferences|Downloads,andclickontheInstallbuttonnexttoCommandLineTools.WaitfortheinstallationtofinishandquitXcode.

Alternatively,youcaninstallXcodecommand-linetoolsbyrunningthefollowingcommand(intheterminal):

$xcode-select–install

Now,wehavetherequiredcompilersforanyapproach.

UsingMacPortswithready-madepackages

Page 53: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WecanusetheMacPortspackagemanagertohelpussetupPython2.7,NumPy,andOpenCV.MacPortsprovidesterminalcommandsthatautomatetheprocessofdownloading,compiling,andinstallingvariouspiecesofopensourcesoftware(OSS).MacPortsalsoinstallsdependenciesasneeded.Foreachpieceofsoftware,thedependenciesandbuildrecipesaredefinedinaconfigurationfilecalledaPortfile.AMacPortsrepositoryisacollectionofPortfiles.

StartingfromasystemwhereXcodeanditscommand-linetoolsarealreadysetup,thefollowingstepswillgiveusanOpenCVinstallationviaMacPorts:

1. DownloadandinstallMacPortsfromhttp://www.macports.org/install.php.2. IfyouwantsupportfortheKinectdepthcamera,youneedtotellMacPortswhereto

downloadthecustomPortfilesthatIhavewritten.Todoso,edit/opt/local/etc/macports/sources.conf(assumingthatMacPortsisinstalledtothedefaultlocation).Justabovetheline,rsync://rsync.macports.org/release/ports/[default],addthefollowingline:

http://nummist.com/opencv/ports.tar.gz

Savethefile.Now,MacPortsknowsthatithastosearchforPortfilesinmyonlinerepositoryfirst,andthenthedefaultonlinerepository.

3. OpentheterminalandrunthefollowingcommandtoupdateMacPorts:

$sudoportselfupdate

Whenprompted,enteryourpassword.

4. Now(ifweareusingmyrepository),runthefollowingcommandtoinstallOpenCVwithPython2.7bindingsandsupportfordepthcameras,includingKinect:

$sudoportinstallopencv+python27+openni_sensorkinect

Alternatively(withorwithoutmyrepository),runthefollowingcommandtoinstallOpenCVwithPython2.7bindingsandsupportfordepthcameras,excludingKinect:

$sudoportinstallopencv+python27+openni

NoteDependencies,includingPython2.7,NumPy,OpenNI,and(inthefirstexample)SensorKinect,areautomaticallyinstalledaswell.

Byadding+python27tothecommand,wespecifythatwewanttheopencvvariant(buildconfiguration)withPython2.7bindings.Similarly,+openni_sensorkinectspecifiesthevariantwiththebroadestpossiblesupportfordepthcamerasviaOpenNIandSensorKinect.Youmayomit+openni_sensorkinectifyoudonotintendtousedepthcameras,oryoumayreplaceitwith+openniifyoudointendtouseOpenNI-compatibledepthcamerasbutjustnotKinect.Toseethefulllistoftheavailablevariantsbeforeinstalling,wecanenterthefollowingcommand:

$portvariantsopencv

Page 54: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Dependingonourcustomizationneeds,wecanaddothervariantstotheinstallcommand.Forevenmoreflexibility,wecanwriteourownvariants(asdescribedinthenextsection).

5. Also,runthefollowingcommandtoinstallSciPy:

$sudoportinstallpy27-scipy

6. ThePythoninstallation’sexecutableisnamedpython2.7.Ifwewanttolinkthedefaultpythonexecutabletopython2.7,let’salsorunthiscommand:

$sudoportinstallpython_select

$sudoportselectpythonpython27

UsingMacPortswithyourowncustompackagesWithafewextrasteps,wecanchangethewaythatMacPortscompilesOpenCVoranyotherpieceofsoftware.Aspreviouslymentioned,MacPorts’buildrecipesaredefinedinconfigurationfilescalledPortfiles.BycreatingoreditingPortfiles,wecanaccesshighlyconfigurablebuildtools,suchasCMake,whilealsobenefittingfromMacPorts’features,suchasdependencyresolution.

Let’sassumethatwealreadyhaveMacPortsinstalled.Now,wecanconfigureMacPortstousethecustomPortfilesthatwewrite:

1. CreateafoldersomewheretoholdourcustomPortfiles.Wewillrefertothisfolderas<local_repository>.

2. Editthe/opt/local/etc/macports/sources.conffile(assumingthatMacPortsisinstalledtothedefaultlocation).Justabovethersync://rsync.macports.org/release/ports/[default]line,addthisline:

file://<local_repository>

Forexample,if<local_repository>is/Users/Joe/Portfiles,addthefollowingline:

file:///Users/Joe/Portfiles

Notethetripleslashesandsavethefile.Now,MacPortsknowsthatithastosearchforPortfilesin<local_repository>first,andthen,itsdefaultonlinerepository.

3. OpentheterminalandupdateMacPortstoensurethatwehavethelatestPortfilesfromthedefaultrepository:

$sudoportselfupdate

4. Let’scopythedefaultrepository’sopencvPortfileasanexample.Weshouldalsocopythedirectorystructure,whichdetermineshowthepackageiscategorizedbyMacPorts:

$mkdir<local_repository>/graphics/

$cp

/opt/local/var/macports/sources/rsync.macports.org/release/ports/graphi

Page 55: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

cs/opencv<local_repository>/graphics

Alternatively,foranexamplethatincludesKinectsupport,wecoulddownloadmyonlinerepositoryfromhttp://nummist.com/opencv/ports.tar.gz,unzipit,andcopyitsentiregraphicsfolderinto<local_repository>:

$cp<unzip_destination>/graphics<local_repository>

5. Edit<local_repository>/graphics/opencv/Portfile.NotethatthisfilespecifiestheCMakeconfigurationflags,dependencies,andvariants.FordetailsonthePortfileediting,gotohttp://guide.macports.org/#development.

ToseewhichCMakeconfigurationflagsarerelevanttoOpenCV,weneedtolookatitssourcecode.Downloadthesourcecodearchivefromhttps://github.com/Itseez/opencv/archive/3.0.0.zip,unzipittoanylocation,andread<unzip_destination>/OpenCV-3.0.0/CMakeLists.txt.

AftermakinganyeditstothePortfile,saveit.

6. Now,weneedtogenerateanindexfileinourlocalrepositorysothatMacPortscanfindthenewPortfile:

$cd<local_repository>

$portindex

7. Fromnowon,wecantreatourcustomopencvfilejustlikeanyotherMacPortspackage.Forexample,wecaninstallitasfollows:

$sudoportinstallopencv+python27+openni_sensorkinect

Notethatourlocalrepository’sPortfiletakesprecedenceoverthedefaultrepository’sPortfilebecauseoftheorderinwhichtheyarelistedin/opt/local/etc/macports/sources.conf.

UsingHomebrewwithready-madepackages(nosupportfordepthcameras)Homebrewisanotherpackagemanagerthatcanhelpus.Normally,MacPortsandHomebrewshouldnotbeinstalledonthesamemachine.

StartingfromasystemwhereXcodeanditscommand-linetoolsarealreadysetup,thefollowingstepswillgiveusanOpenCVinstallationviaHomebrew:

1. OpentheterminalandrunthefollowingcommandtoinstallHomebrew:

$ruby-e"$(curl-fsSkLraw.github.com/mxcl/homebrew/go)"

2. UnlikeMacPorts,HomebrewdoesnotautomaticallyputitsexecutablesinPATH.Todoso,createoreditthe~/.profilefileandaddthislineatthetopofthecode:

exportPATH=/usr/local/bin:/usr/local/sbin:$PATH

SavethefileandrunthiscommandtorefreshPATH:

Page 56: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

$source~/.profile

NotethatexecutablesinstalledbyHomebrewnowtakeprecedenceoverexecutablesinstalledbythesystem.

3. ForHomebrew’sself-diagnosticreport,runthefollowingcommand:

$brewdoctor

Followanytroubleshootingadviceitgives.

4. Now,updateHomebrew:

$brewupdate

5. RunthefollowingcommandtoinstallPython2.7:

$brewinstallpython

6. Now,wecaninstallNumPy.Homebrew’sselectionofthePythonlibrarypackagesislimited,soweuseaseparatepackagemanagementtoolcalledpip,whichcomeswithHomebrew’sPython:

$pipinstallnumpy

7. SciPycontainssomeFortrancode,soweneedanappropriatecompiler.WecanuseHomebrewtoinstallthegfortrancompiler:

$brewinstallgfortran

Now,wecaninstallSciPy:

$pipinstallscipy

8. ToinstallOpenCVona64-bitsystem(allnewMachardwaresincelate2006),runthefollowingcommand:

$brewinstallopencv

TipDownloadingtheexamplecode

YoucandownloadtheexamplecodefilesforallPacktPublishingbooksthatyouhavepurchasedfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

UsingHomebrewwithyourowncustompackagesHomebrewmakesiteasytoeditexistingpackagedefinitions:

$breweditopencv

ThepackagedefinitionsareactuallyscriptsintheRubyprogramminglanguage.TipsoneditingthemcanbefoundontheHomebrewWikipageat

Page 57: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

https://github.com/mxcl/homebrew/wiki/Formula-Cookbook.AscriptmayspecifyMakeorCMakeconfigurationflags,amongotherthings.

ToseewhichCMakeconfigurationflagsarerelevanttoOpenCV,weneedtolookatitssourcecode.Downloadthesourcecodearchivefromhttps://github.com/Itseez/opencv/archive/3.0.0.zip,unzipittoanylocation,andread<unzip_destination>/OpenCV-2.4.3/CMakeLists.txt.

AftermakingeditstotheRubyscript,saveit.

Thecustomizedpackagecanbetreatedasnormal.Forexample,itcanbeinstalledasfollows:

$brewinstallopencv

Page 58: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallationonUbuntuanditsderivativesFirstandforemost,hereisaquicknoteonUbuntu’sversionsofanoperatingsystem:Ubuntuhasa6-monthreleasecycleinwhicheachreleaseiseithera.04ora.10minorversionofamajorversion(14atthetimeofwriting).Everytwoyears,however,Ubuntureleasesaversionclassifiedaslong-termsupport(LTS)whichwillgrantyouafiveyearsupportbyCanonical(thecompanybehindUbuntu).Ifyouworkinanenterpriseenvironment,itiscertainlyadvisabletoinstalloneoftheLTSversions.Thelatestoneavailableis14.04.

UbuntucomeswithPython2.7preinstalled.ThestandardUbunturepositorycontainsOpenCV2.4.9packageswithoutsupportfordepthcameras.Atthetimeofwritingthis,OpenCV3isnotyetavailablethroughtheUbunturepositories,sowewillhavetobuilditfromsource.Fortunately,thevastmajorityofUnix-likeandLinuxsystemscomewithallthenecessarysoftwaretobuildaprojectfromscratchalreadyinstalled.Whenbuiltfromsource,OpenCVcansupportdepthcamerasviaOpenNIandSensorKinect,whichareavailableasprecompiledbinarieswithinstallationscripts.

UsingtheUbunturepository(nosupportfordepthcameras)WecaninstallPythonandallitsnecessarydependenciesusingtheaptpackagemanager,byrunningthefollowingcommands:

>sudoapt-getinstallbuild-essential

>sudoapt-getinstallcmakegitlibgtk2.0-devpkg-configlibavcodecdevlibavformat-devlibswscale-dev

>sudoapt-getinstallpython-devpython-numpylibtbb2libtbb-devlibjpeg-

devlibpng-devlibtiff-devlibjasper-devlibdc1394-22-dev

Equivalently,wecouldhaveusedUbuntuSoftwareCenter,whichistheaptpackagemanager’sgraphicalfrontend.

BuildingOpenCVfromasourceNowthatwehavetheentirePythonstackandcmakeinstalled,wecanbuildOpenCV.First,weneedtodownloadthesourcecodefromhttps://github.com/Itseez/opencv/archive/3.0.0-beta.zip.

Extractthearchiveandmoveitintotheunzippedfolderinaterminal.

Then,runthefollowingcommands:

>mkdirbuild

>cdbuild

>cmake-DCMAKE_BUILD_TYPE=Release-DCMAKE_INSTALL_PREFIX=/usr/local..

>make

>makeinstall

Aftertheinstallationterminates,youmightwanttolookatOpenCV’sPythonsamplesin<opencv_folder>/opencv/samples/pythonand<script_folder>/opencv/samples/python2.

Page 59: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallationonotherUnix-likesystemsTheapproachesforUbuntu(asdescribedpreviously)arelikelytoworkonanyLinuxdistributionderivedfromUbuntu14.04LTSorUbuntu14.10asfollows:

Kubuntu14.04LTSorKubuntu14.10Xubuntu14.04LTSorXubuntu14.10LinuxMint17

OnDebianLinuxanditsderivatives,theaptpackagemanagerworksthesameasonUbuntu,thoughtheavailablepackagesmaydiffer.

OnGentooLinuxanditsderivatives,thePortagepackagemanagerissimilartoMacPorts(asdescribedpreviously),thoughtheavailablepackagesmaydiffer.

OnFreeBSDderivatives,theprocessofinstallationisagainsimilartoMacPorts;infact,MacPortsderivesfromtheportsinstallationsystemadoptedonFreeBSD.ConsulttheremarkableFreeBSDHandbookathttps://www.freebsd.org/doc/handbook/foranoverviewofthesoftwareinstallationprocess.

OnotherUnix-likesystems,thepackagemanagerandavailablepackagesmaydiffer.Consultyourpackagemanager’sdocumentationandsearchforpackageswithopencvintheirnames.RememberthatOpenCVanditsPythonbindingsmightbesplitintomultiplepackages.

Also,lookforanyinstallationnotespublishedbythesystemprovider,therepositorymaintainer,orthecommunity.SinceOpenCVusescameradriversandmediacodecs,gettingallofitsfunctionalitytoworkcanbetrickyonsystemswithpoormultimediasupport.Undersomecircumstances,systempackagesmightneedtobereconfiguredorreinstalledforcompatibility.

IfpackagesareavailableforOpenCV,checktheirversionnumber.OpenCV3orhigherisrecommendedforthisbook’spurposes.Also,checkwhetherthepackagesofferPythonbindingsanddepthcamerasupportviaOpenNIandSensorKinect.Finally,checkwhetheranyoneinthedevelopercommunityhasreportedsuccessorfailureinusingthepackages.

If,instead,wewanttodoacustombuildofOpenCVfromsource,itmightbehelpfultorefertotheinstallationscriptforUbuntu(asdiscussedpreviously)andadaptittothepackagemanagerandpackagesthatarepresentonanothersystem.

Page 60: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 61: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

InstallingtheContribmodulesUnlikewithOpenCV2.4,somemodulesarecontainedinarepositorycalledopencv_contrib,whichisavailableathttps://github.com/Itseez/opencv_contrib.IhighlyrecommendinstallingthesemodulesastheycontainextrafunctionalitiesthatarenotincludedinOpenCV,suchasthefacerecognitionmodule.

Oncedownloaded(eitherthroughziporgit,Irecommendgitsothatyoucankeepuptodatewithasimplegitpullcommand),youcanrerunyourcmakecommandtoincludethebuildingofOpenCVwiththeopencv_contribmodulesasfollows:

cmake-DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules

<opencv_source_directory>

So,ifyou’vefollowedthestandardprocedureandcreatedabuilddirectoryinyourOpenCVdownloadfolder,youshouldrunthefollowingcommand:

mkdirbuild&&cdbuild

cmake-DCMAKE_BUILD_TYPE=Release-DOPENCV_EXTRA_MODULES_PATH=

<opencv_contrib>/modules-DCMAKE_INSTALL_PREFIX=/usr/local..

make

Page 62: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 63: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

RunningsamplesRunningafewsamplescriptsisagoodwaytotestwhetherOpenCViscorrectlysetup.ThesamplesareincludedinOpenCV’ssourcecodearchive.

OnWindows,weshouldhavealreadydownloadedandunzippedOpenCV’sself-extractingZIP.Findthesamplesin<unzip_destination>/opencv/samples.

OnUnix-likesystems,includingMac,downloadthesourcecodearchivefromhttps://github.com/Itseez/opencv/archive/3.0.0.zipandunzipittoanylocation(ifwehavenotalreadydoneso).Findthesamplesin<unzip_destination>/OpenCV-3.0.0/samples.

Someofthesamplescriptsrequirecommand-linearguments.However,thefollowingscripts(amongothers)shouldworkwithoutanyarguments:

python/camera.py:Thisscriptdisplaysawebcamfeed(assumingthatawebcamispluggedin).python/drawing.py:Thisscriptdrawsaseriesofshapes,suchasascreensaver.python2/hist.py:Thisscriptdisplaysaphoto.PressA,B,C,D,orEtoseethevariationsofthephotoalongwithacorrespondinghistogramofcolororgrayscalevalues.python2/opt_flow.py(missingfromtheUbuntupackage):Thisscriptdisplaysawebcamfeedwithasuperimposedvisualizationofanopticalflow(suchasthedirectionofmotion).Forexample,slowlywaveyourhandatthewebcamtoseetheeffect.Press1or2foralternativevisualizations.

Toexitascript,pressEsc(notthewindow’sclosebutton).

IfweencountertheImportError:Nomodulenamedcv2.cvmessage,thenthismeansthatwearerunningthescriptfromaPythoninstallationthatdoesnotknowanythingaboutOpenCV.Therearetwopossibleexplanationsforthis:

SomestepsintheOpenCVinstallationmighthavefailedorbeenmissed.Gobackandreviewthesteps.IfwehavemultiplePythoninstallationsonthemachine,wemightbeusingthewrongversionofPythontolaunchthescript.Forexample,onMac,itmightbethecasethatOpenCVisinstalledforMacPortsPython,butwearerunningthescriptwiththesystem’sPython.Gobackandreviewtheinstallationstepsabouteditingthesystempath.Also,trylaunchingthescriptmanuallyfromthecommandlineusingcommandssuchasthis:

$pythonpython/camera.py

Youcanalsousethefollowingcommand:

$python2.7python/camera.py

AsanotherpossiblemeansofselectingadifferentPythoninstallation,tryeditingthesamplescripttoremovethe#!lines.TheselinesmightexplicitlyassociatethescriptwiththewrongPythoninstallation(forourparticularsetup).

Page 64: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 65: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Findingdocumentation,help,andupdatesOpenCV’sdocumentationcanbefoundonlineathttp://docs.opencv.org/.ThedocumentationincludesacombinedAPIreferenceforOpenCV’snewC++API,itsnewPythonAPI(whichisbasedontheC++API),oldCAPI,anditsoldPythonAPI(whichisbasedontheCAPI).Whenlookingupaclassorfunction,besuretoreadthesectionaboutthenewPythonAPI(thecv2module),andnottheoldPythonAPI(thecvmodule).

ThedocumentationisalsoavailableasseveraldownloadablePDFfiles:

APIreference:Thisdocumentationcanbefoundathttp://docs.opencv.org/modules/refman.htmlTutorials:Thesedocumentscanbefoundathttp://docs.opencv.org/doc/tutorials/tutorials.html(thesetutorialsusetheC++code;foraPythonportofthetutorials’code,seetherepositoryofAbidRahmanK.athttp://goo.gl/EPsD1)

IfyouwritecodeonairplanesorotherplaceswithoutInternetaccess,youwilldefinitelywanttokeepofflinecopiesofthedocumentation.

Ifthedocumentationdoesnotseemtoansweryourquestions,trytalkingtotheOpenCVcommunity.Herearesomesiteswhereyouwillfindhelpfulpeople:

TheOpenCVforum:http://www.answers.opencv.org/questions/DavidMillánEscrivá‘sblog(oneofthisbook’sreviewers):http://blog.damiles.com/AbidRahmanK.‘sblog(oneofthisbook’sreviewers):http://www.opencvpython.blogspot.com/AdrianRosebrock’swebsite(oneofthisbook’sreviewers):http://www.pyimagesearch.com/JoeMinichino’swebsiteforthisbook(authorofthisbook):http://techfort.github.io/pycv/JoeHowse’swebsiteforthisbook(authorofthefirsteditionofthisbook):http://nummist.com/opencv/

Lastly,ifyouareanadvanceduserwhowantstotrynewfeatures,bugfixes,andsamplescriptsfromthelatest(unstable)OpenCVsourcecode,havealookattheproject’srepositoryathttps://github.com/Itseez/opencv/.

Page 66: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 67: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryBynow,weshouldhaveanOpenCVinstallationthatcandoeverythingweneedfortheprojectdescribedinthisbook.Dependingonwhichapproachwetook,wemightalsohaveasetoftoolsandscriptsthatareusabletoreconfigureandrebuildOpenCVforourfutureneeds.

WeknowwheretofindOpenCV’sPythonsamples.Thesesamplescoveredadifferentrangeoffunctionalitiesoutsidethisbook’sscope,buttheyareusefulasadditionallearningaids.

Inthenextchapter,wewillfamiliarizeourselveswiththemostbasicfunctionsoftheOpenCVAPI,namely,displayingimages,videos,capturingvideosthroughawebcam,andhandlingbasickeyboardandmouseinputs.

Page 68: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 69: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter2.HandlingFiles,Cameras,andGUIsInstallingOpenCVandrunningsamplesisfun,butatthisstage,wewanttotryitoutourselves.ThischapterintroducesOpenCV’sI/Ofunctionality.Wealsodiscusstheconceptofaprojectandthebeginningsofanobject-orienteddesignforthisproject,whichwewillfleshoutinsubsequentchapters.

BystartingwithalookattheI/Ocapabilitiesanddesignpatterns,wewillbuildourprojectinthesamewaywewouldmakeasandwich:fromtheoutsidein.Breadslicesandspread,orendpointsandglue,comebeforefillingsoralgorithms.Wechoosethisapproachbecausecomputervisionismostlyextroverted—itcontemplatestherealworldoutsideourcomputer—andwewanttoapplyalloursubsequentalgorithmicworktotherealworldthroughacommoninterface.

Page 70: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

BasicI/OscriptsMostCVapplicationsneedtogetimagesasinput.Mostalsoproduceimagesasoutput.AninteractiveCVapplicationmightrequireacameraasaninputsourceandawindowasanoutputdestination.However,otherpossiblesourcesanddestinationsincludeimagefiles,videofiles,andrawbytes.Forexample,rawbytesmightbetransmittedviaanetworkconnection,ortheymightbegeneratedbyanalgorithmifweincorporateproceduralgraphicsintoourapplication.Let’slookateachofthesepossibilities.

Page 71: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Reading/writinganimagefileOpenCVprovidestheimread()andimwrite()functionsthatsupportvariousfileformatsforstillimages.ThesupportedformatsvarybysystembutshouldalwaysincludetheBMPformat.Typically,PNG,JPEG,andTIFFshouldbeamongthesupportedformatstoo.

Let’sexploretheanatomyoftherepresentationofanimageinPythonandNumPy.

Nomattertheformat,eachpixelhasavalue,butthedifferenceisinhowthepixelisrepresented.Forexample,wecancreateablacksquareimagefromscratchbysimplycreatinga2DNumPyarray:

img=numpy.zeros((3,3),dtype=numpy.uint8)

Ifweprintthisimagetoaconsole,weobtainthefollowingresult:

array([[0,0,0],

[0,0,0],

[0,0,0]],dtype=uint8)

Eachpixelisrepresentedbyasingle8-bitinteger,whichmeansthatthevaluesforeachpixelareinthe0-255range.

Let’snowconvertthisimageintoBlue-green-red(BGR)usingcv2.cvtColor:

img=cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)

Let’sobservehowtheimagehaschanged:

array([[[0,0,0],

[0,0,0],

[0,0,0]],

[[0,0,0],

[0,0,0],

[0,0,0]],

[[0,0,0],

[0,0,0],

[0,0,0]]],dtype=uint8)

Asyoucansee,eachpixelisnowrepresentedbyathree-elementarray,witheachintegerrepresentingtheB,G,andRchannels,respectively.Othercolorspaces,suchasHSV,willberepresentedinthesameway,albeitwithdifferentvalueranges(forexample,thehuevalueoftheHSVcolorspacehasarangeof0-180)anddifferentnumbersofchannels.

Youcancheckthestructureofanimagebyinspectingtheshapeproperty,whichreturnsrows,columns,andthenumberofchannels(ifthereismorethanone).

Considerthisexample:

>>>img=numpy.zeros((3,3),dtype=numpy.uint8)

>>>img.shape

Theprecedingcodewillprint(3,3).IfyouthenconvertedtheimagetoBGR,theshape

Page 72: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

wouldbe(3,3,3),whichindicatesthepresenceofthreechannelsperpixel.

Imagescanbeloadedfromonefileformatandsavedtoanother.Forexample,let’sconvertanimagefromPNGtoJPEG:

importcv2

image=cv2.imread('MyPic.png')

cv2.imwrite('MyPic.jpg',image)

NoteMostoftheOpenCVfunctionalitiesthatweuseareinthecv2module.YoumightcomeacrossotherOpenCVguidesthatinsteadrelyonthecvorcv2.cvmodules,whicharelegacyversions.ThereasonwhythePythonmoduleiscalledcv2isnotbecauseitisaPythonbindingmoduleforOpenCV2.x.x,butbecauseithasintroducedabetterAPI,whichleveragesobject-orientedprogrammingasopposedtothepreviouscvmodule,whichadheredtoamoreproceduralstyleofprogramming.

Bydefault,imread()returnsanimageintheBGRcolorformatevenifthefileusesagrayscaleformat.BGRrepresentsthesamecolorspaceasred-green-blue(RGB),butthebyteorderisreversed.

Optionally,wemayspecifythemodeofimread()tobeoneofthefollowingenumerators:

IMREAD_ANYCOLOR=4

IMREAD_ANYDEPTH=2

IMREAD_COLOR=1

IMREAD_GRAYSCALE=0

IMREAD_LOAD_GDAL=8

IMREAD_UNCHANGED=-1

Forexample,let’sloadaPNGfileasagrayscaleimage(losinganycolorinformationintheprocess),andthen,saveitasagrayscalePNGimage:

importcv2

grayImage=cv2.imread('MyPic.png',cv2.IMREAD_GRAYSCALE)

cv2.imwrite('MyPicGray.png',grayImage)

Toavoidunnecessaryheadaches,useabsolutepathstoyourimages(forexample,C:\Users\Joe\Pictures\MyPic.pngonWindowsor/home/joe/pictures/MyPic.pngonUnix)atleastwhileyou’refamiliarizingyourselfwithOpenCV’sAPI.Thepathofanimage,unlessabsolute,isrelativetothefolderthatcontainsthePythonscript,sointheprecedingexample,MyPic.pngwouldhavetobeinthesamefolderasyourPythonscriptortheimagewon’tbefound.

Regardlessofthemode,imread()discardsanyalphachannel(transparency).Theimwrite()functionrequiresanimagetobeintheBGRorgrayscaleformatwithacertainnumberofbitsperchannelthattheoutputformatcansupport.Forexample,bmprequires8bitsperchannel,whilePNGallowseither8or16bitsperchannel.

Page 73: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ConvertingbetweenanimageandrawbytesConceptually,abyteisanintegerrangingfrom0to255.Inallreal-timegraphicapplicationstoday,apixelistypicallyrepresentedbyonebyteperchannel,thoughotherrepresentationsarealsopossible.

AnOpenCVimageisa2Dor3Darrayofthe.arraytype.An8-bitgrayscaleimageisa2Darraycontainingbytevalues.A24-bitBGRimageisa3Darray,whichalsocontainsbytevalues.Wemayaccessthesevaluesbyusinganexpression,suchasimage[0,0]orimage[0,0,0].Thefirstindexisthepixel’sycoordinateorrow,0beingthetop.Thesecondindexisthepixel’sxcoordinateorcolumn,0beingtheleftmost.Thethirdindex(ifapplicable)representsacolorchannel.

Forexample,inan8-bitgrayscaleimagewithawhitepixelintheupper-leftcorner,image[0,0]is255.Fora24-bitBGRimagewithabluepixelintheupper-leftcorner,image[0,0]is[255,0,0].

NoteAsanalternativetousinganexpression,suchasimage[0,0]orimage[0,0]=128,wemayuseanexpression,suchasimage.item((0,0))orimage.setitem((0,0),128).Thelatterexpressionsaremoreefficientforsingle-pixeloperations.However,aswewillseeinsubsequentchapters,weusuallywanttoperformoperationsonlargeslicesofanimageratherthanonsinglepixels.

Providedthatanimagehas8bitsperchannel,wecancastittoastandardPythonbytearray,whichisone-dimensional:

byteArray=bytearray(image)

Conversely,providedthatbytearraycontainsbytesinanappropriateorder,wecancastandthenreshapeittogetanumpy.arraytypethatisanimage:

grayImage=numpy.array(grayByteArray).reshape(height,width)

bgrImage=numpy.array(bgrByteArray).reshape(height,width,3)

Asamorecompleteexample,let’sconvertbytearray,whichcontainsrandombytestoagrayscaleimageandaBGRimage:

importcv2

importnumpy

importos

#Makeanarrayof120,000randombytes.

randomByteArray=bytearray(os.urandom(120000))

flatNumpyArray=numpy.array(randomByteArray)

#Convertthearraytomakea400x300grayscaleimage.

grayImage=flatNumpyArray.reshape(300,400)

cv2.imwrite('RandomGray.png',grayImage)

#Convertthearraytomakea400x100colorimage.

bgrImage=flatNumpyArray.reshape(100,400,3)

Page 74: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

cv2.imwrite('RandomColor.png',bgrImage)

Afterrunningthisscript,weshouldhaveapairofrandomlygeneratedimages,RandomGray.pngandRandomColor.png,inthescript’sdirectory.

NoteHere,weusePython’sstandardos.urandom()functiontogeneraterandomrawbytes,whichwewillthenconverttoaNumPyarray.NotethatitisalsopossibletogeneratearandomNumPyarraydirectly(andmoreefficiently)usingastatement,suchasnumpy.random.randint(0,256,120000).reshape(300,400).Theonlyreasonweuseos.urandom()istohelpdemonstrateaconversionfromrawbytes.

Page 75: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Accessingimagedatawithnumpy.arrayNowthatyouhaveabetterunderstandingofhowanimageisformed,wecanstartperformingbasicoperationsonit.Weknowthattheeasiest(andmostcommon)waytoloadanimageinOpenCVistousetheimreadfunction.Wealsoknowthatthiswillreturnanimage,whichisreallyanarray(eithera2Dor3Done,dependingontheparametersyoupassedtoimread()).

They.arraystructureiswelloptimizedforarrayoperations,anditallowscertainkindsofbulkmanipulationsthatarenotavailableinaplainPythonlist.Thesekindsof.arraytype-specificoperationscomeinhandyforimagemanipulationsinOpenCV.Let’sexploreimagemanipulationsfromthestartandstepbystepthough,withabasicexample:sayyouwanttomanipulateapixelatthecoordinates,(0,0),ofaBGRimageandturnitintoawhitepixel.

importcv

importnumpyasnp

img=cv.imread('MyPic.png')

img[0,0]=[255,255,255]

Ifyouthenshowedtheimagewithastandardimshow()call,youwillseeawhitedotinthetop-leftcorneroftheimage.Naturally,thisisn’tveryuseful,butitshowswhatcanbeaccomplished.Let’snowleveragetheabilityofnumpy.arraytooperatetransformationstoanarraymuchfasterthanaplainPythonarray.

Let’ssaythatyouwanttochangethebluevalueofaparticularpixel,forexample,thepixelatcoordinates,(150,120).Thenumpy.arraytypeprovidesaveryhandymethod,item(),whichtakesthreeparameters:thex(orleft)position,y(ortop),andtheindexwithinthearrayat(x,y)position(rememberthatinaBGRimage,thedataatacertainpositionisathree-elementarraycontainingtheB,G,andRvaluesinthisorder)andreturnsthevalueattheindexposition.Anotheritemset()methodsetsthevalueofaparticularchannelofaparticularpixeltoaspecifiedvalue(itemset()takestwoarguments:athree-elementtuple(x,y,andindex)andthenewvalue).

Inthisexample,wewillchangethevalueofblueat(150,120)fromitscurrentvalue(127)toanarbitrary255:

importcv

importnumpyasnp

img=cv.imread('MyPic.png')

printimg.item(150,120,0)//printsthecurrentvalueofBforthat

pixel

img.itemset((150,120,0),255)

printimg.item(150,120,0)//prints255

Rememberthatwedothiswithnumpy.arrayfortworeasons:numpy.arrayisanextremelyoptimizedlibraryforthesekindofoperations,andbecauseweobtainmorereadablecodethroughNumPy’selegantmethodsratherthantherawindexaccessofthefirstexample.

Page 76: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thisparticularcodedoesn’tdomuchinitself,butitdoesopenaworldofpossibilities.Itis,however,advisablethatyouutilizebuilt-infiltersandmethodstomanipulateanentireimage;theaboveapproachisonlysuitableforsmallregionsofinterest.

Now,let’stakealookataverycommonoperation,namely,manipulatingchannels.Sometimes,you’llwanttozero-outallthevaluesofaparticularchannel(B,G,orR).

TipUsingloopstomanipulatethePythonarraysisverycostlyintermsofruntimeandshouldbeavoidedatallcosts.Usingarrayindexingallowsforefficientmanipulationofpixels.Thisisacostlyandslowoperation,especiallyifyoumanipulatevideos,you’llfindyourselfwithajitteryoutput.Thenafeaturecalledindexingcomestotherescue.SettingallG(green)valuesofanimageto0isassimpleasusingthiscode:

importcv

importasnp

img=cv.imread('MyPic.png')

img[:,:,1]=0

Thisisafairlyimpressivepieceofcodeandeasytounderstand.Therelevantlineisthelastone,whichbasicallyinstructstheprogramtotakeallpixelsfromallrowsandcolumnsandsettheresultingvalueatindexoneofthethree-elementarray,representingthecolorofthepixelto0.Ifyoudisplaythisimage,youwillnoticeacompleteabsenceofgreen.

ThereareanumberofinterestingthingswecandobyaccessingrawpixelswithNumPy’sarrayindexing;oneofthemisdefiningregionsofinterests(ROI).Oncetheregionisdefined,wecanperformanumberofoperations,namely,bindingthisregiontoavariable,andthenevendefiningasecondregionandassigningitthevalueofthefirstone(visuallycopyingaportionoftheimageovertoanotherpositionintheimage):

importcv

importnumpyasnp

img=cv.imread('MyPic.png')

my_roi=img[0:100,0:100]

img[300:400,300:400]=my_roi

It’simportanttomakesurethatthetworegionscorrespondintermsofsize.Ifnot,NumPywill(rightly)complainthatthetwoshapesmismatch.

Finally,thereareafewinterestingdetailswecanobtainfromnumpy.array,suchastheimagepropertiesusingthiscode:

importcv

importnumpyasnp

img=cv.imread('MyPic.png')

printimg.shape

printimg.size

printimg.dtype

Thesethreepropertiesareinthisorder:

Page 77: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Shape:NumPyreturnsatuplecontainingthewidth,height,and—iftheimageisincolor—thenumberofchannels.Thisisusefultodebugatypeofimage;iftheimageismonochromaticorgrayscale,itwillnotcontainachannel’svalue.Size:Thispropertyreferstothesizeofanimageinpixels.Datatype:Thispropertyreferstothedatatypeusedforanimage(normallyavariationofanunsignedintegertypeandthebitssupportedbythistype,thatis,uint8).

Allinall,itisstronglyadvisablethatyoufamiliarizeyourselfwithNumPyingeneralandnumpy.arrayinparticularwhenworkingwithOpenCV,asitisthefoundationofanimageprocessingdonewithPython.

Page 78: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Reading/writingavideofileOpenCVprovidestheVideoCaptureandVideoWriterclassesthatsupportvariousvideofileformats.ThesupportedformatsvarybysystembutshouldalwaysincludeanAVI.Viaitsread()method,aVideoCaptureclassmaybepolledfornewframesuntilitreachestheendofitsvideofile.EachframeisanimageinaBGRformat.

Conversely,animagemaybepassedtothewrite()methodoftheVideoWriterclass,whichappendstheimagetoafileinVideoWriter.Let’slookatanexamplethatreadsframesfromoneAVIfileandwritesthemtoanotherwithaYUVencoding:

importcv2

videoCapture=cv2.VideoCapture('MyInputVid.avi')

fps=videoCapture.get(cv2.CAP_PROP_FPS)

size=(int(videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),

int(videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))

videoWriter=cv2.VideoWriter(

'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)

success,frame=videoCapture.read()

whilesuccess:#Loopuntiltherearenomoreframes.

videoWriter.write(frame)

success,frame=videoCapture.read()

TheargumentstotheVideoWriterclassconstructordeservespecialattention.Avideo’sfilenamemustbespecified.Anypreexistingfilewiththisnameisoverwritten.Avideocodecmustalsobespecified.Theavailablecodecsmayvaryfromsystemtosystem.Thesearetheoptionsthatareincluded:

cv2.VideoWriter_fourcc('I','4','2','0'):ThisoptionisanuncompressedYUVencoding,4:2:0chromasubsampled.Thisencodingiswidelycompatiblebutproduceslargefiles.Thefileextensionshouldbe.avi.cv2.VideoWriter_fourcc('P','I','M','1'):ThisoptionisMPEG-1.Thefileextensionshouldbe.avi.cv2.VideoWriter_fourcc('X','V','I','D'):ThisoptionisMPEG-4andapreferredoptionifyouwanttheresultingvideosizetobeaverage.Thefileextensionshouldbe.avi.cv2.VideoWriter_fourcc('T','H','E','O'):ThisoptionisOggVorbis.Thefileextensionshouldbe.ogv.cv2.VideoWriter_fourcc('F','L','V','1'):ThisoptionisaFlashvideo.Thefileextensionshouldbe.flv.

Aframerateandframesizemustbespecifiedtoo.Sincewearecopyingvideoframesfromanothervideo,thesepropertiescanbereadfromtheget()methodoftheVideoCaptureclass.

Page 79: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CapturingcameraframesAstreamofcameraframesisrepresentedbytheVideoCaptureclasstoo.However,foracamera,weconstructaVideoCaptureclassbypassingthecamera’sdeviceindexinsteadofavideo’sfilename.Let’sconsideranexamplethatcaptures10secondsofvideofromacameraandwritesittoanAVIfile:

importcv2

cameraCapture=cv2.VideoCapture(0)

fps=30#anassumption

size=(int(cameraCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),

int(cameraCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))

videoWriter=cv2.VideoWriter(

'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)

success,frame=cameraCapture.read()

numFramesRemaining=10*fps-1

whilesuccessandnumFramesRemaining>0:

videoWriter.write(frame)

success,frame=cameraCapture.read()

numFramesRemaining-=1

cameraCapture.release()

Unfortunately,theget()methodofaVideoCaptureclassdoesnotreturnanaccuratevalueforthecamera’sframerate;italwaysreturns0.Theofficialdocumentationathttp://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.htmlreads:

“WhenqueryingapropertythatisnotsupportedbythebackendusedbytheVideoCaptureclass,value0isreturned.”

Thisoccursmostcommonlyonsystemswherethedriveronlysupportsbasicfunctionalities.

ForthepurposeofcreatinganappropriateVideoWriterclassforthecamera,wehavetoeithermakeanassumptionabouttheframerate(aswedidinthecodepreviously)ormeasureitusingatimer.Thelatterapproachisbetterandwewillcoveritlaterinthischapter.

Thenumberofcamerasandtheirorderisofcoursesystem-dependent.Unfortunately,OpenCVdoesnotprovideanymeansofqueryingthenumberofcamerasortheirproperties.IfaninvalidindexisusedtoconstructaVideoCaptureclass,theVideoCaptureclasswillnotyieldanyframes;itsread()methodwillreturn(false,None).AgoodwaytopreventitfromtryingtoretrieveframesfromVideoCapturethatwerenotopenedcorrectlyistousetheVideoCapture.isOpenedmethod,whichreturnsaBoolean.

Theread()methodisinappropriatewhenweneedtosynchronizeasetofcamerasoramultiheadcamera(suchasastereocameraorKinect).Then,weusethegrab()and

Page 80: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

retrieve()methodsinstead.Forasetofcameras,weusethiscode:

success0=cameraCapture0.grab()

success1=cameraCapture1.grab()

ifsuccess0andsuccess1:

frame0=cameraCapture0.retrieve()

frame1=cameraCapture1.retrieve()

Page 81: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DisplayingimagesinawindowOneofthemostbasicoperationsinOpenCVisdisplayinganimage.Thiscanbedonewiththeimshow()function.IfyoucomefromanyotherGUIframeworkbackground,youwouldthinkitsufficienttocallimshow()todisplayanimage.Thisisonlypartiallytrue:theimagewillbedisplayed,andwilldisappearimmediately.Thisisbydesign,toenabletheconstantrefreshingofawindowframewhenworkingwithvideos.Here’saverysimpleexamplecodetodisplayanimage:

importcv2

importnumpyasnp

img=cv2.imread('my-image.png')

cv2.imshow('myimage',img)

cv2.waitKey()

cv2.destroyAllWindows()

Theimshow()functiontakestwoparameters:thenameoftheframeinwhichwewanttodisplaytheimage,andtheimageitself.We’lltalkaboutwaitKey()inmoredetailwhenweexplorethedisplayingofframesinawindow.

TheaptlynameddestroyAllWindows()functiondisposesofallthewindowscreatedbyOpenCV.

Page 82: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DisplayingcameraframesinawindowOpenCVallowsnamedwindowstobecreated,redrawn,anddestroyedusingthenamedWindow(),imshow(),anddestroyWindow()functions.Also,anywindowmaycapturekeyboardinputviathewaitKey()functionandmouseinputviathesetMouseCallback()function.Let’slookatanexamplewhereweshowtheframesofalivecamerainput:

importcv2

clicked=False

defonMouse(event,x,y,flags,param):

globalclicked

ifevent==cv2.EVENT_LBUTTONUP:

clicked=True

cameraCapture=cv2.VideoCapture(0)

cv2.namedWindow('MyWindow')

cv2.setMouseCallback('MyWindow',onMouse)

print'Showingcamerafeed.Clickwindoworpressanykeytostop.'

success,frame=cameraCapture.read()

whilesuccessandcv2.waitKey(1)==-1andnotclicked:

cv2.imshow('MyWindow',frame)

success,frame=cameraCapture.read()

cv2.destroyWindow('MyWindow')

cameraCapture.release()

TheargumentforwaitKey()isanumberofmillisecondstowaitforkeyboardinput.Thereturnvalueiseither-1(meaningthatnokeyhasbeenpressed)oranASCIIkeycode,suchas27forEsc.ForalistofASCIIkeycodes,seehttp://www.asciitable.com/.Also,notethatPythonprovidesastandardfunction,ord(),whichcanconvertacharactertoitsASCIIkeycode.Forexample,ord('a')returns97.

TipOnsomesystems,waitKey()mayreturnavaluethatencodesmorethanjusttheASCIIkeycode.(AbugisknowntooccuronLinuxwhenOpenCVusesGTKasitsbackendGUIlibrary.)Onallsystems,wecanensurethatweextractjusttheASCIIkeycodebyreadingthelastbytefromthereturnvaluelikethis:

keycode=cv2.waitKey(1)

ifkeycode!=-1:

keycode&=0xFF

OpenCV’swindowfunctionsandwaitKey()areinterdependent.OpenCVwindowsareonlyupdatedwhenwaitKey()iscalled,andwaitKey()onlycapturesinputwhenanOpenCVwindowhasfocus.

ThemousecallbackpassedtosetMouseCallback()shouldtakefivearguments,asseeninourcodesample.Thecallback’sparamargumentissetasanoptionalthirdargumentto

Page 83: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

setMouseCallback().Bydefault,itis0.Thecallback’seventargumentisoneofthefollowingactions:

cv2.EVENT_MOUSEMOVE:Thiseventreferstomousemovementcv2.EVENT_LBUTTONDOWN:Thiseventreferstotheleftbuttondowncv2.EVENT_RBUTTONDOWN:Thisreferstotherightbuttondowncv2.EVENT_MBUTTONDOWN:Thisreferstothemiddlebuttondowncv2.EVENT_LBUTTONUP:Thisreferstotheleftbuttonupcv2.EVENT_RBUTTONUP:Thiseventreferstotherightbuttonupcv2.EVENT_MBUTTONUP:Thiseventreferstothemiddlebuttonupcv2.EVENT_LBUTTONDBLCLK:Thiseventreferstotheleftbuttonbeingdouble-clickedcv2.EVENT_RBUTTONDBLCLK:Thisreferstotherightbuttonbeingdouble-clickedcv2.EVENT_MBUTTONDBLCLK:Thisreferstothemiddlebuttonbeingdouble-clicked

Themousecallback’sflagsargumentmaybesomebitwisecombinationofthefollowingevents:

cv2.EVENT_FLAG_LBUTTON:Thiseventreferstotheleftbuttonbeingpressedcv2.EVENT_FLAG_RBUTTON:Thiseventreferstotherightbuttonbeingpressedcv2.EVENT_FLAG_MBUTTON:Thiseventreferstothemiddlebuttonbeingpressedcv2.EVENT_FLAG_CTRLKEY:ThiseventreferstotheCtrlkeybeingpressedcv2.EVENT_FLAG_SHIFTKEY:ThiseventreferstotheShiftkeybeingpressedcv2.EVENT_FLAG_ALTKEY:ThiseventreferstotheAltkeybeingpressed

Unfortunately,OpenCVdoesnotprovideanymeansofhandlingwindowevents.Forexample,wecannotstopourapplicationwhenawindow’sclosebuttonisclicked.DuetoOpenCV’slimitedeventhandlingandGUIcapabilities,manydevelopersprefertointegrateitwithotherapplicationframeworks.Laterinthischapter,wewilldesignanabstractionlayertohelpintegrateOpenCVintoanyapplicationframework.

Page 84: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 85: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ProjectCameo(facetrackingandimagemanipulation)OpenCVisoftenstudiedthroughacookbookapproachthatcoversalotofalgorithmsbutnothingabouthigh-levelapplicationdevelopment.Toanextent,thisapproachisunderstandablebecauseOpenCV’spotentialapplicationsaresodiverse.OpenCVisusedinawidevarietyofapplications:photo/videoeditors,motion-controlledgames,arobot’sAI,orpsychologyexperimentswherewelogparticipants’eyemovements.Acrosssuchdifferentusecases,canwetrulystudyausefulsetofabstractions?

Ibelievewecanandthesoonerwestartcreatingabstractions,thebetter.WewillstructureourstudyofOpenCVaroundasingleapplication,but,ateachstep,wewilldesignacomponentofthisapplicationtobeextensibleandreusable.

Wewilldevelopaninteractiveapplicationthatperformsfacetrackingandimagemanipulationsoncamerainputinrealtime.ThistypeofapplicationcoversabroadrangeofOpenCV’sfunctionalityandchallengesustocreateanefficient,effectiveimplementation.

Specifically,ourapplicationwillperformreal-timefacialmerging.Giventwostreamsofcamerainput(or,optionally,prerecordedvideoinput),theapplicationwillsuperimposefacesfromonestreamontofacesintheother.Filtersanddistortionswillbeappliedtogivethisblendedsceneaunifiedlookandfeel.Usersshouldhavetheexperienceofbeingengagedinaliveperformancewheretheyenteranotherenvironmentandpersona.ThistypeofuserexperienceispopularinamusementparkssuchasDisneyland.

Insuchanapplication,userswouldimmediatelynoticeflaws,suchasalowframerateorinaccuratetracking.Togetthebestresults,wewilltryseveralapproachesusingconventionalimaginganddepthimaging.

WewillcallourapplicationCameo.Acameois(injewelry)asmallportraitofapersonor(infilm)averybriefroleplayedbyacelebrity.

Page 86: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 87: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Cameo–anobject-orienteddesignPythonapplicationscanbewritteninapurelyproceduralstyle.Thisisoftendonewithsmallapplications,suchasourbasicI/Oscripts,discussedpreviously.However,fromnowon,wewilluseanobject-orientedstylebecauseitpromotesmodularityandextensibility.

FromouroverviewofOpenCV’sI/Ofunctionality,weknowthatallimagesaresimilar,regardlessoftheirsourceordestination.Nomatterhowweobtainastreamofimagesorwherewesenditasoutput,wecanapplythesameapplication-specificlogictoeachframeinthisstream.SeparationofI/Ocodeandapplicationcodebecomesespeciallyconvenientinanapplication,suchasCameo,whichusesmultipleI/Ostreams.

WewillcreateclassescalledCaptureManagerandWindowManagerashigh-levelinterfacestoI/Ostreams.OurapplicationcodemayuseCaptureManagertoreadnewframesand,optionally,todispatcheachframetooneormoreoutputs,includingastillimagefile,avideofile,andawindow(viaaWindowManagerclass).AWindowManagerclassletsourapplicationcodehandleawindowandeventsinanobject-orientedstyle.

BothCaptureManagerandWindowManagerareextensible.WecouldmakeimplementationsthatdonotrelyonOpenCVforI/O.Indeed,AppendixA,IntegratingwithPygame,OpenCVComputerVisionwithPython,usesaWindowManagersubclass.

Page 88: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Abstractingavideostreamwithmanagers.CaptureManagerAswehaveseen,OpenCVcancapture,show,andrecordastreamofimagesfromeitheravideofileoracamera,buttherearesomespecialconsiderationsineachcase.OurCaptureManagerclassabstractssomeofthedifferencesandprovidesahigher-levelinterfacetodispatchimagesfromthecapturestreamtooneormoreoutputs—astillimagefile,videofile,orawindow.

ACaptureManagerclassisinitializedwithaVideoCaptureclassandhastheenterFrame()andexitFrame()methodsthatshouldtypicallybecalledoneveryiterationofanapplication’smainloop.BetweenacalltoenterFrame()andexitFrame(),theapplicationmay(anynumberoftimes)setachannelpropertyandgetaframeproperty.Thechannelpropertyisinitially0andonlymultiheadcamerasuseothervalues.Theframepropertyisanimagecorrespondingtothecurrentchannel’sstatewhenenterFrame()wascalled.

ACaptureManagerclassalsohasthewriteImage(),startWritingVideo(),andstopWritingVideo()methodsthatmaybecalledatanytime.ActualfilewritingispostponeduntilexitFrame().Also,duringtheexitFrame()method,theframepropertymaybeshowninawindow,dependingonwhethertheapplicationcodeprovidesaWindowManagerclasseitherasanargumenttotheconstructorofCaptureManagerorbysettingapreviewWindowManagerproperty.

Iftheapplicationcodemanipulatesframe,themanipulationsarereflectedinrecordedfilesandinthewindow.ACaptureManagerclasshasaconstructorargumentandpropertycalledshouldMirrorPreview,whichshouldbeTrueifwewantframetobemirrored(horizontallyflipped)inthewindowbutnotinrecordedfiles.Typically,whenfacingacamera,userspreferlivecamerafeedtobemirrored.

RecallthataVideoWriterclassneedsaframerate,butOpenCVdoesnotprovideanywaytogetanaccurateframerateforacamera.TheCaptureManagerclassworksaroundthislimitationbyusingaframecounterandPython’sstandardtime.time()functiontoestimatetheframerateifnecessary.Thisapproachisnotfoolproof.Dependingonframeratefluctuationsandthesystem-dependentimplementationoftime.time(),theaccuracyoftheestimatemightstillbepoorinsomecases.However,ifwedeploytounknownhardware,itisbetterthanjustassumingthattheuser’scamerahasaparticularframerate.

Let’screateafilecalledmanagers.py,whichwillcontainourimplementationofCaptureManager.Theimplementationturnsouttobequitelong.So,wewilllookatitinseveralpieces.First,let’saddimports,aconstructor,andproperties,asfollows:

importcv2

importnumpy

importtime

classCaptureManager(object):

Page 89: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

def__init__(self,capture,previewWindowManager=None,

shouldMirrorPreview=False):

self.previewWindowManager=previewWindowManager

self.shouldMirrorPreview=shouldMirrorPreview

self._capture=capture

self._channel=0

self._enteredFrame=False

self._frame=None

self._imageFilename=None

self._videoFilename=None

self._videoEncoding=None

self._videoWriter=None

self._startTime=None

self._framesElapsed=long(0)

self._fpsEstimate=None

@property

defchannel(self):

returnself._channel

@channel.setter

defchannel(self,value):

ifself._channel!=value:

self._channel=value

self._frame=None

@property

defframe(self):

ifself._enteredFrameandself._frameisNone:

_,self._frame=self._capture.retrieve()

returnself._frame

@property

defisWritingImage(self):

returnself._imageFilenameisnotNone

@property

defisWritingVideo(self):

returnself._videoFilenameisnotNone

Notethatmostofthemembervariablesarenon-public,asdenotedbytheunderscoreprefixinvariablenames,suchasself._enteredFrame.Thesenonpublicvariablesrelatetothestateofthecurrentframeandanyfile-writingoperations.Asdiscussedpreviously,theapplicationcodeonlyneedstoconfigureafewthings,whichareimplementedasconstructorargumentsandsettablepublicproperties:thecamerachannel,windowmanager,andtheoptiontomirrorthecamerapreview.

ThisbookassumesacertainleveloffamiliaritywithPython;however,ifyouaregetting

Page 90: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

confusedbythose@annotations(forexample,@property),refertothePythondocumentationaboutdecorators,abuilt-infeatureofthelanguagethatallowsthewrappingofafunctionbyanotherfunction,normallyusedtoapplyauser-definedbehaviorinseveralplacesofanapplication(refertohttps://docs.python.org/2/reference/compound_stmts.html#grammar-token-decorator).

NotePythondoesnothavetheconceptofprivatemembervariablesandthesingle/doubleunderscoreprefix(_)isonlyaconvention.

Bythisconvention,inPython,variablesthatareprefixedwithasingleunderscoreshouldbetreatedasprotected(accessedonlywithintheclassanditssubclasses),whilevariablesthatareprefixedwithadoubleunderscoreshouldbetreatedasprivate(accessedonlywithintheclass).

Continuingwithourimplementation,let’saddtheenterFrame()andexitFrame()methodstomanagers.py:

defenterFrame(self):

"""Capturethenextframe,ifany."""

#Butfirst,checkthatanypreviousframewasexited.

assertnotself._enteredFrame,\

'previousenterFrame()hadnomatchingexitFrame()'

ifself._captureisnotNone:

self._enteredFrame=self._capture.grab()

defexitFrame(self):

"""Drawtothewindow.Writetofiles.Releasetheframe."""

#Checkwhetheranygrabbedframeisretrievable.

#Thegettermayretrieveandcachetheframe.

ifself.frameisNone:

self._enteredFrame=False

return

#UpdatetheFPSestimateandrelatedvariables.

ifself._framesElapsed==0:

self._startTime=time.time()

else:

timeElapsed=time.time()-self._startTime

self._fpsEstimate=self._framesElapsed/timeElapsed

self._framesElapsed+=1

#Drawtothewindow,ifany.

ifself.previewWindowManagerisnotNone:

ifself.shouldMirrorPreview:

mirroredFrame=numpy.fliplr(self._frame).copy()

self.previewWindowManager.show(mirroredFrame)

else:

self.previewWindowManager.show(self._frame)

Page 91: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

#Writetotheimagefile,ifany.

ifself.isWritingImage:

cv2.imwrite(self._imageFilename,self._frame)

self._imageFilename=None

#Writetothevideofile,ifany.

self._writeVideoFrame()

#Releasetheframe.

self._frame=None

self._enteredFrame=False

NotethattheimplementationofenterFrame()onlygrabs(synchronizes)aframe,whereasactualretrievalfromachannelispostponedtoasubsequentreadingoftheframevariable.TheimplementationofexitFrame()takestheimagefromthecurrentchannel,estimatesaframerate,showstheimageviathewindowmanager(ifany),andfulfillsanypendingrequeststowritetheimagetofiles.

Severalothermethodsalsopertaintofilewriting.Tofinishourclassimplementation,let’saddtheremainingfile-writingmethodstomanagers.py:

defwriteImage(self,filename):

"""Writethenextexitedframetoanimagefile."""

self._imageFilename=filename

defstartWritingVideo(

self,filename,

encoding=cv2.VideoWriter_fourcc('I','4','2','0')):

"""Startwritingexitedframestoavideofile."""

self._videoFilename=filename

self._videoEncoding=encoding

defstopWritingVideo(self):

"""Stopwritingexitedframestoavideofile."""

self._videoFilename=None

self._videoEncoding=None

self._videoWriter=None

def_writeVideoFrame(self):

ifnotself.isWritingVideo:

return

ifself._videoWriterisNone:

fps=self._capture.get(cv2.CAP_PROP_FPS)

iffps==0.0:

#Thecapture'sFPSisunknownsouseanestimate.

ifself._framesElapsed<20:

#Waituntilmoreframeselapsesothatthe

#estimateismorestable.

return

else:

fps=self._fpsEstimate

size=(int(self._capture.get(

Page 92: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

cv2.CAP_PROP_FRAME_WIDTH)),

int(self._capture.get(

cv2.CAP_PROP_FRAME_HEIGHT)))

self._videoWriter=cv2.VideoWriter(

self._videoFilename,self._videoEncoding,

fps,size)

self._videoWriter.write(self._frame)

ThewriteImage(),startWritingVideo(),andstopWritingVideo()publicmethodssimplyrecordtheparametersforfile-writingoperations,whereastheactualwritingoperationsarepostponedtothenextcallofexitFrame().The_writeVideoFrame()nonpublicmethodcreatesorappendsavideofileinamannerthatshouldbefamiliarfromourearlierscripts.(SeetheReading/writingavideofilesection.)However,insituationswheretheframerateisunknown,weskipsomeframesatthestartofthecapturesessionsothatwehavetimetobuildupanestimateoftheframerate.

AlthoughourcurrentimplementationofCaptureManagerreliesonVideoCapture,wecouldmakeotherimplementationsthatdonotuseOpenCVforinput.Forexample,wecouldmakeasubclassthatisinstantiatedwithasocketconnection,whosebytestreamcouldbeparsedasastreamofimages.Wecouldalsomakeasubclassthatusesathird-partycameralibrarywithdifferenthardwaresupportthanwhatOpenCVprovides.However,forCameo,ourcurrentimplementationissufficient.

Page 93: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Abstractingawindowandkeyboardwithmanagers.WindowManagerAswehaveseen,OpenCVprovidesfunctionsthatcauseawindowtobecreated,destroyed,showanimage,andprocessevents.Ratherthanbeingmethodsofawindowclass,thesefunctionsrequireawindow’snametopassasanargument.Sincethisinterfaceisnotobject-oriented,itisinconsistentwithOpenCV’sgeneralstyle.Also,itisunlikelytobecompatiblewithotherwindoworeventhandlinginterfacesthatwemighteventuallywanttouseinsteadofOpenCV’s.

Forthesakeofobjectorientationandadaptability,weabstractthisfunctionalityintoaWindowManagerclasswiththecreateWindow(),destroyWindow(),show(),andprocessEvents()methods.Asaproperty,aWindowManagerclasshasafunctionobjectcalledkeypressCallback,which(ifnotNone)iscalledfromprocessEvents()inresponsetoanykeypress.ThekeypressCallbackobjectmusttakeasingleargument,suchasanASCIIkeycode.

Let’saddthefollowingimplementationofWindowManagertomanagers.py:

classWindowManager(object):

def__init__(self,windowName,keypressCallback=None):

self.keypressCallback=keypressCallback

self._windowName=windowName

self._isWindowCreated=False

@property

defisWindowCreated(self):

returnself._isWindowCreated

defcreateWindow(self):

cv2.namedWindow(self._windowName)

self._isWindowCreated=True

defshow(self,frame):

cv2.imshow(self._windowName,frame)

defdestroyWindow(self):

cv2.destroyWindow(self._windowName)

self._isWindowCreated=False

defprocessEvents(self):

keycode=cv2.waitKey(1)

ifself.keypressCallbackisnotNoneandkeycode!=-1:

#Discardanynon-ASCIIinfoencodedbyGTK.

keycode&=0xFF

self.keypressCallback(keycode)

Ourcurrentimplementationonlysupportskeyboardevents,whichwillbesufficientfor

Page 94: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Cameo.However,wecouldmodifyWindowManagertosupportmouseeventstoo.Forexample,theclass’sinterfacecouldbeexpandedtoincludeamouseCallbackproperty(andoptionalconstructorargument),butcouldotherwiseremainthesame.WithsomeeventframeworkotherthanOpenCV’s,wecouldsupportadditionaleventtypesinthesamewaybyaddingcallbackproperties.

AppendixA,IntegratingwithPygame,OpenCVComputerVisionwithPython,showsaWindowManagersubclassthatisimplementedwithPygame’swindowhandlingandeventframeworkinsteadofOpenCV’s.ThisimplementationimprovesonthebaseWindowManagerclassbyproperlyhandlingquitevents—forexample,whenauserclicksonawindow’sclosebutton.Potentially,manyothereventtypescanbehandledviaPygametoo.

Page 95: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Applyingeverythingwithcameo.CameoOurapplicationisrepresentedbyaCameoclasswithtwomethods:run()andonKeypress().Oninitialization,aCameoclasscreatesaWindowManagerclasswithonKeypress()asacallback,aswellasaCaptureManagerclassusingacameraandtheWindowManagerclass.Whenrun()iscalled,theapplicationexecutesamainloopinwhichframesandeventsareprocessed.Asaresultofeventprocessing,onKeypress()maybecalled.Thespacebarcausesascreenshottobetaken,Tabcausesascreencast(avideorecording)tostart/stop,andEsccausestheapplicationtoquit.

Inthesamedirectoryasmanagers.py,let’screateafilecalledcameo.pycontainingthefollowingimplementationofCameo:

importcv2

frommanagersimportWindowManager,CaptureManager

classCameo(object):

def__init__(self):

self._windowManager=WindowManager('Cameo',

self.onKeypress)

self._captureManager=CaptureManager(

cv2.VideoCapture(0),self._windowManager,True)

defrun(self):

"""Runthemainloop."""

self._windowManager.createWindow()

whileself._windowManager.isWindowCreated:

self._captureManager.enterFrame()

frame=self._captureManager.frame

#TODO:Filtertheframe(Chapter3).

self._captureManager.exitFrame()

self._windowManager.processEvents()

defonKeypress(self,keycode):

"""Handleakeypress.

space->Takeascreenshot.

tab->Start/stoprecordingascreencast.

escape->Quit.

"""

ifkeycode==32:#space

self._captureManager.writeImage('screenshot.png')

elifkeycode==9:#tab

ifnotself._captureManager.isWritingVideo:

self._captureManager.startWritingVideo(

'screencast.avi')

else:

self._captureManager.stopWritingVideo()

elifkeycode==27:#escape

self._windowManager.destroyWindow()

Page 96: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

if__name__=="__main__":

Cameo().run()

Whenrunningtheapplication,notethatthelivecamerafeedismirrored,whilescreenshotsandscreencastsarenot.Thisistheintendedbehavior,aswepassTrueforshouldMirrorPreviewwheninitializingtheCaptureManagerclass.

Sofar,wedonotmanipulatetheframesinanywayexcepttomirrorthemforpreview.WewillstarttoaddmoreinterestingeffectsinChapter3,FilteringImages.

Page 97: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 98: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryBynow,weshouldhaveanapplicationthatdisplaysacamerafeed,listensforkeyboardinput,and(oncommand)recordsascreenshotorscreencast.Wearereadytoextendtheapplicationbyinsertingsomeimage-filteringcode(Chapter3,FilteringImages)betweenthestartandendofeachframe.Optionally,wearealsoreadytointegrateothercameradriversorapplicationframeworks(AppendixA,IntegratingwithPygame,OpenCVComputerVisionwithPython)besidestheonessupportedbyOpenCV.

WealsonowhavetheknowledgetoprocessimagesandunderstandtheprincipleofimagemanipulationthroughtheNumPyarrays.Thisformstheperfectfoundationtounderstandthenexttopic,filteringimages.

Page 99: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 100: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter3.ProcessingImageswithOpenCV3Soonerorlater,whenworkingwithimages,youwillfindyourselfinneedofalteringimages:beitapplyingartisticfilters,extrapolatingcertainsections,cutting,pasting,orwhateverelseyourmindcanconjure.Thischapterpresentssometechniquestoalterimages,andbytheendofit,youshouldbeabletoperformtasks,suchasdetectingskintoneinanimage,sharpeninganimage,markcontoursofsubjects,anddetectingcrosswalksusingalinesegmentdetector.

Page 101: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ConvertingbetweendifferentcolorspacesThereareliterallyhundredsofmethodsinOpenCVthatpertaintotheconversionofcolorspaces.Ingeneral,threecolorspacesareprevalentinmoderndaycomputervision:gray,BGR,andHue,Saturation,Value(HSV).

Grayisacolorspacethateffectivelyeliminatescolorinformationtranslatingtoshadesofgray:thiscolorspaceisextremelyusefulforintermediateprocessing,suchasfacedetection.BGRistheblue-green-redcolorspace,inwhicheachpixelisathree-elementarray,eachvaluerepresentingtheblue,green,andredcolors:webdeveloperswouldbefamiliarwithasimilardefinitionofcolors,excepttheorderofcolorsisRGB.InHSV,hueisacolortone,saturationistheintensityofacolor,andvaluerepresentsitsdarkness(orbrightnessattheoppositeendofthespectrum).

Page 102: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

AquicknoteonBGRWhenIfirststarteddealingwiththeBGRcolorspace,somethingwasn’taddingup:the[0255255]value(noblue,fullgreen,andfullred)producestheyellowcolor.Ifyouhaveanartisticbackground,youwon’tevenneedtopickuppaintsandbrushestowitnessgreenandredmixintoamuddyshadeofbrown.Thatisbecausethecolormodelusedincomputingiscalledanadditiveanddealswithlights.Lightsbehavedifferentlyfrompaints(whichfollowthesubtractivecolormodel),and—assoftwarerunsoncomputerswhosemediumisamonitorthatemitslight—thecolormodelofreferenceistheadditiveone.

Page 103: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 104: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TheFourierTransformMuchoftheprocessingyouapplytoimagesandvideosinOpenCVinvolvestheconceptofFourierTransforminsomecapacity.JosephFourierwasan18thcenturyFrenchmathematicianwhodiscoveredandpopularizedmanymathematicalconcepts,andconcentratedhisworkonstudyingthelawsgoverningheat,andinmathematics,allthingswaveform.Inparticular,heobservedthatallwaveformsarejustthesumofsimplesinusoidsofdifferentfrequencies.

Inotherwords,thewaveformsyouobserveallaroundyouarethesumofotherwaveforms.Thisconceptisincrediblyusefulwhenmanipulatingimages,becauseitallowsustoidentifyregionsinimageswhereasignal(suchasimagepixels)changesalot,andregionswherethechangeislessdramatic.Wecanthenarbitrarilymarktheseregionsasnoiseorregionsofinterests,backgroundorforeground,andsoon.Thesearethefrequenciesthatmakeuptheoriginalimage,andwehavethepowertoseparatethemtomakesenseoftheimageandextrapolateinterestingdata.

NoteInanOpenCVcontext,thereareanumberofalgorithmsimplementedthatenableustoprocessimagesandmakesenseofthedatacontainedinthem,andthesearealsoreimplementedinNumPytomakeourlifeeveneasier.NumPyhasaFastFourierTransform(FFT)package,whichcontainsthefft2()method.ThismethodallowsustocomputeDiscreteFourierTransform(DFT)oftheimage.

Let’sexaminethemagnitudespectrumconceptofanimageusingFourierTransform.Themagnitudespectrumofanimageisanotherimage,whichgivesarepresentationoftheoriginalimageintermsofitschanges:thinkofitastakinganimageanddraggingallthebrightestpixelstothecenter.Then,yougraduallyworkyourwayouttotheborderwhereallthedarkestpixelshavebeenpushed.Immediately,youwillbeabletoseehowmanylightanddarkpixelsarecontainedinyourimageandthepercentageoftheirdistribution.

TheconceptofFourierTransformisthebasisofmanyalgorithmsusedforcommonimageprocessingoperations,suchasedgedetectionorlineandshapedetection.

Beforeexaminingtheseindetail,let’stakealookattwoconceptsthat—inconjunctionwiththeFourierTransform—formthefoundationoftheaforementionedprocessingoperations:highpassfiltersandlowpassfilters.

Page 105: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

HighpassfilterAhighpassfilter(HPF)isafilterthatexaminesaregionofanimageandbooststheintensityofcertainpixelsbasedonthedifferenceintheintensitywiththesurroundingpixels.

Take,forexample,thefollowingkernel:

[[0,-0.25,0],

[-0.25,1,-0.25],

[0,-0.25,0]]

NoteAkernelisasetofweightsthatareappliedtoaregioninasourceimagetogenerateasinglepixelinthedestinationimage.Forexample,aksizeof7impliesthat49(7x7)sourcepixelsareconsideredingeneratingeachdestinationpixel.Wecanthinkofakernelasapieceoffrostedglassmovingoverthesourceimageandlettingthroughadiffusedblendofthesource’slight.

Aftercalculatingthesumofdifferencesoftheintensitiesofthecentralpixelcomparedtoalltheimmediateneighbors,theintensityofthecentralpixelwillbeboosted(ornot)ifahighlevelofchangesarefound.Inotherwords,ifapixelstandsoutfromthesurroundingpixels,itwillgetboosted.

Thisisparticularlyeffectiveinedgedetection,whereacommonformofHPFcalledhighboostfilterisused.

Bothhighpassandlowpassfiltersuseapropertycalledradius,whichextendstheareaoftheneighborsinvolvedinthefiltercalculation.

Let’sgothroughanexampleofanHPF:

importcv2

importnumpyasnp

fromscipyimportndimage

kernel_3x3=np.array([[-1,-1,-1],

[-1,8,-1],

[-1,-1,-1]])

kernel_5x5=np.array([[-1,-1,-1,-1,-1],

[-1,1,2,1,-1],

[-1,2,4,2,-1],

[-1,1,2,1,-1],

[-1,-1,-1,-1,-1]])

NoteNotethatbothfilterssumupto0,thereasonforthisisexplainedindetailintheEdgedetectionsection.

img=cv2.imread("../images/color1_small.jpg",0)

Page 106: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

k3=ndimage.convolve(img,kernel_3x3)

k5=ndimage.convolve(img,kernel_5x5)

blurred=cv2.GaussianBlur(img,(11,11),0)

g_hpf=img-blurred

cv2.imshow("3x3",k3)

cv2.imshow("5x5",k5)

cv2.imshow("g_hpf",g_hpf)

cv2.waitKey()

cv2.destroyAllWindows()

Aftertheinitialimports,wedefinea3x3kernelanda5x5kernel,andthenweloadtheimageingrayscale.Normally,themajorityofimageprocessingisdonewithNumPy;however,inthisparticularcase,wewantto“convolve”animagewithagivenkernelandNumPyhappenstoonlyacceptone-dimensionalarrays.

Thisdoesnotmeanthattheconvolutionofdeeparrayscan’tbeachievedwithNumPy,justthatitwouldbeabitcomplex.Instead,ndimage(whichisapartofSciPy,soyoushouldhaveitinstalledaspertheinstructionsinChapter1,SettingUpOpenCV),makesthistrivial,throughitsconvolve()function,whichsupportstheclassicNumPyarraysthatthecv2modulesusetostoreimages.

WeapplytwoHPFswiththetwoconvolutionkernelswedefined.Lastly,wealsoimplementadifferentialmethodofobtainingaHPFbyapplyingalowpassfilterandcalculatingthedifferencewiththeoriginalimage.Youwillnoticethatthethirdmethodactuallyyieldsthebestresult,solet’salsoelaborateonlowpassfilters.

Page 107: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LowpassfilterIfanHPFbooststheintensityofapixel,givenitsdifferencewithitsneighbors,alowpassfilter(LPF)willsmoothenthepixelifthedifferencewiththesurroundingpixelsislowerthanacertainthreshold.Thisisusedindenoisingandblurring.Forexample,oneofthemostpopularblurring/smootheningfilters,theGaussianblur,isalowpassfilterthatattenuatestheintensityofhighfrequencysignals.

Page 108: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 109: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CreatingmodulesAsinthecaseofourCaptureManagerandWindowManagerclasses,ourfiltersshouldbereusableoutsideCameo.Thus,weshouldseparatethefiltersintotheirownPythonmoduleorfile.

Let’screateafilecalledfilters.pyinthesamedirectoryascameo.py.Weneedthefollowingimportstatementsinfilters.py:

importcv2

importnumpy

importutils

Let’salsocreateafilecalledutils.pyinthesamedirectory.Itshouldcontainthefollowingimportstatements:

importcv2

importnumpy

importscipy.interpolate

Wewillbeaddingfilterfunctionsandclassestofilters.py,whilemoregeneral-purposemathfunctionswillgoinutils.py.

Page 110: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 111: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

EdgedetectionEdgesplayamajorroleinbothhumanandcomputervision.We,ashumans,caneasilyrecognizemanyobjecttypesandtheirposejustbyseeingabacklitsilhouetteoraroughsketch.Indeed,whenartemphasizesedgesandposes,itoftenseemstoconveytheideaofanarchetype,suchasRodin’sTheThinkerorJoeShuster’sSuperman.Software,too,canreasonaboutedges,poses,andarchetypes.Wewilldiscussthesekindsofreasoningsinlaterchapters.

OpenCVprovidesmanyedge-findingfilters,includingLaplacian(),Sobel(),andScharr().Thesefiltersaresupposedtoturnnon-edgeregionstoblackwhileturningedgeregionstowhiteorsaturatedcolors.However,theyarepronetomisidentifyingnoiseasedges.Thisflawcanbemitigatedbyblurringanimagebeforetryingtofinditsedges.OpenCValsoprovidesmanyblurringfilters,includingblur()(simpleaverage),medianBlur(),andGaussianBlur().Theargumentsfortheedge-findingandblurringfiltersvarybutalwaysincludeksize,anoddwholenumberthatrepresentsthewidthandheight(inpixels)ofafilter’skernel.

Forblurring,let’susemedianBlur(),whichiseffectiveinremovingdigitalvideonoise,especiallyincolorimages.Foredge-finding,let’suseLaplacian(),whichproducesboldedgelines,especiallyingrayscaleimages.AfterapplyingmedianBlur(),butbeforeapplyingLaplacian(),weshouldconverttheimagefromBGRtograyscale.

OncewehavetheresultofLaplacian(),wecaninvertittogetblackedgesonawhitebackground.Then,wecannormalizeit(sothatitsvaluesrangefrom0to1)andmultiplyitwiththesourceimagetodarkentheedges.Let’simplementthisapproachinfilters.py:

defstrokeEdges(src,dst,blurKsize=7,edgeKsize=5):

ifblurKsize>=3:

blurredSrc=cv2.medianBlur(src,blurKsize)

graySrc=cv2.cvtColor(blurredSrc,cv2.COLOR_BGR2GRAY)

else:

graySrc=cv2.cvtColor(src,cv2.COLOR_BGR2GRAY)

cv2.Laplacian(graySrc,cv2.CV_8U,graySrc,ksize=edgeKsize)

normalizedInverseAlpha=(1.0/255)*(255-graySrc)

channels=cv2.split(src)

forchannelinchannels:

channel[:]=channel*normalizedInverseAlpha

cv2.merge(channels,dst)

NotethatweallowkernelsizestobespecifiedasargumentsforstrokeEdges().TheblurKsizeargumentisusedasksizeformedianBlur(),whileedgeKsizeisusedasksizeforLaplacian().Withmywebcams,IfindthatablurKsizevalueof7andanedgeKsizevalueof5looksbest.Unfortunately,medianBlur()isexpensivewithalargeksize,suchas7.

TipIfyouencounterperformanceproblemswhenrunningstrokeEdges(),trydecreasingthe

Page 112: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

blurKsizevalue.Toturnoffblur,setittoavaluelessthan3.

Page 113: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 114: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Customkernels–gettingconvolutedAswehavejustseen,manyofOpenCV’spredefinedfiltersuseakernel.Rememberthatakernelisasetofweights,whichdeterminehoweachoutputpixeliscalculatedfromaneighborhoodofinputpixels.Anothertermforakernelisaconvolutionmatrix.Itmixesuporconvolvesthepixelsinaregion.Similarly,akernel-basedfiltermaybecalledaconvolutionfilter.

OpenCVprovidesaveryversatilefilter2D()function,whichappliesanykernelorconvolutionmatrixthatwespecify.Tounderstandhowtousethisfunction,let’sfirstlearntheformatofaconvolutionmatrix.Itisa2Darraywithanoddnumberofrowsandcolumns.Thecentralelementcorrespondstoapixelofinterestandtheotherelementscorrespondtotheneighborsofthispixel.Eachelementcontainsanintegerorfloatingpointvalue,whichisaweightthatgetsappliedtoaninputpixel’svalue.Considerthisexample:

kernel=numpy.array([[-1,-1,-1],

[-1,9,-1],

[-1,-1,-1]])

Here,thepixelofinteresthasaweightof9anditsimmediateneighborseachhaveaweightof-1.Forthepixelofinterest,theoutputcolorwillbeninetimesitsinputcolorminustheinputcolorsofalleightadjacentpixels.Ifthepixelofinterestisalreadyabitdifferentfromitsneighbors,thisdifferencebecomesintensified.Theeffectisthattheimagelookssharperasthecontrastbetweentheneighborsisincreased.

Continuingourexample,wecanapplythisconvolutionmatrixtoasourceanddestinationimage,respectively,asfollows:

cv2.filter2D(src,-1,kernel,dst)

Thesecondargumentspecifiestheper-channeldepthofthedestinationimage(suchascv2.CV_8Ufor8bitsperchannel).Anegativevalue(asusedhere)meansthatthedestinationimagehasthesamedepthasthesourceimage.

NoteForcolorimages,notethatfilter2D()appliesthekernelequallytoeachchannel.Tousedifferentkernelsondifferentchannels,wewouldalsohavetousethesplit()andmerge()functions.

Basedonthissimpleexample,let’saddtwoclassestofilters.py.Oneclass,VConvolutionFilter,willrepresentaconvolutionfilteringeneral.Asubclass,SharpenFilter,willrepresentoursharpeningfilterspecifically.Let’seditfilters.pytoimplementthesetwonewclassesasfollows:

classVConvolutionFilter(object):

"""AfilterthatappliesaconvolutiontoV(orallofBGR)."""

def__init__(self,kernel):

self._kernel=kernel

Page 115: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

defapply(self,src,dst):

"""ApplythefilterwithaBGRorgraysource/destination."""

cv2.filter2D(src,-1,self._kernel,dst)

classSharpenFilter(VConvolutionFilter):

"""Asharpenfilterwitha1-pixelradius."""

def__init__(self):

kernel=numpy.array([[-1,-1,-1],

[-1,9,-1],

[-1,-1,-1]])

VConvolutionFilter.__init__(self,kernel)

Notethattheweightssumupto1.Thisshouldbethecasewheneverwewanttoleavetheimage’soverallbrightnessunchanged.Ifwemodifyasharpeningkernelslightlysothatitsweightssumupto0instead,wehaveanedgedetectionkernelthatturnsedgeswhiteandnon-edgesblack.Forexample,let’saddthefollowingedgedetectionfiltertofilters.py:

classFindEdgesFilter(VConvolutionFilter):

"""Anedge-findingfilterwitha1-pixelradius."""

def__init__(self):

kernel=numpy.array([[-1,-1,-1],

[-1,8,-1],

[-1,-1,-1]])

VConvolutionFilter.__init__(self,kernel)

Next,let’smakeablurfilter.Generally,forablureffect,theweightsshouldsumupto1andshouldbepositivethroughouttheneighborhood.Forexample,wecantakeasimpleaverageoftheneighborhoodasfollows:

classBlurFilter(VConvolutionFilter):

"""Ablurfilterwitha2-pixelradius."""

def__init__(self):

kernel=numpy.array([[0.04,0.04,0.04,0.04,0.04],

[0.04,0.04,0.04,0.04,0.04],

[0.04,0.04,0.04,0.04,0.04],

[0.04,0.04,0.04,0.04,0.04],

[0.04,0.04,0.04,0.04,0.04]])

VConvolutionFilter.__init__(self,kernel)

Oursharpening,edgedetection,andblurfiltersusekernelsthatarehighlysymmetric.Sometimes,though,kernelswithlesssymmetryproduceaninterestingeffect.Let’sconsiderakernelthatblursononeside(withpositiveweights)andsharpensontheother(withnegativeweights).Itwillproducearidgedorembossedeffect.Hereisanimplementationthatwecanaddtofilters.py:

classEmbossFilter(VConvolutionFilter):

"""Anembossfilterwitha1-pixelradius."""

def__init__(self):

kernel=numpy.array([[-2,-1,0],

[-1,1,1],

Page 116: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

[0,1,2]])

VConvolutionFilter.__init__(self,kernel)

Thissetofcustomconvolutionfiltersisverybasic.Indeed,itismorebasicthanOpenCV’sready-madesetoffilters.However,withabitofexperimentation,youshouldbeabletowriteyourownkernelsthatproduceauniquelook.

Page 117: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 118: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ModifyingtheapplicationNowthatwehavehigh-levelfunctionsandclassesforseveralfilters,itistrivialtoapplyanyofthemtothecapturedframesinCameo.Let’seditcameo.pyandaddthelinesthatappearinboldfaceinthefollowingexcerpt:

importcv2

importfilters

frommanagersimportWindowManager,CaptureManager

classCameo(object):

def__init__(self):

self._windowManager=WindowManager('Cameo',

self.onKeypress)

self._captureManager=CaptureManager(

cv2.VideoCapture(0),self._windowManager,True)

self._curveFilter=filters.BGRPortraCurveFilter()

defrun(self):

"""Runthemainloop."""

self._windowManager.createWindow()

whileself._windowManager.isWindowCreated:

self._captureManager.enterFrame()

frame=self._captureManager.frame

filters.strokeEdges(frame,frame)

self._curveFilter.apply(frame,frame)

self._captureManager.exitFrame()

self._windowManager.processEvents()

#...TherestisthesameasinChapter2.

Here,Ihavechosentoapplytwoeffects:strokingtheedgesandemulatingPortrafilmcolors.Feelfreetomodifythecodetoapplyanyfiltersyoulike.

HereisascreenshotfromCameowithstrokededgesandPortra-likecolors:

Page 119: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 120: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 121: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

EdgedetectionwithCannyOpenCValsooffersaveryhandyfunctioncalledCanny(afterthealgorithm’sinventor,JohnF.Canny),whichisverypopularnotonlybecauseofitseffectiveness,butalsothesimplicityofitsimplementationinanOpenCVprogram,asitisaone-liner:

importcv2

importnumpyasnp

img=cv2.imread("../images/statue_small.jpg",0)

cv2.imwrite("canny.jpg",cv2.Canny(img,200,300))

cv2.imshow("canny",cv2.imread("canny.jpg"))

cv2.waitKey()

cv2.destroyAllWindows()

Theresultisaveryclearidentificationoftheedges:

TheCannyedgedetectionalgorithmisquitecomplexbutalsointeresting:it’safive-stepprocessthatdenoisestheimagewithaGaussianfilter,calculatesgradients,appliesnonmaximumsuppression(NMS)onedges,adoublethresholdonallthedetectededgestoeliminatefalsepositives,and,lastly,analyzesalltheedgesandtheirconnectiontoeachothertokeeptherealedgesanddiscardtheweakones.

Page 122: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 123: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ContourdetectionAnothervitaltaskincomputervisioniscontourdetection,notonlybecauseoftheobviousaspectofdetectingcontoursofsubjectscontainedinanimageorvideoframe,butbecauseofthederivativeoperationsconnectedwithidentifyingcontours.

Theseoperationsare,namely,computingboundingpolygons,approximatingshapes,andgenerallycalculatingregionsofinterest,whichconsiderablysimplifyinteractionwithimagedatabecausearectangularregionwithNumPyiseasilydefinedwithanarrayslice.Wewillbeusingthistechniquealotwhenexploringtheconceptofobjectdetection(includingfaces)andobjecttracking.

Let’sgoinorderandfamiliarizeourselveswiththeAPIfirstwithanexample:

importcv2

importnumpyasnp

img=np.zeros((200,200),dtype=np.uint8)

img[50:150,50:150]=255

ret,thresh=cv2.threshold(img,127,255,0)

image,contours,hierarchy=cv2.findContours(thresh,cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

color=cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)

img=cv2.drawContours(color,contours,-1,(0,255,0),2)

cv2.imshow("contours",color)

cv2.waitKey()

cv2.destroyAllWindows()

Firstly,wecreateanemptyblackimagethatis200x200pixelsinsize.Then,weplaceawhitesquareinthecenterofitutilizingndarray’sabilitytoassignvaluesonaslice.

Wethenthresholdtheimage,andcallthefindContours()function.Thisfunctionhasthreeparameters:theinputimage,hierarchytype,andthecontourapproximationmethod.Thereareanumberofaspectsthatareofparticularinterestinthisfunction:

Thefunctionmodifiestheinputimage,soitwouldbeadvisabletouseacopyoftheoriginalimage(forexample,bypassingimg.copy()).Secondly,thehierarchytreereturnedbythefunctionisquiteimportant:cv2.RETR_TREEwillretrievetheentirehierarchyofcontoursintheimage,enablingyoutoestablish“relationships”betweencontours.Ifyouonlywanttoretrievethemostexternalcontours,usecv2.RETR_EXTERNAL.Thisisparticularlyusefulwhenyouwanttoeliminatecontoursthatareentirelycontainedinothercontours(forexample,inavastmajorityofcases,youwon’tneedtodetectanobjectwithinanotherobjectofthesametype).

ThefindContoursfunctionreturnsthreeelements:themodifiedimage,contours,andtheirhierarchy.Weusethecontourstodrawonthecolorversionoftheimage(sothatwecandrawcontoursingreen)andeventuallydisplayit.

Theresultisawhitesquarewithitscontourdrawningreen.Spartan,buteffectivein

Page 124: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

demonstratingtheconcept!Let’smoveontomoremeaningfulexamples.

Page 125: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 126: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Contours–boundingbox,minimumarearectangle,andminimumenclosingcircleFindingthecontoursofasquareisasimpletask;irregular,skewed,androtatedshapesbringthebestoutofthecv2.findContoursutilityfunctionofOpenCV.Let’stakealookatthefollowingimage:

Inareal-lifeapplication,wewouldbemostinterestedindeterminingtheboundingboxofthesubject,itsminimumenclosingrectangle,anditscircle.Thecv2.findContoursfunctioninconjunctionwithafewotherOpenCVutilitiesmakesthisveryeasytoaccomplish:

importcv2

importnumpyasnp

img=cv2.pyrDown(cv2.imread("hammer.jpg",cv2.IMREAD_UNCHANGED))

ret,thresh=cv2.threshold(cv2.cvtColor(img.copy(),cv2.COLOR_BGR2GRAY),

127,255,cv2.THRESH_BINARY)

image,contours,hier=cv2.findContours(thresh,cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

forcincontours:

#findboundingboxcoordinates

x,y,w,h=cv2.boundingRect(c)

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

Page 127: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

#findminimumarea

rect=cv2.minAreaRect(c)

#calculatecoordinatesoftheminimumarearectangle

box=cv2.boxPoints(rect)

#normalizecoordinatestointegers

box=np.int0(box)

#drawcontours

cv2.drawContours(img,[box],0,(0,0,255),3)

#calculatecenterandradiusofminimumenclosingcircle

(x,y),radius=cv2.minEnclosingCircle(c)

#casttointegers

center=(int(x),int(y))

radius=int(radius)

#drawthecircle

img=cv2.circle(img,center,radius,(0,255,0),2)

cv2.drawContours(img,contours,-1,(255,0,0),1)

cv2.imshow("contours",img)

Aftertheinitialimports,weloadtheimage,andthenapplyabinarythresholdonagrayscaleversionoftheoriginalimage.Bydoingthis,weoperateallfind-contourcalculationsonagrayscalecopy,butwedrawontheoriginalsothatwecanutilizecolorinformation.

Firstly,let’scalculateasimpleboundingbox:

x,y,w,h=cv2.boundingRect(c)

Thisisaprettystraightforwardconversionofcontourinformationtothe(x,y)coordinates,plustheheightandwidthoftherectangle.Drawingthisrectangleisaneasytaskandcanbedoneusingthiscode:

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

Secondly,let’scalculatetheminimumareaenclosingthesubject:

rect=cv2.minAreaRect(c)

box=cv2.boxPoints(rect)

box=np.int0(box)

Themechanismusedhereisparticularlyinteresting:OpenCVdoesnothaveafunctiontocalculatethecoordinatesoftheminimumrectanglevertexesdirectlyfromthecontourinformation.Instead,wecalculatetheminimumrectanglearea,andthencalculatethevertexesofthisrectangle.Notethatthecalculatedvertexesarefloats,butpixelsareaccessedwithintegers(youcan’taccessa“portion”ofapixel),soweneedtooperatethisconversion.Next,wedrawthebox,whichgivesustheperfectopportunitytointroducethecv2.drawContoursfunction:

cv2.drawContours(img,[box],0,(0,0,255),3)

Firstly,thisfunction—likealldrawingfunctions—modifiestheoriginalimage.Secondly,ittakesanarrayofcontoursinitssecondparameter,soyoucandrawanumberofcontoursinasingleoperation.Therefore,ifyouhaveasinglesetofpointsrepresentinga

Page 128: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

contourpolygon,youneedtowrapthesepointsintoanarray,exactlylikewedidwithourboxintheprecedingexample.Thethirdparameterofthisfunctionspecifiestheindexofthecontoursarraythatwewanttodraw:avalueof-1willdrawallcontours;otherwise,acontouratthespecifiedindexinthecontoursarray(thesecondparameter)willbedrawn.

Mostdrawingfunctionstakethecolorofthedrawinganditsthicknessasthelasttwoparameters.

Thelastboundingcontourwe’regoingtoexamineistheminimumenclosingcircle:

(x,y),radius=cv2.minEnclosingCircle(c)

center=(int(x),int(y))

radius=int(radius)

img=cv2.circle(img,center,radius,(0,255,0),2)

Theonlypeculiarityofthecv2.minEnclosingCirclefunctionisthatitreturnsatwo-elementtuple,ofwhichthefirstelementisatupleitself,representingthecoordinatesofthecircle’scenter,andthesecondelementistheradiusofthiscircle.Afterconvertingallthesevaluestointegers,drawingthecircleisquiteatrivialoperation.

Thefinalresultontheoriginalimagelookslikethis:

Page 129: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 130: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Contours–convexcontoursandtheDouglas-PeuckeralgorithmMostofthetime,whenworkingwithcontours,subjectswillhavethemostdiverseshapes,includingconvexones.Aconvexshapeisonewheretherearetwopointswithinthisshapewhoseconnectinglinegoesoutsidetheperimeteroftheshapeitself.

ThefirstfacilitythatOpenCVofferstocalculatetheapproximateboundingpolygonofashapeiscv2.approxPolyDP.Thisfunctiontakesthreeparameters:

AcontourAnepsilonvaluerepresentingthemaximumdiscrepancybetweentheoriginalcontourandtheapproximatedpolygon(thelowerthevalue,theclosertheapproximatedvaluewillbetotheoriginalcontour)ABooleanflagsignifyingthatthepolygonisclosed

Theepsilonvalueisofvitalimportancetoobtainausefulcontour,solet’sunderstandwhatitrepresents.Anepsilonisthemaximumdifferencebetweentheapproximatedpolygon’sperimeterandtheoriginalcontour’sperimeter.Thelowerthisdifferenceis,themoretheapproximatedpolygonwillbesimilartotheoriginalcontour.

Youmayaskyourselfwhyweneedanapproximatepolygonwhenwehaveacontourthatisalreadyapreciserepresentation.Theanswertothisisthatapolygonisasetofstraightlines,andtheimportanceofbeingabletodefinepolygonsinaregionforfurthermanipulationandprocessingisparamountinmanycomputervisiontasks.

Nowthatweknowwhatanepsilonis,weneedtoobtaincontourperimeterinformationasareferencevalue.Thisisobtainedwiththecv2.arcLengthfunctionofOpenCV:

epsilon=0.01*cv2.arcLength(cnt,True)

approx=cv2.approxPolyDP(cnt,epsilon,True)

Effectively,we’reinstructingOpenCVtocalculateanapproximatedpolygonwhoseperimetercanonlydifferfromtheoriginalcontourinanepsilonratio.

OpenCValsooffersacv2.convexHullfunctiontoobtainprocessedcontourinformationforconvexshapesandthisisastraightforwardone-lineexpression:

hull=cv2.convexHull(cnt)

Let’scombinetheoriginalcontour,approximatedpolygoncontour,andtheconvexhullinoneimagetoobservethedifferencebetweenthem.Tosimplifythings,I’veappliedthecontourstoablackimagesothattheoriginalsubjectisnotvisiblebutitscontoursare:

Page 131: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Asyoucansee,theconvexhullsurroundstheentiresubject,theapproximatedpolygonistheinnermostpolygonshape,andinbetweenthetwoistheoriginalcontour,mainlycomposedofarcs.

Page 132: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 133: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LineandcircledetectionDetectingedgesandcontoursarenotonlycommonandimportanttasks,theyalsoconstitutethebasisforothercomplexoperations.Linesandshapedetectiongohandinhandwithedgeandcontourdetection,solet’sexaminehowOpenCVimplementsthese.

ThetheorybehindlinesandshapedetectionhasitsfoundationinatechniquecalledtheHoughtransform,inventedbyRichardDudaandPeterHart,whoextended(generalized)theworkdonebyPaulHoughintheearly1960s.

Let’stakealookatOpenCV’sAPIfortheHoughtransforms.

Page 134: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LinedetectionFirstofall,let’sdetectsomelines,whichisdonewiththeHoughLinesandHoughLinesPfunctions.TheonlydifferencebetweenthetwofunctionsisthatoneusesthestandardHoughtransform,andthesecondusestheprobabilisticHoughtransform(hencePinthename).

Theprobabilisticversionisso-calledbecauseitonlyanalyzesasubsetofpointsandestimatestheprobabilityofthesepointsallbelongingtothesameline.ThisimplementationisanoptimizedversionofthestandardHoughtransform,andinthiscase,it’slesscomputationallyintensiveandexecutesfaster.

Let’stakealookataverysimpleexample:

importcv2

importnumpyasnp

img=cv2.imread('lines.jpg')

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

edges=cv2.Canny(gray,50,120)

minLineLength=20

maxLineGap=5

lines=cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap)

forx1,y1,x2,y2inlines[0]:

cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)

cv2.imshow("edges",edges)

cv2.imshow("lines",img)

cv2.waitKey()

cv2.destroyAllWindows()

Thecrucialpointofthissimplescript—asidefromtheHoughLinesfunctioncall—isthesettingofminimumlinelength(shorterlineswillbediscarded)andthemaximumlinegap,whichisthemaximumsizeofagapinalinebeforethetwosegmentsstartbeingconsideredasseparatelines.

AlsonotethattheHoughLinesfunctiontakesasinglechannelbinaryimage,processedthroughtheCannyedgedetectionfilter.Cannyisnotastrictrequirement,however;animagethat’sbeendenoisedandonlyrepresentsedges,istheidealsourceforaHoughtransform,soyouwillfindthistobeacommonpractice.

TheparametersofHoughLinesPareasfollows:

Theimagewewanttoprocess.Thegeometricalrepresentationsofthelines,rhoandtheta,whichareusually1andnp.pi/180.Thethreshold,whichrepresentsthethresholdbelowwhichalineisdiscarded.TheHoughtransformworkswithasystemofbinsandvotes,witheachbinrepresentingaline,soanylinewithaminimumofthe<threshold>votesisretained,therestdiscarded.MinLineLengthandMaxLineGap,whichwementionedpreviously.

Page 135: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CircledetectionOpenCValsohasafunctionfordetectingcircles,calledHoughCircles.ItworksinaverysimilarfashiontoHoughLines,butwhereminLineLengthandmaxLineGapweretheparameterstodiscardorretainlines,HoughCircleshasaminimumdistancebetweencircles’centers,minimum,andmaximumradiusofthecircles.Here’stheobligatoryexample:

importcv2

importnumpyasnp

planets=cv2.imread('planet_glow.jpg')

gray_img=cv2.cvtColor(planets,cv2.COLOR_BGR2GRAY)

img=cv2.medianBlur(gray_img,5)

cimg=cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)

circles=cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,120,

param1=100,param2=30,minRadius=0,maxRadius=0)

circles=np.uint16(np.around(circles))

foriincircles[0,:]:

#drawtheoutercircle

cv2.circle(planets,(i[0],i[1]),i[2],(0,255,0),2)

#drawthecenterofthecircle

cv2.circle(planets,(i[0],i[1]),2,(0,0,255),3)

cv2.imwrite("planets_circles.jpg",planets)

cv2.imshow("HoughCirlces",planets)

cv2.waitKey()

cv2.destroyAllWindows()

Here’savisualrepresentationoftheresult:

Page 136: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 137: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DetectingshapesThedetectionofshapeswiththeHoughtransformislimitedtocircles;however,wealreadyimplicitlyexploreddetectingshapesofanykind,specificallywhenwetalkedaboutapproxPolyDP.Thisfunctionallowstheapproximationofpolygons,soifyourimagecontainspolygons,theywillbequiteaccuratelydetected,combiningtheusageofcv2.findContoursandcv2.approxPolyDP.

Page 138: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 139: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryAtthispoint,youshouldhavegainedagoodunderstandingofcolorspaces,FourierTransform,andtheseveralkindsoffiltersmadeavailablebyOpenCVtoprocessimages.

Youshouldalsobeproficientindetectingedges,lines,circles,andshapesingeneral.Additionally,youshouldbeabletofindcontoursandexploittheinformationtheyprovideaboutthesubjectscontainedinanimage.Theseconceptswillserveastheidealbackgroundtoexplorethetopicsinthenextchapter.

Page 140: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 141: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter4.DepthEstimationandSegmentationThischaptershowsyouhowtousedatafromadepthcameratoidentifyforegroundandbackgroundregions,sothatwecanlimitaneffecttoonlytheforegroundoronlythebackground.Asprerequisites,weneedadepthcamera,suchasMicrosoftKinect,andweneedtobuildOpenCVwithsupportforourdepthcamera.Forbuildinstructions,seeChapter1,SettingUpOpenCV.

We’lldealwithtwomaintopicsinthischapter:depthestimationandsegmentation.Wewillexploredepthestimationwithtwodistinctapproaches:firstly,byusingadepthcamera(aprerequisiteofthefirstpartofthechapter),suchasMicrosoftKinect,andthen,byusingstereoimages,forwhichanormalcamerawillsuffice.ForinstructionsonhowtobuildOpenCVwithsupportfordepthcameras,seeChapter1,SettingUpOpenCV.Thesecondpartofthechapterisaboutsegmentation,thetechniquethatallowsustoextractforegroundobjectsfromanimage.

Page 142: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CreatingmodulesThecodetocaptureandmanipulatedepth-cameradatawillbereusableoutsideCameo.py.So,weshouldseparateitintoanewmodule.Let’screateafilecalleddepth.pyinthesamedirectoryasCameo.py.Weneedthefollowingimportstatementindepth.py:

importnumpy

Wewillalsoneedtomodifyourpreexistingrects.pyfilesothatourcopyoperationscanbelimitedtoanonrectangularsubregionofarectangle.Tosupportthechangeswearegoingtomake,let’saddthefollowingimportstatementstorects.py:

importnumpy

importutils

Finally,thenewversionofourapplicationwillusedepth-relatedfunctionalities.So,let’saddthefollowingimportstatementtoCameo.py:

importdepth

Now,let’sgodeeperintothesubjectofdepth.

Page 143: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 144: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CapturingframesfromadepthcameraBackinChapter2,HandlingFiles,Cameras,andGUIs,wediscussedtheconceptthatacomputercanhavemultiplevideocapturedevicesandeachdevicecanhavemultiplechannels.Supposeagivendeviceisastereocamera.Eachchannelmightcorrespondtoadifferentlensandsensor.Also,eachchannelmightcorrespondtodifferentkindsofdata,suchasanormalcolorimageversusadepthmap.TheC++versionofOpenCVdefinessomeconstantsfortheidentifiersofcertaindevicesandchannels.However,theseconstantsarenotdefinedinthePythonversion.

Toremedythissituation,let’saddthefollowingdefinitionsindepth.py:

#Devices.CAP_OPENNI=900#OpenNI(forMicrosoftKinect)CAP_OPENNI_ASUS=

910#OpenNI(forAsusXtion)

#ChannelsofanOpenNI-compatibledepthgenerator.CAP_OPENNI_DEPTH_MAP=0

#Depthvaluesinmm(16UC1)CAP_OPENNI_POINT_CLOUD_MAP=1#XYZinmeters

(32FC3)CAP_OPENNI_DISPARITY_MAP=2#Disparityinpixels

(8UC1)CAP_OPENNI_DISPARITY_MAP_32F=3#Disparityinpixels

(32FC1)CAP_OPENNI_VALID_DEPTH_MASK=4#8UC1

#ChannelsofanOpenNI-compatibleRGBimagegenerator.CAP_OPENNI_BGR_IMAGE

=5CAP_OPENNI_GRAY_IMAGE=6

Thedepth-relatedchannelsrequiresomeexplanation,asgiveninthefollowinglist:

Adepthmapisagrayscaleimageinwhicheachpixelvalueistheestimateddistancefromthecameratoasurface.Specifically,animagefromtheCAP_OPENNI_DEPTH_MAPchannelgivesthedistanceasafloating-pointnumberofmillimeters.Apointcloudmapisacolorimageinwhicheachcolorcorrespondstoan(x,y,orz)spatialdimension.Specifically,theCAP_OPENNI_POINT_CLOUD_MAPchannelyieldsaBGRimage,whereBisx(blueisright),Gisy(greenisup),andRisz(redisdeep),fromthecamera’sperspective.Thevaluesareinmeters.Adisparitymapisagrayscaleimageinwhicheachpixelvalueisthestereodisparityofasurface.Toconceptualizestereodisparity,let’ssupposeweoverlaytwoimagesofascene,shotfromdifferentviewpoints.Theresultwouldbesimilartoseeingdoubleimages.Forpointsonanypairoftwinobjectsinthescene,wecanmeasurethedistanceinpixels.Thismeasurementisthestereodisparity.Nearbyobjectsexhibitgreaterstereodisparitythanfar-offobjects.Thus,nearbyobjectsappearbrighterinadisparitymap.Avaliddepthmaskshowswhetherthedepthinformationatagivenpixelisbelievedtobevalid(shownbyanonzerovalue)orinvalid(shownbyavalueofzero).Forexample,ifthedepthcameradependsonaninfraredilluminator(aninfraredflash),depthinformationisinvalidinregionsthatareoccluded(shadowed)fromthislight.

Thefollowingscreenshotshowsapointcloudmapofamansittingbehindasculptureofacat:

Page 145: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thefollowingscreenshothasadisparitymapofamansittingbehindasculptureofacat:

Avaliddepthmaskofamansittingbehindasculptureofacatisshowninthefollowingscreenshot:

Page 146: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 147: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 148: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CreatingamaskfromadisparitymapForthepurposesofCameo,weareinterestedindisparitymapsandvaliddepthmasks.Theycanhelpusrefineourestimatesoffacialregions.

UsingtheFaceTrackerfunctionandanormalcolorimage,wecanobtainrectangularestimatesoffacialregions.Byanalyzingsucharectangularregioninthecorrespondingdisparitymap,wecantellthatsomepixelswithintherectangleareoutliers—toonearortoofartoreallybeapartoftheface.Wecanrefinethefacialregiontoexcludetheseoutliers.However,weshouldonlyapplythistestwherethedataisvalid,asindicatedbythevaliddepthmask.

Let’swriteafunctiontogenerateamaskwhosevaluesare0fortherejectedregionsofthefacialrectangleand1fortheacceptedregions.Thisfunctionshouldtakeadisparitymap,validdepthmask,andarectangleasarguments.Wecanimplementitindepth.pyasfollows:

defcreateMedianMask(disparityMap,validDepthMask,rect=None):

"""Returnamaskselectingthemedianlayer,plusshadows."""

ifrectisnotNone:

x,y,w,h=rect

disparityMap=disparityMap[y:y+h,x:x+w]

validDepthMask=validDepthMask[y:y+h,x:x+w]

median=numpy.median(disparityMap)

returnnumpy.where((validDepthMask==0)|\

(abs(disparityMap-median)<12),

1.0,0.0)

Toidentifyoutliersinthedisparitymap,wefirstfindthemedianusingnumpy.median(),whichtakesanarrayasanargument.Ifthearrayisofanoddlength,median()returnsthevaluethatwouldlieinthemiddleofthearrayifthearrayweresorted.Ifthearrayisofevenlength,median()returnstheaverageofthetwovaluesthatwouldbesortednearesttothemiddleofthearray.

Togenerateamaskbasedonper-pixelBooleanoperations,weusenumpy.where()withthreearguments.Inthefirstargument,where()takesanarraywhoseelementsareevaluatedfortruthorfalsity.Anoutputarrayoflikedimensionsisreturned.Whereveranelementintheinputarrayistrue,thewhere()function’ssecondargumentisassignedtothecorrespondingelementintheoutputarray.Conversely,whereveranelementintheinputarrayisfalse,thewhere()function’sthirdargumentisassignedtothecorrespondingelementintheoutputarray.

Ourimplementationtreatsapixelasanoutlierwhenithasavaliddisparityvaluethatdeviatesfromthemediandisparityvalueby12ormore.I’vechosenthevalueof12justbyexperimentation.FeelfreetotweakthisvaluelaterbasedontheresultsyouencounterwhenrunningCameowithyourparticularcamerasetup.

Page 149: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 150: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

MaskingacopyoperationAspartofthepreviouschapter’swork,wewrotecopyRect()asacopyoperationthatlimitsitselftothegivenrectanglesofasourceanddestinationimage.Now,wewanttoapplyfurtherlimitstothiscopyoperation.Wewanttouseagivenmaskthathasthesamedimensionsasthesourcerectangle.

Weshallcopyonlythosepixelsinthesourcerectanglewherethemask’svalueisnotzero.Otherpixelsshallretaintheiroldvaluesfromthedestinationimage.Thislogic,withanarrayofconditionsandtwoarraysofpossibleoutputvalues,canbeexpressedconciselywiththenumpy.where()functionthatwehaverecentlylearned.

Let’sopenrects.pyandeditcopyRect()toaddanewmaskargument.ThisargumentmaybeNone,inwhichcase,wefallbacktoouroldimplementationofthecopyoperation.Otherwise,wenextensurethatmaskandtheimageshavethesamenumberofchannels.Weassumethatmaskhasonechannelbuttheimagesmayhavethreechannels(BGR).Wecanaddduplicatechannelstomaskusingtherepeat()andreshape()methodsofnumpy.array.

Finally,weperformthecopyoperationusingwhere().Thecompleteimplementationisasfollows:

defcopyRect(src,dst,srcRect,dstRect,mask=None,

interpolation=cv2.INTER_LINEAR):

"""Copypartofthesourcetopartofthedestination."""

x0,y0,w0,h0=srcRect

x1,y1,w1,h1=dstRect

#Resizethecontentsofthesourcesub-rectangle.

#Puttheresultinthedestinationsub-rectangle.

ifmaskisNone:

dst[y1:y1+h1,x1:x1+w1]=\

cv2.resize(src[y0:y0+h0,x0:x0+w0],(w1,h1),

interpolation=interpolation)

else:

ifnotutils.isGray(src):

#Convertthemaskto3channels,liketheimage.

mask=mask.repeat(3).reshape(h0,w0,3)

#Performthecopy,withthemaskapplied.

dst[y1:y1+h1,x1:x1+w1]=\

numpy.where(cv2.resize(mask,(w1,h1),

interpolation=\

cv2.INTER_NEAREST),

cv2.resize(src[y0:y0+h0,x0:x0+w0],(w1,h1),

interpolation=interpolation),

dst[y1:y1+h1,x1:x1+w1])

WealsoneedtomodifyourswapRects()function,whichusescopyRect()toperformacircularswapofalistofrectangularregions.ThemodificationstoswapRects()arequitesimple.Wejustneedtoaddanewmasksargument,whichisalistofmaskswhoseelementsarepassedtotherespectivecopyRect()calls.Ifthevalueofthegivenmasks

Page 151: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

argumentisNone,wepassNonetoeverycopyRect()call.

Thefollowingcodeshowsyouthefullimplementationofthis:

defswapRects(src,dst,rects,masks=None,

interpolation=cv2.INTER_LINEAR):

"""Copythesourcewithtwoormoresub-rectanglesswapped."""

ifdstisnotsrc:

dst[:]=src

numRects=len(rects)

ifnumRects<2:

return

ifmasksisNone:

masks=[None]*numRects

#Copythecontentsofthelastrectangleintotemporarystorage.

x,y,w,h=rects[numRects-1]

temp=src[y:y+h,x:x+w].copy()

#Copythecontentsofeachrectangleintothenext.

i=numRects-2

whilei>=0:

copyRect(src,dst,rects[i],rects[i+1],masks[i],

interpolation)

i-=1

#Copythetemporarilystoredcontentintothefirstrectangle.

copyRect(temp,dst,(0,0,w,h),rects[0],masks[numRects-1],

interpolation)

NotethatthemasksargumentincopyRect()andswapRects()bothdefaulttoNone.Thus,ournewversionsofthesefunctionsarebackwardcompatiblewithourpreviousversionsofCameo.

Page 152: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 153: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DepthestimationwithanormalcameraAdepthcameraisafantasticlittledevicetocaptureimagesandestimatethedistanceofobjectsfromthecameraitself,but,howdoesthedepthcameraretrievedepthinformation?Also,isitpossibletoreproducethesamekindofcalculationswithanormalcamera?

Adepthcamera,suchasMicrosoftKinect,usesatraditionalcameracombinedwithaninfraredsensorthathelpsthecameradifferentiatesimilarobjectsandcalculatetheirdistancefromthecamera.However,noteverybodyhasaccesstoadepthcameraoraKinect,andespeciallywhenyou’rejustlearningOpenCV,you’reprobablynotgoingtoinvestinanexpensivepieceofequipmentuntilyoufeelyourskillsarewell-sharpened,andyourinterestinthesubjectisconfirmed.

Oursetupincludesasimplecamera,whichismostlikelyintegratedinourmachine,orawebcamattachedtoourcomputer.So,weneedtoresorttolessfancymeansofestimatingthedifferenceindistanceofobjectsfromthecamera.

Geometrywillcometotherescueinthiscase,andinparticular,EpipolarGeometry,whichisthegeometryofstereovision.Stereovisionisabranchofcomputervisionthatextractsthree-dimensionalinformationoutoftwodifferentimagesofthesamesubject.

Howdoesepipolargeometrywork?Conceptually,ittracesimaginarylinesfromthecameratoeachobjectintheimage,thendoesthesameonthesecondimage,andcalculatesthedistanceofobjectsbasedontheintersectionofthelinescorrespondingtothesameobject.Hereisarepresentationofthisconcept:

Let’sseehowOpenCVappliesepipolargeometrytocalculateaso-calleddisparitymap,whichisbasicallyarepresentationofthedifferentdepthsdetectedintheimages.Thiswillenableustoextracttheforegroundofapictureanddiscardtherest.

Firstly,weneedtwoimagesofthesamesubjecttakenfromdifferentpointsofview,butpayingattentiontothefactthatthepicturesaretakenatanequaldistancefromtheobject,

Page 154: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

otherwisethecalculationswillfailandthedisparitymapwillbemeaningless.

So,movingontoanexample:

importnumpyasnp

importcv2

defupdate(val=0):

#disparityrangeistunedfor'aloe'imagepair

stereo.setBlockSize(cv2.getTrackbarPos('window_size','disparity'))

stereo.setUniquenessRatio(cv2.getTrackbarPos('uniquenessRatio',

'disparity'))

stereo.setSpeckleWindowSize(cv2.getTrackbarPos('speckleWindowSize',

'disparity'))

stereo.setSpeckleRange(cv2.getTrackbarPos('speckleRange','disparity'))

stereo.setDisp12MaxDiff(cv2.getTrackbarPos('disp12MaxDiff',

'disparity'))

print'computingdisparity…'

disp=stereo.compute(imgL,imgR).astype(np.float32)/16.0

cv2.imshow('left',imgL)

cv2.imshow('disparity',(disp-min_disp)/num_disp)

if__name__=="__main__":

window_size=5

min_disp=16

num_disp=192-min_disp

blockSize=window_size

uniquenessRatio=1

speckleRange=3

speckleWindowSize=3

disp12MaxDiff=200

P1=600

P2=2400

imgL=cv2.imread('images/color1_small.jpg')

imgR=cv2.imread('images/color2_small.jpg')

cv2.namedWindow('disparity')

cv2.createTrackbar('speckleRange','disparity',speckleRange,50,

update)

cv2.createTrackbar('window_size','disparity',window_size,21,update)

cv2.createTrackbar('speckleWindowSize','disparity',speckleWindowSize,

200,update)

cv2.createTrackbar('uniquenessRatio','disparity',uniquenessRatio,50,

update)

cv2.createTrackbar('disp12MaxDiff','disparity',disp12MaxDiff,250,

update)

stereo=cv2.StereoSGBM_create(

minDisparity=min_disp,

numDisparities=num_disp,

blockSize=window_size,

uniquenessRatio=uniquenessRatio,

speckleRange=speckleRange,

speckleWindowSize=speckleWindowSize,

disp12MaxDiff=disp12MaxDiff,

Page 155: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

P1=P1,

P2=P2

)

update()

cv2.waitKey()

Inthisexample,wetaketwoimagesofthesamesubjectandcalculateadisparitymap,showinginbrightercolorsthepointsinthemapthatareclosertothecamera.Theareasmarkedinblackrepresentthedisparities.

Firstofall,weimportnumpyandcv2asusual.

Let’sskipthedefinitionoftheupdatefunctionforasecondandtakealookatthemaincode;theprocessisquitesimple:loadtwoimages,createaStereoSGBMinstance(StereoSGBMstandsforsemiglobalblockmatching,anditisanalgorithmusedforcomputingdisparitymaps),andalsocreateafewtrackbarstoplayaroundwiththeparametersofthealgorithmandcalltheupdatefunction.

TheupdatefunctionappliesthetrackbarvaluestotheStereoSGBMinstance,andthencallsthecomputemethod,whichproducesadisparitymap.Allinall,prettysimple!HereisthefirstimageI’veused:

Thisisthesecondone:

Page 156: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thereyougo:aniceandquiteeasytointerpretdisparitymap.

Page 157: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TheparametersusedbyStereoSGBMareasfollows(takenfromtheOpenCVdocumentation):

Parameter Description

minDisparity

Thisparameterreferstotheminimumpossibledisparityvalue.Normally,itiszerobutsometimes,rectificationalgorithmscanshiftimages,sothisparameterneedstobeadjustedaccordingly.

numDisparitiesThisparameterreferstothemaximumdisparityminusminimumdisparity.Theresultantvalueisalwaysgreaterthanzero.Inthecurrentimplementation,thisparametermustbedivisibleby16.

windowSizeThisparameterreferstoamatchedblocksize.Itmustbeanoddnumbergreaterthanorequalto1.Normally,itshouldbesomewhereinthe3-11range.

P1Thisparameterreferstothefirstparametercontrollingthedisparitysmoothness.Seethenextpoint.

P2

Thisparameterreferstothesecondparameterthatcontrolsthedisparitysmoothness.Thelargerthevaluesare,thesmootherthedisparityis.P1isthepenaltyonthedisparitychangebyplusorminus1betweenneighborpixels.P2isthepenaltyonthedisparitychangebymorethan1betweenneighborpixels.ThealgorithmrequiresP2>P1.

Seethestereo_match.cppsamplewheresomereasonablygoodP1andP2valuesareshown(suchas8*number_of_image_channels*windowSize*windowSizeand32*number_of_image_channels*windowSize*windowSize,respectively).

disp12MaxDiffThisparameterreferstothemaximumalloweddifference(inintegerpixelunits)intheleft-rightdisparitycheck.Setittoanonpositivevaluetodisablethecheck.

preFilterCap

Thisparameterreferstothetruncationvalueforprefilteredimagepixels.Thealgorithmfirstcomputesthex-derivativeateachpixelandclipsitsvaluebythe[-preFilterCap,preFilterCap]interval.TheresultantvaluesarepassedtotheBirchfield-Tomasipixelcostfunction.

uniquenessRatio

Thisparameterreferstothemargininpercentagebywhichthebest(minimum)computedcostfunctionvalueshould“win”thesecondbestvaluetoconsiderthefoundmatchtobecorrect.Normally,avaluewithinthe5-15rangeisgoodenough.

speckleWindowSize

Thisparameterreferstothemaximumsizeofsmoothdisparityregionstoconsidertheirnoisespecklesandinvalidate.Setitto0todisablespecklefiltering.Otherwise,setitsomewhereinthe50-200range.

speckleRange

Thisparameterreferstothemaximumdisparityvariationwithineachconnectedcomponent.Ifyoudospecklefiltering,settheparametertoapositivevalue;itwillimplicitlybemultipliedby16.Normally,1or2isgoodenough.

Withtheprecedingscript,you’llbeabletoloadtheimagesandplayaroundwithparametersuntilyou’rehappywiththedisparitymapgeneratedbyStereoSGBM.

Page 158: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 159: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ObjectsegmentationusingtheWatershedandGrabCutalgorithmsCalculatingadisparitymapcanbeveryusefultodetecttheforegroundofanimage,butStereoSGBMisnottheonlyalgorithmavailabletoaccomplishthis,andinfact,StereoSGBMismoreaboutgathering3Dinformationfrom2Dpictures,thananythingelse.GrabCut,however,isaperfecttoolforthispurpose.TheGrabCutalgorithmfollowsaprecisesequenceofsteps:

1. Arectangleincludingthesubject(s)ofthepictureisdefined.2. Thearealyingoutsidetherectangleisautomaticallydefinedasabackground.3. Thedatacontainedinthebackgroundisusedasareferencetodistinguish

backgroundareasfromforegroundareaswithintheuser-definedrectangle.4. AGaussiansMixtureModel(GMM)modelstheforegroundandbackground,and

labelsundefinedpixelsasprobablebackgroundandforegrounds.5. Eachpixelintheimageisvirtuallyconnectedtothesurroundingpixelsthrough

virtualedges,andeachedgegetsaprobabilityofbeingforegroundorbackground,basedonhowsimilaritisincolortothepixelssurroundingit.

6. Eachpixel(ornodeasitisconceptualizedinthealgorithm)isconnectedtoeitheraforegroundorabackgroundnode,whichyoucanpicturelookinglikethis:

7. Afterthenodeshavebeenconnectedtoeitherterminal(backgroundorforeground,alsocalledasourceandsink),theedgesbetweennodesbelongingtodifferentterminalsarecut(thefamouscutpartofthealgorithm),whichenablestheseparationofthepartsoftheimage.Thisgraphadequatelyrepresentsthealgorithm:

Page 160: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 161: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ExampleofforegrounddetectionwithGrabCutLet’slookatanexample.Westartwiththepictureofabeautifulstatueofanangel.

Wewanttograbourangelanddiscardthebackground.Todothis,wewillcreatearelativelyshortscriptthatwillinstantiateGrabCut,operatetheseparation,andthendisplaytheresultingimagesidebysidetotheoriginal.Wewilldothisusingmatplotlib,averyusefulPythonlibrary,whichmakesdisplayingchartsandimagesatrivialtask:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

img=cv2.imread('images/statue_small.jpg')

mask=np.zeros(img.shape[:2],np.uint8)

bgdModel=np.zeros((1,65),np.float64)

fgdModel=np.zeros((1,65),np.float64)

rect=(100,50,421,378)

cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)

mask2=np.where((mask==2)|(mask==0),0,1).astype('uint8')

img=img*mask2[:,:,np.newaxis]

plt.subplot(121),plt.imshow(img)

plt.title("grabcut"),plt.xticks([]),plt.yticks([])

plt.subplot(122),

plt.imshow(cv2.cvtColor(cv2.imread('images/statue_small.jpg'),

cv2.COLOR_BGR2RGB))

plt.title("original"),plt.xticks([]),plt.yticks([])

Page 162: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

plt.show()

Thiscodeisactuallyquitestraightforward.Firstly,weloadtheimagewewanttoprocess,andthenwecreateamaskpopulatedwithzeroswiththesameshapeastheimagewe’veloaded:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

img=cv2.imread('images/statue_small.jpg')

mask=np.zeros(img.shape[:2],np.uint8)

Wethencreatezero-filledforegroundandbackgroundmodels:

bgdModel=np.zeros((1,65),np.float64)

fgdModel=np.zeros((1,65),np.float64)

Wecouldhavepopulatedthesemodelswithdata,butwe’regoingtoinitializetheGrabCutalgorithmwitharectangleidentifyingthesubjectwewanttoisolate.So,backgroundandforegroundmodelsaregoingtobedeterminedbasedontheareasleftoutoftheinitialrectangle.Thisrectangleisdefinedinthenextline:

rect=(100,50,421,378)

Nowtotheinterestingpart!WeruntheGrabCutalgorithmspecifyingtheemptymodelsandmask,andthefactthatwe’regoingtousearectangletoinitializetheoperation:

cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)

You’llalsonoticeanintegerafterfgdModel,whichisthenumberofiterationsthealgorithmisgoingtorunontheimage.Youcanincreasethese,butthereisapointinwhichpixelclassificationswillconverge,andeffectively,you’lljustbeaddingiterationswithoutobtaininganymoreimprovements.

Afterthis,ourmaskwillhavechangedtocontainvaluesbetween0and3.Thevalues,0and2,willbeconvertedintozeros,and1-3intoones,andstoredintomask2,whichwecanthenusetofilteroutallzero-valuepixels(theoreticallyleavingallforegroundpixelsintact):

mask2=np.where((mask==2)|(mask==0),0,1).astype('uint8')

img=img*mask2[:,:,np.newaxis]

Thelastpartofthecodedisplaystheimagessidebyside,andhere’stheresult:

Page 163: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thisisquiteasatisfactoryresult.You’llnoticethatanareaofbackgroundisleftundertheangel’sarm.Itispossibletoapplytouchstrokestoapplymoreiterations;thetechniqueisquitewellillustratedinthegrabcut.pyfileinsamples/python2ofyourOpenCVinstallation.

Page 164: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ImagesegmentationwiththeWatershedalgorithmFinally,wetakeaquicklookattheWatershedalgorithm.ThealgorithmiscalledWatershed,becauseitsconceptualizationinvolveswater.Imagineareaswithlowdensity(littletonochange)inanimageasvalleys,andareaswithhighdensity(lotsofchange)aspeaks.Startfillingthevalleyswithwatertothepointwherewaterfromtwodifferentvalleysisabouttomerge.Topreventthemergingofwaterfromdifferentvalleys,youbuildabarriertokeepthemseparated.Theresultingbarrieristheimagesegmentation.

AsanItalian,Ilovefood,andoneofthethingsIlovethemostisagoodplateofpastawithapestosauce.Sohere’sapictureofthemostvitalingredientforapesto,basil:

Now,wewanttosegmenttheimagetoseparatethebasilleavesfromthewhitebackground.

Oncemore,weimportnumpy,cv2,andmatplotlib,andthenimportourbasilleaves’image:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

img=cv2.imread('images/basil.jpg')

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

Afterchangingthecolortograyscale,werunathresholdontheimage.Thisoperationhelpsdividingtheimageintwo,blacksandwhites:

ret,thresh=

cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

Nextup,weremovenoisefromtheimagebyapplyingthemorphologyExtransformation,anoperationthatconsistsofdilatingandthenerodinganimagetoextractfeatures:

kernel=np.ones((3,3),np.uint8)

opening=cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel,iterations=2)

Page 165: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Bydilatingtheresultofthemorphologytransformation,wecanobtainareasoftheimagethataremostcertainlybackground:

sure_bg=cv2.dilate(opening,kernel,iterations=3)

Conversely,wecanobtainsureforegroundareasbyapplyingdistanceTransform.Inpracticalterms,ofalltheareasmostlikelytobeforeground,thefartherawayfromthe“border”withthebackgroundapointis,thehigherthechanceitisforeground.Oncewe’veobtainedthedistanceTransformrepresentationoftheimage,weapplyathresholdtodeterminewithahighlymathematicalprobabilitywhethertheareasareforeground:

dist_transform=cv2.distanceTransform(opening,cv2.DIST_L2,5)

ret,sure_fg=cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)

Atthisstage,wehavesomesureforegroundsandbackgrounds.Now,whatabouttheareasinbetween?Firstofall,weneedtodeterminetheseregions,whichcanbedonebysubtractingthesureforegroundfromthebackground:

sure_fg=np.uint8(sure_fg)

unknown=cv2.subtract(sure_bg,sure_fg)

Nowthatwehavetheseareas,wecanbuildourfamous“barriers”tostopthewaterfrommerging.ThisisdonewiththeconnectedComponentsfunction.WetookaglimpseatthegraphtheorywhenweanalyzedtheGrabCutalgorithm,andconceptualizedanimageasasetofnodesthatareconnectedbyedges.Giventhesureforegroundareas,someofthesenodeswillbeconnectedtogether,butsomewon’t.Thismeansthattheybelongtodifferentwatervalleys,andthereshouldbeabarrierbetweenthem:

ret,markers=cv2.connectedComponents(sure_fg)

Nowweadd1tothebackgroundareasbecauseweonlywantunknownstostayat0:

markers=markers+1

markers[unknown==255]=0

Finally,weopenthegates!Letthewaterfallandourbarriersbedrawninred:

markers=cv2.watershed(img,markers)

img[markers==-1]=[255,0,0]

plt.imshow(img)

plt.show()

Now,let’sshowtheresult:

Page 166: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Needlesstosay,Iamnowhungry!

Page 167: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 168: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryInthischapter,welearnedaboutgatheringthree-dimensionalinformationfrombi-dimensionalinput(avideoframeoranimage).Firstly,weexamineddepthcameras,andthenepipolargeometryandstereoimages,sowearenowabletocalculatedisparitymaps.Finally,welookedatimagesegmentationwithtwoofthemostpopularmethods:GrabCutandWatershed.

ThischapterhasintroducedustotheworldofinterpretinginformationprovidedbyimagesandwearenowreadytoexploreanotherimportantfeatureofOpenCV:featuredescriptorsandkeypointdetection.

Page 169: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 170: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter5.DetectingandRecognizingFacesAmongthemanyreasonsthatmakecomputervisionafascinatingsubjectisthefactthatcomputervisionmakesveryfuturistic-soundingtasksareality.Onesuchfeatureisfacedetection.OpenCVhasabuilt-infacilitytoperformfacedetection,whichhasvirtuallyinfiniteapplicationsintherealworldinallsortsofcontexts,fromsecuritytoentertainment.

ThischapterintroducessomeofOpenCV’sfacedetectionfunctionalities,alongwiththedatafilesthatdefineparticulartypesoftrackableobjects.Specifically,welookatHaarcascadeclassifiers,whichanalyzecontrastbetweenadjacentimageregionstodeterminewhetherornotagivenimageorsubimagematchesaknowntype.WeconsiderhowtocombinemultipleHaarcascadeclassifiersinahierarchy,suchthatoneclassifieridentifiesaparentregion(forourpurposes,aface)andotherclassifiersidentifychildregions(eyes,nose,andmouth).

Wealsotakeadetourintothehumblebutimportantsubjectofrectangles.Bydrawing,copying,andresizingrectangularimageregions,wecanperformsimplemanipulationsonimageregionsthatwearetracking.

Bytheendofthischapter,wewillintegratefacetrackingandrectanglemanipulationsintoCameo.Finally,we’llhavesomeface-to-faceinteraction!

Page 171: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ConceptualizingHaarcascadesWhenwetalkaboutclassifyingobjectsandtrackingtheirlocation,whatexactlyarewehopingtopinpoint?Whatconstitutesarecognizablepartofanobject?

Photographicimages,evenfromawebcam,maycontainalotofdetailforour(human)viewingpleasure.However,imagedetailtendstobeunstablewithrespecttovariationsinlighting,viewingangle,viewingdistance,camerashake,anddigitalnoise.Moreover,evenrealdifferencesinphysicaldetailmightnotinterestusforthepurposeofclassification.Iwastaughtinschoolthatnotwosnowflakeslookalikeunderamicroscope.Fortunately,asaCanadianchild,Ihadalreadylearnedhowtorecognizesnowflakeswithoutamicroscope,asthesimilaritiesaremoreobviousinbulk.

Thus,somemeansofabstractingimagedetailisusefulinproducingstableclassificationandtrackingresults.Theabstractionsarecalledfeatures,whicharesaidtobeextractedfromtheimagedata.Thereshouldbefarfewerfeaturesthanpixels,thoughanypixelmightinfluencemultiplefeatures.ThelevelofsimilaritybetweentwoimagescanbeevaluatedbasedonEuclideandistancesbetweentheimages’correspondingfeatures.

Forexample,distancemightbedefinedintermsofspatialcoordinatesorcolorcoordinates.Haar-likefeaturesareonetypeoffeaturethatisoftenappliedtoreal-timefacetracking.Theywerefirstusedforthispurposeinthepaper,RobustReal-TimeFaceDetection,PaulViolaandMichaelJones,KluwerAcademicPublishers,2001(availableathttp://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf).EachHaar-likefeaturedescribesthepatternofcontrastamongadjacentimageregions.Forexample,edges,vertices,andthinlineseachgeneratedistinctivefeatures.

Foranygivenimage,thefeaturesmayvarydependingontheregion’ssize;thismaybecalledthewindowsize.Twoimagesthatdifferonlyinscaleshouldbecapableofyieldingsimilarfeatures,albeitfordifferentwindowsizes.Thus,itisusefultogeneratefeaturesformultiplewindowsizes.Suchacollectionoffeaturesiscalledacascade.WemaysayaHaarcascadeisscale-invariantor,inotherwords,robusttochangesinscale.OpenCVprovidesaclassifierandtrackerforscale-invariantHaarcascadesthatitexpectstobeinacertainfileformat.

Haarcascades,asimplementedinOpenCV,arenotrobusttochangesinrotation.Forexample,anupside-downfaceisnotconsideredsimilartoanuprightfaceandafaceviewedinprofileisnotconsideredsimilartoafaceviewedfromthefront.Amorecomplexandmoreresource-intensiveimplementationcouldimproveHaarcascades’robustnesstorotationbyconsideringmultipletransformationsofimagesaswellasmultiplewindowsizes.However,wewillconfineourselvestotheimplementationinOpenCV.

Page 172: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 173: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

GettingHaarcascadedataOnceyouhaveacopyofthesourcecodeofOpenCV3,youwillfindafolder,data/haarcascades.

ThisfoldercontainsalltheXMLfilesusedbytheOpenCVfacedetectionenginetodetectfacesinstillimages,videos,andcamerafeeds.

Onceyoufindhaarcascades,createadirectoryforyourproject;inthisfolder,createasubfoldercalledcascades,andcopythefollowingfilesfromhaarcascadesintocascades:

haarcascade_profileface.xml

haarcascade_righteye_2splits.xml

haarcascade_russian_plate_number.xml

haarcascade_smile.xml

haarcascade_upperbody.xml

Astheirnamessuggest,thesecascadesarefortrackingfaces,eyes,noses,andmouths.Theyrequireafrontal,uprightviewofthesubject.Wewillusethemlaterwhenbuildingafacedetector.Ifyouarecuriousabouthowthesedatasetsaregenerated,refertoAppendixB,GeneratingHaarCascadesforCustomTargets,OpenCVComputerVisionwithPython.Withalotofpatienceandapowerfulcomputer,youcanmakeyourowncascadesandtrainthemforvarioustypesofobjects.

Page 174: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 175: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

UsingOpenCVtoperformfacedetectionUnlikewhatyoumaythinkfromtheoutset,performingfacedetectiononastillimageoravideofeedisanextremelysimilaroperation.Thelatterisjustthesequentialversionoftheformer:facedetectiononvideosissimplyfacedetectionappliedtoeachframereadintotheprogramfromthecamera.Naturally,awholehostofconceptsareappliedtovideofacedetectionsuchastracking,whichdoesnotapplytostillimages,butit’salwaysgoodtoknowthattheunderlyingtheoryisthesame.

Solet’sgoaheadanddetectsomefaces.

Page 176: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PerformingfacedetectiononastillimageThefirstandmostbasicwaytoperformfacedetectionistoloadanimageanddetectfacesinit.Tomaketheresultvisuallymeaningful,wewilldrawrectanglesaroundfacesontheoriginalimage.

Nowthatyouhavehaarcascadesincludedinyourproject,let’sgoaheadandcreateabasicscripttoperformfacedetection.

importcv2

filename='/path/to/my/pic.jpg'

defdetect(filename):

face_cascade=

cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

img=cv2.imread(filename)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

faces=face_cascade.detectMultiScale(gray,1.3,5)

for(x,y,w,h)infaces:

img=cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)

cv2.namedWindow('VikingsDetected!!')

cv2.imshow('VikingsDetected!!',img)

cv2.imwrite('./vikings.jpg',img)

cv2.waitKey(0)

detect(filename)

Let’sgothroughthecode.First,weusetheobligatorycv2import(you’llfindthateveryscriptinthisbookwillstartlikethis,oralmostsimilar).Secondly,wedeclarethedetectfunction.

defdetect(filename):

Withinthisfunction,wedeclareaface_cascadevariable,whichisaCascadeClassifierobjectforfaces,andresponsibleforfacedetection.

face_cascade=

cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

Wethenloadourfilewithcv2.imread,andconvertittograyscale,becausethat’sthecolorspaceinwhichthefacedetectionhappens.

Thenextstep(face_cascade.detectMultiScale)iswhereweoperatetheactualfacedetection.

img=cv2.imread(filename)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

faces=face_cascade.detectMultiScale(gray,1.3,5)

TheparameterspassedarescaleFactorandminNeighbors,whichdeterminethepercentagereductionoftheimageateachiterationofthefacedetectionprocess,andtheminimumnumberofneighborsretainedbyeachfacerectangleateachiteration.Thismay

Page 177: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

allseemalittlecomplexinthebeginningbutyoucancheckalltheoptionsoutintheofficialdocumentation.

Thevaluereturnedfromthedetectionoperationisanarrayoftuplesthatrepresentthefacerectangles.Theutilitymethod,cv2.rectangle,allowsustodrawrectanglesatthespecifiedcoordinates(xandyrepresenttheleftandtopcoordinates,wandhrepresentthewidthandheightofthefacerectangle).

Wewilldrawbluerectanglesaroundallthefaceswefindbyloopingthroughthefacesvariable,makingsureweusetheoriginalimagefordrawing,notthegrayversion.

for(x,y,w,h)infaces:

img=cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)

Lastly,wecreateanamedWindowinstanceanddisplaytheresultingprocessedimageinit.Topreventtheimagewindowfromclosingautomatically,weinsertacalltowaitKey,whichclosesthewindowdownatthepressofanykey.

cv2.namedWindow('VikingsDetected!!')

cv2.imshow('VikingsDetected!!',img)

cv2.waitKey(0)

Andtherewego,awholesetofVikingshavebeendetectedinourimage,asshowninthefollowingscreenshot:

Page 178: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PerformingfacedetectiononavideoWenowhaveagoodfoundationtounderstandhowtoperformfacedetectiononastillimage.Asmentionedpreviously,wecanrepeattheprocessontheindividualframesofavideo(beitacamerafeedoravideo)andperformfacedetection.

Thescriptwillperformthefollowingtasks:itwillopenacamerafeed,itwillreadaframe,itwillexaminethatframeforfaces,itwillscanforeyeswithinthefacesdetected,andthenitwilldrawbluerectanglesaroundthefacesandgreenrectanglesaroundtheeyes.

1. Let’screateafilecalledface_detection.pyandstartbyimportingthenecessarymodule:

importcv2

2. Afterthis,wedeclareamethod,detect(),whichwillperformfacedetection.

defdetect():

face_cascade=

cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

eye_cascade=cv2.CascadeClassifier('./cascades/haarcascade_eye.xml')

camera=cv2.VideoCapture(0)

3. Thefirstthingweneedtodoinsidethedetect()methodistoloadtheHaarcascadefilessothatOpenCVcanoperatefacedetection.Aswecopiedthecascadefilesinthelocalcascades/folder,wecanusearelativepath.Then,weopenaVideoCaptureobject(thecamerafeed).TheVideoCaptureconstructortakesaparameter,whichindicatesthecameratobeused;zeroindicatesthefirstcameraavailable.

while(True):

ret,frame=camera.read()

gray=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

4. Nextup,wecaptureaframe.Theread()methodreturnstwovalues:aBooleanindicatingthesuccessoftheframereadoperation,andtheframeitself.Wecapturetheframe,andthenweconvertittograyscale.Thisisanecessaryoperation,becausefacedetectioninOpenCVhappensinthegrayscalecolorspace:

faces=face_cascade.detectMultiScale(gray,1.3,5)

5. Muchlikethesinglestillimageexample,wecalldetectMultiScaleonthegrayscaleversionoftheframe.

for(x,y,w,h)infaces:

img=cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)

roi_gray=gray[y:y+h,x:x+w]

eyes=eye_cascade.detectMultiScale(roi_gray,1.03,5,0,

(40,40))

Page 179: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

NoteThereareafewadditionalparametersintheeyedetection.Why?ThemethodsignaturefordetectMultiScaletakesanumberofoptionalparameters:inthecaseofdetectingaface,thedefaultoptionsweregoodenoughtodetectfaces.However,eyesareasmallerfeatureoftheface,andself-castingshadowsinmybeardormynoseandrandomshadowsintheframeweretriggeringfalsepositives.

Bylimitingthesearchforeyestoaminimumsizeof40x40pixels,Iwasabletodiscardallfalsepositives.Goaheadandtesttheseparametersuntilyoureachapointatwhichyourapplicationperformsasyouexpecteditto(forexample,youcantryandspecifyamaximumsizeforthefeaturetoo,orincreasethescalefactorandnumberofneighbors).

6. Herewehaveafurtherstepcomparedtothestillimageexample:wecreatearegionofinterestcorrespondingtothefacerectangle,andwithinthisrectangle,weoperate“eyedetection”.Thismakessenseasyouwouldn’twanttogolookingforeyesoutsideaface(well,forhumanbeingsatleast!).

for(ex,ey,ew,eh)ineyes:

cv2.rectangle(img,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)

7. Again,weloopthroughtheresultingeyetuplesanddrawgreenrectanglesaroundthem.

cv2.imshow("camera",frame)

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

camera.release()

cv2.destroyAllWindows()

if__name__=="__main__":

detect()

8. Finally,weshowtheresultingframeinthewindow.Allbeingwell,ifanyfaceiswithinthefieldofviewofthecamera,youwillhaveabluerectanglearoundtheirfaceandagreenrectanglearoundeacheye,asshowninthisscreenshot:

Page 180: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 181: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PerformingfacerecognitionDetectingfacesisafantasticfeatureofOpenCVandonethatconstitutesthebasisforamoreadvancedoperation:facerecognition.Whatisfacerecognition?It’stheabilityofaprogram,givenanimageoravideofeed,toidentifyaperson.Oneofthewaystoachievethis(andtheapproachadoptedbyOpenCV)isto“train”theprogrambyfeedingitasetofclassifiedpictures(afacialdatabase),andoperatetherecognitionagainstthosepictures.

ThisistheprocessthatOpenCVanditsfacerecognitionmodulefollowtorecognizefaces.

Anotherimportantfeatureofthefacerecognitionmoduleisthateachrecognitionhasaconfidencescore,whichallowsustosetthresholdsinreal-lifeapplicationstolimittheamountoffalsereads.

Let’sstartfromtheverybeginning;tooperatefacerecognition,weneedfacestorecognize.Youcandothisintwoways:supplytheimagesyourselforobtainfreelyavailablefacedatabases.ThereareanumberoffacedatabasesontheInternet:

TheYalefacedatabase(Yalefaces):http://vision.ucsd.edu/content/yale-face-databaseTheAT&T:http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.htmlTheExtendedYaleorYaleB:http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

Tooperatefacerecognitiononthesesamples,youwouldthenhavetorunfacerecognitiononanimagethatcontainsthefaceofoneofthesampledpeople.Thatmaybeaneducationalprocess,butIfoundittobenotassatisfyingasprovidingimagesofmyown.Infact,Iprobablyhadthesamethoughtthatmanypeoplehad:IwonderifIcouldwriteaprogramthatrecognizesmyfacewithacertaindegreeofconfidence.

GeneratingthedataforfacerecognitionSolet’sgoaheadandwriteascriptthatwillgeneratethoseimagesforus.Afewimagescontainingdifferentexpressionsareallthatweneed,butwehavetomakesurethesampleimagesadheretocertaincriteria:

Imageswillbegrayscaleinthe.pgmformatSquareshapeAllthesamesizeimages(Iused200x200;mostfreelyavailablesetsaresmallerthanthat)

Here’sthescriptitself:

importcv2

defgenerate():

face_cascade=

cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

eye_cascade=cv2.CascadeClassifier('./cascades/haarcascade_eye.xml')

camera=cv2.VideoCapture(0)

Page 182: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

count=0

while(True):

ret,frame=camera.read()

gray=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

faces=face_cascade.detectMultiScale(gray,1.3,5)

for(x,y,w,h)infaces:

img=cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)

f=cv2.resize(gray[y:y+h,x:x+w],(200,200))

cv2.imwrite('./data/at/jm/%s.pgm'%str(count),f)

count+=1

cv2.imshow("camera",frame)

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

camera.release()

cv2.destroyAllWindows()

if__name__=="__main__":

generate()

Whatisquiteinterestingaboutthisexerciseisthatwearegoingtogeneratesampleimagesbuildingonournewfoundknowledgeofhowtodetectafaceinavideofeed.Effectively,whatwearedoingisdetectingaface,croppingthatregionofthegray-scaledframe,resizingittobe200x200pixels,andsavingitwithanameinaparticularfolder(inmycase,jm;youcanuseyourinitials)inthe.pgmformat.

Iinsertedavariable,count,becauseweneededprogressivenamesfortheimages.Runthescriptforafewseconds,changeexpressionsafewtimes,andcheckthedestinationfolderyouspecifiedinthescript.Youwillfindanumberofimagesofyourface,grayed,resized,andnamedwiththeformat,<count>.pgm.

Let’snowmoveontotryandrecognizeourfaceinavideofeed.Thisshouldbefun!

RecognizingfacesOpenCV3comeswiththreemainmethodsforrecognizingfaces,basedonthreedifferentalgorithms:Eigenfaces,Fisherfaces,andLocalBinaryPatternHistograms(LBPH).Itisbeyondthescopeofthisbooktogetintothenitty-grittyofthetheoreticaldifferencesbetweenthesemethods,butwecangiveahigh-leveloverviewoftheconcepts.

Iwillreferyoutothefollowinglinksforadetaileddescriptionofthealgorithms:

PrincipalComponentAnalysis(PCA):AveryintuitiveintroductionbyJonathonShlensisavailableathttp://arxiv.org/pdf/1404.1100v1.pdf.Thisalgorithmwasinventedin1901byK.Pearson,andtheoriginalpaper,OnLinesandPlanesofClosestFittoSystemsofPointsinSpace,isavailableathttp://stat.smmu.edu.cn/history/pearson1901.pdf.Eigenfaces:Thepaper,EigenfacesforRecognition,M.TurkandA.Pentland,1991,

Page 183: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

isavailableathttp://www.cs.ucsb.edu/~mturk/Papers/jcn.pdf.Fisherfaces:Theseminalpaper,THEUSEOFMULTIPLEMEASUREMENTSINTAXONOMICPROBLEMS,R.A.Fisher,1936,isavailableathttp://onlinelibrary.wiley.com/doi/10.1111/j.1469-1809.1936.tb02137.x/pdf.LocalBinaryPattern:ThefirstpaperdescribingthisalgorithmisPerformanceevaluationoftexturemeasureswithclassificationbasedonKullbackdiscriminationofdistributions,T.Ojala,M.Pietikainen,D.Harwoodandisavailableathttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=576366&searchWithin%5B%5D=%22Authors%22%3A.QT.Ojala%2C+T..QT.&newsearch=true

Firstandforemost,allmethodsfollowasimilarprocess;theyalltakeasetofclassifiedobservations(ourfacedatabase,containingnumeroussamplesperindividual),get“trained”onit,performananalysisoffacesdetectedinanimageorvideo,anddeterminetwoelements:whetherthesubjectisidentified,andameasureoftheconfidenceofthesubjectreallybeingidentified,whichiscommonlyknownastheconfidencescore.

EigenfacesperformsasocalledPCA,which—ofallthemathematicalconceptsyouwillhearmentionedinrelationtocomputervision—ispossiblythemostdescriptive.Itbasicallyidentifiesprincipalcomponentsofacertainsetofobservations(again,yourfacedatabase),calculatesthedivergenceofthecurrentobservation(thefacesbeingdetectedinanimageorframe)comparedtothedataset,anditproducesavalue.Thesmallerthevalue,thesmallerthedifferencebetweenfacedatabaseanddetectedface;hence,avalueof0isanexactmatch.

FisherfacesderivesfromPCAandevolvestheconcept,applyingmorecomplexlogic.Whilecomputationallymoreintensive,ittendstoyieldmoreaccurateresultsthanEigenfaces.

LBPHinsteadroughly(again,fromaveryhighlevel)dividesadetectedfaceintosmallcellsandcompareseachcelltothecorrespondingcellinthemodel,producingahistogramofmatchingvaluesforeacharea.Becauseofthisflexibleapproach,LBPHistheonlyfacerecognitionalgorithmthatallowsthemodelsamplefacesandthedetectedfacestobeofdifferentshapeandsize.Ipersonallyfoundthistobethemostaccuratealgorithmgenerallyspeaking,buteachalgorithmhasitsstrengthsandweaknesses.

PreparingthetrainingdataNowthatwehaveourdata,weneedtoloadthesesamplepicturesintoourfacerecognitionalgorithms.Allfacerecognitionalgorithmstaketwoparametersintheirtrain()method:anarrayofimagesandanarrayoflabels.Whatdotheselabelsrepresent?TheyaretheIDsofacertainindividual/facesothatwhenfacerecognitionisperformed,wenotonlyknowthepersonwasrecognizedbutalsowho—amongthemanypeopleavailableinourdatabase—thepersonis.

Todothat,weneedtocreateacomma-separatedvalue(CSV)file,whichwillcontainthepathtoasamplepicturefollowedbytheIDofthatperson.Inmycase,Ihave20picturesgeneratedwiththepreviousscript,inthesubfolder,jm/,ofthefolder,data/at/,whichcontainsallthepicturesofalltheindividuals.

Page 184: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

MyCSVfilethereforelookslikethis:

jm/1.pgm;0

jm/2.pgm;0

jm/3.pgm;0…

jm/20.pgm;0

NoteThedotsareallthemissingnumbers.Thejm/instanceindicatesthesubfolder,andthe0valueattheendistheIDformyface.

OK,atthisstage,wehaveeverythingweneedtoinstructOpenCVtorecognizeourface.

LoadingthedataandrecognizingfacesNextup,weneedtoloadthesetworesources(thearrayofimagesandCSVfile)intothefacerecognitionalgorithm,soitcanbetrainedtorecognizeourface.Todothis,webuildafunctionthatreadstheCSVfileand—foreachlineofthefile—loadstheimageatthecorrespondingpathintotheimagesarrayandtheIDintothelabelsarray.

defread_images(path,sz=None):

c=0

X,y=[],[]

fordirname,dirnames,filenamesinos.walk(path):

forsubdirnameindirnames:

subject_path=os.path.join(dirname,subdirname)

forfilenameinos.listdir(subject_path):

try:

if(filename==".directory"):

continue

filepath=os.path.join(subject_path,filename)

im=cv2.imread(os.path.join(subject_path,filename),

cv2.IMREAD_GRAYSCALE)

#resizetogivensize(ifgiven)

if(szisnotNone):

im=cv2.resize(im,(200,200))

X.append(np.asarray(im,dtype=np.uint8))

y.append(c)

exceptIOError,(errno,strerror):

print"I/Oerror({0}):{1}".format(errno,strerror)

except:

print"Unexpectederror:",sys.exc_info()[0]

raise

c=c+1

return[X,y]

PerforminganEigenfacesrecognitionWe’rereadytotestthefacerecognitionalgorithm.Here’sthescripttoperformit:

Page 185: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

defface_rec():

names=['Joe','Jane','Jack']

iflen(sys.argv)<2:

print"USAGE:facerec_demo.py</path/to/images>

[</path/to/store/images/at>]"

sys.exit()

[X,y]=read_images(sys.argv[1])

y=np.asarray(y,dtype=np.int32)

iflen(sys.argv)==3:

out_dir=sys.argv[2]

model=cv2.face.createEigenFaceRecognizer()

model.train(np.asarray(X),np.asarray(y))

camera=cv2.VideoCapture(0)

face_cascade=

cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')

while(True):

read,img=camera.read()

faces=face_cascade.detectMultiScale(img,1.3,5)

for(x,y,w,h)infaces:

img=cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

roi=gray[x:x+w,y:y+h]

try:

roi=cv2.resize(roi,(200,200),

interpolation=cv2.INTER_LINEAR)

params=model.predict(roi)

print"Label:%s,Confidence:%.2f"%(params[0],params[1])

cv2.putText(img,names[params[0]],(x,y-20),

cv2.FONT_HERSHEY_SIMPLEX,1,255,2)

except:

continue

cv2.imshow("camera",img)

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

cv2.destroyAllWindows()

Thereareafewlinesthatmaylookabitmysterious,solet’sanalyzethescript.Firstofall,there’sanarrayofnamesdeclared;thosearetheactualnamesoftheindividualpeopleIstoredinmydatabaseoffaces.It’sgreattoidentifyapersonasID0,butprinting'Joe'ontopofafacethat’sbeencorrectlydetectedandrecognizedismuchmoredramatic.

SowheneverthescriptrecognizesanID,wewillprintthecorrespondingnameinthenamesarrayinsteadofanID.

Afterthis,weloadtheimagesasdescribedinthepreviousfunction,createthefacerecognitionmodelwithcv2.createEigenFaceRecognizer(),andtrainitbypassingthetwoarraysofimagesandlabels(IDs).NotethattheEigenfacerecognizertakestwoimportantparametersthatyoucanspecify:thefirstoneisthenumberofprincipalcomponentsyouwanttokeepandthesecondisafloatvaluespecifyingaconfidencethreshold.

Page 186: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Nextup,werepeatasimilarprocesstothefacedetectionoperation.Thistime,though,weextendtheprocessingoftheframesbyalsooperatingfacerecognitiononanyfacethat’sbeendetected.

Thishappensintwosteps:firstly,weresizethedetectedfacetotheexpectedsize(inmycase,sampleswere200x200pixels),andthenwecallthepredict()functionontheresizedregion.

NoteThisisabitofasimplifiedprocess,anditservesthepurposeofenablingyoutohaveabasicapplicationrunningandunderstandtheprocessoffacerecognitioninOpenCV3.Inreality,youwillapplyafewmoreoptimizations,suchascorrectlyaligningandrotatingdetectedfaces,sotheaccuracyoftherecognitionismaximized.

Lastly,weobtaintheresultsoftherecognitionand,justforeffect,wedrawitintheframe:

PerformingfacerecognitionwithFisherfacesWhataboutFisherfaces?Theprocessdoesn’tchangemuch;wesimplyneedtoinstantiateadifferentalgorithm.So,thedeclarationofourmodelvariablewouldlooklikeso:

model=cv2.face.createFisherFaceRecognizer()

FisherfacetakesthesametwoargumentsasEigenfaces:theFisherfacestokeepandtheconfidencethreshold.Faceswithconfidenceabovethisthresholdwillbediscarded.

PerformingfacerecognitionwithLBPHFinally,let’stakeaquicklookattheLBPHalgorithm.Again,theprocessisverysimilar.However,theparameterstakenbythealgorithmfactoryareabitmorecomplexasthey

Page 187: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

indicateinorder:radius,neighbors,grid_x,grid_y,andtheconfidencethreshold.Ifyoudon’tspecifythesevalues,theywillautomaticallybesetto1,8,8,8,and123.0.Themodeldeclarationwilllooklikeso:

model=cv2.face.createLBPHFaceRecognizer()

NoteNotethatwithLBPH,youwon’tneedtoresizeimages,asthedivisioningridsallowscomparingpatternsidentifiedineachcell.

DiscardingresultswithconfidencescoreThepredict()methodreturnsatwo-elementarray:thefirstelementisthelabeloftherecognizedindividualandthesecondistheconfidencescore.Allalgorithmscomewiththeoptionofsettingaconfidencescorethreshold,whichmeasuresthedistanceoftherecognizedfacefromtheoriginalmodel,thereforeascoreof0signifiesanexactmatch.

Theremaybecasesinwhichyouwouldratherretainallrecognitions,andthenapplyfurtherprocessing,soyoucancomeupwithyourownalgorithmstoestimatetheconfidencescoreofarecognition;forexample,ifyouaretryingtoidentifypeopleinavideo,youmaywanttoanalyzetheconfidencescoreinsubsequentframestoestablishwhethertherecognitionwassuccessfulornot.Inthiscase,youcaninspecttheconfidencescoreobtainedbythealgorithmanddrawyourownconclusions.

NoteTheconfidencescorevalueiscompletelydifferentinEigenfaces/FisherfacesandLBPH.EigenfacesandFisherfaceswillproducevalues(roughly)intherange0to20,000,withanyscorebelow4-5,000beingquiteaconfidentrecognition.

LBPHworkssimilarly;however,thereferencevalueforagoodrecognitionisbelow50,andanyvalueabove80isconsideredasalowconfidencescore.

Anormalcustomapproachwouldbetohold-offdrawingarectanglearoundarecognizedfaceuntilwehaveanumberofframeswithasatisfyingarbitraryconfidencescore,butyouhavetotalfreedomtouseOpenCV’sfacerecognitionmoduletotailoryourapplicationtoyourneeds.

Page 188: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 189: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryBynow,youshouldhaveagoodunderstandingofhowfacedetectionandfacerecognitionwork,andhowtoimplementtheminPythonandOpenCV3.

Facedetectionandfacerecognitionareconstantlyevolvingbranchesofcomputervision,withalgorithmsbeingdevelopedcontinuously,andtheywillevolveevenfasterinthenearfuturewiththeemphasisposedonroboticsandtheInternetofthings.

Fornow,theaccuracyofdetectionandrecognitionheavilydependsonthequalityofthetrainingdata,somakesureyouprovideyourapplicationswithhigh-qualityfacedatabasesandyouwillbesatisfiedwiththeresults.

Page 190: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 191: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter6.RetrievingImagesandSearchingUsingImageDescriptorsSimilartothehumaneyesandbrain,OpenCVcandetectthemainfeaturesofanimageandextracttheseintoso-calledimagedescriptors.Thesefeaturescanthenbeusedasadatabase,enablingimage-basedsearches.Moreover,wecanusekeypointstostitchimagestogetherandcomposeabiggerimage(thinkofputtingtogethermanypicturestoforma360degreepanorama).

ThischaptershowsyouhowtodetectfeaturesofanimagewithOpenCVandmakeuseofthemtomatchandsearchimages.Throughoutthechapter,wewilltakesampleimagesanddetecttheirmainfeatures,andthentrytofindasampleimagecontainedinanotherimageusinghomography.

Page 192: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FeaturedetectionalgorithmsThereareanumberofalgorithmsthatcanbeusedtodetectandextractfeatures,andwewillexploremostofthem.ThemostcommonalgorithmsusedinOpenCVareasfollows:

Harris:ThisalgorithmisusefultodetectcornersSIFT:ThisalgorithmisusefultodetectblobsSURF:ThisalgorithmisusefultodetectblobsFAST:ThisalgorithmisusefultodetectcornersBRIEF:ThisalgorithmisusefultodetectblobsORB:ThisalgorithmstandsforOrientedFASTandRotatedBRIEF

Matchingfeaturescanbeperformedwiththefollowingmethods:

Brute-ForcematchingFLANN-basedmatching

Spatialverificationcanthenbeperformedwithhomography.

Page 193: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DefiningfeaturesWhatisafeatureexactly?Whyisaparticularareaofanimageclassifiableasafeature,whileothersarenot?Broadlyspeaking,afeatureisanareaofinterestintheimagethatisuniqueoreasilyrecognizable.Asyoucanimagine,cornersandhigh-densityareasaregoodfeatures,whilepatternsthatrepeatthemselvesalotorlow-densityareas(suchasabluesky)arenot.Edgesaregoodfeaturesastheytendtodividetworegionsofanimage.Ablob(anareaofanimagethatgreatlydiffersfromitssurroundingareas)isalsoaninterestingfeature.

Mostfeaturedetectionalgorithmsrevolvearoundtheidentificationofcorners,edges,andblobs,withsomealsofocusingontheconceptofaridge,whichyoucanconceptualizeasthesymmetryaxisofanelongatedobject(think,forexample,aboutidentifyingaroadinanimage).

Somealgorithmsarebetteratidentifyingandextractingfeaturesofacertaintype,soit’simportanttoknowwhatyourinputimageissothatyoucanutilizethebesttoolinyourOpenCVbelt.

Detectingfeatures–cornersLet’sstartbyidentifyingcornersbyutilizingCornerHarris,andlet’sdothiswithanexample.IfyoucontinuestudyingOpenCVbeyondthisbook,you’llfindthat—formanyareason—chessboardsareacommonsubjectofanalysisincomputervision,partlybecauseachequeredpatternissuitedtomanytypesoffeaturedetections,andmaybebecausechessisprettypopularamonggeeks.

Here’soursampleimage:

Page 194: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

OpenCVhasaveryhandyutilityfunctioncalledcornerHarris,whichdetectscornersinanimage.Thecodetoillustratethisisincrediblysimple:

importcv2

importnumpyasnp

img=cv2.imread('images/chess_board.png')

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

gray=np.float32(gray)

dst=cv2.cornerHarris(gray,2,23,0.04)

img[dst>0.01*dst.max()]=[0,0,255]

while(True):

cv2.imshow('corners',img)

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

cv2.destroyAllWindows()

Let’sanalyzethecode:aftertheusualimports,weloadthechessboardimageandturnittograyscalesothatcornerHarriscancomputeit.Then,wecallthecornerHarrisfunction:

dst=cv2.cornerHarris(gray,2,23,0.04)

Themostimportantparameterhereisthethirdone,whichdefinestheapertureoftheSobeloperator.TheSobeloperatorperformsthechangedetectionintherowsandcolumnsofanimagetodetectedges,anditdoesthisusingakernel.TheOpenCVcornerHarrisfunctionusesaSobeloperatorwhoseapertureisdefinedbythisparameter.Inplainenglish,itdefineshowsensitivecornerdetectionis.Itmustbebetween3and31andbeanoddvalue.Atvalue3,allthosediagonallinesintheblacksquaresofthechessboardwillregisterascornerswhentheytouchtheborderofthesquare.At23,onlythecornersofeachsquarewillbedetectedascorners.

Considerthefollowingline:

img[dst>0.01*dst.max()]=[0,0,255]

Here,inplaceswherearedmarkinthecornersisdetected,tweakingthesecondparameterincornerHarriswillchangethis,thatis,thesmallerthevalue,thesmallerthemarksindicatingcorners.

Here’sthefinalresult:

Page 195: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Great,wehavecornerpointsmarked,andtheresultismeaningfulatfirstglance;allthecornersaremarkedinred.

Page 196: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FeatureextractionanddescriptionusingDoGandSIFTTheprecedingtechnique,whichusescornerHarris,isgreattodetectcornersandhasadistinctadvantagebecausecornersarecorners;theyaredetectedeveniftheimageisrotated.

However,ifwereduce(orincrease)thesizeofanimage,somepartsoftheimagemayloseorevengainacornerquality.

Forexample,lookatthefollowingcornerdetectionsoftheF1ItalianGrandPrixtrack:

Here’sasmallerversionofthesamescreenshot:

Youwillnoticehowcornersarealotmorecondensed;however,wedidn’tonlygaincorners,wealsolostsome!Inparticular,lookattheVarianteAscarichicanethatsquigglesattheendoftheNW/SEstraightpartofthetrack.Inthelargerversionoftheimage,boththeentranceandtheapexofthedoublebendweredetectedascorners.Inthe

Page 197: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

smallerimage,theapexisnotdetectedassuch.Themorewereducetheimage,themorelikelyitisthatwe’regoingtolosetheentrancetothatchicanetoo.

Thislossoffeaturesraisesanissue;weneedanalgorithmthatwillworkregardlessofthescaleoftheimage.EnterSIFT:whileScale-InvariantFeatureTransformmaysoundabitmysterious,nowthatweknowwhatproblemwe’retryingtoresolve,itactuallymakessense.Weneedafunction(atransform)thatwilldetectfeatures(afeaturetransform)andwillnotoutputdifferentresultsdependingonthescaleoftheimage(ascale-invariantfeaturetransform).NotethatSIFTdoesnotdetectkeypoints(whichisdonewithDifferenceofGaussians),butitdescribestheregionsurroundingthembymeansofafeaturevector.

AquickintroductiontoDifferenceofGaussians(DoG)isinorder;wehavealreadytalkedaboutlowpassfiltersandblurringoperations,specificallywiththecv2.GaussianBlur()function.DoGistheresultofdifferentGaussianfiltersappliedtothesameimage.InChapter3,ProcessingImageswithOpenCV3,weappliedthistechniquetocomputeaveryefficientedgedetection,andtheideaisthesame.ThefinalresultofaDoGoperationcontainsareasofinterest(keypoints),whicharethengoingtobedescribedthroughSIFT.

Let’sseehowSIFTbehavesinanimagefullofcornersandfeatures:

Now,thebeautifulpanoramaofVarese(Lombardy,Italy)alsogainsacomputervisionmeaning.Here’sthecodeusedtoobtainthisprocessedimage:

importcv2

importsys

importnumpyasnp

imgpath=sys.argv[1]

Page 198: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

img=cv2.imread(imgpath)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

sift=cv2.xfeatures2d.SIFT_create()

keypoints,descriptor=sift.detectAndCompute(gray,None)

img=cv2.drawKeypoints(image=img,outImage=img,keypoints=keypoints,

flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINT,color=(51,163,236))

cv2.imshow('sift_keypoints',img)

while(True):

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

cv2.destroyAllWindows()

Aftertheusualimports,weloadtheimagethatwewanttoprocess.Tomakethisscriptgeneric,wewilltaketheimagepathasacommand-lineargumentusingthesysmoduleofPython:

imgpath=sys.argv[1]

img=cv2.imread(imgpath)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

Wethenturntheimageintograyscale.Atthisstage,youmayhavegatheredthatmostprocessingalgorithmsinPythonneedagrayscalefeedinordertowork.

ThenextstepistocreateaSIFTobjectandcomputethegrayscaleimage:

sift=cv2.xfeatures2d.SIFT_create()

keypoints,descriptor=sift.detectAndCompute(gray,None)

Thisisaninterestingandimportantprocess;theSIFTobjectusesDoGtodetectkeypointsandcomputesafeaturevectorforthesurroundingregionsofeachkeypoint.Asthenameofthemethodclearlygivesaway,therearetwomainoperationsperformed:detectionandcomputation.Thereturnvalueoftheoperationisatuplecontainingkeypointinformation(keypoints)andthedescriptor.

Finally,weprocessthisimagebydrawingthekeypointsonitanddisplayingitwiththeusualimshowfunction.

NotethatinthedrawKeypointsfunction,wepassaflagthathasavalueof4.Thisisactuallythecv2moduleproperty:

cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINT

Thiscodeenablesthedrawingofcirclesandorientationofeachkeypoint.

AnatomyofakeypointLet’stakeaquicklookatthedefinition,fromtheOpenCVdocumentation,ofthekeypointclass:

pt

size

Page 199: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

angle

response

octave

class_id

Somepropertiesaremoreself-explanatorythanothers,butlet’snottakeanythingforgrantedandgothrougheachone:

Thept(point)propertyindicatesthexandycoordinatesofthekeypointintheimage.Thesizepropertyindicatesthediameterofthefeature.Theanglepropertyindicatestheorientationofthefeatureasshownintheprecedingprocessedimage.Theresponsepropertyindicatesthestrengthofthekeypoint.SomefeaturesareclassifiedbySIFTasstrongerthanothers,andresponseisthepropertyyouwouldchecktoevaluatethestrengthofafeature.Theoctavepropertyindicatesthelayerinthepyramidwherethefeaturewasfound.Tofullyexplainthisproperty,wewouldneedtowriteanentirechapteronit,soIwillonlyintroducethebasicconcept.TheSIFTalgorithmoperatesinasimilarfashiontofacedetectionalgorithmsinthat,itprocessesthesameimagesequentiallybutalterstheparametersofthecomputation.

Forexample,thescaleoftheimageandneighboringpixelsareparametersthatchangeateachiteration(octave)ofthealgorithm.So,theoctavepropertyindicatesthelayeratwhichthekeypointwasdetected.

Finally,theobjectIDistheIDofthekeypoint.

Page 200: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FeatureextractionanddetectionusingFastHessianandSURFComputervisionisarelativelyrecentbranchofcomputerscienceandmanyalgorithmsandtechniquesareofrecentinvention.SIFTisinfactonly16yearsold,havingbeenpublishedbyDavidLowein1999.

SURFisafeaturedetectionalgorithmpublishedin2006byHerbertBay,whichisseveraltimesfasterthanSIFT,anditispartiallyinspiredbyit.

NoteNotethatbothSIFTandSURFarepatentedalgorithmsand,forthisreason,aremadeavailableinthexfeatures2dmoduleofOpenCV.

ItisnotparticularlyrelevanttothisbooktounderstandhowSURFworksunderthehood,asmuchaswecanuseitinourapplicationsandmakethebestofit.WhatisimportanttounderstandisthatSURFisanOpenCVclassthatoperateskeypointdetectionwiththeFastHessianalgorithmandextractionwithSURF,muchliketheSIFTclassinOpenCVoperatingkeypointdetectionwithDoGandextractionwithSIFT.

Also,thegoodnewsisthatasafeaturedetectionalgorithm,theAPIofSURFdoesnotdifferfromSIFT.Therefore,wecansimplyeditthepreviousscripttodynamicallychooseafeaturedetectionalgorithminsteadofrewritingtheentireprogram.

Asweonlysupporttwoalgorithmsfornow,thereisnoneedtofindaparticularlyelegantsolutiontotheevaluationofthealgorithmtobeusedandwe’llusethesimpleifblocks,asshowninthefollowingcode:

importcv2

importsys

importnumpyasnp

imgpath=sys.argv[1]

img=cv2.imread(imgpath)

alg=sys.argv[2]

deffd(algorithm):

ifalgorithm=="SIFT":

returncv2.xfeatures2d.SIFT_create()

ifalgorithm=="SURF":

returncv2.xfeatures2d.SURF_create(float(sys.argv[3])iflen(sys.argv)

==4else4000)

gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

fd_alg=fd(alg)

keypoints,descriptor=fd_alg.detectAndCompute(gray,None)

img=cv2.drawKeypoints(image=img,outImage=img,keypoints=keypoints,

flags=4,color=(51,163,236))

Page 201: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

cv2.imshow('keypoints',img)

while(True):

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

cv2.destroyAllWindows()

Here’stheresultusingSURFwithathreshold:

ThisimagehasbeenobtainedbyprocessingitwithaSURFalgorithmusingaHessianthresholdof8000.Tobeprecise,Iranthefollowingcommand:

>pythonfeat_det.pyimages/varese.jpgSURF8000

Thehigherthethreshold,thelessfeaturesidentified,soplayaroundwiththevaluesuntilyoureachanoptimaldetection.Intheprecedingcase,youcanclearlyseehowindividualbuildingsaredetectedasfeatures.

InaprocesssimilartotheoneweadoptedinChapter4,DepthEstimationandSegmentation,whenwewerecalculatingdisparitymaps,try—asanexercise—tocreateatrackbartofeedthevalueoftheHessianthresholdtotheSURFinstance,andseethenumberoffeaturesincreaseanddecreaseinaninverselyproportionalfashion.

Now,let’sexaminecornerdetectionwithFAST,theBRIEFkeypointdescriptor,andORB(whichusesthetwo)andputfeaturedetectiontogooduse.

Page 202: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ORBfeaturedetectionandfeaturematchingIfSIFTisyoung,andSURFyounger,ORBisinitsinfancy.ORBwasfirstpublishedin2011asafastalternativetoSIFTandSURF.

Thealgorithmwaspublishedinthepaper,ORB:anefficientalternativetoSIFTorSURF,andisavailableinthePDFformatathttp://www.vision.cs.chubu.ac.jp/CV-R/pdf/Rublee_iccv2011.pdf.

ORBmixestechniquesusedintheFASTkeypointdetectionandtheBRIEFdescriptor,soitisdefinitelyworthtakingaquicklookatFASTandBRIEFfirst.WewillthentalkaboutBrute-Forcematching—oneofthealgorithmsusedforfeaturematching—andshowanexampleoffeaturematching.

FASTTheFeaturesfromAcceleratedSegmentTest(FAST)algorithmworksinacleverway;itdrawsacirclearoundincluding16pixels.Itthenmarkseachpixelbrighterordarkerthanaparticularthresholdcomparedtothecenterofthecircle.Acornerisdefinedbyidentifyinganumberofcontiguouspixelsmarkedasbrighterordarker.

FASTimplementsahigh-speedtest,whichattemptsatquicklyskippingthewhole16-pixeltest.Tounderstandhowthistestworks,let’stakealookatthisscreenshot:

Asyoucansee,threeoutoffourofthetestpixels(pixelsnumber1,9,5,and13)mustbewithin(orbeyond)thethreshold(and,therefore,markedasbrighterordarker)andonemustbeintheoppositesideofthethreshold.Ifallfouraremarkedasbrighterordarker,ortwoareandtwoarenot,thepixelisnotacandidatecorner.

FASTisanincrediblycleveralgorithm,butnotdevoidofweaknesses,andtocompensatetheseweaknesses,developersanalyzingimagescanimplementamachinelearningapproach,feedingasetofimages(relevanttoyourapplication)tothealgorithmsothatcornerdetectionisoptimized.

Despitethis,FASTdependsonathreshold,sothedeveloper’sinputisalwaysnecessary(unlikeSIFT).

Page 203: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

BRIEFBinaryRobustIndependentElementaryFeatures(BRIEF),ontheotherhand,isnotafeaturedetectionalgorithm,butadescriptor.Wehavenotyetexploredthisconcept,solet’sexplainwhatadescriptoris,andthenlookatBRIEF.

YouwillnoticethatwhenwepreviouslyanalyzedimageswithSIFTandSURF,theheartoftheentireprocessisthecalltothedetectAndComputefunction.Thisfunctionoperatestwodifferentsteps:detectionandcomputation,andtheyreturntwodifferentresultsifcoupledinatuple.

Theresultofdetectionisasetofkeypoints;theresultofthecomputationisthedescriptor.ThismeansthattheOpenCV’sSIFTandSURFclassesarebothdetectorsanddescriptors(although,rememberthattheoriginalalgorithmsarenot!OpenCV’sSIFTisreallyDoGplusSIFTandOpenCV’sSURFisreallyFastHessianplusSURF).

Keypointdescriptorsarearepresentationoftheimagethatservesasthegatewaytofeaturematchingbecauseyoucancomparethekeypointdescriptorsoftwoimagesandfindcommonalities.

BRIEFisoneofthefastestdescriptorscurrentlyavailable.ThetheorybehindBRIEFisactuallyquitecomplicated,butsufficetosaythatBRIEFadoptsaseriesofoptimizationsthatmakeitaverygoodchoiceforfeaturematching.

Brute-ForcematchingABrute-Forcematcherisadescriptormatcherthatcomparestwodescriptorsandgeneratesaresultthatisalistofmatches.Thereasonwhyit’scalledBrute-Forceisthatthereislittleoptimizationinvolvedinthealgorithm;allthefeaturesinthefirstdescriptorarecomparedtothefeaturesintheseconddescriptor,andeachcomparisonisgivenadistancevalueandthebestresultisconsideredamatch.

Thisiswhyit’scalledBrute-Force.Incomputing,theterm,brute-force,isoftenassociatedwithanapproachthatprioritizestheexhaustionofallpossiblecombinations(forexample,allthepossiblecombinationsofcharacterstocrackapassword)oversomecleverandconvolutedalgorithmicallogic.OpenCVprovidesaBFMatcherobjectthatdoesjustthat.

Page 204: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FeaturematchingwithORBNowthatwehaveageneralideaofwhatFASTandBRIEFare,wecanunderstandwhytheteambehindORB(atthetimecomposedbyEthanRublee,VincentRabaud,KurtKonolige,andGaryR.Bradski)chosethesetwoalgorithmsasafoundationforORB.

Intheirpaper,theauthorsaimatachievingthefollowingresults:

TheadditionofafastandaccurateorientationcomponenttoFASTTheefficientcomputationoforientedBRIEFfeaturesAnalysisofvarianceandcorrelationoforientedBRIEFfeaturesAlearningmethodtodecorrelateBRIEFfeaturesunderrotationalinvariance,leadingtobetterperformanceinnearest-neighborapplications.

Asidefromverytechnicaljargon,themainpointsarequiteclear;ORBaimsatoptimizingandspeedingupoperations,includingtheveryimportantstepofutilizingBRIEFinarotation-awarefashionsothatmatchingisimprovedeveninsituationswhereatrainingimagehasaverydifferentrotationtothequeryimage.

Atthisstage,though,Ibetyouhavehadenoughofthetheoryandwanttosinkyourteethinsomefeaturematching,solet’sgolookatsomecode.

Asanavidlistenerofmusic,thefirstexamplethatcomestomymindistogetthelogoofabandandmatchittooneoftheband’salbums:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

img1=cv2.imread('images/manowar_logo.png',cv2.IMREAD_GRAYSCALE)

img2=cv2.imread('images/manowar_single.jpg',cv2.IMREAD_GRAYSCALE)

orb=cv2.ORB_create()

kp1,des1=orb.detectAndCompute(img1,None)

kp2,des2=orb.detectAndCompute(img2,None)

bf=cv2.BFMatcher(cv2.NORM_HAMMING,crossCheck=True)

matches=bf.match(des1,des2)

matches=sorted(matches,key=lambdax:x.distance)

img3=cv2.drawMatches(img1,kp1,img2,kp2,matches[:40],img2,flags=2)

plt.imshow(img3),plt.show()

Let’snowexaminethiscodestepbystep.

Aftertheusualimports,weloadtwoimages(thequeryimageandthetrainingimage).

Notethatyouhaveprobablyseentheloadingofimageswithasecondparameterwiththevalueof0beingpassed.Thisisbecausecv2.imreadtakesasecondparameterthatcanbeoneofthefollowingflags:

IMREAD_ANYCOLOR=4

IMREAD_ANYDEPTH=2

IMREAD_COLOR=1

IMREAD_GRAYSCALE=0

IMREAD_LOAD_GDAL=8

Page 205: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

IMREAD_UNCHANGED=-1

Asyoucansee,cv2.IMREAD_GRAYSCALEisequalto0,soyoucanpasstheflagitselforitsvalue;theyarethesamething.

Thisistheimagewe’veloaded:

Thisisanotherimagethatwe’veloaded:

Now,weproceedtocreatingtheORBfeaturedetectoranddescriptor:

orb=cv2.ORB_create()

kp1,des1=orb.detectAndCompute(img1,None)

kp2,des2=orb.detectAndCompute(img2,None)

InasimilarfashiontowhatwedidwithSIFTandSURF,wedetectandcomputethe

Page 206: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

keypointsanddescriptorsforbothimages.

Thetheoryatthispointisprettysimple;iteratethroughthedescriptorsanddeterminewhethertheyareamatchornot,andthencalculatethequalityofthismatch(distance)andsortthematchessothatwecandisplaythetopnmatcheswithadegreeofconfidencethattheyare,infact,matchingfeaturesonbothimages.

BFMatcher,asdescribedinBrute-Forcematching,doesthisforus:

bf=cv2.BFMatcher(cv2.NORM_HAMMING,crossCheck=True)

matches=bf.match(des1,des2)

matches=sorted(matches,key=lambdax:x.distance)

Atthisstage,wealreadyhavealltheinformationweneed,butascomputervisionenthusiasts,weplacequiteabitofimportanceonvisuallyrepresentingdata,solet’sdrawthesematchesinamatplotlibchart:

img3=cv2.drawMatches(img1,kp1,img2,kp2,matches[:40],img2,flags=2)

plt.imshow(img3),plt.show()

Theresultisasfollows:

Page 207: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

UsingK-NearestNeighborsmatchingThereareanumberofalgorithmsthatcanbeusedtodetectmatchessothatwecandrawthem.OneofthemisK-NearestNeighbors(KNN).Usingdifferentalgorithmsfordifferenttaskscanbereallybeneficial,becauseeachalgorithmhasstrengthsandweaknesses.Somemaybemoreaccuratethanothers,somemaybefasterorlesscomputationallyexpensive,soit’suptoyoutodecidewhichonetouse,dependingonthetaskathand.

Forexample,ifyouhavehardwareconstraints,youmaychooseanalgorithmthatislesscostly.Ifyou’redevelopingareal-timeapplication,youmaychoosethefastestalgorithm,regardlessofhowheavyitisontheprocessorormemoryusage.

Amongallthemachinelearningalgorithms,KNNisprobablythesimplest,andwhilethetheorybehinditisinteresting,itiswelloutofthescopeofthisbook.Instead,wewillsimplyshowyouhowtouseKNNinyourapplication,whichisnotverydifferentfromtheprecedingexample.

Crucially,thetwopointswherethescriptdifferstoswitchtoKNNareinthewaywecalculatematcheswiththeBrute-Forcematcher,andthewaywedrawthesematches.Theprecedingexample,whichhasbeeneditedtouseKNN,lookslikethis:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

img1=cv2.imread('images/manowar_logo.png',0)

img2=cv2.imread('images/manowar_single.jpg',0)

orb=cv2.ORB_create()

kp1,des1=orb.detectAndCompute(img1,None)

kp2,des2=orb.detectAndCompute(img2,None)

bf=cv2.BFMatcher(cv2.NORM_HAMMING,crossCheck=True)

matches=bf.knnMatch(des1,des2,k=2)

img3=cv2.drawMatchesKnn(img1,kp1,img2,kp2,matches,img2,flags=2)

plt.imshow(img3),plt.show()

Thefinalresultissomewhatsimilartothepreviousone,sowhatisthedifferencebetweenmatchandknnMatch?Thedifferenceisthatmatchreturnsbestmatches,whileKNNreturnskmatches,givingthedevelopertheoptiontofurthermanipulatethematchesobtainedwithknnMatch.

Forexample,youcoulditeratethroughthematchesandapplyaratiotestsothatyoucanfilteroutmatchesthatdonotsatisfyauser-definedcondition.

Page 208: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FLANN-basedmatchingFinally,wearegoingtotakealookatFastLibraryforApproximateNearestNeighbors(FLANN).TheofficialInternethomeofFLANNisathttp://www.cs.ubc.ca/research/flann/.

LikeORB,FLANNhasamorepermissivelicensethanSIFTorSURF,soyoucanfreelyuseitinyourproject.QuotingthewebsiteofFLANN,

“FLANNisalibraryforperformingfastapproximatenearestneighborsearchesinhighdimensionalspaces.Itcontainsacollectionofalgorithmswefoundtoworkbestfornearestneighborsearchandasystemforautomaticallychoosingthebestalgorithmandoptimumparametersdependingonthedataset.

FLANNiswritteninC++andcontainsbindingsforthefollowinglanguages:C,MATLABandPython.”

Inotherwords,FLANNpossessesaninternalmechanismthatattemptsatemployingthebestalgorithmtoprocessadatasetdependingonthedataitself.FLANNhasbeenproventobe10timestimesfasterthanothernearestneighborssearchsoftware.

FLANNisevenavailableonGitHubathttps://github.com/mariusmuja/flann.Inmyexperience,I’vefoundFLANN-basedmatchingtobeveryaccurateandfastaswellasfriendlytouse.

Let’slookatanexampleofFLANN-basedfeaturematching:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

queryImage=cv2.imread('images/bathory_album.jpg',0)

trainingImage=cv2.imread('images/vinyls.jpg',0)

#createSIFTanddetect/compute

sift=cv2.xfeatures2d.SIFT_create()

kp1,des1=sift.detectAndCompute(queryImage,None)

kp2,des2=sift.detectAndCompute(trainingImage,None)

#FLANNmatcherparameters

FLANN_INDEX_KDTREE=0

indexParams=dict(algorithm=FLANN_INDEX_KDTREE,trees=5)

searchParams=dict(checks=50)#orpassemptydictionary

flann=cv2.FlannBasedMatcher(indexParams,searchParams)

matches=flann.knnMatch(des1,des2,k=2)

#prepareanemptymasktodrawgoodmatches

matchesMask=[[0,0]foriinxrange(len(matches))]

#DavidG.Lowe'sratiotest,populatethemask

fori,(m,n)inenumerate(matches):

Page 209: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ifm.distance<0.7*n.distance:

matchesMask[i]=[1,0]

drawParams=dict(matchColor=(0,255,0),

singlePointColor=(255,0,0),

matchesMask=matchesMask,

flags=0)

resultImage=

cv2.drawMatchesKnn(queryImage,kp1,trainingImage,kp2,matches,None,**drawPara

ms)

plt.imshow(resultImage,),plt.show()

Somepartsoftheprecedingscriptwillbefamiliartoyouatthisstage(importofmodules,imageloading,andcreationofaSIFTobject).

NoteTheinterestingpartisthedeclarationoftheFLANNmatcher,whichfollowsthedocumentationathttp://www.cs.ubc.ca/~mariusm/uploads/FLANN/flann_manual-1.6.pdf.

WefindthattheFLANNmatchertakestwoparameters:anindexParamsobjectandasearchParamsobject.Theseparameters,passedinadictionaryforminPython(andastructinC++),determinethebehavioroftheindexandsearchobjectsusedinternallybyFLANNtocomputethematches.

Inthiscase,wecouldhavechosenbetweenLinearIndex,KTreeIndex,KMeansIndex,CompositeIndex,andAutotuneIndex,andwechoseKTreeIndex.Why?Thisisbecauseit’sasimpleenoughindextoconfigure(onlyrequirestheusertospecifythenumberofkerneldensitytreestobeprocessed;agoodvalueisbetween1and16)andcleverenough(thekd-treesareprocessedinparallel).ThesearchParamsdictionaryonlycontainsonefield(checks)thatspecifiesthenumberoftimesanindextreeshouldbetraversed.Thehigherthevalue,thelongerittakestocomputethematching,butitwillalsobemoreaccurate.

Inreality,alotdependsontheinputthatyoufeedtheprogramwith.I’vefoundthat5kd-treesand50checksalwaysyieldarespectablyaccurateresult,whileonlytakingashorttimetocomplete.

AfterthecreationoftheFLANNmatcherandhavingcreatedthematchesarray,matchesarethenfilteredaccordingtothetestdescribedbyLoweinhispaper,DistinctiveImageFeaturesfromScale-InvariantKeypoints,availableathttps://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf.

Initschapter,Applicationtoobjectrecognition,Loweexplainsthatnotallmatchesare“good”ones,andthatfilteringaccordingtoanarbitrarythresholdwouldnotyieldgoodresultsallthetime.Instead,Dr.Loweexplains,

“Theprobabilitythatamatchiscorrectcanbedeterminedbytakingtheratioofdistancefromtheclosestneighbortothedistanceofthesecondclosest.”

Page 210: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Intheprecedingexample,discardinganyvaluewithadistancegreaterthan0.7willresultinjustafewgoodmatchesbeingfilteredout,whilegettingridofaround90percentoffalsematches.

Let’sunveiltheresultofapracticalexampleofFLANN.ThisisthequeryimagethatI’vefedthescript:

Thisisthetrainingimage:

Here,youmaynoticethattheimagecontainsthequeryimageatposition(5,3)ofthisgrid.

ThisistheFLANNprocessedresult:

Page 211: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Aperfectmatch!!

Page 212: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

FLANNmatchingwithhomographyFirstofall,whatishomography?Let’sreadadefinitionfromtheInternet:

“Arelationbetweentwofigures,suchthattoanypointoftheonecorrespondsoneandbutonepointintheother,andviseversa.Thus,atangentlinerollingonacirclecutstwofixedtangentsofthecircleintwosetsofpointsthatarehomographic.”

Ifyou—likeme—arenonethewiserfromtheprecedingdefinition,youwillprobablyfindthisexplanationabitclearer:homographyisaconditioninwhichtwofiguresfindeachotherwhenoneisaperspectivedistortionoftheother.

Unlikeallthepreviousexamples,let’sfirsttakealookatwhatwewanttoachievesothatwecanfullyunderstandwhathomographyis.Then,we’llgothroughthecode.Here’sthefinalresult:

Asyoucanseefromthescreenshot,wetookasubjectontheleft,correctlyidentifiedintheimageontheright-handside,drewmatchinglinesbetweenkeypoints,andevendrewawhitebordershowingtheperspectivedeformationoftheseedsubjectintheright-handsideoftheimage:

importnumpyasnp

importcv2

frommatplotlibimportpyplotasplt

MIN_MATCH_COUNT=10

img1=cv2.imread('images/bb.jpg',0)

img2=cv2.imread('images/color2_small.jpg',0)

sift=cv2.xfeatures2d.SIFT_create()

kp1,des1=sift.detectAndCompute(img1,None)

kp2,des2=sift.detectAndCompute(img2,None)

FLANN_INDEX_KDTREE=0

index_params=dict(algorithm=FLANN_INDEX_KDTREE,trees=5)

Page 213: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

search_params=dict(checks=50)

flann=cv2.FlannBasedMatcher(index_params,search_params)

matches=flann.knnMatch(des1,des2,k=2)

#storeallthegoodmatchesasperLowe'sratiotest.

good=[]

form,ninmatches:

ifm.distance<0.7*n.distance:

good.append(m)

iflen(good)>MIN_MATCH_COUNT:

src_pts=np.float32([kp1[m.queryIdx].ptformingood

]).reshape(-1,1,2)

dst_pts=np.float32([kp2[m.trainIdx].ptformingood

]).reshape(-1,1,2)

M,mask=cv2.findHomography(src_pts,dst_pts,cv2.RANSAC,5.0)

matchesMask=mask.ravel().tolist()

h,w=img1.shape

pts=np.float32([[0,0],[0,h-1],[w-1,h-1],[w-1,0]]).reshape(-1,1,2)

dst=cv2.perspectiveTransform(pts,M)

img2=cv2.polylines(img2,[np.int32(dst)],True,255,3,cv2.LINE_AA)

else:

print"Notenoughmatchesarefound-%d/%d"%

(len(good),MIN_MATCH_COUNT)

matchesMask=None

draw_params=dict(matchColor=(0,255,0),#drawmatchesingreencolor

singlePointColor=None,

matchesMask=matchesMask,#drawonlyinliers

flags=2)

img3=cv2.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)

plt.imshow(img3,'gray'),plt.show()

WhencomparedtothepreviousFLANN-basedmatchingexample,theonlydifference(andthisiswherealltheactionhappens)isintheifblock.

Here’swhathappensinthiscodestepbystep:firstly,wemakesurethatwehaveatleastacertainnumberofgoodmatches(theminimumrequiredtocomputeahomographyisfour),whichwewillarbitrarilysetat10(inreallife,youwouldprobablyuseahighervaluethanthis):

iflen(good)>MIN_MATCH_COUNT:

Then,wefindthekeypointsintheoriginalimageandthetrainingimage:

src_pts=np.float32([kp1[m.queryIdx].ptformingood]).reshape(-1,1,2)

dst_pts=np.float32([kp2[m.trainIdx].ptformingood]).reshape(-1,1,2)

Page 214: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Now,wefindthehomography:

M,mask=cv2.findHomography(src_pts,dst_pts,cv2.RANSAC,5.0)

matchesMask=mask.ravel().tolist()

NotethatwecreatematchesMask,whichwillbeusedinthefinaldrawingofthematchessothatonlypointslyingwithinthehomographywillhavematchinglinesdrawn.

Atthisstage,wesimplyhavetocalculatetheperspectivedistortionoftheoriginalobjectintothesecondpicturesothatwecandrawtheborder:

h,w=img1.shape

pts=np.float32([[0,0],[0,h-1],[w-1,h-1],[w-1,0]]).reshape(-1,1,2)

dst=cv2.perspectiveTransform(pts,M)

img2=cv2.polylines(img2,[np.int32(dst)],True,255,3,cv2.LINE_AA)

Andwethenproceedtodrawasperallourpreviousexamples.

Page 215: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Asampleapplication–tattooforensicsLet’sconcludethischapterwithareal-life(orkindof)example.Imaginethatyou’reworkingfortheGothamforensicsdepartmentandyouneedtoidentifyatattoo.Youhavetheoriginalpictureofthetattoo(imaginethiscomingfromaCCTVfootage)belongingtoacriminal,butyoudon’tknowtheidentityoftheperson.However,youpossessadatabaseoftattoos,indexedwiththenameofthepersontowhomthetattoobelongs.

So,let’sdividethetaskintwoparts:saveimagedescriptorstofilesfirst,andthen,scantheseformatchesagainstthepictureweareusingasaqueryimage.

SavingimagedescriptorstofileThefirstthingwewilldoissaveimagedescriptorstoanexternalfile.Thisissothatwedon’thavetorecreatethedescriptorseverytimewewanttoscantwoimagesformatchesandhomography.

Inourapplication,wewillscanafolderforimagesandcreatethecorrespondingdescriptorfilessothatwehavethemreadilyavailableforfuturesearches.

Tocreatedescriptorsandsavethemtoafile,wewilluseaprocesswehaveusedanumberoftimesinthischapter,namelyloadanimage,createafeaturedetector,detect,andcompute:

#generate_descriptors.py

importcv2

importnumpyasnp

fromosimportwalk

fromos.pathimportjoin

importsys

defcreate_descriptors(folder):

files=[]

for(dirpath,dirnames,filenames)inwalk(folder):

files.extend(filenames)

forfinfiles:

save_descriptor(folder,f,cv2.xfeatures2d.SIFT_create())

defsave_descriptor(folder,image_path,feature_detector):

img=cv2.imread(join(folder,image_path),0)

keypoints,descriptors=feature_detector.detectAndCompute(img,None)

descriptor_file=image_path.replace("jpg","npy")

np.save(join(folder,descriptor_file),descriptors)

dir=sys.argv[1]

create_descriptors(dir)

Inthisscript,wepassthefoldernamewhereallourimagesarecontained,andthencreateallthedescriptorfilesinthesamefolder.

NumPyhasaveryhandysave()utility,whichdumpsarraydataintoafileinanoptimizedway.Togeneratethedescriptorsinthefoldercontainingyourscript,runthis

Page 216: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

command:

>pythongenerate_descriptors.py<foldercontainingimages>

NotethatcPickle/picklearemorepopularlibrariesforPythonobjectserialization.However,inthisparticularcontext,wearetryingtolimitourselvestotheusageofOpenCVandPythonwithNumPyandSciPy.

ScanningformatchesNowthatwehavedescriptorssavedtofiles,allweneedtodoistorepeatthehomographyprocessonallthedescriptorsandfindapotentialmatchtoourqueryimage.

Thisistheprocesswewillputinplace:

Loadingaqueryimageandcreatingadescriptorforit(tattoo_seed.jpg)ScanningthefolderwithdescriptorsForeachdescriptor,computingaFLANN-basedmatchIfthenumberofmatchesisbeyondanarbitrarythreshold,includingthefileofpotentialculprits(rememberwe’reinvestigatingacrime)Ofalltheculprits,electingtheonewiththehighestnumberofmatchesasthepotentialsuspect

Let’sinspectthecodetoachievethis:

fromos.pathimportjoin

fromosimportwalk

importnumpyasnp

importcv2

fromsysimportargv

#createanarrayoffilenames

folder=argv[1]

query=cv2.imread(join(folder,"tattoo_seed.jpg"),0)

#createfiles,images,descriptorsglobals

files=[]

images=[]

descriptors=[]

for(dirpath,dirnames,filenames)inwalk(folder):

files.extend(filenames)

forfinfiles:

iff.endswith("npy")andf!="tattoo_seed.npy":

descriptors.append(f)

printdescriptors

#createthesiftdetector

sift=cv2.xfeatures2d.SIFT_create()

query_kp,query_ds=sift.detectAndCompute(query,None)

#createFLANNmatcher

FLANN_INDEX_KDTREE=0

index_params=dict(algorithm=FLANN_INDEX_KDTREE,trees=5)

search_params=dict(checks=50)

flann=cv2.FlannBasedMatcher(index_params,search_params)

Page 217: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

#minimumnumberofmatches

MIN_MATCH_COUNT=10

potential_culprits={}

print">>Initiatingpicturescan…"

fordindescriptors:

print"---------analyzing%sformatches------------"%d

matches=flann.knnMatch(query_ds,np.load(join(folder,d)),k=2)

good=[]

form,ninmatches:

ifm.distance<0.7*n.distance:

good.append(m)

iflen(good)>MIN_MATCH_COUNT:

print"%sisamatch!(%d)"%(d,len(good))

else:

print"%sisnotamatch"%d

potential_culprits[d]=len(good)

max_matches=None

potential_suspect=None

forculprit,matchesinpotential_culprits.iteritems():

ifmax_matches==Noneormatches>max_matches:

max_matches=matches

potential_suspect=culprit

print"potentialsuspectis%s"%potential_suspect.replace("npy",

"").upper()

Isavedthisscriptasscan_for_matches.py.Theonlyelementofnoveltyinthisscriptistheuseofnumpy.load(filename),whichloadsannpyfileintoannparray.

Runningthescriptproducesthefollowingoutput:

>>Initiatingpicturescan…

---------analyzingposion-ivy.npyformatches------------

posion-ivy.npyisnotamatch

---------analyzingbane.npyformatches------------

bane.npyisnotamatch

---------analyzingtwo-face.npyformatches------------

two-face.npyisnotamatch

---------analyzingriddler.npyformatches------------

riddler.npyisnotamatch

---------analyzingpenguin.npyformatches------------

penguin.npyisnotamatch

---------analyzingdr-hurt.npyformatches------------

dr-hurt.npyisamatch!(298)

---------analyzinghush.npyformatches------------

hush.npyisamatch!(301)

potentialsuspectisHUSH.

Ifweweretorepresentthisgraphically,thisiswhatwewouldsee:

Page 218: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 219: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 220: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryInthischapter,welearnedaboutdetectingfeaturesinimagesandextractingthemintodescriptors.WeexploredanumberofalgorithmsavailableinOpenCVtoaccomplishthistask,andthenappliedthemtoreal-lifescenariostounderstandareal-worldapplicationoftheconceptsweexplored.

Wearenowfamiliarwiththeconceptofdetectingfeaturesinanimage(oravideoframe),whichisagoodfoundationforthenextchapter.

Page 221: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 222: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter7.DetectingandRecognizingObjectsThischapterwillintroducetheconceptofdetectingandrecognizingobjects,whichisoneofthemostcommonchallengesincomputervision.You’vecomethisfarinthebook,soatthisstage,you’rewonderinghowfarareyoufrommountingacomputerinyourcarthatwillgiveyouinformationaboutcarsandpeoplesurroundingyouthroughtheuseofacamera.Well,You’renottoofarfromyourgoal,actually.

Inthischapter,wewillexpandontheconceptofobjectdetection,whichweinitiallyexploredwhentalkingaboutrecognizingfaces,andadaptittoallsortsofreal-lifeobjects,notjustfaces.

Page 223: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ObjectdetectionandrecognitiontechniquesWemadeadistinctioninChapter5,DetectingandRecognizingFaces,whichwe’llreiterateforclarity:detectinganobjectistheabilityofaprogramtodetermineifacertainregionofanimagecontainsanunidentifiedobject,andrecognizingistheabilityofaprogramtoidentifythisobject.Recognizingnormallyonlyoccursinareasofinterestwhereanobjecthasbeendetected,forexample,wehaveattemptedtorecognizefacesontheareasofanimagethatcontainedafaceinthefirstplace.

Whenitcomestorecognizinganddetectingobjects,thereareanumberoftechniquesusedincomputervision,whichwe’llbeexamining:

HistogramofOrientedGradientsImagepyramidsSlidingwindows

Unlikefeaturedetectionalgorithms,thesearenotmutuallyexclusivetechniques,rather,theyarecomplimentary.YoucanperformaHistogramofOrientedGradients(HOG)whileapplyingtheslidingwindowstechnique.

So,let’stakealookatHOGfirstandunderstandwhatitis.

Page 224: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

HOGdescriptorsHOGisafeaturedescriptor,soitbelongstothesamefamilyofalgorithms,suchasSIFT,SURF,andORB.

Itisusedinimageandvideoprocessingtodetectobjects.Itsinternalmechanismisreallyclever;animageisdividedintoportionsandagradientforeachportioniscalculated.We’veobservedasimilarapproachwhenwetalkedaboutfacerecognitionthroughLBPH.

HOG,however,calculateshistogramsthatarenotbasedoncolorvalues,rather,theyarebasedongradients.AsHOGisafeaturedescriptor,itiscapableofdeliveringthetypeofinformationthatisvitalforfeaturematchingandobjectdetection/recognition.

BeforedivingintothetechnicaldetailsofhowHOGworks,let’sfirsttakealookathowHOGseestheworld;hereisanimageofatruck:

ThisisitsHOGversion:

Page 225: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Youcaneasilyrecognizethewheelsandthemainstructureofthevehicle.So,whatisHOGseeing?Firstofall,youcanseehowtheimageisdividedintocells;theseare16x16pixelscells.Eachcellcontainsavisualrepresentationofthecalculatedgradientsofcolorineightdirections(N,NW,W,SW,S,SE,E,andNE).

Theseeightvaluescontainedineachcellarethefamoushistograms.Therefore,asinglecellgetsauniquesignature,whichyoucanmentallyvisualizetobesomewhatlikethis:

Theextrapolationofhistogramsintodescriptorsisquiteacomplexprocess.First,localhistogramsforeachcellarecalculated.Thecellsaregroupedintolargerregionscalledblocks.Theseblockscanbemadeofanynumberofcells,butDalalandTriggsfoundthat2x2cellblocksyieldedthebestresultswhenperformingpeopledetection.Ablock-widevectoriscreatedsothatitcanbenormalized,accountingforvariationsinilluminationandshadowing(asinglecellistoosmallaregiontodetectsuchvariations).Thisimprovestheaccuracyofdetectionasitreducestheilluminationandshadowingdifferencebetweenthesampleandtheblockbeingexamined.

Simplycomparingcellsintwoimageswouldnotworkunlesstheimagesareidentical(bothintermsofsizeanddata).

Therearetwomainproblemstoresolve:

LocationScale

ThescaleissueImagine,forexample,ifyoursamplewasadetail(say,abike)extrapolatedfromalargerimage,andyou’retryingtocomparethetwopictures.Youwouldnotobtainthesamegradientsignaturesandthedetectionwouldfail(eventhoughthebikeisinbothpictures).

ThelocationissueOncewe’veresolvedthescaleproblem,wehaveanotherobstacleinourpath:apotentiallydetectableobjectcanbeanywhereintheimage,soweneedtoscantheentireimageinportionstomakesurewecanidentifyareasofinterest,andwithintheseareas,trytodetectobjects.Evenifasampleimageandobjectintheimageareofidenticalsize,thereneedstobeawaytoinstructOpenCVtolocatethisobject.So,therestoftheimageisdiscardedandacomparisonismadeonpotentiallymatchingregions.

Toobviatetheseproblems,weneedtofamiliarizeourselveswiththeconceptsofimagepyramidandslidingwindows.

Imagepyramid

Manyofthealgorithmsusedincomputervisionutilizeaconceptcalledpyramid.

Page 226: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Animagepyramidisamultiscalerepresentationofanimage.Thisdiagramshouldhelpyouunderstandthisconcept:

Amultiscalerepresentationofanimage,oranimagepyramid,helpsyouresolvetheproblemofdetectingobjectsatdifferentscales.Theimportanceofthisconceptiseasilyexplainedthroughreal-lifehardfacts,suchasitisextremelyunlikelythatanobjectwillappearinanimageattheexactscaleitappearedinoursampleimage.

Moreover,youwilllearnthatobjectclassifiers(utilitiesthatallowyoutodetectobjectsinOpenCV)needtraining,andthistrainingisprovidedthroughimagedatabasesmadeupofpositivematchesandnegativematches.Amongthepositives,itisagainunlikelythattheobjectwewanttoidentifywillappearinthesamescalethroughoutthetrainingdataset.

We’vegotit,Joe.Weneedtotakescaleoutoftheequation,sonowlet’sexaminehowanimagepyramidisbuilt.

Animagepyramidisbuiltthroughthefollowingprocess:

1. Takeanimage.2. Resize(smaller)theimageusinganarbitraryscaleparameter.3. Smoothentheimage(usingGaussianblurring).4. Iftheimageislargerthananarbitraryminimumsize,repeattheprocessfromstep1.

Despiteexploringimagepyramids,scaleratio,andminimumsizesonlyatthisstageofthebook,you’vealreadydealtwiththem.IfyourecallChapter5,DetectingandRecognizingFaces,weusedthedetectMultiScalemethodoftheCascadeClassifierobject.

Straightaway,detectMultiScaledoesn’tsoundsoobscureanymore;infact,ithas

Page 227: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

becomeself-explanatory.Thecascadeclassifierobjectattemptsatdetectinganobjectatdifferentscalesofaninputimage.ThesecondpieceofinformationthatshouldbecomemuchcleareristhescaleFactorparameterofthedetectMultiScale()method.Thisparameterrepresentstheratioatwhichtheimagewillberesampledtoasmallersizeateachstepofthepyramid.

ThesmallerthescaleFactorparameter,themorelayersinthepyramid,andtheslowerandmorecomputationallyintensivetheoperationwillbe,although—toanextent—moreaccurateinresults.

So,bynow,youshouldhaveanunderstandingofwhatanimagepyramidis,andwhyitisusedincomputervision.Let’snowmoveontoslidingwindows.

Slidingwindows

Slidingwindowsisatechniqueusedincomputervisionthatconsistsofexaminingtheshiftingportionsofanimage(slidingwindows)andoperatingdetectiononthoseusingimagepyramids.Thisisdonesothatanobjectcanbedetectedatamultiscalelevel.

Slidingwindowsresolveslocationissuesbyscanningsmallerregionsofalargerimage,andthenrepeatingthescanningondifferentscalesofthesameimage.

Withthistechnique,eachimageisdecomposedintoportions,whichallowsdiscardingportionsthatareunlikelytocontainobjects,whiletheremainingportionsareclassified.

Thereisoneproblemthatemergeswiththisapproach,though:overlappingregions.

Let’sexpandalittlebitonthisconcepttoclarifythenatureoftheproblem.Say,you’reoperatingfacedetectiononanimageandareusingslidingwindows.

Eachwindowslidesoffafewpixelsatatime,whichmeansthataslidingwindowhappenstobeapositivematchforthesamefaceinfourdifferentpositions.Naturally,wedon’twanttoreportfourmatches,ratheronlyone;furthermore,we’renotinterestedintheportionoftheimagewithagoodscore,butsimplyintheportionwiththehighestscore.

Here’swherenon-maximumsuppressioncomesintoplay:givenasetofoverlappingregions,wecansuppressalltheregionsthatarenotclassifiedwiththemaximumscore.

Non-maximum(ornon-maxima)suppressionNon-maximum(ornon-maxima)suppressionisatechniquethatsuppressesalltheresultsthatrelatetothesameareaofanimage,whicharenotthemaximumscoreforaparticulararea.Thisisbecausesimilarlycolocatedwindowstendtohavehigherscoresandoverlappingareasaresignificant,butweareonlyinterestedinthewindowwiththebestresult,anddiscardingoverlappingwindowswithlowerscores.

Whenexamininganimagewithslidingwindows,youwanttomakesuretoretainthebestwindowofabunchofwindows,alloverlappingaroundthesamesubject.

Todothis,youdeterminethatallthewindowswithmorethanathreshold,x,incommonwillbethrownintothenon-maximumsuppressionoperation.

Thisisquitecomplex,butit’salsonottheendofthisprocess.Remembertheimage

Page 228: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

pyramid?We’rescanningtheimageatsmallerscalesiterativelytomakesuretodetectobjectsindifferentscales.

Thismeansthatyouwillobtainaseriesofwindowsatdifferentscales,then,computethesizeofawindowobtainedinasmallerscaleasifitweredetectedintheoriginalscale,and,finally,throwthiswindowintotheoriginalmix.

Itdoessoundabitcomplex.Thankfully,we’renotthefirsttocomeacrossthisproblem,whichhasbeenresolvedinseveralways.ThefastestalgorithminmyexperiencewasimplementedbyDr.TomaszMalisiewiczathttp://www.computervisionblog.com/2011/08/blazing-fast-nmsm-from-exemplar-svm.html.TheexampleisinMATLAB,butintheapplicationexample,wewillobviouslyuseaPythonversionofit.

Thegeneralapproachbehindnon-maximumsuppressionisasfollows:

1. Onceanimagepyramidhasbeenconstructed,scantheimagewiththeslidingwindowapproachforobjectdetection.

2. Collectallthecurrentwindowsthathavereturnedapositiveresult(beyondacertainarbitrarythreshold),andtakeawindow,W,withthehighestresponse.

3. EliminateallwindowsthatoverlapWsignificantly.4. Moveontothenextwindowwiththehighestresponseandrepeattheprocessforthe

currentscale.

Whenthisprocessiscomplete,moveupthenextscaleintheimagepyramidandrepeattheprecedingprocess.Tomakesurewindowsarecorrectlyrepresentedattheendoftheentirenon-maximumsuppressionprocess,besuretocomputethewindowsizeinrelationtotheoriginalsizeoftheimage(forexample,ifyoudetectawindowat50percentscaleoftheoriginalsizeinthepyramid,thedetectedwindowwillactuallybefourtimesthesizeintheoriginalimage).

Attheendofthisprocess,youwillhaveasetofmaximumscoredwindows.Optionally,youcancheckforwindowsthatareentirelycontainedinotherwindows(likewedidforthepeopledetectionprocessatthebeginningofthechapter)andeliminatethose.

Now,howdowedeterminethescoreofawindow?Weneedaclassificationsystemthatdetermineswhetheracertainfeatureispresentornotandaconfidencescoreforthisclassification.Thisiswheresupportvectormachines(SVM)comesintoplay.

SupportvectormachinesExplainingindetailwhatanSVMisanddoesisbeyondthescopeofthisbook,butsufficeittosay,SVMisanalgorithmthat—givenlabeledtrainingdata–enablestheclassificationofthisdatabyoutputtinganoptimalhyperplane,which,inplainEnglish,istheoptimalplanethatdividesdifferentlyclassifieddata.Avisualrepresentationwillhelpyouunderstandthis:

Page 229: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Whyisitsohelpfulincomputervisionandobjectdetectioninparticular?Thisisduetothefactthatfindingtheoptimaldivisionlinebetweenpixelsthatbelongtoanobjectandthosethatdon’tisavitalcomponentofobjectdetection.

TheSVMmodelhasbeenaroundsincetheearly1960s;however,thecurrentformofitsimplementationoriginatesina1995paperbyCorinnaCortesandVadimirVapnik,whichisavailableathttp://link.springer.com/article/10.1007/BF00994018.

Nowthatwehaveagoodunderstandingoftheconceptsinvolvedinobjectdetection,wecanstartlookingatafewexamples.Wewillstartfrombuilt-infunctionsandevolveintotrainingourowncustomobjectdetectors.

Page 230: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PeopledetectionOpenCVcomeswithHOGDescriptorthatperformspeopledetection.

Here’saprettystraightforwardexample:

importcv2

importnumpyasnp

defis_inside(o,i):

ox,oy,ow,oh=o

ix,iy,iw,ih=i

returnox>ixandoy>iyandox+ow<ix+iwandoy+oh<iy+ih

defdraw_person(image,person):

x,y,w,h=person

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,255),2)

img=cv2.imread("../images/people.jpg")

hog=cv2.HOGDescriptor()

hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

found,w=hog.detectMultiScale(img)

found_filtered=[]

forri,rinenumerate(found):

forqi,qinenumerate(found):

ifri!=qiandis_inside(r,q):

break

else:

found_filtered.append(r)

forpersoninfound_filtered:

draw_person(img,person)

cv2.imshow("peopledetection",img)

cv2.waitKey(0)

cv2.destroyAllWindows()

Aftertheusualimports,wedefinetwoverysimplefunctions:is_insideanddraw_person,whichperformtwominimaltasks,namely,determiningwhetherarectangleisfullycontainedinanotherrectangle,anddrawingrectanglesarounddetectedpeople.

WethenloadtheimageandcreateHOGDescriptorthroughaverysimpleandself-explanatorycode:

cv2.HOGDescriptor()

Afterthis,wespecifythatHOGDescriptorwilluseadefaultpeopledetector.

ThisisdonethroughthesetSVMDetector()method,which—afterourintroductiontoSVM—soundslessobscurethanitmayhaveifwehadn’tintroducedSVMs.

Next,weapplydetectMultiScaleontheloadedimage.Interestingly,unlikeallthefacedetectionalgorithms,wedon’tneedtoconverttheoriginalimagetograyscalebefore

Page 231: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

applyinganyformofobjectdetection.

Thedetectionmethodwillreturnanarrayofrectangles,whichwouldbeagoodenoughsourceofinformationforustostartdrawingshapesontheimage.Ifwedidthis,however,youwouldnoticesomethingstrange:someoftherectanglesareentirelycontainedinotherrectangles.Thisclearlyindicatesanerrorindetection,andwecansafelyassumethatarectangleentirelyinsideanotheronecanbediscarded.

Thisispreciselythereasonwhywedefinedanis_insidefunction,andwhyweiteratethroughtheresultofthedetectiontodiscardfalsepositives.

Ifyourunthescriptyourself,youwillseerectanglesaroundpeopleintheimage.

Page 232: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CreatingandtraininganobjectdetectorUsingbuilt-infeaturesmakesiteasytocomeupwithaquickprototypeforanapplication,andwe’reallverygratefultotheOpenCVdevelopersformakinggreatfeatures,suchasfacedetectionorpeopledetectionreadilyavailable(truly,weare).

However,whetheryouareahobbyistoracomputervisionprofessional,it’sunlikelythatyouwillonlydealwithpeopleandfaces.

Moreover,ifyou’relikeme,youwonderhowthepeopledetectorfeaturewascreatedinthefirstplaceandifyoucanimproveit.Furthermore,youmayalsowonderwhetheryoucanapplythesameconceptstodetectthemostdiversetypeofobjects,rangingfromcarstogoblins.

Inanenterpriseenvironment,youmayhavetodealwithveryspecificdetection,suchasregistrationplates,bookcovers,orwhateveryourcompanymaydealwith.

So,thequestionis,howdowecomeupwithourownclassifiers?

TheanswerliesinSVMandbag-of-wordstechnique.

We’vealreadytalkedaboutHOGandSVM,solet’stakeacloserlookatbag-of-words.

Bag-of-wordsBag-of-words(BOW)isaconceptthatwasnotinitiallyintendedforcomputervision,rather,weuseanevolvedversionofthisconceptinthecontextofcomputervision.So,let’sfirsttalkaboutitsbasicversion,which—asyoumayhaveguessed—originallybelongstothefieldoflanguageanalysisandinformationretrieval.

BOWisthetechniquebywhichweassignacountweighttoeachwordinaseriesofdocuments;wethenrerepresentthesedocumentswithvectorsthatrepresentthesesetofcounts.Let’slookatanexample:

Document1:IlikeOpenCVandIlikePythonDocument2:IlikeC++andPythonDocument3:Idon'tlikeartichokes

Thesethreedocumentsallowustobuildadictionary(orcodebook)withthesevalues:

{

I:4,

like:4,

OpenCV:2,

and:2,

Python:2,

C++:1,

dont:1,

artichokes:1

}

Wehaveeightentries.Let’snowrerepresenttheoriginaldocumentsusingeight-entryvectors,eachvectorcontainingallthewordsinthedictionarywithvaluesrepresentingthe

Page 233: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

countforeachterminthedocument.Thevectorrepresentationoftheprecedingthreesentencesisasfollows:

[2,2,1,1,1,0,0,0]

[1,1,0,1,1,1,0,0]

[1,1,0,0,0,0,1,1]

Thiskindofrepresentationofdocumentshasmanyeffectiveapplicationsintherealworld,suchasspamfiltering.

Thesevectorscanbeconceptualizedasahistogramrepresentationofdocumentsorasafeature(thesamewayweextractedfeaturesfromimagesinpreviouschapters),whichcanbeusedtotrainclassifiers.

NowthatwehaveagraspofthebasicconceptofBOWorbagofvisualwords(BOVW)incomputervision,let’sseehowthisappliestotheworldofcomputervision.

BOWincomputervisionWearebynowfamiliarwiththeconceptofimagefeatures.We’veusedfeatureextractors,suchasSIFT,andSURF,toextractfeaturesfromimagessothatwecouldmatchthesefeaturesinanotherimage.

We’vealsofamiliarizedourselveswiththeconceptofcodebook,andweknowaboutSVM,amodelthatcanbefedasetoffeaturesandutilizescomplexalgorithmstoclassifytraindata,andcanpredicttheclassificationofnewdata.

So,theimplementationofaBOWapproachwillinvolvethefollowingsteps:

1. Takeasampledataset.2. Foreachimageinthedataset,extractdescriptors(withSIFT,SURF,andsoon).3. AddeachdescriptortotheBOWtrainer.4. Clusterthedescriptorstokclusters(okay,thissoundsobscure,butbearwithme)

whosecenters(centroids)areourvisualwords.

Atthispoint,wehaveadictionaryofvisualwordsreadytobeused.Asyoucanimagine,alargedatasetwillhelpmakeourdictionaryricherinvisualwords.Uptoanextent,themorewords,thebetter!

Afterthis,wearereadytotestourclassifierandattemptdetection.Thegoodnewsisthattheprocessisverysimilartotheoneoutlinedpreviously:givenatestimage,wecanextractfeaturesandquantizethembasedontheirdistancetothenearestcentroidtoformahistogram.

Basedonthis,wecanattempttorecognizevisualwordsandlocatethemintheimage.Here’savisualrepresentationoftheBOWprocess:

Page 234: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thisisthepointinthechapterwhenyouhavebuiltanappetiteforapracticalexample,andarerearingtocode.However,beforeproceeding,Ifeelthataquickdigressionintothetheoryofthek-meansclusteringisnecessarysothatyoucanfullyunderstandhowvisualwordsarecreated,andgainabetterunderstandingoftheprocessofobjectdetectionusingBOWandSVM.

Thek-meansclustering

Thek-meansclusteringisamethodofvectorquantizationtoperformdataanalysis.Givenadataset,krepresentsthenumberofclustersinwhichthedatasetisgoingtobedivided.Theterm“means”referstothemathematicalconceptofmean,whichisprettybasic,butforthesakeofclarity,it’swhatpeoplecommonlyrefertoasaverage;whenvisuallyrepresented,themeanofaclusterisitscentroidorthegeometricalcenterofpointsinthecluster.

NoteClusteringreferstothegroupingofpointsinadatasetintoclusters.

OneoftheclasseswewillbeusingtoperformobjectdetectioniscalledBagOfWordsKMeansTrainer;bynow,youshouldabletodeducewhattheresponsibilityofthisclassistocreate:

”kmeans()-basedclasstotrainavisualvocabularyusingthebag-of-wordsapproach”

Page 235: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThisisaspertheOpenCVdocumentation.

Here’sarepresentationofak-meansclusteringoperationwithfiveclusters:

Afterthislongtheoreticalintroduction,wecanlookatanexample,andstarttrainingourobjectdetector.

Page 236: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 237: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DetectingcarsThereisnovirtuallimittothetypeofobjectsyoucandetectinyourimagesandvideos.However,toobtainanacceptablelevelofaccuracy,youneedasufficientlylargedataset,containingtrainimagesthatareidenticalinsize.

Thiswouldbeatimeconsumingoperationifweweretodoitallbyourselves(whichisentirelypossible).

Wecanavailofready-madedatasets;thereareanumberofthemfreelydownloadablefromvarioussources:

TheUniversityofIllinois:http://l2r.cs.uiuc.edu/~cogcomp/Data/Car/CarData.tar.gzStanfordUniversity:http://ai.stanford.edu/~jkrause/cars/car_dataset.html

NoteNotethattrainingimagesandtestimagesareavailableinseparatefiles.

I’llbeusingtheUIUCdatasetinmyexample,butfeelfreetoexploretheInternetforothertypesofdatasets.

Now,let’stakealookatanexample:

importcv2

importnumpyasnp

fromos.pathimportjoin

datapath="/home/d3athmast3r/dev/python/CarData/TrainImages/"

defpath(cls,i):

return"%s/%s%d.pgm"%(datapath,cls,i+1)

pos,neg="pos-","neg-"

detect=cv2.xfeatures2d.SIFT_create()

extract=cv2.xfeatures2d.SIFT_create()

flann_params=dict(algorithm=1,trees=5)flann=

cv2.FlannBasedMatcher(flann_params,{})

bow_kmeans_trainer=cv2.BOWKMeansTrainer(40)

extract_bow=cv2.BOWImgDescriptorExtractor(extract,flann)

defextract_sift(fn):

im=cv2.imread(fn,0)

returnextract.compute(im,detect.detect(im))[1]

foriinrange(8):

bow_kmeans_trainer.add(extract_sift(path(pos,i)))

bow_kmeans_trainer.add(extract_sift(path(neg,i)))

voc=bow_kmeans_trainer.cluster()

extract_bow.setVocabulary(voc)

Page 238: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

defbow_features(fn):

im=cv2.imread(fn,0)

returnextract_bow.compute(im,detect.detect(im))

traindata,trainlabels=[],[]

foriinrange(20):

traindata.extend(bow_features(path(pos,i)));trainlabels.append(1)

traindata.extend(bow_features(path(neg,i)));trainlabels.append(-1)

svm=cv2.ml.SVM_create()

svm.train(np.array(traindata),cv2.ml.ROW_SAMPLE,np.array(trainlabels))

defpredict(fn):

f=bow_features(fn);

p=svm.predict(f)

printfn,"\t",p[1][0][0]

returnp

car,notcar="/home/d3athmast3r/dev/python/study/images/car.jpg",

"/home/d3athmast3r/dev/python/study/images/bb.jpg"

car_img=cv2.imread(car)

notcar_img=cv2.imread(notcar)

car_predict=predict(car)

not_car_predict=predict(notcar)

font=cv2.FONT_HERSHEY_SIMPLEX

if(car_predict[1][0][0]==1.0):

cv2.putText(car_img,'CarDetected',(10,30),font,1,

(0,255,0),2,cv2.LINE_AA)

if(not_car_predict[1][0][0]==-1.0):

cv2.putText(notcar_img,'CarNotDetected',(10,30),font,1,(0,0,

255),2,cv2.LINE_AA)

cv2.imshow('BOW+SVMSuccess',car_img)

cv2.imshow('BOW+SVMFailure',notcar_img)

cv2.waitKey(0)

cv2.destroyAllWindows()

Page 239: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Whatdidwejustdo?Thisisquitealottoassimilate,solet’sgothroughwhatwe’vedone:

1. Firstofall,ourusualimportsarefollowedbythedeclarationofthebasepathofourtrainingimages.Thiswillcomeinhandytoavoidrewritingthebasepatheverytimeweprocessanimageinaparticularfolderonourcomputer.

2. Afterthis,wedeclareafunction,path:

defpath(cls,i):

return"%s/%s%d.pgm"%(datapath,cls,i+1)

pos,neg="pos-","neg-"

NoteMoreonthepathfunction

Thisfunctionisautilitymethod:giventhenameofaclass(inourcase,wehavetwoclasses,posandneg)andanumericalindex,wereturnthefullpathtoaparticulartestingimage.Ourcardatasetcontainsimagesnamedinthefollowingway:pos-x.pgmandneg-x.pgm,wherexisanumber.

Immediately,youwillfindtheusefulnessofthisfunctionwheniteratingthrougharangeofnumbers(say,20),whichwillallowyoutoloadallimagesfrompos-0.pgmtopos-20.pgm,andthesamegoesforthenegativeclass.

3. Nextup,we’llcreatetwoSIFTinstances:onetoextractkeypoints,theothertoextractfeatures:

detect=cv2.xfeatures2d.SIFT_create()

extract=cv2.xfeatures2d.SIFT_create()

4. WheneveryouseeSIFTinvolved,youcanbeprettysuresomefeaturematchingalgorithmwillbeinvolvedtoo.Inourcase,we’llcreateaninstanceforaFLANNmatcher:

flann_params=dict(algorithm=1,trees=5)flann=

cv2.FlannBasedMatcher(flann_params,{})

NoteNotethatcurrently,theenumvaluesforFLANNaremissingfromthePythonversionofOpenCV3,so,number1,whichispassedasthealgorithmparameter,representstheFLANN_INDEX_KDTREEalgorithm.Isuspectthefinalversionwillbecv2.FLANN_INDEX_KDTREE,whichisalittlemorehelpful.Makesuretochecktheenumvaluesforthecorrectflags.

5. Next,wementiontheBOWtrainer:

bow_kmeans_trainer=cv2.BOWKMeansTrainer(40)

6. ThisBOWtrainerutilizes40clusters.Afterthis,we’llinitializetheBOWextractor.

Page 240: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThisistheBOWclassthatwillbefedavocabularyofvisualwordsandwilltrytodetecttheminthetestimage:

extract_bow=cv2.BOWImgDescriptorExtractor(extract,flann)

7. ToextracttheSIFTfeaturesfromanimage,webuildautilitymethod,whichtakesthepathtotheimage,readsitingrayscale,andreturnsthedescriptor:

defextract_sift(fn):

im=cv2.imread(fn,0)

returnextract.compute(im,detect.detect(im))[1]

Atthisstage,wehaveeverythingweneedtostarttrainingtheBOWtrainer.

1. Let’sreadeightimagesperclass(eightpositivesandeightnegatives)fromourdataset:

foriinrange(8):

bow_kmeans_trainer.add(extract_sift(path(pos,i)))

bow_kmeans_trainer.add(extract_sift(path(neg,i)))

2. Tocreatethevocabularyofvisualwords,we’llcallthecluster()methodonthetrainer,whichperformsthek-meansclassificationandreturnsthesaidvocabulary.We’llassignthisvocabularytoBOWImgDescriptorExtractorsothatitcanextractdescriptorsfromtestimages:

vocabulary=bow_kmeans_trainer.cluster()

extract_bow.setVocabulary(vocabulary)

3. Inlinewithotherutilityfunctionsdeclaredinthisscript,we’lldeclareafunctionthattakesthepathtoanimageandreturnsthedescriptorascomputedbytheBOWdescriptorextractor:

defbow_features(fn):

im=cv2.imread(fn,0)

returnextract_bow.compute(im,detect.detect(im))

4. Let’screatetwoarraystoaccommodatethetraindataandlabels,andpopulatethemwiththedescriptorsgeneratedbyBOWImgDescriptorExtractor,associatinglabelstothepositiveandnegativeimageswe’refeeding(1standsforapositivematch,-1foranegative):

traindata,trainlabels=[],[]

foriinrange(20):

traindata.extend(bow_features(path(pos,i)));trainlabels.append(1)

traindata.extend(bow_features(path(neg,i)));trainlabels.append(-1)

5. Now,let’screateaninstanceofanSVM:

svm=cv2.ml.SVM_create()

6. Then,trainitbywrappingthetraindataandlabelsintotheNumPyarrays:

svm.train(np.array(traindata),cv2.ml.ROW_SAMPLE,

np.array(trainlabels))

Page 241: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

We’reallsetwithatrainedSVM;allthatislefttodoistofeedtheSVMacoupleofsampleimagesandseehowitbehaves.

1. Let’sfirstdefineanotherutilitymethodtoprinttheresultofourpredictmethodandreturnit:

defpredict(fn):

f=bow_features(fn);

p=svm.predict(f)

printfn,"\t",p[1][0][0]

returnp

2. Let’sdefinetwosampleimagepathsandreadthemastheNumPyarrays:

car,notcar="/home/d3athmast3r/dev/python/study/images/car.jpg",

"/home/d3athmast3r/dev/python/study/images/bb.jpg"

car_img=cv2.imread(car)

notcar_img=cv2.imread(notcar)

3. We’llpasstheseimagestothetrainedSVM,andgettheresultoftheprediction:

car_predict=predict(car)

not_car_predict=predict(notcar)

Naturally,we’rehopingthatthecarimagewillbedetectedasacar(resultofpredict()shouldbe1.0),andthattheotherimagewillnot(resultshouldbe-1.0),sowewillonlyaddtexttotheimagesiftheresultistheexpectedone.

4. Atlast,we’llpresenttheimagesonthescreen,hopingtoseethecorrectcaptiononeach:

font=cv2.FONT_HERSHEY_SIMPLEX

if(car_predict[1][0][0]==1.0):

cv2.putText(car_img,'CarDetected',(10,30),font,1,

(0,255,0),2,cv2.LINE_AA)

if(not_car_predict[1][0][0]==-1.0):

cv2.putText(notcar_img,'CarNotDetected',(10,30),font,1,(0,0,

255),2,cv2.LINE_AA)

cv2.imshow('BOW+SVMSuccess',car_img)

cv2.imshow('BOW+SVMFailure',notcar_img)

cv2.waitKey(0)

cv2.destroyAllWindows()

Theprecedingoperationproducesthefollowingresult:

Page 242: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Italsoresultsinthis:

Page 243: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SVMandslidingwindowsHavingdetectedanobjectisanimpressiveachievement,butnowwewanttopushthistothenextlevelintheseways:

DetectingmultipleobjectsofthesamekindinanimageDeterminingthepositionofadetectedobjectinanimage

Toaccomplishthis,wewillusetheslidingwindowsapproach.Ifit’snotalreadyclearfromthepreviousexplanationoftheconceptofslidingwindows,therationalebehindtheadoptionofthisapproachwillbecomemoreapparentifwetakealookatadiagram:

Observethemovementoftheblock:

1. Wetakearegionoftheimage,classifyit,andthenmoveapredefinedstepsizetotheright-handside.Whenwereachtherightmostendoftheimage,we’llresetthexcoordinateto0andmovedownastep,andrepeattheentireprocess.

2. Ateachstep,we’llperformaclassificationwiththeSVMthatwastrainedwithBOW.

3. KeepatrackofalltheblocksthathavepassedtheSVMpredicttest.4. Whenyou’vefinishedclassifyingtheentireimage,scaletheimagedownandrepeat

theentireslidingwindowsprocess.

Continuerescalingandclassifyinguntilyougettoaminimumsize.

Thisgivesyouthechancetodetectobjectsinseveralregionsoftheimageandatdifferentscales.

Atthisstage,youwillhavecollectedimportantinformationaboutthecontentoftheimage;however,there’saproblem:it’smostlikelythatyouwillendupwithanumberofoverlappingblocksthatgiveyouapositivescore.Thismeansthatyourimagemaycontainoneobjectthatgetsdetectedfourorfivetimes,andifyouweretoreporttheresultofthedetection,yourreportwouldbequiteinaccurate,sohere’swherenon-maximum

Page 244: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

suppressioncomesintoplay.

Example–cardetectioninasceneWearenowreadytoapplyalltheconceptswelearnedsofartoareal-lifeexample,andcreateacardetectorapplicationthatscansanimageanddrawsrectanglesaroundcars.

Let’ssummarizetheprocessbeforedivingintothecode:

1. Obtainatraindataset.2. CreateaBOWtrainerandcreateavisualvocabulary.3. TrainanSVMwiththevocabulary.4. Attemptdetectionusingslidingwindowsonanimagepyramidofatestimage.5. Applynon-maximumsuppressiontooverlappingboxes.6. Outputtheresult.

Let’salsotakealookatthestructureoftheproject,asitisabitmorecomplexthantheclassicstandalonescriptapproachwe’veadopteduntilnow.

Theprojectstructureisasfollows:

├──car_detector

│├──detector.py

│├──__init__.py

│├──non_maximum.py

│├──pyramid.py

│└──sliding_w112661222.indow.py

└──car_sliding_windows.py

Themainprogramisincar_sliding_windows.py,andalltheutilitiesarecontainedinthecar_detectorfolder.Aswe’reusingPython2.7,we’llneedan__init__.pyfileinthefolderforittobedetectedasamodule.

Thefourfilesinthecar_detectormoduleareasfollows:

TheSVMtrainingmodelThenon-maximumsuppressionfunctionTheimagepyramidTheslidingwindowsfunction

Let’sexaminethemonebyone,startingfromtheimagepyramid:

importcv2

defresize(img,scaleFactor):

returncv2.resize(img,(int(img.shape[1]*(1/scaleFactor)),

int(img.shape[0]*(1/scaleFactor))),interpolation=cv2.INTER_AREA)

defpyramid(image,scale=1.5,minSize=(200,80)):

yieldimage

whileTrue:

image=resize(image,scale)

ifimage.shape[0]<minSize[1]orimage.shape[1]<minSize[0]:

Page 245: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

break

yieldimage

Thismodulecontainstwofunctiondefinitions:

ResizetakesanimageandresizesitbyaspecifiedfactorPyramidtakesanimageandreturnsaresizedversionofituntiltheminimumconstraintsofwidthandheightarereached

NoteYouwillnoticethattheimageisnotreturnedwiththereturnkeywordbutwiththeyieldkeyword.Thisisbecausethisfunctionisaso-calledgenerator.Ifyouarenotfamiliarwithgenerators,takealookathttps://wiki.python.org/moin/Generators.

Thiswillallowustoobtainaresizedimagetoprocessinourmainprogram.

Nextupistheslidingwindowsfunction:

defsliding_window(image,stepSize,windowSize):

foryinxrange(0,image.shape[0],stepSize):

forxinxrange(0,image.shape[1],stepSize):

yield(x,y,image[y:y+windowSize[1],x:x+windowSize[0]])

Again,thisisagenerator.Althoughabitdeep-nested,thismechanismisverysimple:givenanimage,returnawindowthatmovesofanarbitrarysizedstepfromtheleftmargintowardstheright,untiltheentirewidthoftheimageiscovered,thengoesbacktotheleftmarginbutdownastep,coveringthewidthoftheimagerepeatedlyuntilthebottomrightcorneroftheimageisreached.Youcanvisualizethisasthesamepatternusedforwritingonapieceofpaper:startfromtheleftmarginandreachtherightmargin,thenmoveontothenextlinefromtheleftmargin.

Thelastutilityisnon-maximumsuppression,whichlookslikethis(Malisiewicz/Rosebrock’scode):

defnon_max_suppression_fast(boxes,overlapThresh):

#iftherearenoboxes,returnanemptylist

iflen(boxes)==0:

return[]

#iftheboundingboxesintegers,convertthemtofloats—

#thisisimportantsincewe'llbedoingabunchofdivisions

ifboxes.dtype.kind=="i":

boxes=boxes.astype("float")

Page 246: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

#initializethelistofpickedindexes

pick=[]

#grabthecoordinatesoftheboundingboxes

x1=boxes[:,0]

y1=boxes[:,1]

x2=boxes[:,2]

y2=boxes[:,3]

scores=boxes[:,4]

#computetheareaoftheboundingboxesandsortthebounding

#boxesbythescore/probabilityoftheboundingbox

area=(x2-x1+1)*(y2-y1+1)

idxs=np.argsort(scores)[::-1]

#keeploopingwhilesomeindexesstillremainintheindexes

#list

whilelen(idxs)>0:

#grabthelastindexintheindexeslistandaddthe

#indexvaluetothelistofpickedindexes

last=len(idxs)-1

i=idxs[last]

pick.append(i)

#findthelargest(x,y)coordinatesforthestartof

#theboundingboxandthesmallest(x,y)coordinates

#fortheendoftheboundingbox

xx1=np.maximum(x1[i],x1[idxs[:last]])

yy1=np.maximum(y1[i],y1[idxs[:last]])

Page 247: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

xx2=np.minimum(x2[i],x2[idxs[:last]])

yy2=np.minimum(y2[i],y2[idxs[:last]])

#computethewidthandheightoftheboundingbox

w=np.maximum(0,xx2-xx1+1)

h=np.maximum(0,yy2-yy1+1)

#computetheratioofoverlap

overlap=(w*h)/area[idxs[:last]]

#deleteallindexesfromtheindexlistthathave

idxs=np.delete(idxs,np.concatenate(([last],

np.where(overlap>overlapThresh)[0])))

#returnonlytheboundingboxesthatwerepickedusingthe

#integerdatatype

returnboxes[pick].astype("int")

Thisfunctionsimplytakesalistofrectanglesandsortsthembytheirscore.Startingfromtheboxwiththehighestscore,iteliminatesallboxesthatoverlapbeyondacertainthresholdbycalculatingtheareaofintersectionanddeterminingwhetheritisgreaterthanacertainthreshold.

Examiningdetector.py

Now,let’sexaminetheheartofthisprogram,whichisdetector.py.Thisabitlongandcomplex;however,everythingshouldappearmuchclearergivenournewfoundfamiliaritywiththeconceptsofBOW,SVM,andfeaturedetection/extraction.

Here’sthecode:

importcv2

importnumpyasnp

datapath="/path/to/CarData/TrainImages/"

SAMPLES=400

defpath(cls,i):

return"%s/%s%d.pgm"%(datapath,cls,i+1)

Page 248: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

defget_flann_matcher():

flann_params=dict(algorithm=1,trees=5)

returncv2.FlannBasedMatcher(flann_params,{})

defget_bow_extractor(extract,flann):

returncv2.BOWImgDescriptorExtractor(extract,flann)

defget_extract_detect():

returncv2.xfeatures2d.SIFT_create(),cv2.xfeatures2d.SIFT_create()

defextract_sift(fn,extractor,detector):

im=cv2.imread(fn,0)

returnextractor.compute(im,detector.detect(im))[1]

defbow_features(img,extractor_bow,detector):

returnextractor_bow.compute(img,detector.detect(img))

defcar_detector():

pos,neg="pos-","neg-"

detect,extract=get_extract_detect()

matcher=get_flann_matcher()

print"buildingBOWKMeansTrainer…"

bow_kmeans_trainer=cv2.BOWKMeansTrainer(1000)

extract_bow=cv2.BOWImgDescriptorExtractor(extract,flann)

print"addingfeaturestotrainer"

foriinrange(SAMPLES):

printi

bow_kmeans_trainer.add(extract_sift(path(pos,i),extract,detect))

bow_kmeans_trainer.add(extract_sift(path(neg,i),extract,detect))

voc=bow_kmeans_trainer.cluster()

extract_bow.setVocabulary(voc)

traindata,trainlabels=[],[]

print"addingtotraindata"

foriinrange(SAMPLES):

printi

traindata.extend(bow_features(cv2.imread(path(pos,i),0),extract_bow,

detect))

trainlabels.append(1)

traindata.extend(bow_features(cv2.imread(path(neg,i),0),extract_bow,

detect))

trainlabels.append(-1)

svm=cv2.ml.SVM_create()

svm.setType(cv2.ml.SVM_C_SVC)

svm.setGamma(0.5)

svm.setC(30)

svm.setKernel(cv2.ml.SVM_RBF)

svm.train(np.array(traindata),cv2.ml.ROW_SAMPLE,np.array(trainlabels))

returnsvm,extract_bow

Let’sgothroughit.First,we’llimportourusualmodules,andthensetapathforthetrainingimages.

Page 249: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Then,we’lldefineanumberofutilityfunctions:

defpath(cls,i):

return"%s/%s%d.pgm"%(datapath,cls,i+1)

Thisfunctionreturnsthepathtoanimagegivenabasepathandaclassname.Inourexample,we’regoingtousetheneg-andpos-classnames,becausethisiswhatthetrainingimagesarecalled(thatis,neg-1.pgm).Thelastargumentisanintegerusedtocomposethefinalpartoftheimagepath.

Next,we’lldefineautilityfunctiontoobtainaFLANNmatcher:

defget_flann_matcher():

flann_params=dict(algorithm=1,trees=5)

returncv2.FlannBasedMatcher(flann_params,{})

Again,it’snotthattheinteger,1,passedasanalgorithmargumentrepresentsFLANN_INDEX_KDTREE.

ThenexttwofunctionsreturntheSIFTfeaturedetectors/extractorsandaBOWtrainer:

defget_bow_extractor(extract,flann):

returncv2.BOWImgDescriptorExtractor(extract,flann)

defget_extract_detect():

returncv2.xfeatures2d.SIFT_create(),cv2.xfeatures2d.SIFT_create()

Thenextutilityisafunctionusedtoreturnfeaturesfromanimage:

defextract_sift(fn,extractor,detector):

im=cv2.imread(fn,0)

returnextractor.compute(im,detector.detect(im))[1]

NoteASIFTdetectordetectsfeatures,whileaSIFTextractorextractsandreturnsthem.

We’llalsodefineasimilarutilityfunctiontoextracttheBOWfeatures:

defbow_features(img,extractor_bow,detector):

returnextractor_bow.compute(img,detector.detect(img))

Inthemaincar_detectorfunction,we’llfirstcreatethenecessaryobjectusedtoperformfeaturedetectionandextraction:

pos,neg="pos-","neg-"

detect,extract=get_extract_detect()

matcher=get_flann_matcher()

bow_kmeans_trainer=cv2.BOWKMeansTrainer(1000)

extract_bow=cv2.BOWImgDescriptorExtractor(extract,flann)

Then,we’lladdfeaturestakenfromtrainingimagestothetrainer:

print"addingfeaturestotrainer"

foriinrange(SAMPLES):

printi

bow_kmeans_trainer.add(extract_sift(path(pos,i),extract,detect))

Page 250: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Foreachclass,we’lladdapositiveimagetothetrainerandanegativeimage.

Afterthis,we’llinstructthetrainertoclusterthedataintokgroups.

Theclustereddataisnowourvocabularyofvisualwords,andwecansettheBOWImgDescriptorExtractorclass’vocabularyinthisway:

vocabulary=bow_kmeans_trainer.cluster()

extract_bow.setVocabulary(vocabulary)

Associatingtrainingdatawithclasses

Withavisualvocabularyready,wecannowassociatetraindatawithclasses.Inourcase,wehavetwoclasses:-1fornegativeresultsand1forpositiveones.

Let’spopulatetwoarrays,traindataandtrainlabels,containingextractedfeaturesandtheircorrespondinglabels.Iteratingthroughthedataset,wecanquicklysetthisupwiththefollowingcode:

traindata,trainlabels=[],[]

print"addingtotraindata"

foriinrange(SAMPLES):

printi

traindata.extend(bow_features(cv2.imread(path(pos,i),0),extract_bow,

detect))

trainlabels.append(1)

traindata.extend(bow_features(cv2.imread(path(neg,i),0),extract_bow,

detect))

trainlabels.append(-1)

Youwillnoticethatateachcycle,we’lladdonepositiveandonenegativeimage,andthenpopulatethelabelswitha1anda-1valuetokeepthedatasynchronizedwiththelabels.

Shouldyouwishtotrainmoreclasses,youcoulddothatbyfollowingthispattern:

traindata,trainlabels=[],[]

print"addingtotraindata"

foriinrange(SAMPLES):

printi

traindata.extend(bow_features(cv2.imread(path(class1,i),0),

extract_bow,detect))

trainlabels.append(1)

traindata.extend(bow_features(cv2.imread(path(class2,i),0),

extract_bow,detect))

trainlabels.append(2)

traindata.extend(bow_features(cv2.imread(path(class3,i),0),

extract_bow,detect))

trainlabels.append(3)

Forexample,youcouldtrainadetectortodetectcarsandpeopleandperformdetectionontheseinanimagecontainingbothcarsandpeople.

Lastly,we’lltraintheSVMwiththefollowingcode:

svm=cv2.ml.SVM_create()

svm.setType(cv2.ml.SVM_C_SVC)

svm.setGamma(0.5)

Page 251: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

svm.setC(30)

svm.setKernel(cv2.ml.SVM_RBF)

svm.train(np.array(traindata),cv2.ml.ROW_SAMPLE,np.array(trainlabels))

returnsvm,extract_bow

TherearetwoparametersinparticularthatI’dliketofocusyourattentionon:

C:Withthisparameter,youcouldconceptualizethestrictnessorseverityoftheclassifier.Thehigherthevalue,thelesschancesofmisclassification,butthetrade-offisthatsomepositiveresultsmaynotbedetected.Ontheotherhand,alowvaluemayover-fit,soyouriskgettingfalsepositives.Kernel:Thisparameterdeterminesthenatureoftheclassifier:SVM_LINEARindicatesalinearhyperplane,which,inpracticalterms,worksverywellforabinaryclassification(thetestsampleeitherbelongstoaclassoritdoesn’t),whileSVM_RBF(radialbasisfunction)separatesdatausingtheGaussianfunctions,whichmeansthatthedataissplitintoseveralkernelsdefinedbythesefunctions.WhentrainingtheSVMtoclassifyformorethantwoclasses,youwillhavetouseRBF.

Finally,we’llpassthetraindataandtrainlabelsarraysintotheSVMtrainmethod,andreturntheSVMandBOWextractorobject.Thisisbecauseinourapplications,wedon’twanttohavetorecreatethevocabularyeverytime,soweexposeitforreuse.

Dude,where’smycar?Wearereadytotestourcardetector!

Let’sfirstcreateasimpleprogramthatloadsanimage,andthenoperatesdetectionusingtheslidingwindowsandimagepyramidtechniques,respectively:

importcv2

importnumpyasnp

fromcar_detector.detectorimportcar_detector,bow_features

fromcar_detector.pyramidimportpyramid

fromcar_detector.non_maximumimportnon_max_suppression_fastasnms

fromcar_detector.sliding_windowimportsliding_window

defin_range(number,test,thresh=0.2):

returnabs(number-test)<thresh

test_image="/path/to/cars.jpg"

svm,extractor=car_detector()

detect=cv2.xfeatures2d.SIFT_create()

w,h=100,40

img=cv2.imread(test_img)

rectangles=[]

counter=1

scaleFactor=1.25

scale=1

font=cv2.FONT_HERSHEY_PLAIN

Page 252: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

forresizedinpyramid(img,scaleFactor):

scale=float(img.shape[1])/float(resized.shape[1])

for(x,y,roi)insliding_window(resized,20,(w,h)):

ifroi.shape[1]!=worroi.shape[0]!=h:

continue

try:

bf=bow_features(roi,extractor,detect)

_,result=svm.predict(bf)

a,res=svm.predict(bf,flags=cv2.ml.STAT_MODEL_RAW_OUTPUT)

print"Class:%d,Score:%f"%(result[0][0],res[0][0])

score=res[0][0]

ifresult[0][0]==1:

ifscore<-1.0:

rx,ry,rx2,ry2=int(x*scale),int(y*scale),int((x+w)*

scale),int((y+h)*scale)

rectangles.append([rx,ry,rx2,ry2,abs(score)])

except:

pass

counter+=1

windows=np.array(rectangles)

boxes=nms(windows,0.25)

for(x,y,x2,y2,score)inboxes:

printx,y,x2,y2,score

cv2.rectangle(img,(int(x),int(y)),(int(x2),int(y2)),(0,255,0),1)

cv2.putText(img,"%f"%score,(int(x),int(y)),font,1,(0,255,0))

cv2.imshow("img",img)

cv2.waitKey(0)

Thenotablepartoftheprogramisthefunctionwithinthepyramid/slidingwindowloop:

bf=bow_features(roi,extractor,detect)

_,result=svm.predict(bf)

a,res=svm.predict(bf,flags=cv2.ml.STAT_MODEL_RAW_OUTPUT)

print"Class:%d,Score:%f"%(result[0][0],res[0][0])

score=res[0][0]

ifresult[0][0]==1:

ifscore<-1.0:

rx,ry,rx2,ry2=int(x*scale),int(y*scale),int((x+w)*

scale),int((y+h)*scale)

rectangles.append([rx,ry,rx2,ry2,abs(score)])

Here,weextractthefeaturesoftheregionofinterest(ROI),whichcorrespondstothecurrentslidingwindow,andthenwecallpredictontheextractedfeatures.Thepredictmethodhasanoptionalparameter,flags,whichreturnsthescoreoftheprediction(containedatthe[0][0]value).

NoteAwordonthescoreoftheprediction:thelowerthevalue,thehighertheconfidencethat

Page 253: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

theclassifiedelementreallybelongstotheclass.

So,we’llsetanarbitrarythresholdof-1.0forclassifiedwindows,andallwindowswithlessthan-1.0aregoingtobetakenasgoodresults.AsyouexperimentwithyourSVMs,youmaytweakthistoyourlikinguntilyoufindagoldenmeanthatassuresbestresults.

Finally,weaddthecomputedcoordinatesoftheslidingwindow(meaning,wemultiplythecurrentcoordinatesbythescaleofthecurrentlayerintheimagepyramidsothatitgetscorrectlyrepresentedinthefinaldrawing)tothearrayofrectangles.

There’sonelastoperationweneedtoperformbeforedrawingourfinalresult:non-maximumsuppression.

WeturntherectanglesarrayintoaNumPyarray(toallowcertainkindofoperationsthatareonlypossiblewithNumPy),andthenapplyNMS:

windows=np.array(rectangles)

boxes=nms(windows,0.25)

Finally,weproceedwithdisplayingallourresults;forthesakeofconvenience,I’vealsoprintedthescoreobtainedforalltheremainingwindows:

Thisisaremarkablyaccurateresult!

AfinalnoteonSVM:youdon’tneedtotrainadetectoreverytimeyouwanttouseit,whichwouldbeextremelyimpractical.Youcanusethefollowingcode:

svm.save('/path/to/serialized/svmxml')

Page 254: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Youcansubsequentlyreloaditwithaloadmethodandfeedittestimagesorframes.

Page 255: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 256: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryInthischapter,wetalkedaboutnumerousobjectdetectionconcepts,suchasHOG,BOW,SVM,andsomeusefultechniques,suchasimagepyramid,slidingwindows,andnon-maximumsuppression.

Weintroducedtheconceptofmachinelearningandexploredthevariousapproachesusedtotrainacustomdetector,includinghowtocreateorobtainatrainingdatasetandclassifydata.Finally,weputthisknowledgetogoodusebycreatingacardetectorfromscratchandverifyingitscorrectfunctioning.

Alltheseconceptsformthefoundationofthenextchapter,inwhichwewillutilizeobjectdetectionandclassificationtechniquesinthecontextofmakingvideos,andlearnhowtotrackobjectstoretaininformationthatcanpotentiallybeusedforbusinessorapplicationpurposes.

Page 257: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 258: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter8.TrackingObjectsInthischapter,wewillexplorethevasttopicofobjecttracking,whichistheprocessoflocatingamovingobjectinamovieorvideofeedfromacamera.Real-timeobjecttrackingisacriticaltaskinmanycomputervisionapplicationssuchassurveillance,perceptualuserinterfaces,augmentedreality,object-basedvideocompression,anddriverassistance.

Trackingobjectscanbeaccomplishedinseveralways,withtheoptimaltechniquebeinglargelydependentonthetaskathand.Wewilllearnhowtoidentifymovingobjectsandtrackthemacrossframes.

Page 259: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

DetectingmovingobjectsThefirsttaskthatneedstobeaccomplishedforustobeabletotrackanythinginavideoistoidentifythoseregionsofavideoframethatcorrespondtomovingobjects.

Therearemanywaystotrackobjectsinavideo,allofthemfulfillingaslightlydifferentpurpose.Forexample,youmaywanttotrackanythingthatmoves,inwhichcasedifferencesbetweenframesaregoingtobeofhelp;youmaywanttotrackahandmovinginavideo,inwhichcaseMeanshiftbasedonthecoloroftheskinisthemostappropriatesolution;youmaywanttotrackaparticularobjectofwhichyouknowtheaspect,inwhichcasetechniquessuchastemplatematchingwillbeofhelp.

Objecttrackingtechniquescangetquitecomplex,let’sexplorethemintheascendingorderofdifficulty,startingfromthesimplesttechnique.

Page 260: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

BasicmotiondetectionThefirstandmostintuitivesolutionistocalculatethedifferencesbetweenframes,orbetweenaframeconsidered“background”andalltheotherframes.

Let’slookatanexampleofthisapproach:

importcv2

importnumpyasnp

camera=cv2.VideoCapture(0)

es=cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(9,4))

kernel=np.ones((5,5),np.uint8)

background=None

while(True):

ret,frame=camera.read()

ifbackgroundisNone:

background=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

background=cv2.GaussianBlur(background,(21,21),0)

continue

gray_frame=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

gray_frame=cv2.GaussianBlur(gray_frame,(21,21),0)

diff=cv2.absdiff(background,gray_frame)

diff=cv2.threshold(diff,25,255,cv2.THRESH_BINARY)[1]

diff=cv2.dilate(diff,es,iterations=2)

image,cnts,hierarchy=cv2.findContours(diff.copy(),cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

forcincnts:

ifcv2.contourArea(c)<1500:

continue

(x,y,w,h)=cv2.boundingRect(c)

cv2.rectangle(frame,(x,y),(x+w,y+h),(0,255,0),2)

cv2.imshow("contours",frame)

cv2.imshow("dif",diff)

ifcv2.waitKey(1000/12)&0xff==ord("q"):

break

cv2.destroyAllWindows()

camera.release()

Afterthenecessaryimports,weopenthevideofeedobtainedfromthedefaultsystemcamera,andwesetthefirstframeasthebackgroundoftheentirefeed.Eachframereadfromthatpointonwardisprocessedtocalculatethedifferencebetweenthebackgroundandtheframeitself.Thisisatrivialoperation:

diff=cv2.threshold(diff,25,255,cv2.THRESH_BINARY)[1]

Beforewegettodothat,though,weneedtoprepareourframeforprocessing.Thefirstthingwedoisconverttheframetograyscaleandbluritabit:

Page 261: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

gray_frame=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

gray_frame=cv2.GaussianBlur(gray_frame,(21,21),0)

NoteYoumaywonderabouttheblurring:thereasonwhyweblurtheimageisthat,ineachvideofeed,there’sanaturalnoisecomingfromnaturalvibrations,changesinlighting,andthenoisegeneratedbythecameraitself.Wewanttosmooththisnoiseoutsothatitdoesn’tgetdetectedasmotionandconsequentlygettracked.

Nowthatourframeisgrayscaledandsmoothed,wecancalculatethedifferencecomparedtothebackground(whichhasalsobeengrayscaledandsmoothed),andobtainamapofdifferences.Thisisnottheonlyprocessingstep,though.We’realsogoingtoapplyathreshold,soastoobtainablackandwhiteimage,anddilatetheimagesoholesandimperfectionsgetnormalized,likeso:

diff=cv2.absdiff(background,gray_frame)

diff=cv2.threshold(diff,25,255,cv2.THRESH_BINARY)[1]

diff=cv2.dilate(diff,es,iterations=2)

Notethaterodinganddilatingcanalsoactasanoisefilter,muchliketheblurringweapplied,andthatitcanalsobeobtainedinonefunctioncallusingcv2.morphologyEx,weshowbothstepsexplicitlyfortransparencypurposes.Allthatislefttodoatthispointistofindthecontoursofallthewhiteblobsinthecalculateddifferencemap,anddisplaythem.Optionally,weonlydisplaycontoursforrectanglesgreaterthananarbitrarythreshold,sotinymovementsarenotdisplayed.Naturally,thisisuptoyouandyourapplicationneeds.Withaconstantlightingandaverynoiselesscamera,youmaywishtohavenothresholdontheminimumsizeofthecontours.Thisishowwedisplaytherectangles:

image,cnts,hierarchy=cv2.findContours(diff.copy(),cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

forcincnts:

ifcv2.contourArea(c)<1500:

continue

(x,y,w,h)=cv2.boundingRect(c)

cv2.rectangle(frame,(x,y),(x+w,y+h),(255,255,0),2)

cv2.imshow("contours",frame)

cv2.imshow("dif",diff)

OpenCVofferstwoveryhandyfunctions:

cv2.findContours:Thisfunctioncomputesthecontoursofsubjectsinanimagecv2.boundinRect:Thisfunctioncalculatestheirboundingbox

Sothereyouhaveit,abasicmotiondetectorwithrectanglesaroundsubjects.Thefinalresultissomethinglikethis:

Page 262: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Forsuchasimpletechnique,thisisquiteaccurate.However,thereareafewdrawbacksthatmakethisapproachunsuitableforallbusinessneeds,mostnotablythefactthatyouneedafirst“default”frametosetasabackground.Insituationssuchas—forexample—outdoorcameras,withlightschangingquiteconstantly,thisprocessresultsinaquiteinflexibleapproach,soweneedabitmoreintelligenceintooursystem.That’swherebackgroundsubtractorscomeintoplay.

Page 263: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 264: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Backgroundsubtractors–KNN,MOG2,andGMGOpenCVprovidesaclasscalledBackgroundSubtractor,whichisahandywaytooperateforegroundandbackgroundsegmentation.

ThisworkssimilarlytotheGrabCutalgorithmweanalyzedinChapter3,ProcessingImageswithOpenCV3,however,BackgroundSubtractorisafullyfledgedclasswithaplethoraofmethodsthatnotonlyperformbackgroundsubtraction,butalsoimprovebackgrounddetectionintimethroughmachinelearningandletsyousavetheclassifiertoafile.

TofamiliarizeourselveswithBackgroundSubtractor,let’slookatabasicexample:

importnumpyasnp

importcv2

cap=cv2.VideoCapture')

mog=cv2.createBackgroundSubtractorMOG2()

while(1):

ret,frame=cap.read()

fgmask=mog.apply(frame)

cv2.imshow('frame',fgmask)

ifcv2.waitKey(30)&0xff:

break

cap.release()

cv2.destroyAllWindows()

Let’sgothroughthisinorder.Firstofall,let’stalkaboutthebackgroundsubtractorobject.TherearethreebackgroundsubtractorsavailableinOpenCV3:K-NearestNeighbors(KNN),MixtureofGaussians(MOG2),andGeometricMultigrid(GMG),correspondingtothealgorithmusedtocomputethebackgroundsubtraction.

YoumayrememberthatwealreadyelaboratedonthetopicofforegroundandbackgrounddetectioninChapter5,DepthEstimationandSegmentation,inparticularwhenwetalkedaboutGrabCutandWatershed.

SowhydoweneedtheBackgroundSubtractorclasses?ThemainreasonbehindthisisthatBackgroundSubtractorclassesarespecificallybuiltwithvideoanalysisinmind,whichmeansthattheOpenCVBackgroundSubtractorclasses“learn”somethingabouttheenvironmentwitheveryframe.Forexample,withGMG,youcanspecifythenumberofframesusedtoinitializethevideoanalysis,withthedefaultbeing120(roughly5secondswithaveragecameras).TheconstantaspectabouttheBackgroundSubtractorclassesisthattheyoperateacomparisonbetweenframesandtheystoreahistory,whichallowsthemtoimprovemotionanalysisresultsastimepasses.

Anotherfundamental(andfrankly,quiteamazing)featureoftheBackgroundSubtractor

Page 265: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

classesistheabilitytocomputeshadows.Thisisabsolutelyvitalforanaccuratereadingofvideoframes;bydetectingshadows,youcanexcludeshadowareas(bythresholdingthem)fromtheobjectsyoudetected,andconcentrateontherealfeatures.Italsogreatlyreducestheunwanted“merging”ofobjects.AnimagecomparisonwillgiveyouagoodideaoftheconceptI’mtryingtoillustrate.Here’sasampleofbackgroundsubtractionwithoutshadowdetection:

Here’sanexampleofshadowdetection(withshadowsthresholded):

Page 266: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Notethatshadowdetectionisn’tabsolutelyperfect,butithelpsbringtheobjectcontoursbacktotheobject’soriginalshape.Let’stakealookatareimplementedexampleofmotiondetectionutilizingBackgroundSubtractorKNN:

importcv2

importnumpyasnp

bs=cv2.createBackgroundSubtractorKNN(detectShadows=True)

camera=cv2.VideoCapture("/path/to/movie.flv")

whileTrue:

ret,frame=camera.read()

fgmask=bs.apply(frame)

th=cv2.threshold(fgmask.copy(),244,255,cv2.THRESH_BINARY)[1]

dilated=cv2.dilate(th,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,

(3,3)),iterations=2)

image,contours,hier=cv2.findContours(dilated,cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

forcincontours:

ifcv2.contourArea(c)>1600:

(x,y,w,h)=cv2.boundingRect(c)

cv2.rectangle(frame,(x,y),(x+w,y+h),(255,255,0),2)

cv2.imshow("mog",fgmask)

cv2.imshow("thresh",th)

cv2.imshow("detection",frame)

ifcv2.waitKey(30)&0xff==27:

break

camera.release()

cv2.destroyAllWindows()

Asaresultoftheaccuracyofthesubtractor,anditsabilitytodetectshadows,weobtainareallyprecisemotiondetection,inwhichevenobjectsthatarenexttoeachotherdon’tgetmergedintoonedetection,asshowninthefollowingscreenshot:

Page 267: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

That’saremarkableresultforfewerthan30linesofcode!

Thecoreoftheentireprogramistheapply()methodofthebackgroundsubtractor;itcomputesaforegroundmask,whichcanbeusedasabasisfortherestoftheprocessing:

fgmask=bs.apply(frame)

th=cv2.threshold(fgmask.copy(),244,255,cv2.THRESH_BINARY)[1]

dilated=cv2.dilate(th,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,

(3,3)),iterations=2)

image,contours,hier=cv2.findContours(dilated,cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

forcincontours:

ifcv2.contourArea(c)>1600:

(x,y,w,h)=cv2.boundingRect(c)

cv2.rectangle(frame,(x,y),(x+w,y+h),(255,255,0),2)

Onceaforegroundmaskisobtained,wecanapplyathreshold:theforegroundmaskhaswhitevaluesfortheforegroundandgrayforshadows;thus,inthethresholdedimage,allpixelsthatarenotalmostpurewhite(244-255)arebinarizedto0insteadof1.

Fromthere,weproceedwiththesameapproachweadoptedforthebasicmotiondetectionexample:identifyingobjects,detectingcontours,anddrawingthemontheoriginalframe.

Page 268: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

MeanshiftandCAMShiftBackgroundsubtractionisareallyeffectivetechnique,butnottheonlyoneavailabletotrackobjectsinavideo.Meanshiftisanalgorithmthattracksobjectsbyfindingthemaximumdensityofadiscretesampleofaprobabilityfunction(inourcase,aregionofinterestinanimage)andrecalculatingitatthenextframe,whichgivesthealgorithmanindicationofthedirectioninwhichtheobjecthasmoved.

Thiscalculationgetsrepeateduntilthecentroidmatchestheoriginalone,orremainsunalteredevenafterconsecutiveiterationsofthecalculation.Thisfinalmatchingiscalledconvergence.Forreference,thealgorithmwasfirstdescribedinthepaper,Theestimationofthegradientofadensityfunction,withapplicationsinpatternrecognition,FukunagaK.andHoestetlerL.,IEEE,1975,whichisavailableathttp://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1055330&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1055330(notethatthispaperisnotfreefordownload).

Here’savisualrepresentationofthisprocess:

Asidefromthetheory,Meanshiftisveryusefulwhentrackingaparticularregionofinterestinavideo,andthishasaseriesofimplications;forexample,ifyoudon’tknowaprioriwhattheregionyouwanttotrackis,you’regoingtohavetomanagethiscleverlyanddevelopprogramsthatdynamicallystarttracking(andceasetracking)certainareasofthevideo,dependingonarbitrarycriteria.OneexamplecouldbethatyouoperateobjectdetectionwithatrainedSVM,andthenstartusingMeanshifttotrackadetectedobject.

Let’snotmakeourlifecomplicatedfromtheverybeginning,though;let’sfirstgetfamiliarwithMeanshift,andthenuseitinmorecomplexscenarios.

Wewillstartbysimplymarkingaregionofinterestandkeepingtrackofit,likeso:

Page 269: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

importnumpyasnp

importcv2

cap=cv2.VideoCapture(0)

ret,frame=cap.read()

r,h,c,w=10,200,10,200

track_window=(c,r,w,h)

roi=frame[r:r+h,c:c+w]

hsv_roi=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

mask=cv2.inRange(hsv_roi,np.array((100.,30.,32.)),

np.array((180.,120.,255.)))

roi_hist=cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])

cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

term_crit=(cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,10,1)

whileTrue:

ret,frame=cap.read()

ifret==True:

hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

dst=cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

#applymeanshifttogetthenewlocation

ret,track_window=cv2.meanShift(dst,track_window,term_crit)

#Drawitonimage

x,y,w,h=track_window

img2=cv2.rectangle(frame,(x,y),(x+w,y+h),255,2)

cv2.imshow('img2',img2)

k=cv2.waitKey(60)&0xff

ifk==27:

break

else:

break

cv2.destroyAllWindows()

cap.release()

Intheprecedingcode,IsuppliedtheHSVvaluesfortrackingsomeshadesoflilac,andhere’stheresult:

Page 270: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ifyouranthecodeonyourmachine,you’dnoticehowtheMeanshiftwindowactuallylooksforthespecifiedcolorrange;ifitdoesn’tfindit,you’lljustseethewindowwobbling(itactuallylooksabitimpatient).Ifanobjectwiththespecifiedcolorrangeentersthewindow,thewindowwillthenstarttrackingit.

Let’sexaminethecodesothatwecanfullyunderstandhowMeanshiftperformsthistrackingoperation.

Page 271: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ColorhistogramsBeforeshowingthecodefortheprecedingexample,though,hereisanot-so-briefdigressiononcolorhistogramsandthetwoveryimportantbuilt-infunctionsofOpenCV:calcHistandcalcBackProject.

Thefunction,calcHist,calculatescolorhistogramsofanimage,sothenextlogicalstepistoexplaintheconceptofcolorhistograms.Acolorhistogramisarepresentationofthecolordistributionofanimage.Onthexaxisoftherepresentation,wehavecolorvalues,andontheyaxis,wehavethenumberofpixelscorrespondingtothecolorvalues.

Let’slookatavisualrepresentationofthisconcept,hopingtheadage,“apicturespeaksathousandwords”,willapplyinthisinstancetoo:

Thepictureshowsarepresentationofacolorhistogramwithonecolumnpervaluefrom0to180(notethatOpenCVusesHvalues0-180.Othersystemsmayuse0-360or0-255).

AsidefromMeanshift,colorhistogramsareusedforanumberofdifferentandusefulimageandvideoprocessingoperations.

ThecalcHistfunctionThecalcHist()functioninOpenCVhasthefollowingPythonsignature:

calcHist(...)

calcHist(images,channels,mask,histSize,ranges[,hist[,

accumulate]])->hist

Thedescriptionoftheparameters(astakenfromtheofficialOpenCVdocumentation)areasfollows:

Parameter Description

imagesThisparameteristhesourcearrays.Theyallshouldhavethesamedepth,CV_8UorCV_32F,andthesamesize.Eachofthemcanhaveanarbitrarynumberofchannels.

channels Thisparameteristhelistofthedimschannelsusedtocomputethehistogram.

maskThisparameteristheoptionalmask.Ifthematrixisnotempty,itmustbean8-bitarrayofthesamesizeasimages[i].Thenonzeromaskelementsmarkthearrayelementscountedinthehistogram.

histSize Thisparameteristhearrayofhistogramsizesineachdimension.

Page 272: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ranges Thisparameteristhearrayofthedimsarraysofthehistogrambinboundariesineachdimension.

hist Thisparameteristheoutputhistogram,whichisadenseorsparsedims(dimensional)array.

accumulate

Thisparameteristheaccumulationflag.Ifitisset,thehistogramisnotclearedinthebeginningwhenitisallocated.Thisfeatureenablesyoutocomputeasinglehistogramfromseveralsetsofarrays,ortoupdatethehistogramintime.

Inourexample,wecalculatethehistogramsoftheregionofinterestlikeso:

roi_hist=cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])

ThiscanbeinterpretedasthecalculationofcolorhistogramsforanarrayofimagescontainingonlytheregionofinterestintheHSVspace.Inthisregion,wecomputeonlytheimagevaluescorrespondingtothemaskvaluesnotequalto0,with18histogramcolumns,andwitheachhistogramhaving0asthelowerboundaryand180astheupperboundary.

Thisisratherconvolutedtodescribebut,onceyouhavefamiliarizedyourselfwiththeconceptofahistogram,thepiecesofthepuzzleshouldclickintoplace.

ThecalcBackProjectfunctionTheotherfunctionthatcoversavitalroleintheMeanshiftalgorithm(butnotonlythis)iscalcBackProject,whichisshortforhistogrambackprojection(calculation).Ahistogrambackprojectionissocalledbecauseittakesahistogramandprojectsitbackontoanimage,withtheresultbeingtheprobabilitythateachpixelwillbelongtotheimagethatgeneratedthehistograminthefirstplace.Therefore,calcBackProjectgivesaprobabilityestimationthatacertainimageisequalorsimilartoamodelimage(fromwhichtheoriginalhistogramwasgenerated).

Again,ifyouthoughtcalcHistwasabitconvoluted,calcBackProjectisprobablyevenmorecomplex!

InsummaryThecalcHistfunctionextractsacolorhistogramfromanimage,givingastatisticalrepresentationofthecolorsinanimage,andcalcBackProjecthelpsincalculatingtheprobabilityofeachpixelofanimagebelongingtotheoriginalimage.

Page 273: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

BacktothecodeLet’sgetbacktoourexample.Firstourusualimports,andthenwemarktheinitialregionofinterest:

cap=cv2.VideoCapture(0)

ret,frame=cap.read()

r,h,c,w=10,200,10,200

track_window=(c,r,w,h)

Then,weextractandconverttheROItoHSVcolorspace:

roi=frame[r:r+h,c:c+w]

hsv_roi=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

Now,wecreateamasktoincludeallpixelsoftheROIwithHSVvaluesbetweenthelowerandupperbounds:

mask=cv2.inRange(hsv_roi,np.array((100.,30.,32.)),

np.array((180.,120.,255.)))

Next,wecalculatethehistogramsoftheROI:

roi_hist=cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])

cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

Afterthehistogramsarecalculated,thevaluesarenormalizedtobeincludedwithintherange0-255.

Meanshiftperformsanumberofiterationsbeforereachingconvergence;however,thisconvergenceisnotassured.So,OpenCVallowsustopassso-calledterminationcriteria,whichisawaytospecifythebehaviorofMeanshiftwithregardtoterminatingtheseriesofcalculations:

term_crit=(cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,10,1)

Inthisparticularcase,we’respecifyingabehaviorthatinstructsMeanshifttostopcalculatingthecentroidshiftafterteniterationsorifthecentroidhasmovedatleast1pixel.Thatfirstflag(EPSorCRITERIA_COUNT)indicateswe’regoingtouseeitherofthetwocriteria(countor“epsilon”,meaningtheminimummovement).

Nowthatwehaveahistogramcalculated,andterminationcriteriaforMeanshift,wecanstartourusualinfiniteloop,grabthecurrentframefromthecamera,andstartprocessingit.ThefirstthingwedoisswitchtoHSVcolorspace:

ifret==True:

hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

NowthatwehaveanHSVarray,wecanoperatethelongawaitedhistogrambackprojection:

dst=cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

TheresultofcalcBackProjectisamatrix.Ifyouprintedittoconsole,itlooksmoreorlesslikethis:

Page 274: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

[[000…,000]

[000…,000]

[000…,000]

...,

[0020…,000]

[78200…,000]

[25513720…,000]]

Eachpixelisrepresentedwithitsprobability.

ThismatrixcanthefinallybepassedintoMeanshift,togetherwiththetrackwindowandtheterminationcriteriaasoutlinedbythePythonsignatureofcv2.meanShift:

meanShift(...)

meanShift(probImage,window,criteria)->retval,window

Sohereitis:

ret,track_window=cv2.meanShift(dst,track_window,term_crit)

Finally,wecalculatethenewcoordinatesofthewindow,drawarectangletodisplayitintheframe,andthenshowit:

x,y,w,h=track_window

img2=cv2.rectangle(frame,(x,y),(x+w,y+h),255,2)

cv2.imshow('img2',img2)

That’sit.Youshouldbynowhaveagoodideaofcolorhistograms,backprojections,andMeanshift.However,thereremainsoneissuetoberesolvedwiththeprecedingprogram:thesizeofthewindowdoesnotchangewiththesizeoftheobjectintheframesbeingtracked.

Oneoftheauthoritiesincomputervisionandauthoroftheseminalbook,LearningOpenCV,GaryBradski,O’Reilly,publishedapaperin1988toimprovetheaccuracyofMeanshift,anddescribedanewalgorithmcalledContinuouslyAdaptiveMeanshift(CAMShift),whichisverysimilartoMeanshiftbutalsoadaptsthesizeofthetrackwindowwhenMeanshiftreachesconvergence.

Page 275: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 276: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CAMShiftWhileCAMShiftaddscomplexitytoMeanshift,theimplementationofthetheprecedingprogramusingCAMShiftissurprisingly(ornot?)similartotheMeanshiftexample,withthemaindifferencebeingthat,afterthecalltoCamShift,therectangleisdrawnwithaparticularrotationthatfollowstherotationoftheobjectbeingtracked.

Here’sthecodereimplementedwithCAMShift:

importnumpyasnp

importcv2

cap=cv2.VideoCapture(0)

#takefirstframeofthevideo

ret,frame=cap.read()

#setupinitiallocationofwindow

r,h,c,w=300,200,400,300#simplyhardcodedthevalues

track_window=(c,r,w,h)

roi=frame[r:r+h,c:c+w]

hsv_roi=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

mask=cv2.inRange(hsv_roi,np.array((100.,30.,32.)),

np.array((180.,120.,255.)))

roi_hist=cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])

cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

term_crit=(cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,10,1)

while(1):

ret,frame=cap.read()

ifret==True:

hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

dst=cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

ret,track_window=cv2.CamShift(dst,track_window,term_crit)

pts=cv2.boxPoints(ret)

pts=np.int0(pts)

img2=cv2.polylines(frame,[pts],True,255,2)

cv2.imshow('img2',img2)

k=cv2.waitKey(60)&0xff

ifk==27:

break

else:

break

cv2.destroyAllWindows()

cap.release()

ThedifferencebetweentheCAMShiftcodeandtheMeanshiftoneliesinthesefourlines:

Page 277: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ret,track_window=cv2.CamShift(dst,track_window,term_crit)

pts=cv2.boxPoints(ret)

pts=np.int0(pts)

img2=cv2.polylines(frame,[pts],True,255,2)

ThemethodsignatureofCamShiftisidenticaltoMeanshift.

TheboxPointsfunctionfindstheverticesofarotatedrectangle,whilethepolylinesfunctiondrawsthelinesoftherectangleontheframe.

Bynow,youshouldbefamiliarwiththethreeapproachesweadoptedfortrackingobjects:basicmotiondetection,Meanshift,andCAMShift.

Let’snowexploreanothertechnique:theKalmanfilter.

Page 278: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 279: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TheKalmanfilterTheKalmanfilterisanalgorithmmainly(butnotonly)developedbyRudolfKalmaninthelate1950s,andhasfoundpracticalapplicationinmanyfields,particularlynavigationsystemsforallsortsofvehiclesfromnuclearsubmarinestoaircrafts.

TheKalmanfilteroperatesrecursivelyonstreamsofnoisyinputdata(whichincomputervisionisnormallyavideofeed)toproduceastatisticallyoptimalestimateoftheunderlyingsystemstate(thepositioninsidethevideo).

Let’stakeaquickexampletoconceptualizetheKalmanfilterandtranslatethepreceding(purposelybroadandgeneric)definitionintoplainerEnglish.Thinkofasmallredballonatable,andimagineyouhaveacamerapointingatthescene.Youidentifytheballasthesubjecttobetracked,andflickitwithyourfingers.Theballwillstartrollingonthetable,followingthelawsofmotionwe’refamiliarwith.

Iftheballisrollingataspeedof1meterpersecond(1m/s)inaparticulardirection,youdon’tneedtheKalmanfiltertoestimatewheretheballwillbein1second’stime:itwillbe1meteraway.TheKalmanfilterappliestheselawstopredictanobject’spositioninthecurrentvideoframebasedonobservationsgatheredinthepreviousframes.Naturally,theKalmanfiltercannotknowaboutapencilonthetabledeflectingthecourseoftheball,butitcanadjustforthiskindofunforeseeableevent.

Page 280: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PredictandupdateFromtheprecedingdescription,wegatherthattheKalmanfilteralgorithmisdividedintotwophases:

Predict:Inthefirstphase,theKalmanfilterusesthecovariancecalculateduptothecurrentpointintimetoestimatetheobject’snewpositionUpdate:Inthesecondphase,itrecordstheobject’spositionandadjuststhecovarianceforthenextcycleofcalculations

Thisadjustmentis—inOpenCVterms—acorrection,hencetheAPIoftheKalmanFilterclassinthePythonbindingsofOpenCVisasfollows:

classKalmanFilter(__builtin__.object)

|Methodsdefinedhere:

|

|__repr__(...)

|x.__repr__()<==>repr(x)

|

|correct(...)

|correct(measurement)->retval

|

|predict(...)

|predict([,control])->retval

Wecandeducethat,inourprograms,wewillcallpredict()toestimatethepositionofanobject,andcorrect()toinstructtheKalmanfiltertoadjustitscalculations.

Page 281: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

AnexampleUltimately,wewillaimtousetheKalmanfilterincombinationwithCAMShifttoobtainthehighestdegreeofaccuracyandperformance.However,beforewegointosuchlevelsofcomplexity,let’sanalyzeasimpleexample,specificallyonethatseemstobeverycommonontheWebwhenitcomestotheKalmanfilterandOpenCV:mousetracking.

Inthefollowingexample,wewilldrawanemptyframeandtwolines:onecorrespondingtotheactualmovementofthemouse,andtheothercorrespondingtotheKalmanfilterprediction.Here’sthecode:

importcv2

importnumpyasnp

frame=np.zeros((800,800,3),np.uint8)

last_measurement=current_measurement=np.array((2,1),np.float32)

last_prediction=current_prediction=np.zeros((2,1),np.float32)

defmousemove(event,x,y,s,p):

globalframe,current_measurement,measurements,last_measurement,

current_prediction,last_prediction

last_prediction=current_prediction

last_measurement=current_measurement

current_measurement=np.array([[np.float32(x)],[np.float32(y)]])

kalman.correct(current_measurement)

current_prediction=kalman.predict()

lmx,lmy=last_measurement[0],last_measurement[1]

cmx,cmy=current_measurement[0],current_measurement[1]

lpx,lpy=last_prediction[0],last_prediction[1]

cpx,cpy=current_prediction[0],current_prediction[1]

cv2.line(frame,(lmx,lmy),(cmx,cmy),(0,100,0))

cv2.line(frame,(lpx,lpy),(cpx,cpy),(0,0,200))

cv2.namedWindow("kalman_tracker")

cv2.setMouseCallback("kalman_tracker",mousemove)

kalman=cv2.KalmanFilter(4,2)

kalman.measurementMatrix=np.array([[1,0,0,0],[0,1,0,0]],np.float32)

kalman.transitionMatrix=np.array([[1,0,1,0],[0,1,0,1],[0,0,1,0],

[0,0,0,1]],np.float32)

kalman.processNoiseCov=np.array([[1,0,0,0],[0,1,0,0],[0,0,1,0],

[0,0,0,1]],np.float32)*0.03

whileTrue:

cv2.imshow("kalman_tracker",frame)

if(cv2.waitKey(30)&0xFF)==27:

break

cv2.destroyAllWindows()

Asusual,let’sanalyzeitstepbystep.Afterthepackagesimport,wecreateanemptyframe,ofsize800x800,andtheninitializethearraysthatwilltakethecoordinatesofthemeasurementsandpredictionsofthemousemovements:

Page 282: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

frame=np.zeros((800,800,3),np.uint8)

last_measurement=current_measurement=np.array((2,1),np.float32)

last_prediction=current_prediction=np.zeros((2,1),np.float32)

Then,wedeclarethemousemoveCallbackfunction,whichisgoingtohandlethedrawingofthetracking.Themechanismisquitesimple;westorethelastmeasurementsandlastprediction,correcttheKalmanwiththecurrentmeasurement,calculatetheKalmanprediction,andfinallydrawtwolines,fromthelastmeasurementtothecurrentandfromthelastpredictiontothecurrent:

defmousemove(event,x,y,s,p):

globalframe,current_measurement,measurements,last_measurement,

current_prediction,last_prediction

last_prediction=current_prediction

last_measurement=current_measurement

current_measurement=np.array([[np.float32(x)],[np.float32(y)]])

kalman.correct(current_measurement)

current_prediction=kalman.predict()

lmx,lmy=last_measurement[0],last_measurement[1]

cmx,cmy=current_measurement[0],current_measurement[1]

lpx,lpy=last_prediction[0],last_prediction[1]

cpx,cpy=current_prediction[0],current_prediction[1]

cv2.line(frame,(lmx,lmy),(cmx,cmy),(0,100,0))

cv2.line(frame,(lpx,lpy),(cpx,cpy),(0,0,200))

ThenextstepistoinitializethewindowandsettheCallbackfunction.OpenCVhandlesmouseeventswiththesetMouseCallbackfunction;specificeventsmustbehandledusingthefirstparameteroftheCallback(event)functionthatdetermineswhatkindofeventhasbeentriggered(click,move,andsoon):

cv2.namedWindow("kalman_tracker")

cv2.setMouseCallback("kalman_tracker",mousemove)

Nowwe’rereadytocreatetheKalmanfilter:

kalman=cv2.KalmanFilter(4,2)

kalman.measurementMatrix=np.array([[1,0,0,0],[0,1,0,0]],np.float32)

kalman.transitionMatrix=np.array([[1,0,1,0],[0,1,0,1],[0,0,1,0],

[0,0,0,1]],np.float32)

kalman.processNoiseCov=np.array([[1,0,0,0],[0,1,0,0],[0,0,1,0],

[0,0,0,1]],np.float32)*0.03

TheKalmanfilterclasstakesoptionalparametersinitsconstructor(fromtheOpenCVdocumentation):

dynamParams:ThisparameterstatesthedimensionalityofthestateMeasureParams:ThisparameterstatesthedimensionalityofthemeasurementControlParams:Thisparameterstatesthedimensionalityofthecontrolvector.type:ThisparameterstatesthetypeofthecreatedmatricesthatshouldbeCV_32ForCV_64F

Ifoundtheprecedingparameters(bothfortheconstructorandtheKalmanproperties)toworkverywell.

Page 283: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Fromthispointon,theprogramisstraightforward;everymousemovementtriggersaKalmanprediction,boththeactualpositionofthemouseandtheKalmanpredictionaredrawnintheframe,whichiscontinuouslydisplayed.Ifyoumoveyourmousearound,you’llnoticethat,ifyoumakeasuddenturnathighspeed,thepredictionlinewillhaveawidertrajectory,whichisconsistentwiththemomentumofthemousemovementatthetime.Here’sasampleresult:

Page 284: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Areal-lifeexample–trackingpedestriansUptothispoint,wehavefamiliarizedourselveswiththeconceptsofmotiondetection,objectdetection,andobjecttracking,soIimagineyouareanxioustoputthisnewfoundknowledgetogooduseinareal-lifescenario.Let’sdojustthatbyexaminingthevideofeedofasurveillancecameraandtrackingpedestriansinit.

Firstofall,weneedasamplevideo;ifyoudownloadtheOpenCVsource,youwillfindtheperfectvideofileforthispurposein<opencv_dir>/samples/data/768x576.avi.

Nowthatwehavetheperfectassettoanalyze,let’sstartbuildingtheapplication.

TheapplicationworkflowTheapplicationwilladheretothefollowinglogic:

1. Examinethefirstframe.2. Examinethefollowingframesandperformbackgroundsubtractiontoidentify

pedestriansinthesceneatthestartofthescene.3. EstablishanROIperpedestrian,anduseKalman/CAMShifttotrackgivinganIDto

eachpedestrian.4. Examinethenextframesfornewpedestriansenteringthescene.

Ifthiswereareal-worldapplication,youwouldprobablystorepedestrianinformationtoobtaininformationsuchastheaveragepermanenceofapedestrianinthesceneandmostlikelyroutes.However,thisisallbeyondtheremitofthisexampleapplication.

Inareal-worldapplication,youwouldmakesuretoidentifynewpedestriansenteringthescene,butfornow,we’llfocusontrackingthoseobjectsthatareinthesceneatthestartofthevideo,utilizingtheCAMShiftandKalmanfilteralgorithms.

Youwillfindthecodeforthisapplicationinchapter8/surveillance_demo/ofthecoderepository.

Abriefdigression–functionalversusobject-orientedprogrammingAlthoughmostprogrammersareeitherfamiliar(orworkonaconstantbasis)withObject-orientedProgramming(OOP),Ihavefoundthat,themoretheyearspass,themoreIpreferFunctionalProgramming(FP)solutions.

Forthosenotfamiliarwiththeterminology,FPisaprogrammingparadigmadoptedbymanylanguagesthattreatsprogramsastheevaluationofmathematicalfunctions,allowsfunctionstoreturnfunctions,andpermitsfunctionsasargumentsinafunction.ThestrengthofFPdoesnotonlyresideinwhatitcando,butalsoinwhatitcanavoid,oraimsatavoidingside-effectsandchangingstates.Ifthetopicoffunctionalprogramminghassparkedaninterest,makesuretocheckoutlanguagessuchasHaskell,Clojure,orML.

NoteWhatisaside-effectinprogrammingterms?Youcandefineasideeffectasanyfunction

Page 285: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

thatchangesanyvaluethatdoesnotdependonthefunction’sinput.Python,alongwithmanyotherlanguages,issusceptibletocausingside-effectsbecause—muchlike,forexample,JavaScript—itallowsaccesstoglobalvariables(andsometimesthisaccesstoglobalvariablescanbeaccidental!).

Anothermajorissueencounteredwithlanguagesthatarenotpurelyfunctionalisthefactthatafunction’sresultwillchangeovertime,dependingonthestateofthevariablesinvolved.Ifafunctiontakesanobjectasanargument—forexample—andthecomputationreliesontheinternalstateofthatobject,thefunctionwillreturndifferentresultsaccordingtothechangesintheobject’sstate.Thisissomethingthatverytypicallyhappensinlanguages,suchasCandC++,infunctionswhereoneormoreoftheargumentsarereferencestoobjects.

Whythisdigression?BecausesofarIhaveillustratedconceptsusingmostlyfunctions;Ididnotshyawayfromaccessingglobalvariableswherethiswasthesimplestandmostrobustapproach.However,thenextprogramwewillexaminewillcontainOOP.SowhydoIchoosetoadoptOOPwhileadvocatingFP?BecauseOpenCVhasquiteanopinionatedapproach,whichmakesithardtoimplementaprogramwithapurelyfunctionalorobject-orientedapproach.

Forexample,anydrawingfunction,suchascv2.rectangleandcv2.circle,modifiestheargumentpassedintoit.Thisapproachcontravenesoneofthecardinalrulesoffunctionalprogramming,whichistoavoidside-effectsandchangingstates.

Outofcuriosity,youcould—inPython—redeclaretheAPIofthesedrawingfunctionsinawaythatismoreFP-friendly.Forexample,youcouldrewritecv2.rectanglelikethis:

defdrawRect(frame,topLeft,bottomRight,color,thickness,fill=

cv2.LINE_AA):

newframe=frame.copy()

cv2.rectangle(newframe,topLeft,bottomRight,color,thickness,fill)

returnnewframe

Thisapproach—whilecomputationallymoreexpensiveduetothecopy()operation—allowstheexplicitreassignmentofaframe,likeso:

frame=camera.read()

frame=drawRect(frame,(0,0),(10,10),(0,255,0),1)

Toconcludethisdigression,Iwillreiterateabeliefveryoftenmentionedinallprogrammingforumsandresources:thereisnosuchthingasthebestlanguageorparadigm,onlythebesttoolforthejobinhand.

Solet’sgetbacktoourprogramandexploretheimplementationofasurveillanceapplication,trackingmovingobjectsinavideo.

Page 286: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThePedestrianclassThemainrationalebehindthecreationofaPedestrianclassisthenatureoftheKalmanfilter.TheKalmanfiltercanpredictthepositionofanobjectbasedonhistoricalobservationsandcorrectthepredictionbasedontheactualdata,butitcanonlydothatforoneobject.

Asaconsequence,weneedoneKalmanfilterperobjecttracked.

SothePedestrianclasswillactasaholderforaKalmanfilter,acolorhistogram(calculatedonthefirstdetectionoftheobjectandusedasareferenceforthesubsequentframes),andinformationabouttheregionofinterest,whichwillbeusedbytheCAMShiftalgorithm(thetrack_windowparameter).

Furthermore,westoretheIDofeachpedestrianforsomefancyreal-timeinfo.

Let’stakealookatthePedestrianclass:

classPedestrian():

"""Pedestrianclass

eachpedestrianiscomposedofaROI,anIDandaKalmanfilter

sowecreateaPedestrianclasstoholdtheobjectstate

"""

def__init__(self,id,frame,track_window):

"""initthepedestrianobjectwithtrackwindowcoordinates"""

#setuptheroi

self.id=int(id)

x,y,w,h=track_window

self.track_window=track_window

self.roi=cv2.cvtColor(frame[y:y+h,x:x+w],cv2.COLOR_BGR2HSV)

roi_hist=cv2.calcHist([self.roi],[0],None,[16],[0,180])

self.roi_hist=cv2.normalize(roi_hist,roi_hist,0,255,

cv2.NORM_MINMAX)

#setupthekalman

self.kalman=cv2.KalmanFilter(4,2)

self.kalman.measurementMatrix=np.array([[1,0,0,0],

[0,1,0,0]],np.float32)

self.kalman.transitionMatrix=np.array([[1,0,1,0],[0,1,0,1],[0,0,1,0],

[0,0,0,1]],np.float32)

self.kalman.processNoiseCov=np.array([[1,0,0,0],[0,1,0,0],[0,0,1,0],

[0,0,0,1]],np.float32)*0.03

self.measurement=np.array((2,1),np.float32)

self.prediction=np.zeros((2,1),np.float32)

self.term_crit=(cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,10,

1)

self.center=None

self.update(frame)

def__del__(self):

print"Pedestrian%ddestroyed"%self.id

defupdate(self,frame):

Page 287: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

#print"updating%d"%self.id

hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)

back_project=cv2.calcBackProject([hsv],[0],self.roi_hist,[0,180],1)

ifargs.get("algorithm")=="c":

ret,self.track_window=cv2.CamShift(back_project,

self.track_window,self.term_crit)

pts=cv2.boxPoints(ret)

pts=np.int0(pts)

self.center=center(pts)

cv2.polylines(frame,[pts],True,255,1)

ifnotargs.get("algorithm")orargs.get("algorithm")=="m":

ret,self.track_window=cv2.meanShift(back_project,

self.track_window,self.term_crit)

x,y,w,h=self.track_window

self.center=center([[x,y],[x+w,y],[x,y+h],[x+w,y+h]])

cv2.rectangle(frame,(x,y),(x+w,y+h),(255,255,0),1)

self.kalman.correct(self.center)

prediction=self.kalman.predict()

cv2.circle(frame,(int(prediction[0]),int(prediction[1])),4,(0,255,

0),-1)

#fakeshadow

cv2.putText(frame,"ID:%d->%s"%(self.id,self.center),(11,

(self.id+1)*25+1),

font,0.6,

(0,0,0),

1,

cv2.LINE_AA)

#actualinfo

cv2.putText(frame,"ID:%d->%s"%(self.id,self.center),(10,

(self.id+1)*25),

font,0.6,

(0,255,0),

1,

cv2.LINE_AA)

Atthecoreoftheprogramliesthebackgroundsubtractorobject,whichletsusidentifyregionsofinterestcorrespondingtomovingobjects.

Whentheprogramstarts,wetakeeachoftheseregionsandinstantiateaPedestrianclass,passingtheID(asimplecounter),andtheframeandtrackwindowcoordinates(sowecanextracttheRegionofInterest(ROI),and,fromthis,theHSVhistogramoftheROI).

Theconstructorfunction(__init__inPython)ismoreorlessanaggregationofallthepreviousconcepts:givenanROI,wecalculateitshistogram,setupaKalmanfilter,andassociateittoaproperty(self.kalman)oftheobject.

Intheupdatemethod,wepassthecurrentframeandconvertittoHSVsothatwecancalculatethebackprojectionofthepedestrian’sHSVhistogram.

WethenuseeitherCAMShiftorMeanshift(dependingontheargumentpassed;Meanshiftisthedefaultifnoargumentsarepassed)totrackthemovementofthepedestrian,andcorrecttheKalmanfilterforthatpedestrianwiththeactualposition.

Page 288: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WealsodrawbothCAMShift/Meanshift(withasurroundingrectangle)andKalman(withadot),soyoucanobserveKalmanandCAMShift/Meanshiftgonearlyhandinhand,exceptforsuddenmovementsthatcauseKalmantohavetoreadjust.

Lastly,weprintsomepedestrianinformationonthetop-leftcorneroftheimage.

Page 289: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThemainprogramNowthatwehaveaPedestrianclassholdingallspecificinformationforeachobject,let’stakealookatthemainfunctionintheprogram.

First,weloadavideo(itcouldbeawebcam),andthenweinitializeabackgroundsubtractor,setting20framesastheframesaffectingthebackgroundmodel:

history=20

bs=cv2.createBackgroundSubtractorKNN(detectShadows=True)

bs.setHistory(history)

Wealsocreatethemaindisplaywindow,andthensetupapedestriansdictionaryandafirstFrameflag,whichwe’regoingtousetoallowafewframesforthebackgroundsubtractortobuildhistory,soitcanbetteridentifymovingobjects.Tohelpwiththis,wealsosetupaframecounter:

cv2.namedWindow("surveillance")

pedestrians={}

firstFrame=True

frames=0

Nowwestarttheloop.Wereadcameraframes(orvideoframes)onebyone:

whileTrue:

print"--------------------FRAME%d--------------------"%frames

grabbed,frane=camera.read()

if(grabbedisFalse):

print"failedtograbframe."

break

ret,frame=camera.read()

WeletBackgroundSubtractorKNNbuildthehistoryforthebackgroundmodel,sowedon’tactuallyprocessthefirst20frames;weonlypassthemintothesubtractor:

fgmask=bs.apply(frame)

#thisisjusttoletthebackgroundsubtractorbuildabitofhistory

ifframes<history:

frames+=1

continue

Thenweprocesstheframewiththeapproachexplainedearlierinthechapter,byapplyingaprocessofdilationanderosionontheforegroundmasksoastoobtaineasilyidentifiableblobsandtheirboundingboxes.Theseareobviouslymovingobjectsintheframe:

th=cv2.threshold(fgmask.copy(),127,255,cv2.THRESH_BINARY)[1]

th=cv2.erode(th,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3)),

iterations=2)

dilated=cv2.dilate(th,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,

(8,3)),iterations=2)

image,contours,hier=cv2.findContours(dilated,cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

Oncethecontoursareidentified,weinstantiateonepedestrianpercontourforthefirst

Page 290: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

frameonly(notethatIsetaminimumareaforthecontourtofurtherdenoiseourdetection):

counter=0

forcincontours:

ifcv2.contourArea(c)>500:

(x,y,w,h)=cv2.boundingRect(c)

cv2.rectangle(frame,(x,y),(x+w,y+h),(0,255,0),1)

#onlycreatepedestriansinthefirstframe,thenjustfollowthe

onesyouhave

iffirstFrameisTrue:

pedestrians[counter]=Pedestrian(counter,frame,(x,y,w,h))

counter+=1

Then,foreachpedestriandetected,weperformanupdatemethodpassingthecurrentframe,whichisneededinitsoriginalcolorspace,becausethepedestrianobjectsareresponsiblefordrawingtheirowninformation(textandMeanshift/CAMShiftrectangles,andKalmanfiltertracking):

fori,pinpedestrians.iteritems():

p.update(frame)

WesetthefirstFrameflagtoFalse,sowedon’tinstantiateanymorepedestrians;wejustkeeptrackoftheoneswehave:

firstFrame=False

frames+=1

Finally,weshowtheresultinthedisplaywindow.TheprogramcanbeexitedbypressingtheEsckey:

cv2.imshow("surveillance",frame)

ifcv2.waitKey(110)&0xff==27:

break

if__name__=="__main__":

main()

Thereyouhaveit:CAMShift/MeanshiftworkingintandemwiththeKalmanfiltertotrackmovingobjects.Allbeingwell,youshouldobtainaresultsimilartothis:

Inthisscreenshot,thebluerectangleistheCAMShiftdetectionandthegreenrectangleistheKalmanfilterpredictionwithitscenteratthebluecircle.

Wheredowegofromhere?

Page 291: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thisprogramconstitutesabasisforyourapplicationdomain’sneeds.Therearemanyimprovementsthatcanbemadebuildingontheprogramabovetosuitanapplication’sadditionalrequirements.Considerthefollowingexamples:

YoucoulddestroyapedestrianobjectifKalmanpredictsitspositiontobeoutsidetheframeYoucouldcheckwhethereachdetectedmovingobjectiscorrespondingtoexistingpedestrianinstances,andifnot,createaninstanceforitYoucouldtrainanSVMandoperateclassificationoneachmovingobjecttoestablishwhetherornotthemovingobjectisofthenatureyouintendtotrack(forinstance,adogmightenterthescenebutyourapplicationrequirestoonlytrackhumans)

Whateveryourneeds,hopefullythischapterwillhaveprovidedyouwiththenecessaryknowledgetobuildapplicationsthatsatisfyyourrequirements.

Page 292: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 293: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryThischapterexploredthevastandcomplextopicofvideoanalysisandtrackingobjects.

Welearnedaboutvideobackgroundsubtractionwithabasicmotiondetectiontechniquethatcalculatesframedifferences,andthenmovedtomorecomplexandefficienttoolssuchasBackgroundSubtractor.

Wethenexploredtwoveryimportantvideoanalysisalgorithms:MeanshiftandCAMShift.Inthecourseofthis,wetalkedindetailaboutcolorhistogramsandbackprojections.WealsofamiliarizedourselveswiththeKalmanfilter,anditsusefulnessinacomputervisioncontext.Finally,weputallourknowledgetogetherinasamplesurveillanceapplication,whichtracksmovingobjectsinavideo.

NowthatourfoundationinOpenCVandmachinelearningissolidifying,wearereadytotackleartificialneuralnetworksanddivedeeperintoartificialintelligencewithOpenCVandPythoninthenextchapter.

Page 294: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 295: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Chapter9.NeuralNetworkswithOpenCV–anIntroductionMachinelearningisabranchofartificialintelligence,onethatdealsspecificallywithalgorithmsthatenableamachinetorecognizepatternsandtrendsindataandsuccessfullymakepredictionsandclassifications.

ManyofthealgorithmsandtechniquesusedbyOpenCVtoaccomplishsomeofthemoreadvancedtasksincomputervisionaredirectlyrelatedtoartificialintelligenceandmachinelearning.

ThischapterwillintroduceyoutotheMachineLearningconceptsinOpenCVsuchasartificialneuralnetworks.Thisisagentleintroductionthatbarelyscratchesthesurfaceofavastworld,thatofMachineLearning,whichiscontinuouslyevolving.

Page 296: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ArtificialneuralnetworksLet’sstartbydefiningArtificialNeuralNetworks(ANN)withanumberoflogicalsteps,ratherthanaclassicmonolithicsentenceusingobscurejargonwithanevenmoreobscuremeaning.

Firstofall,anANNisastatisticalmodel.Whatisastatisticalmodel?Astatisticalmodelisapairofelements,namelythespaceS(asetofobservations)andtheprobabilityP,wherePisadistributionthatapproximatesS(inotherwords,afunctionthatwouldgenerateasetofobservationsthatisverysimilartoS).

IliketothinkofPintwoways:asasimplificationofacomplexscenario,andasthefunctionthatgeneratedSinthefirstplace,orattheveryleastasetofobservationsverysimilartoS.

SoANNsaremodelsthattakeacomplexreality,simplifyit,anddeduceafunctionto(approximately)representstatisticalobservationsonewouldexpectfromthatreality,inmathematicalform.

ThenextstepinourjourneytowardscomprehendingANNsistounderstandhowanANNimprovesontheconceptofasimplestatisticalmodel.

Whatifthefunctionthatgeneratedthedatasetislikelytotakealargeamountof(unknown)inputs?

TheapproachthatANNstakeistodelegateworktoanumberofneurons,nodes,orunits,eachofwhichiscapableof“approximating”thefunctionthatcreatedtheinputs.Approximationismathematicallythedefinitionofasimplerfunctionthatapproximatesamorecomplexfunction,whichenablesustodefineerrors(relativetotheapplicationdomain).Furthermore,foraccuracy’ssake,anetworkisgenerallyrecognizedtobeneuraliftheneuronsorunitsarecapableofapproximatinganonlinearfunction.

Let’stakeacloserlookatneurons.

Page 297: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

NeuronsandperceptronsTheperceptronisaconceptthatdatesbacktothe1950s,and(toputitsimply)aperceptronisafunctionthattakesanumberofinputsandproducesasinglevalue.

Eachoftheinputshasanassociatedweightthatsignifiestheimportanceoftheinputinthefunction.Thesigmoidfunctionproducesasinglevalue:

Asigmoidfunctionisatermthatindicatesthatthefunctionproduceseithera0or1value.Thediscriminantisathresholdvalue;iftheweightedsumoftheinputsisgreaterthanacertainthreshold,theperceptronproducesabinaryclassificationof1,otherwise0.

Howaretheseweightsdetermined,andwhatdotheyrepresent?

Neuronsareinterconnectedtoeachother,andeachneuron’ssetofweights(thesearejustnumericalparameters)definesthestrengthoftheconnectiontootherneurons.Theseweightsare“adaptive”,meaningtheychangeintimeaccordingtoalearningalgorithm.

Page 298: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 299: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThestructureofanANNHere’savisualrepresentationofaneuralnetwork:

Asyoucanseefromthefigure,therearethreedistinctlayersinaneuralnetwork:Inputlayer,Hiddenlayer(ormiddle),andOutputlayer.

Therecanbemorethanonehiddenlayer;however,onehiddenlayerwouldbeenoughtoresolvethemajorityofreal-lifeproblems.

Page 300: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

NetworklayersbyexampleHowdowedeterminethenetwork’stopology,andhowmanyneuronstocreateforeachlayer?Let’smakethisdeterminationlayerbylayer.

TheinputlayerTheinputlayerdefinesthenumberofinputsintothenetwork.Forexample,let’ssayyouwanttocreateanANN,whichwillhelpyoudeterminewhatanimalyou’relookingatgivenadescriptionofitsattributes.Let’sfixtheseattributestoweight,length,andteeth.That’sasetofthreeattributes;ournetworkwillneedtocontainthreeinputnodes.

TheoutputlayerTheoutputlayerisequaltothenumberofclassesweidentified.Continuingwiththeprecedingexampleofananimalclassificationnetwork,wewillarbitrarilysettheoutputlayerto4,becauseweknowwe’regoingtodealwiththefollowinganimals:dog,condor,dolphin,anddragon.Ifwefeedindataforananimalthatisnotinoneofthesecategories,thenetworkwillreturntheclassmostlikelytoresemblethisunclassifiedanimal.

ThehiddenlayerThehiddenlayercontainsperceptrons.Asmentioned,thevastmajorityofproblemsonlyrequireonehiddenlayer;mathematicallyspeaking,thereisnoverifiedreasontohavemorethantwohiddenlayers.Wewill,therefore,sticktoonehiddenlayerandworkwiththat.

Thereareanumberofrulesofthumbtodeterminethenumberofneuronscontainedinthehiddenlayer,butthereisnohard-and-fastrule.Theempiricalwayisyourfriendinthisparticularcircumstance:testyournetworkwithdifferentsettings,andchoosetheonethatfitsbest.

ThesearesomeofthemostcommonrulesusedwhenbuildinganANN:

Thenumberofhiddenneuronsshouldbebetweenthesizeoftheinputlayerandthesizeoftheoutputlayer.Ifthedifferencebetweentheinputlayersizeandtheoutputlayerislarge,itismyexperiencethatahiddenlayersizemuchclosertotheoutputlayerispreferable.Forrelativelysmallinputlayers,thenumberofhiddenneuronsistwo-thirdsthesizeoftheinputlayer,plusthesizeoftheoutputlayer,orlessthantwicethesizeoftheinputlayer.

Oneveryimportantfactortokeepinmindisoverfitting.Overfittingoccurswhenthere’ssuchaninordinateamountofinformationcontainedinthehiddenlayer(forexample,adisproportionateamountofneuronsinthelayer)comparedtotheinformationprovidedbythetrainingdatathatclassificationisnotverymeaningful.

Thelargerthehiddenlayer,themoretraininginformationisrequiredforthenetworktobetrainedproperly.And,needlesstosay,thisisgoingtolengthenthetimerequiredbythenetworktoproperlytrain.

Page 301: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

So,followingthesecondrulesofthumbillustratedearlier,ournetworkwillhaveahiddenlayerofsize8,justbecauseafterafewrunsofthenetwork,Ifoundittoyieldthebestresults.Asasidenote,theempiricalapproachisverymuchencouragedintheworldofANNs.Thebestnetworktopologyisrelatedtothetypeofdatafedtothenetwork,sodon’trefrainfromtestingANNsinatrial-and-errorfashion.

Insummary,ournetworkhasthefollowingsizes:

Input:3Hidden:8Output:4

Thelearningalgorithms

ThereareanumberoflearningalgorithmsusedbyANNs,butwecanidentifythreemajorones:

Supervisedlearning:Withthisalgorithm,wewanttoobtainafunctionfromtheANN,whichdescribesthedatawelabeled.Weknow,apriori,thenatureofthisdata,andwedelegatetotheANNtheprocessoffindingafunctionthatdescribesthedata.Unsupervisedlearning:Thisalgorithmdiffersfromsupervisedlearning;inthis,thedataisunlabeled.Thisimpliesthatwedon’thavetoselectandlabeldata,butitalsomeanstheANNhasalotmoreworktodo.Theclassificationofthedataisusuallyobtainedthroughtechniquessuchas(butnotonly)clustering,whichweexploredinChapter7,DetectingandRecognizingObjects.Reinforcementlearning:Reinforcementlearningisalittlemorecomplex.Asystemreceivesaninput;adecision-makingmechanismdeterminesanaction,whichisperformedandscored(success/failureandgradesinbetween);andfinallytheinputandtheactionarepairedwiththeirscore,sothesystemlearnstorepeatorchangetheactiontobeperformedforacertaininputorstate.

NowthatwehaveageneralideaofwhatANNsare,let’sseehowOpenCVimplementsthem,andhowtoputthemtogooduse.Finally,we’llworkourwayuptoafullblownapplication,inwhichwewillattempttorecognizehandwrittendigits.

Page 302: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 303: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ANNsinOpenCVUnsurprisingly,ANNsresideinthemlmoduleofOpenCV.

Let’sexamineadummyexample,asagentleintroductiontoANNs:

importcv2

importnumpyasnp

ann=cv2.ml.ANN_MLP_create()

ann.setLayerSizes(np.array([9,5,9],dtype=np.uint8))

ann.setTrainMethod(cv2.ml.ANN_MLP_BACKPROP)

ann.train(np.array([[1.2,1.3,1.9,2.2,2.3,2.9,3.0,3.2,3.3]],

dtype=np.float32),

cv2.ml.ROW_SAMPLE,

np.array([[0,0,0,0,0,1,0,0,0]],dtype=np.float32))

printann.predict(np.array([[1.4,1.5,1.2,2.,2.5,2.8,3.,3.1,3.8]],

dtype=np.float32))

First,wecreateanANN:

ann=cv2.ml.ANN_MLP_create()

YoumaywonderabouttheMLPacronyminthefunctionname;itstandsformultilayerperceptron.Bynow,youshouldknowwhataperceptronis.

Aftercreatingthenetwork,weneedtosetitstopology:

ann.setLayerSizes(np.array([9,5,9],dtype=np.uint8))

ann.setTrainMethod(cv2.ml.ANN_MLP_BACKPROP)

ThelayersizesaredefinedbytheNumPyarraythatispassedintothesetLayerSizesmethod.Thefirstelementsetsthesizeoftheinputlayer,thelastelementsetsthesizeoftheoutputlayer,andallintermediaryelementsdefinethesizeofthehiddenlayers.

Wethensetthetrainmethodtobebackpropagation.Therearetwochoiceshere:BACKPROPandRPROP.

BothBACKPROPandRPROParebackpropagationalgorithms—insimpleterms,algorithmsthathaveaneffectonweightsbasedonerrorsinclassification.

Thesetwoalgorithmsworkinthecontextofsupervisedlearning,whichiswhatweareusingintheexample.Howcanwetellthisparticulardetail?Lookatthenextstatement:

ann.train(np.array([[1.2,1.3,1.9,2.2,2.3,2.9,3.0,3.2,3.3]],

dtype=np.float32),

cv2.ml.ROW_SAMPLE,

np.array([[0,0,0,0,0,1,0,0,0]],dtype=np.float32))

Youshouldnoticeanumberofdetails.Themethodlooksextremelysimilartothetrain()methodofsupportvectormachine.Themethodcontainsthreeparameters:samples,layout,andresponses.Onlysamplesistherequiredparameter;theothertwoareoptional.

Page 304: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Thisrevealsthefollowinginformation:

First,ANN,likeSVM,isanOpenCVStatModel(statisticalmodel);trainandpredictarethemethodsinheritedfromthebaseStatModelclass.Second,astatisticalmodeltrainedwithonlysamplesisadoptinganunsupervisedlearningalgorithm.Ifweprovidelayoutandresponses,we’reinasupervisedlearningcontext.

Aswe’reusingANNs,wecanspecifythetypeofbackpropagationalgorithmwe’regoingtouse(BACKPROPorRPROP),because—aswesaid—we’reinasupervisedlearningenvironment.

Sowhatisbackpropagation?Backpropagationisatwo-phasealgorithmthatcalculatestheerrorofpredictionsandupdatesinbothdirectionsofthenetwork(theinputandoutputlayers);itthenupdatestheneuronweightsaccordingly.

Let’straintheANN;aswespecifiedaninputlayerofsize9,weneedtoprovide9inputs,and9outputstoreflectthesizeoftheoutputlayer:

ann.train(np.array([[1.2,1.3,1.9,2.2,2.3,2.9,3.0,3.2,3.3]],

dtype=np.float32),

cv2.ml.ROW_SAMPLE,

np.array([[0,0,0,0,0,1,0,0,0]],dtype=np.float32))

Thestructureoftheresponseissimplyanarrayofzeros,witha1valueinthepositionindicatingtheclasswewanttoassociatetheinputwith.Inourprecedingexample,weindicatedthatthespecifiedinputarraycorrespondstoclass5(classesarezero-indexed)ofclasses0to8.

Lastly,weperformclassification:

printann.predict(np.array([[1.4,1.5,1.2,2.,2.5,2.8,3.,3.1,3.8]],

dtype=np.float32))

Thiswillyieldthefollowingresult:

(5.0,array([[-0.06419383,-0.13360272,-0.1681568,-0.18708915,

0.0970564,

0.89237726,0.05093023,0.17537238,0.13388439]],dtype=float32))

Thismeansthattheprovidedinputwasclassifiedasbelongingtoclass5.Thisisonlyadummyexampleandtheclassificationisprettymeaningless;however,thenetworkbehavedcorrectly.Inthiscode,weonlyprovidedonetrainingrecordforclass5,sothenetworkclassifiedanewinputasbelongingtoclass5.

Asyoumayhaveguessed,theoutputofapredictionisatuple,withthefirstvaluebeingtheclassandthesecondbeinganarraycontainingtheprobabilitiesforeachclass.Thepredictedclasswillhavethehighestvalue.Let’smoveontoaslightlymoreusefulexample:animalclassification.

Page 305: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ANN-imalclassificationPickingupfromwhereweleftoff,let’sillustrateaverysimpleexampleofanANNthatattemptstoclassifyanimalsbasedontheirstatistics(weight,length,andteeth).Myintentistodescribeamockreal-lifescenariotoimproveourunderstandingofANNsbeforewestartapplyingittocomputervisionand,specifically,OpenCV:

importcv2

importnumpyasnp

fromrandomimportrandint

animals_net=cv2.ml.ANN_MLP_create()

animals_net.setTrainMethod(cv2.ml.ANN_MLP_RPROP|

cv2.ml.ANN_MLP_UPDATE_WEIGHTS)

animals_net.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)

animals_net.setLayerSizes(np.array([3,8,4]))

animals_net.setTermCriteria((cv2.TERM_CRITERIA_EPS|

cv2.TERM_CRITERIA_COUNT,10,1))

"""Inputarrays

weight,length,teeth

"""

"""Outputarrays

dog,eagle,dolphinanddragon

"""

defdog_sample():

return[randint(5,20),1,randint(38,42)]

defdog_class():

return[1,0,0,0]

defcondor_sample():

return[randint(3,13),3,0]

defcondor_class():

return[0,1,0,0]

defdolphin_sample():

return[randint(30,190),randint(5,15),randint(80,100)]

defdolphin_class():

return[0,0,1,0]

defdragon_sample():

return[randint(1200,1800),randint(15,40),randint(110,180)]

defdragon_class():

return[0,0,0,1]

SAMPLES=5000

forxinrange(0,SAMPLES):

print"Samples%d/%d"%(x,SAMPLES)

Page 306: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

animals_net.train(np.array([dog_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dog_class()],dtype=np.float32))

animals_net.train(np.array([condor_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([condor_class()],dtype=np.float32))

animals_net.train(np.array([dolphin_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dolphin_class()],dtype=np.float32))

animals_net.train(np.array([dragon_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dragon_class()],dtype=np.float32))

printanimals_net.predict(np.array([dog_sample()],dtype=np.float32))

printanimals_net.predict(np.array([condor_sample()],dtype=np.float32))

printanimals_net.predict(np.array([dragon_sample()],dtype=np.float32))

Thereareagoodfewdifferencesbetweenthisexampleandthedummyexample,solet’sexaminetheminorder.

First,theusualimports.Then,weimportrandint,justbecausewewanttogeneratesomerelativelyrandomdata:

importcv2

importnumpyasnp

fromrandomimportrandint

Then,wecreatetheANN.Thistime,wespecifythetrainmethodtoberesilientbackpropagation(animprovedversionofbackpropagation)andtheactivationfunctiontobeasigmoidfunction:

animals_net=cv2.ml.ANN_MLP_create()

animals_net.setTrainMethod(cv2.ml.ANN_MLP_RPROP|

cv2.ml.ANN_MLP_UPDATE_WEIGHTS)

animals_net.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)

animals_net.setLayerSizes(np.array([3,8,4]))

Also,wespecifytheterminationcriteriasimilarlytothewaywedidintheCAMShiftalgorithminthepreviouschapter:

animals_net.setTermCriteria((cv2.TERM_CRITERIA_EPS|

cv2.TERM_CRITERIA_COUNT,10,1))

Nowweneedsomedata.We’renotreallysomuchinterestedinrepresentinganimalsaccurately,asrequiringabunchofrecordstobeusedastrainingdata.Sowebasicallydefinefoursamplecreationfunctionsandfourclassificationfunctionsthatwillhelpustrainthenetwork:

"""Inputarrays

weight,length,teeth

"""

"""Outputarrays

dog,eagle,dolphinanddragon

"""

defdog_sample():

return[randint(5,20),1,randint(38,42)]

Page 307: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

defdog_class():

return[1,0,0,0]

defcondor_sample():

return[randint(3,13),3,0]

defcondor_class():

return[0,1,0,0]

defdolphin_sample():

return[randint(30,190),randint(5,15),randint(80,100)]

defdolphin_class():

return[0,0,1,0]

defdragon_sample():

return[randint(1200,1800),randint(15,40),randint(110,180)]

defdragon_class():

return[0,0,0,1]

Let’sproceedwiththecreationofourfakeanimaldata;we’llcreate5,000samplesperclass:

SAMPLES=5000

forxinrange(0,SAMPLES):

print"Samples%d/%d"%(x,SAMPLES)

animals_net.train(np.array([dog_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dog_class()],dtype=np.float32))

animals_net.train(np.array([condor_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([condor_class()],dtype=np.float32))

animals_net.train(np.array([dolphin_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dolphin_class()],dtype=np.float32))

animals_net.train(np.array([dragon_sample()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([dragon_class()],dtype=np.float32))

Intheend,weprinttheresultsthatyieldthefollowingcode:

(1.0,array([[1.49817729,1.60551953,-1.56444871,-0.04313202]],

dtype=float32))

(1.0,array([[1.49817729,1.60551953,-1.56444871,-0.04313202]],

dtype=float32))

(3.0,array([[-1.54576635,-1.68725526,1.6469276,2.23223686]],

dtype=float32))

Fromtheseresults,wededucethefollowing:

Thenetworkgottwooutofthreesamplescorrect,whichisnotperfectbutservesasagoodexampletoillustratetheimportanceofalltheelementsinvolvedinbuildingandtraininganANN.Thesizeoftheinputlayerisveryimportanttocreatediversificationbetweenthedifferentclasses.Inourcase,weonlyhadthreestatisticsandthereisarelativeoverlappinginfeatures.Thesizeofthehiddenlayerneedstobetested.Youwillfindthatincreasingneuronsmayimproveaccuracytoapoint,andthenitwilloverfit,unlessyoustartcompensatingwithenormousamountsofdata:thenumberoftrainingrecords.

Page 308: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Definitely,avoidhavingtoofewrecordsorfeedingalotofidenticalrecordsastheANNwon’tlearnmuchfromthem.

Page 309: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TrainingepochsAnotherimportantconceptintrainingANNsistheideaofepochs.Atrainingepochisaniterationthroughthetrainingdata,afterwhichthedataistestedforclassification.MostANNstrainoverseveralepochs;you’llfindthatsomeofthemostcommonexamplesofANNs,classifyinghandwrittendigits,willhavethetrainingdataiteratedseveralhundredtimes.

IpersonallysuggestyouspendalotoftimeplayingwithANNsandthenumberofepochs,untilyoureachconvergence,whichmeansthatfurtheriterationswillnolongerimprove(atleastnotnoticeably)theaccuracyoftheresults.

Theprecedingexamplecanbemodifiedasfollowstoleverageepochs:

defrecord(sample,classification):

return(np.array([sample],dtype=np.float32),np.array([classification],

dtype=np.float32))

records=[]

RECORDS=5000

forxinrange(0,RECORDS):

records.append(record(dog_sample(),dog_class()))

records.append(record(condor_sample(),condor_class()))

records.append(record(dolphin_sample(),dolphin_class()))

records.append(record(dragon_sample(),dragon_class()))

EPOCHS=5

foreinrange(0,EPOCHS):

print"Epoch%d:"%e

fort,cinrecords:

animals_net.train(t,cv2.ml.ROW_SAMPLE,c)

Then,dosometests,startingwiththedogclass:

dog_results=0

forxinrange(0,100):

clas=int(animals_net.predict(np.array([dog_sample()],

dtype=np.float32))[0])

print"class:%d"%clas

if(clas)==0:

dog_results+=1

Repeatoverallclassesandoutputtheresults:

print"Dogaccuracy:%f"%(dog_results)

print"condoraccuracy:%f"%(condor_results)

print"dolphinaccuracy:%f"%(dolphin_results)

print"dragonaccuracy:%f"%(dragon_results)

Finally,weobtainthefollowingresults:

Dogaccuracy:100.000000%

condoraccuracy:0.000000%

dolphinaccuracy:0.000000%

dragonaccuracy:92.000000%

Page 310: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Considerthefactthatwe’reonlyplayingwithtoy/fakedataandthesizeoftrainingdata/trainingiterations;thisteachesusquitealot.WecandiagnosetheANNasoverfittingtowardscertainclasses,soit’simportanttoimprovethequalityofthedatayoufeedintothetrainingprocess.

Allthatsaid,timeforareal-lifeexample:handwrittendigitrecognition.

Page 311: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 312: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

HandwrittendigitrecognitionwithANNsTheworldofMachineLearningisvastandmostlyunexplored,andANNsarebutoneofthemanyconceptsrelatedtoMachineLearning,whichisoneofthemanysubdisciplinesofArtificialIntelligence.Forthepurposeofthischapter,wewillonlybeexploringtheconceptofANNsinthecontextofOpenCV.ItisbynomeansanexhaustivetreatiseonthesubjectofArtificialIntelligence.

Ultimately,we’reinterestedinseeingANNsworkintherealworld.Solet’sgoaheadandmakeithappen.

Page 313: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

MNIST–thehandwrittendigitdatabaseOneofthemostpopularresourcesontheWebforthetrainingofclassifiersdealingwithOCRandhandwrittencharacterrecognitionistheMNISTdatabase,publiclyavailableathttp://yann.lecun.com/exdb/mnist/.

Thisparticulardatabaseisafreelyavailableresourcetokick-startthecreationofaprogramthatutilizesANNstorecognizehandwrittendigits.

Page 314: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CustomizedtrainingdataItisalwayspossibletobuildyourowntrainingdata.Itwilltakealittlebitofpatiencebutit’sfairlyeasy;collectavastnumberofhandwrittendigitsandcreateimagescontainingasingledigit,makingsurealltheimagesarethesamesizeandingrayscale.

Afterthis,youwillhavetocreateamechanismthatkeepsatrainingsampleinsyncwiththeexpectedclassification.

Page 315: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TheinitialparametersLet’stakealookattheindividuallayersinthenetwork:

InputlayerHiddenlayerOutputlayer

TheinputlayerSincewe’regoingtoutilizetheMNISTdatabase,theinputlayerwillhaveasizeof784inputnodes:that’sbecauseMNISTsamplesare28x28pixelimages,whichmeans784pixels.

ThehiddenlayerAswehaveseen,there’snohard-and-fastruleforthesizeofthehiddenlayer,I’vefound—throughseveralattempts—that50to60nodesyieldsthebestresultwhilenotnecessitatinganinordinateamountoftrainingdata.

Youcanincreasethesizeofthehiddenlayerwiththeamountofdata,butbeyondacertainpoint,therewillbenoadvantagetothat;youwillalsohavetobepreparedforyournetworktotakehourstotrain(themorehiddenneurons,thelongerittakestotrainthenetwork).

TheoutputlayerTheoutputlayerwillhaveasizeof10.Thisshouldnotbeasurpriseaswewanttoclassify10digits(0-9).

Page 316: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

TrainingepochsWewillinitiallyusetheentiresetofthetraindatafromMNIST,whichconsistsofover60,000handwrittenimages,halfofwhichwerewrittenbyUSgovernmentemployees,andtheotherhalfbyhigh-schoolstudents.That’salotofdata,sowewon’tneedmorethanoneepochtoachieveanacceptablyhighaccuracyondetection.

Fromthereon,itisuptoyoutotrainthenetworkiterativelyonthesametraindata,andmysuggestionisthatyouuseanaccuracytest,andfindtheepochatwhichtheaccuracy“peaks”.Bydoingso,youwillhaveaprecisemeasurementofthehighestpossibleaccuracyachievedbyyournetworkgivenitscurrentconfiguration.

Page 317: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

OtherparametersWewilluseasigmoidactivationfunction,ResilientBackPropagation(RPROP),andextendtheterminationcriteriaforeachcalculationto20iterationsinsteadof10,likewedidforeveryotheroperationinthisbookthatinvolvedcv2.TermCriteria.

NoteImportantnotesontraindataandANNslibraries

ExploringtheInternetforsources,IfoundanamazingarticlebyMichaelNielsenathttp://neuralnetworksanddeeplearning.com/chap1.html,whichillustrateshowtowriteanANNlibraryfromscratch,andthecodeforthislibraryisfreelyavailableonGitHubathttps://github.com/mnielsen/neural-networks-and-deep-learning;thisisthesourcecodeforabook,NeuralNetworksandDeepLearning,byMichaelNielsen.

Inthedatafolder,youwillfindapicklefile,signifyingdatathathasbeensavedtodiskthroughthepopularPythonlibrary,cPickle,whichmakesloadingandsavingthePythondataatrivialtask.

ThispicklefileisacPicklelibrary-serializedversionoftheMNISTdataand,asitissousefulandreadytoworkwith,Istronglysuggestyouusethat.NothingstopsyoufromloadingtheMNISTdatasetbuttheprocessofdeserializingthetrainingdataisquitetediousand—strictlyspeaking—outsidetheremitofthisbook.

Second,IwouldliketopointoutthatOpenCVisnottheonlyPythonlibrarythatallowsyoutouseANNs,notbyanystretchoftheimagination.TheWebisfullofalternativesthatIstronglyencourageyoutotryout,mostnotablyPyBrain,alibrarycalledLasagna(which—asanItalian—Ifindexceptionallyattractive)andmanycustom-writtenimplementations,suchastheaforementionedMichaelNielsen’simplementation.

Enoughintroductorydetails,though.Let’sgetgoing.

Page 318: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Mini-librariesSettingupanANNinOpenCVisnotdifficult,butyouwillalmostdefinitelyfindyourselftrainingyournetworkcountlesstimes,insearchofthatelusivepercentagepointthatbooststheaccuracyofyourresults.

Toautomatethisasmuchaspossible,wewillbuildamini-librarythatwrapstheOpenCV’snativeimplementationofANNsandletsusrerunandretrainthenetworkeasily.

Here’sanexampleofawrapperlibrary:

importcv2

importcPickle

importnumpyasnp

importgzip

defload_data():

mnist=gzip.open('./data/mnist.pkl.gz','rb')

training_data,classification_data,test_data=cPickle.load(mnist)

mnist.close()

return(training_data,classification_data,test_data)

defwrap_data():

tr_d,va_d,te_d=load_data()

training_inputs=[np.reshape(x,(784,1))forxintr_d[0]]

training_results=[vectorized_result(y)foryintr_d[1]]

training_data=zip(training_inputs,training_results)

validation_inputs=[np.reshape(x,(784,1))forxinva_d[0]]

validation_data=zip(validation_inputs,va_d[1])

test_inputs=[np.reshape(x,(784,1))forxinte_d[0]]

test_data=zip(test_inputs,te_d[1])

return(training_data,validation_data,test_data)

defvectorized_result(j):

e=np.zeros((10,1))

e[j]=1.0

returne

defcreate_ANN(hidden=20):

ann=cv2.ml.ANN_MLP_create()

ann.setLayerSizes(np.array([784,hidden,10]))

ann.setTrainMethod(cv2.ml.ANN_MLP_RPROP)

ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)

ann.setTermCriteria((cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,

20,1))

returnann

deftrain(ann,samples=10000,epochs=1):

tr,val,test=wrap_data()

forxinxrange(epochs):

counter=0

forimgintr:

Page 319: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

if(counter>samples):

break

if(counter%1000==0):

print"Epoch%d:Trained%d/%d"%(x,counter,samples)

counter+=1

data,digit=img

ann.train(np.array([data.ravel()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([digit.ravel()],dtype=np.float32))

print"Epoch%dcomplete"%x

returnann,test

deftest(ann,test_data):

sample=np.array(test_data[0][0].ravel(),dtype=np.float32).reshape(28,

28)

cv2.imshow("sample",sample)

cv2.waitKey()

printann.predict(np.array([test_data[0][0].ravel()],dtype=np.float32))

defpredict(ann,sample):

resized=sample.copy()

rows,cols=resized.shape

if(rows!=28orcols!=28)androws*cols>0:

resized=cv2.resize(resized,(28,28),interpolation=

cv2.INTER_CUBIC)

returnann.predict(np.array([resized.ravel()],dtype=np.float32))

Let’sexamineitinorder.First,theload_data,wrap_data,andvectorized_resultfunctionsareincludedinMichaelNielsen’scodeforloadingthepicklefile.

It’sarelativelystraightforwardloadingofapicklefile.Mostnotably,though,theloadeddatahasbeensplitintothetrainandtestdata.Bothtrainandtestdataarearrayscontainingtwo-elementtuples:thefirstoneisthedataitself;thesecondoneistheexpectedclassification.SowecanusethetraindatatotraintheANNandthetestdatatoevaluateitsaccuracy.

Thevectorized_resultfunctionisaverycleverfunctionthat—givenanexpectedclassification—createsa10-elementarrayofzeros,settingasingle1fortheexpectedresult.This10-elementarray,youmayhaveguessed,willbeusedasaclassificationfortheoutputlayer.

ThefirstANN-relatedfunctioniscreate_ANN:

defcreate_ANN(hidden=20):

ann=cv2.ml.ANN_MLP_create()

ann.setLayerSizes(np.array([784,hidden,10]))

ann.setTrainMethod(cv2.ml.ANN_MLP_RPROP)

ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM)

ann.setTermCriteria((cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,

20,1))

returnann

ThisfunctioncreatesanANNspecificallygearedtowardshandwrittendigitrecognitionwithMNIST,byspecifyinglayersizesasillustratedintheInitialparameterssection.

Page 320: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Wenowneedatrainingfunction:

deftrain(ann,samples=10000,epochs=1):

tr,val,test=wrap_data()

forxinxrange(epochs):

counter=0

forimgintr:

if(counter>samples):

break

if(counter%1000==0):

print"Epoch%d:Trained%d/%d"%(x,counter,samples)

counter+=1

data,digit=img

ann.train(np.array([data.ravel()],dtype=np.float32),

cv2.ml.ROW_SAMPLE,np.array([digit.ravel()],dtype=np.float32))

print"Epoch%dcomplete"%x

returnann,test

Again,thisisquitesimple:givenanumberofsamplesandtrainingepochs,weloadthedata,andtheniteratethroughthesamplesanx-number-of-epochstimes.

Theimportantsectionofthisfunctionisthedeconstructionofthesingletrainingrecordintothetraindataandanexpectedclassification,whichisthenpassedintotheANN.

Todoso,weutilizethenumpyarrayfunction,ravel(),whichtakesanarrayofanyshapeand“flattens”itintoasingle-rowarray.So,forexample,considerthisarray:

data=[[1,2,3],[4,5,6],[7,8,9]]

Theprecedingarrayonce“raveled”,becomesthefollowingarray:

[1,2,3,4,5,6,7,8,9]

ThisistheformatthatOpenCV’sANNexpectsdatatolooklikeinitstrain()method.

Finally,wereturnboththenetworkandtestdata.Wecouldhavejustreturnedthedata,buthavingthetestdataathandforaccuracycheckingisquiteuseful.

Thelastfunctionweneedisapredict()functiontowrapANN’sownpredict()method:

defpredict(ann,sample):

resized=sample.copy()

rows,cols=resized.shape

if(rows!=28orcols!=28)androws*cols>0:

resized=cv2.resize(resized,(28,28),interpolation=

cv2.INTER_CUBIC)

returnann.predict(np.array([resized.ravel()],dtype=np.float32))

ThisfunctiontakesanANNandasampleimage;itoperatesaminimumof“sanitization”bymakingsuretheshapeofthedataisasexpectedandresizingitifit’snot,andthenravelingitforasuccessfulprediction.

Page 321: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThefileIcreatedalsocontainsatestfunctiontoverifythatthenetworkworksanditdisplaysthesampleprovidedforclassification.

Page 322: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ThemainfileThiswholechapterhasbeenanintroductoryjourneyleadingustothispoint.Infact,manyofthetechniqueswe’regoingtousearefrompreviouschapters,soinawaytheentirebookhasledustothispoint.Solet’sputallourknowledgetogooduse.

Let’stakeaninitiallookatthefile,andthendecomposeitforabetterunderstanding:

importcv2

importnumpyasnp

importdigits_annasANN

definside(r1,r2):

x1,y1,w1,h1=r1

x2,y2,w2,h2=r2

if(x1>x2)and(y1>y2)and(x1+w1<x2+w2)and(y1+h1<y2+h2):

returnTrue

else:

returnFalse

defwrap_digit(rect):

x,y,w,h=rect

padding=5

hcenter=x+w/2

vcenter=y+h/2

if(h>w):

w=h

x=hcenter-(w/2)

else:

h=w

y=vcenter-(h/2)

return(x-padding,y-padding,w+padding,h+padding)

ann,test_data=ANN.train(ANN.create_ANN(56),20000)

font=cv2.FONT_HERSHEY_SIMPLEX

path="./images/numbers.jpg"

img=cv2.imread(path,cv2.IMREAD_UNCHANGED)

bw=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

bw=cv2.GaussianBlur(bw,(7,7),0)

ret,thbw=cv2.threshold(bw,127,255,cv2.THRESH_BINARY_INV)

thbw=cv2.erode(thbw,np.ones((2,2),np.uint8),iterations=2)

image,cntrs,hier=cv2.findContours(thbw.copy(),cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

rectangles=[]

forcincntrs:

r=x,y,w,h=cv2.boundingRect(c)

a=cv2.contourArea(c)

b=(img.shape[0]-3)*(img.shape[1]-3)

is_inside=False

forqinrectangles:

ifinside(r,q):

Page 323: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

is_inside=True

break

ifnotis_inside:

ifnota==b:

rectangles.append(r)

forrinrectangles:

x,y,w,h=wrap_digit(r)

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

roi=thbw[y:y+h,x:x+w]

try:

digit_class=int(ANN.predict(ann,roi.copy())[0])

except:

continue

cv2.putText(img,"%d"%digit_class,(x,y-1),font,1,(0,255,0))

cv2.imshow("thbw",thbw)

cv2.imshow("contours",img)

cv2.imwrite("sample.jpg",img)

cv2.waitKey()

Aftertheinitialusualimports,weimportthemini-librarywecreated,whichisstoredindigits_ann.py.

Ifinditgoodpracticetodefinefunctionsatthetopofthefile,solet’sexaminethose.Theinside()functiondetermineswhetherarectangleisentirelycontainedinanotherrectangle:

definside(r1,r2):

x1,y1,w1,h1=r1

x2,y2,w2,h2=r2

if(x1>x2)and(y1>y2)and(x1+w1<x2+w2)and(y1+h1<y2+h2):

returnTrue

else:

returnFalse

Thewrap_digit()functiontakesarectanglethatsurroundsadigit,turnsitintoasquare,andcentersitonthedigititself,with5-pointpaddingtomakesurethedigitisentirelycontainedinit:

defwrap_digit(rect):

x,y,w,h=rect

padding=5

hcenter=x+w/2

vcenter=y+h/2

if(h>w):

w=h

x=hcenter-(w/2)

else:

h=w

y=vcenter-(h/2)

return(x-padding,y-padding,w+padding,h+padding)

Thepointofthisfunctionwillbecomeclearerlateron;let’snotdwellonittoomuchatthe

Page 324: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

moment.

Now,let’screatethenetwork.Wewilluse58hiddennodes,andtrainover20,000samples:

ann,test_data=ANN.train(ANN.create_ANN(58),20000)

Thisisgoodenoughforapreliminarytesttokeepthetrainingtimedowntoaminuteortwo(dependingontheprocessingpowerofyourmachine).Theidealistousethefullsetoftrainingdata(50,000),anditeratethroughitseveraltimes,untilsomeconvergenceisreached(aswediscussedearlier,theaccuracy“peak”).Youwoulddothisbycallingthefollowingfunction:

ann,test_data=ANN.train(ANN.create_ANN(100),50000,30)

Wecannowpreparethedatatotest.Todothat,we’regoingtoloadanimage,andcleanupalittle:

path="./images/numbers.jpg"

img=cv2.imread(path,cv2.IMREAD_UNCHANGED)

bw=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

bw=cv2.GaussianBlur(bw,(7,7),0)

Nowthatwehaveagrayscalesmoothedimage,wecanapplyathresholdandsomemorphologyoperationstomakesurethenumbersareproperlystandingoutfromthebackgroundandrelativelycleanedupforirregularities,whichmightthrowthepredictionoperationoff:

ret,thbw=cv2.threshold(bw,127,255,cv2.THRESH_BINARY_INV)

thbw=cv2.erode(thbw,np.ones((2,2),np.uint8),iterations=2)

NoteNotethethresholdflag,whichisforaninversebinarythreshold:asthesamplesoftheMNISTdatabasearewhiteonblack(andnotblackonwhite),weturntheimageintoablackbackgroundwithwhitenumbers.

Afterthemorphologyoperation,weneedtoidentifyandseparateeachnumberinthepicture.Todothis,wefirstidentifythecontoursintheimage:

image,cntrs,hier=cv2.findContours(thbw.copy(),cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

Then,weiteratethroughthecontours,anddiscardalltherectanglesthatareentirelycontainedinotherrectangles;weonlyappendtothelistofgoodrectanglestheonesthatarenotcontainedinotherrectanglesandarealsonotaswideastheimageitself.Insomeofthetests,findContoursyieldedtheentireimageasacontouritself,whichmeantnootherrectanglepassedtheinsidetest:

rectangles=[]

forcincntrs:

r=x,y,w,h=cv2.boundingRect(c)

a=cv2.contourArea(c)

Page 325: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

b=(img.shape[0]-3)*(img.shape[1]-3)

is_inside=False

forqinrectangles:

ifinside(r,q):

is_inside=True

break

ifnotis_inside:

ifnota==b:

rectangles.append(r)

Nowthatwehavealistofgoodrectangles,wecaniteratethroughthemanddefinearegionofinterestforeachoftherectanglesweidentified:

forrinrectangles:

x,y,w,h=wrap_digit(r)

Thisiswherethewrap_digit()functionwedefinedatthebeginningoftheprogramcomesintoplay:weneedtopassasquareregionofinteresttothepredictorfunction;ifwesimplyresizedarectangleintoasquare,we’druinourtestdata.

Youmaywonderwhythinkofthenumberone.Arectanglesurroundingthenumberonewouldbeverynarrow,especiallyifithasbeendrawnwithouttoomuchofaleantoeitherside.Ifyousimplyresizedittoasquare,youwould“fatten”thenumberoneinsuchawaythatnearlytheentiresquarewouldturnblack,renderingthepredictionimpossible.Instead,wewanttocreateasquarearoundtheidentifiednumber,whichisexactlywhatwrap_digit()does.

Thisapproachisquick-and-dirty;itallowsustodrawasquarearoundanumberandsimultaneouslypassthatsquareasaregionofinterestfortheprediction.Apurerapproachwouldbetotaketheoriginalrectangleand“center”itintoasquarenumpyarraywithrowsandcolumnsequaltothelargerofthetwodimensionsoftheoriginalrectangle.Thereasonforthisisyouwillnoticethatsomeofthesquarewillincludetinybitsofadjacentnumbers,whichcanthrowthepredictionoff.Withasquarecreatedfromanp.zeros()function,noimpuritieswillbeaccidentallydraggedin:

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

roi=thbw[y:y+h,x:x+w]

try:

digit_class=int(ANN.predict(ann,roi.copy())[0])

except:

continue

cv2.putText(img,"%d"%digit_class,(x,y-1),font,1,(0,255,0))

Oncethepredictionforthesquareregioniscomplete,wedrawitontheoriginalimage:

cv2.imshow("thbw",thbw)

cv2.imshow("contours",img)

cv2.imwrite("sample.jpg",img)

cv2.waitKey()

Andthat’sit!Thefinalresultwilllooksimilartothis:

Page 326: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 327: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 328: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PossibleimprovementsandpotentialapplicationsWehaveillustratedhowtobuildanANN,feedittrainingdata,anduseitforclassification.Thereareanumberofaspectswecanimprove,dependingonthetaskathand,andanumberofpotentialapplicationsofournew-foundknowledge.

Page 329: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

ImprovementsThereareanumberofimprovementsthatcanbeappliedtothisapproach,someofwhichwehavealreadydiscussed:

Forexample,youcouldenlargeyourdatasetanditeratemoretimes,untilaperformancepeakisreachedYoucouldalsoexperimentwiththeseveralactivationfunctions(cv2.ml.ANN_MLP_SIGMOID_SYMisnottheonlyone;thereisalsocv2.ml.ANN_MLP_IDENTITYandcv2.ml.ANN_MLP_GAUSSIAN)Youcouldutilizedifferenttrainingflags(cv2.ml.ANN_MLP_UPDATE_WEIGHTS,cv2.ml.ANN_MLP_NO_INPUT_SCALE,cv2.ml.ANN_MLP_NO_OUTPUT_SCALE),andtrainingmethods(backpropagationorresilientbackpropagation)

Asidefromthat,bearinmindoneofthemantrasofsoftwaredevelopment:thereisnosinglebesttechnology,thereisonlythebesttoolforthejobathand.So,carefulanalysisoftheapplicationrequirementswillleadyoutothebestchoicesofparameters.Forexample,noteveryonedrawsdigitsthesameway.Infact,youwillevenfindthatsomecountriesdrawnumbersinaslightlydifferentway.

TheMNISTdatabasewascompiledintheUS,inwhichthenumbersevenisdrawnlikethecharacter7.Butyouwillfindthatthenumber7inEuropeisoftendrawnwithasmallhorizontallinehalfwaythroughthediagonalportionofthenumber,whichwasintroducedtodistinguishitfromthenumber1.

NoteForamoredetailedoverviewofregionalhandwritingvariations,checktheWikipediaarticleonthesubject,whichisagoodintroduction,availableathttps://en.wikipedia.org/wiki/Regional_handwriting_variation.

ThismeanstheMNISTdatabasehaslimitedaccuracywhenappliedtoEuropeanhandwriting;somenumberswillbeclassifiedmoreaccuratelythanothers.Soyoumayendupcreatingyourowndataset.Inalmostallcircumstances,itispreferabletoutilizethetraindatathat’srelevantandbelongstothecurrentapplicationdomain.

Finally,rememberthatonceyou’rehappywiththeaccuracyofyournetwork,youcanalwayssaveitandreloaditlater,soitcanbeutilizedinthird-partyapplicationswithouthavingtotraintheANNeverytime.

PotentialapplicationsTheprecedingprogramisonlythefoundationofahandwritingrecognitionapplication.Straightaway,youcanquicklyextendtheearlierapproachtovideosanddetecthandwrittendigitsinreal-time,oryoucouldtrainyourANNtorecognizetheentirealphabetforafull-blownOCRsystem.

Carregistrationplatedetectionseemslikeanobviousextensionofthelessonslearnedtothispoint,anditshouldbeaneveneasierdomaintoworkwith,asregistrationplatesuseconsistentcharacters.

Page 330: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Also,foryourownedificationorbusinesspurposes,youmaytrytobuildaclassifierwithANNsandplainSVMs(withfeaturedetectorssuchasSIFT)andseehowtheybenchmark.

Page 331: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.
Page 332: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

SummaryInthischapter,wescratchedthesurfaceofthevastandfascinatingworldofANNs,focusingonOpenCV’simplementationofit.WelearnedaboutthestructureofANNs,andhowtodesignanetworktopologybasedonapplicationrequirements.

Finally,weutilizedvariousconceptsthatweexploredinthepreviouschapterstobuildahandwrittendigitrecognitionapplication.

Page 333: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Toboldlygo…IhopeyouenjoyedthejourneythroughthePythonbindingsforOpenCV3.AlthoughcoveringOpenCV3wouldtakeaseriesofbooks,weexploredveryfascinatingandfuturisticconcepts,andIencourageyoutogetintouchandletmeandtheOpenCVcommunityknowwhatyournextgroundbreakingcomputer-vision-basedprojectis!

Page 334: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

IndexA

additiveabout/AquicknoteonBGR

applicationmodifying/Modifyingtheapplication

artificialneuralnetworks(ANN)about/Artificialneuralnetworksstatisticalmodel/Artificialneuralnetworksperceptrons/Neuronsandperceptronsstructure/ThestructureofanANNnetworklayers,byexample/NetworklayersbyexampleinOpenCV/ANNsinOpenCVANN-imalclassification/ANN-imalclassificationtrainingepochs/Trainingepochsdigitrecognition,handwritten/HandwrittendigitrecognitionwithANNshandwrittendigitrecognition/HandwrittendigitrecognitionwithANNsURL/Otherparametersimprovements/Improvementspotentialapplications/Potentialapplications

AT&TURL/Performingfacerecognition

Page 335: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Bbackgroundsubtractors

about/Backgroundsubtractors–KNN,MOG2,andGMGcode/Backtothecode

Bagofwords(BOW)about/Bag-of-words,BOWincomputervisionincomputervision/BOWincomputervisionkmeansclustering/Thek-meansclustering

BGRabout/AquicknoteonBGR

BinaryRobustIndependentElementaryFeatures(BRIEF)algorithm/BRIEFBlue-green-red(BGR)/Reading/writinganimagefileBRIEF/FeaturedetectionalgorithmsBrute-Forcematching/Brute-Forcematching

Page 336: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

CcalcBackProjectfunction

about/ThecalcBackProjectfunctioncalcHistfunction/ThecalcHistfunctionCameo

about/ProjectCameo(facetrackingandimagemanipulation),Cameo–anobject-orienteddesignvideostreamabstracting,managers.CaptureManagerused/Abstractingavideostreamwithmanagers.CaptureManagerwindowabstracting,managers.WindowManagerused/Abstractingawindowandkeyboardwithmanagers.WindowManagerkeyboardabstracting,managers.WindowManagerused/Abstractingawindowandkeyboardwithmanagers.WindowManagerclass/Applyingeverythingwithcameo.Cameo

cameraframescapturing/Capturingcameraframesdisplaying,inwindow/Displayingcameraframesinawindow

CAMShiftabout/Backtothecode,CAMShift

Cannyused,foredgedetection/EdgedetectionwithCanny

cardetectionabout/Detectingcars,Whatdidwejustdo?slidingwindows/SVMandslidingwindowsSVM/SVMandslidingwindowsinscene/Example–cardetectioninascenetesting/Dude,where’smycar?

cardetection,inscenedetector.py,examining/Examiningdetector.pytrainingdata,associatingwithclasses/Associatingtrainingdatawithclasses

cascade/ConceptualizingHaarcascadeschannels

depthmap/Capturingframesfromadepthcamerapointcloudmap/Capturingframesfromadepthcameradisparitymap/Capturingframesfromadepthcameravaliddepthmask/Capturingframesfromadepthcamera

circledetectionabout/Lineandcircledetection,Circledetection

clustering/Thek-meansclusteringcolorhistograms

about/ColorhistogramscalcHist/ThecalcHistfunctioncalcBackProject/ThecalcBackProjectfunction

Page 337: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

insummary/Insummarycolorspaces

converting,between/Convertingbetweendifferentcolorspacescomma-separatedvalue(CSV)file/Preparingthetrainingdatacontours

detection/Contourdetectionboundingbox/Contours–boundingbox,minimumarearectangle,andminimumenclosingcircleminimumenclosingcircle/Contours–boundingbox,minimumarearectangle,andminimumenclosingcircleminimumarearectangle/Contours–boundingbox,minimumarearectangle,andminimumenclosingcircleconvexcontours/Contours–convexcontoursandtheDouglas-Peuckeralgorithmdouglas-peuckeralgorithm/Contours–convexcontoursandtheDouglas-Peuckeralgorithm

Contribmodulesinstalling/InstallingtheContribmodules

convergenceURL/MeanshiftandCAMShift

convexcontours/Contours–convexcontoursandtheDouglas-Peuckeralgorithmconvolutionmatrix

about/Customkernels–gettingconvolutedcopyoperation

masking/Maskingacopyoperation

Page 338: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ddepthcamera

modules,creating/Creatingmodulesframes,capturing/Capturingframesfromadepthcamera

depthestimationwithnormalcamera/Depthestimationwithanormalcamera

depthmapabout/Capturingframesfromadepthcamera

DifferenceofGaussians(DoG)/FeatureextractionanddescriptionusingDoGandSIFTDiscreteFourierTransform(DFT)/TheFourierTransformdisparitymap/Capturingframesfromadepthcamera

mask,creatingfrom/Creatingamaskfromadisparitymapdouglas-peuckeralgorithm/Contours–convexcontoursandtheDouglas-Peuckeralgorithm

Page 339: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Eedge

detecting/Edgedetectiondetection,withCanny/EdgedetectionwithCanny

Eigenfacesabout/RecognizingfacesURL/Recognizingfacesrecognition,performing/PerforminganEigenfacesrecognition

ExtendedYaleorYaleBURL/Performingfacerecognition

Page 340: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ffacedetection

performing,OpneCVused/UsingOpenCVtoperformfacedetectionperforming,onstillimage/Performingfacedetectiononastillimageperforming,onvideo/Performingfacedetectiononavideoperforming/Performingfacerecognitiondata,generating/Generatingthedataforfacerecognitionfaces,recognizing/Recognizingfaces,Loadingthedataandrecognizingfacestrainingdata,preparing/Preparingthetrainingdatadata,loading/LoadingthedataandrecognizingfacesEigenfacesrecognition,performing/PerforminganEigenfacesrecognitionfacerecognition,performingwithFisherfaces/PerformingfacerecognitionwithFisherfacesfacerecognition,performingwithLBPH/PerformingfacerecognitionwithLBPHresults,discardingwithconfidencescore/Discardingresultswithconfidencescore

FAST/FeaturedetectionalgorithmsFastFourierTransform(FFT)/TheFourierTransformFastLibraryforApproximateNearestNeighbors(FLANN)/FLANN-basedmatching

matching,withhomography/FLANNmatchingwithhomographyfeaturedetectionalgorithms

about/Featuredetectionalgorithmsfeatures,defining/Definingfeaturescorners/Detectingfeatures–cornersDoGused/FeatureextractionanddescriptionusingDoGandSIFTSIFTused/FeatureextractionanddescriptionusingDoGandSIFTkeypoint,anatomy/AnatomyofakeypointFastHessianused/FeatureextractionanddetectionusingFastHessianandSURFSURFused/FeatureextractionanddetectionusingFastHessianandSURFORBfeaturedetection/ORBfeaturedetectionandfeaturematchingFeaturesfromAcceleratedSegmentTest(FAST)algorithm/FASTBinaryRobustIndependentElementaryFeatures(BRIEF)algorithm/BRIEFBrute-Forcematching/Brute-Forcematchingfeaturematching,withORB/FeaturematchingwithORBK-NearestNeighborsmatching,using/UsingK-NearestNeighborsmatchingFastLibraryforApproximateNearestNeighbors(FLANN)/FLANN-basedmatchingFLANNmatching,withhomography/FLANNmatchingwithhomography

features/ConceptualizingHaarcascadesFeaturesfromAcceleratedSegmentTest(FAST)algorithm/FAST

Page 341: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Fisherfacesabout/RecognizingfacesURL/Recognizingfacesfacerecognition,performingwith/PerformingfacerecognitionwithFisherfaces

FourierTransformabout/TheFourierTransformhighpassfilter(HPF)/Highpassfilterlowpassfilter(LPF)/Lowpassfilter

framescapturing,fromdepthcamera/Capturingframesfromadepthcamera

functionalprogramming(FP)solutions/Abriefdigression–functionalversusobject-orientedprogramming

Page 342: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ggeometricmultigrid(GMG)/Backgroundsubtractors–KNN,MOG2,andGMGGrabCutalgorithm

used,forobjectsegmentation/ObjectsegmentationusingtheWatershedandGrabCutalgorithmsforegrounddetection,example/ExampleofforegrounddetectionwithGrabCut

Page 343: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

HHaarcascades

conceptualizing/ConceptualizingHaarcascadesdata,getting/GettingHaarcascadedata

handwrittendigitrecognitionwithANNs/HandwrittendigitrecognitionwithANNsMNISTdatabase/MNIST–thehandwrittendigitdatabasetrainingdata,customized/Customizedtrainingdatainitialparameters/Theinitialparameterstrainingepochs/Trainingepochsotherparameters/Otherparametersmini-libraries/Mini-librariesmainfile/Themainfile

Harris/Featuredetectionalgorithmshiddenlayer

about/Thehiddenlayer,Thehiddenlayerhighpassfilter(HPF)/Highpassfilterhistogrambackprojection/ThecalcBackProjectfunctionHistogramofOrientedGradients(HOG)

about/HOGdescriptorsscaleissue/Thescaleissuelocationissue/Thelocationissueimagepyramid/Imagepyramidslidingwindows/Slidingwindowsnonmaximum(ornonmaxima)suppression/Non-maximum(ornon-maxima)suppressionsupportvectormachines/Supportvectormachines

Hue,Saturation,Value(HSV)about/Convertingbetweendifferentcolorspaces

Page 344: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

II/Oscripts

basic/BasicI/Oscriptsimagefile,reading/Reading/writinganimagefileimagefile,writing/Reading/writinganimagefileimageandrawbytes,conversions/Convertingbetweenanimageandrawbytesimagedata,accessingwithnumpy.array/Accessingimagedatawithnumpy.arrayvideofile,writing/Reading/writingavideofilevideofile,reading/Reading/writingavideofilecameraframes,capturing/Capturingcameraframesimages,displayinginwindow/Displayingimagesinawindowcameraframes,displayinginwindow/Displayingcameraframesinawindow

imageabout/Convertingbetweenanimageandrawbytesdisplaying,inwindow/Displayingimagesinawindow

imagedataaccessing,numpy.arrayused/Accessingimagedatawithnumpy.array

imagedescriptorssaving,tofile/Savingimagedescriptorstofilematches,scanningfor/Scanningformatches

imagefilereading/Reading/writinganimagefilewriting/Reading/writinganimagefile

imagesegmentationwithWatershedalgorithm/ImagesegmentationwiththeWatershedalgorithm

inputlayer/Theinputlayer,Theinputlayerinsummary

about/Insummary

Page 345: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

KK-NearestNeighbors(KNN)/UsingK-NearestNeighborsmatching,Backgroundsubtractors–KNN,MOG2,andGMGKalmanfilter

about/TheKalmanfilter,Wheredowegofromhere?predictphase/Predictandupdateupdatephase/Predictandupdateexample/Anexamplereal-lifeexample/Areal-lifeexample–trackingpedestriansPedestrianclass/ThePedestrianclassmainprogram/Themainprogram

Kalmanfilter,real-lifeexampleabout/Areal-lifeexample–trackingpedestriansapplicationworkflow/Theapplicationworkflowfunctional,versusobject-orientedprogramming/Abriefdigression–functionalversusobject-orientedprogramming

kernelsabout/Highpassfiltercustomkernels/Customkernels–gettingconvoluted

keyboardabstracting,managers.WindowManagerused/Abstractingawindowandkeyboardwithmanagers.WindowManager

kmeansclustering/Thek-meansclustering

Page 346: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

LLasagna/Otherparameterslearningalgorithms

about/Thelearningalgorithmssupervisedlearning/Thelearningalgorithmsunsupervisedlearning/Thelearningalgorithmsreinforcementlearning/Thelearningalgorithms

linesdetectionabout/Lineandcircledetection,Linedetection

LocalBinaryPatternURL/Recognizingfaces

LocalBinaryPatternHistogramsfacerecognition,performingwith/PerformingfacerecognitionwithLBPH

LocalBinaryPatternHistograms(LBPH)about/Recognizingfaces

lowpassfilter(LPF)/Lowpassfilter

Page 347: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Mmagnitudespectrum/TheFourierTransformmask

creating,fromdisparitymap/CreatingamaskfromadisparitymapMeanshift

about/MeanshiftandCAMShiftmixtureofGaussians(MOG2)/Backgroundsubtractors–KNN,MOG2,andGMGMNIST

about/MNIST–thehandwrittendigitdatabaseURL/MNIST–thehandwrittendigitdatabase

modulescreating/Creatingmodules

multilayerperceptron/ANNsinOpenCV

Page 348: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Nnetworklayers

byexample/Networklayersbyexampleinputlayer/Theinputlayeroutputlayer/Theoutputlayerhiddenlayer/Thehiddenlayer

nonmaximum(ornonmaxima)suppression/Non-maximum(ornon-maxima)suppressionnonmaximumsuppression(NMS)/EdgedetectionwithCannynumpy.array

used,foraccessingimagedata/Accessingimagedatawithnumpy.array

Page 349: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Oobjectdetection

about/Objectdetectionandrecognitiontechniquesobjectdetection,techniques

HistogramofOrientedGradients(HOG)/HOGdescriptorspeopledetection/Peopledetectioncreating/Creatingandtraininganobjectdetectortraining/CreatingandtraininganobjectdetectorBagofwords(BOW)/Bag-of-words

objectsmovingobjects,detecting/Detectingmovingobjectsmotiondetection/Basicmotiondetection

objectsegmentationWatershedalgorithmused/ObjectsegmentationusingtheWatershedandGrabCutalgorithmsGrabCutalgorithmused/ObjectsegmentationusingtheWatershedandGrabCutalgorithms

OpenSourceComputerVision(OCV)building,fromsource/BuildingOpenCVfromasourcedocumentation,URL/Findingdocumentation,help,andupdatescommunity,URL/Findingdocumentation,help,andupdates

OrientedFASTandRotatedBRIEF(ORB)/Featuredetectionalgorithmsfeaturedetectionandfeaturematching/ORBfeaturedetectionandfeaturematching

OSXinstallationon/InstallingonOSXMacPorts,usingwithready-madepackages/UsingMacPortswithready-madepackagesMacPorts,usingwithowncustompackages/UsingMacPortswithyourowncustompackagesHomebrew,usingwithready-madepackages/UsingHomebrewwithready-madepackages(nosupportfordepthcameras)Homebrew,usingwithowncustompackages/UsingHomebrewwithyourowncustompackages

outputlayer/Theoutputlayer,Theoutputlayeroverfitting/Thehiddenlayer

Page 350: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

PPedestrianclass

about/ThePedestrianclassperceptrons

about/Neuronsandperceptronspointcloudmap/CapturingframesfromadepthcameraPrincipalComponentAnalysis(PCA)

URL/RecognizingfacesPyBrain/Otherparameterspyramid/Imagepyramid

Page 351: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Rrawbytes

about/Convertingbetweenanimageandrawbytesred-green-blue(RGB)/Reading/writinganimagefileregionalhandwritingvariations

URL/Improvementsregionofinterest(ROI)/ThePedestrianclassregionsofinterests(ROI)/Accessingimagedatawithnumpy.arrayreinforcementlearning/Thelearningalgorithmsresilientbackpropagation(RPROP)/Otherparametersridge/Definingfeatures

Page 352: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Ssamples

running/RunningsamplesScale-InvariantFeatureTransform(SIFT)/FeatureextractionanddescriptionusingDoGandSIFTsemiglobalblockmatching/Depthestimationwithanormalcamerasetuptools

selecting/Choosingandusingtherightsetuptoolsinstallation,onWindows/InstallationonWindowsinstallation,onOSX/InstallingonOSXinstallation,onUbuntuandderivatives/InstallationonUbuntuanditsderivativesinstallation,onUnix-likesystems/InstallationonotherUnix-likesystems

shapesdetecting/Detectingshapes

SIFT/Featuredetectionalgorithmssigmoidfunction/Neuronsandperceptronsslidingwindows/SlidingwindowsStanfordUniversity

URL/Detectingcarsstatisticalmodel/Artificialneuralnetworks,ANNsinOpenCVStereoSGBMparameters

minDisparity/DepthestimationwithanormalcameranumDisparities/DepthestimationwithanormalcamerawindowSize/DepthestimationwithanormalcameraP1/DepthestimationwithanormalcameraP2/Depthestimationwithanormalcameradisp12MaxDiff/DepthestimationwithanormalcamerapreFilterCap/DepthestimationwithanormalcamerauniquenessRatio/DepthestimationwithanormalcameraspeckleWindowSize/DepthestimationwithanormalcameraspeckleRange/Depthestimationwithanormalcamera

subtractivecolormodelabout/AquicknoteonBGR

supervisedlearning/Thelearningalgorithmssupportvectormachines(SVM)/Non-maximum(ornon-maxima)suppression

URL/SupportvectormachinesSURF/Featuredetectionalgorithms

Page 353: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

UUbuntu

repository,using/UsingtheUbunturepository(nosupportfordepthcameras)UniversityofIllinois

URL/DetectingcarsUnix-likesystems

installationon/InstallationonotherUnix-likesystemsunsupervisedlearning/Thelearningalgorithms

Page 354: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

Vvaliddepthmask/CapturingframesfromadepthcameraVarianteAscari/FeatureextractionanddescriptionusingDoGandSIFTvideofile

reading/Reading/writingavideofilewriting/Reading/writingavideofile

videostreamabstracting,managers.CaptureManagerused/Abstractingavideostreamwithmanagers.CaptureManager

Page 355: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

WWatershedalgorithm

used,forobjectsegmentation/ObjectsegmentationusingtheWatershedandGrabCutalgorithmsused,forimagesegmentation/ImagesegmentationwiththeWatershedalgorithm

windowabstracting,managers.WindowManagerused/Abstractingawindowandkeyboardwithmanagers.WindowManager

Windowsinstallationon/InstallationonWindowsbinaryinstallers,using/Usingbinaryinstallers(nosupportfordepthcameras)CMake,using/UsingCMakeandcompilerscompilers,using/UsingCMakeandcompilers

windowsize/ConceptualizingHaarcascades

Page 356: Learning OpenCV 3 Computer Vision with Pythonread.pudn.com/downloads800/ebook/3158042/Learning... · Disha Haria Production Coordinator. Arvindkumar Gupta Cover Work Arvindkumar Gupta.

YYalefacedatabase(Yalefaces)

URL/Performingfacerecognition