CS 231A Section: Computer Vision Libraries...

Section 6- Computer Vision Libraries

CS231ASection:ComputerVisionLibrariesOverview

AmirSadeghianMay2017

1Amir Sadeghian


Overview

• Opencv• DeepLearningFrameworks

• Caffe• Torch• Tensorflow

Amir Sadeghian 2


OtherCVlibraries

• Vlfeat: AnOpensource librarywithpopular computervisionalgorithmsspecializing inimageunderstanding andlocalfeaturesextractionandmatching.

• scikit-learn:AnopensourcePythonlibrary thatimplementsarangeofmachine learning, preprocessing, cross-validationandvisualizationalgorithms.

• PCL: Astandalone, largescale,openprojectfor2D/3Dimageandpointcloudprocessing.

• SLAMframeworks(bundler, visualsfm,meshlab):Applicationsfor3Dreconstructionusingstructurefrommotion (SFM).

• Librariesforspecifictasks:e.g.trackinglibraries,Detectionlibraries…

Amir Sadeghian 3


OpenCV

Amir Sadeghian 4


IntroductiontoOpenCV

• Opensourcecomputervisionandmachinelearninglibrary• Containsimplementationsofalargenumberofvisionalgorithms• WrittennativelyinC++,alsohasC,Python,Java,andMATLABinterfaces

• SupportsWindows,Linux,MacOSX,Android,andiOS

Amir Sadeghian 5


Installation

• Downloadfromhttp://opencv.org andcompilefromsource• Windows:RunexecutabledownloadedfromOpenCV website• MacOSX:InstallthroughMacPorts,easy_install,…• Linux:Installthroughthepackagemanager(e.g.yum,apt)butmakesuretheversionissufficientlyup-to-dateforyourneeds

Amir Sadeghian 6


BasicOpenCV Structures

• Point,Point2f- 2DPoint

• Size - 2Dsizestructure

• Rect - 2Drectangleobject

• RotatedRect - Rect objectwithangle

• Mat - imageobject

Amir Sadeghian 7


Point

Amir Sadeghian 8

• 2DPointObject- int x,y;

• SampleFunctions

- Point.dot(<Point>)- computesdotproduct

- Point.inside(<Rect>)- returnstrueifpointisinside

Mathoperators,youmayuse-Pointoperator+

-Pointoperator+=

-Pointoperator-

-Pointoperator-=

-Pointoperator*

-Pointoperator*=

-booloperator==

-booloperator!=

-doublenorm


Size

• 2DSizeStructure- int width,height;

• Functions- Size.area()- returns(width*height)

Amir Sadeghian 9

Rect

• 2DRectangleStructure- int x,y,width,height;

• Functions

- Rect.tl()- returntopleftpoint

- Rect.br()- returnbottomrightpoint


cv::Mat

• TheprimarydatastructureinOpenCV istheMatobject.Itstoresimagesandtheircomponents.

• Mainitems

• rows,cols- lengthandwidth(int)

• channels- 1:grayscale,3:BGR

• depth:CV_<depth>C<num chan>

• Seethemanualsformoreinformation

Amir Sadeghian 10


cv::Mat- Functions

• Mat.at<datatype>(row,col)[channel]- returnspointer toimagelocation• Mat.channels()- returnsthenumberofchannels• Mat.clone()- returnsadeepcopyoftheimage• Mat.create(rows,cols,TYPE)- re-allocatesnewmemorytomatrix• Mat.cross(<Mat>)- computescrossproductoftwomatricies• Mat.depth()- returnsdatatypeofmatrix• Mat.dot(<Mat>)- computesthedotproductoftwomatrices

Amir Sadeghian 11


Pixeltypes

• PixelTypes showshowtheimageisrepresentedindata

• BGR- Thedefaultcolorofimread().Normal3channelcolor

• HSV- Hueiscolor,Saturationisamount,Valueislightness.3channels

• GRAYSCALE- Grayvalues,Singlechannel• OpenCV requiresthatimagesbeinBGRorGrayscaleinordertobeshownorsaved.Otherwise,undesirableeffectsmayappear.

Amir Sadeghian 12


ImageNormalizationandThresholding

• Normalizationremapsarangeofpixelvaluestoanotherrangeofpixelvalues

• voidnormalize(InputArray src,OutputArray dst,...)

• OpenCV providesageneralpurposemethodforthresholdinganimage

• doublethreshold(InputArray src,OutputArraydst,double thresh,double maxval,int type)

• Specifythresholding schemespecifiedbythetypevariable

Amir Sadeghian 13


ImageSmoothing

• Reducesthesharpnessofedgesandsmoothsoutdetailsinanimage

• OpenCV implementsseveralofthemostcommonlyusedmethods

• voidGaussianBlur(InputArray src,OutputArray dst,...)

• voidmedianBlur(InputArray src,OutputArray dst,...)

• Otherfunctionsincludegenericconvolution,separableconvolution,dilate,anderode.

Amir Sadeghian 14


ImageSmoothing:Code

Amir Sadeghian 15


ImageSmoothing:SampleImage

Amir Sadeghian 16

Original GaussianBlur MedianBlur


EdgeDetection

• OpenCV implementsanumberofoperatorstohelpdetectedgesinanimage

• SobelOperator• void cv::Sobel(image in, image out, CV_DEPTH, dx, dy);

• Scharr Operator• void cv::Scharr(image in, image out, CV_DEPTH, dx, dy);

• LaplacianOperator• void cv::Laplacian( image in, image out, CV_DEPTH);

• OpenCV alsoimplementsmulti-stageedgedetectionalgorithmssuchasCannyedgedetection

• Tip:Ifyourimageisnoisy,thenedgedetectionwilloftenexaggeratethenoise

• Sometimessmoothingtheimagebeforerunningedgedetectiongivesbetterresults

Amir Sadeghian 17


EdgeDetection:Code

Amir Sadeghian 18


EdgeDetection:SampleResults

Amir Sadeghian 19


FaceDetection:Viola-Jones

} Robustandfast} IntroducedbyPaulViolaandMichaelJones

} http://research.microsoft.com/~viola/Pubs/Detect/violaJones_CVPR2001.pdf

} Haar-likeFeatures

Amir Sadeghian 20


AndManyMore…

• ObjectTrackingusingOpenCV• HandwrittenDigitsClassification:AnOpenCV(C++/Python)Tutorial

• EyeDetectorusingOpenCV• ImageRecognitionandObjectDetection• HeadPoseEstimationusingOpenCV• ConfiguringQtforOpenCV• …

Amir Sadeghian 21


DeepLearningFrameworks

Amir Sadeghian 22


DeepLearningFrameworks

• Caffe• Torch/PyTorch

• NYU• scientificcomputingframeworkinLua• supportedbyFacebook

• TensorFlow• Google• Python

• Theano/Pylearn2• U.Montreal• Python• symboliccomputationandautomaticdifferentiation

• MatConvNet• OxfordU.• DeepLearninginMATLAB

Amir Sadeghian 23


FrameworkComparison

• Morealikethandifferent• Allexpressdeepmodels• Allareopen-source(contributions differ)• Mostincludescripting forhackingandprototyping

• Nostrictwinners,experimentandchoosetheframeworkthatbestfitsyourwork

Amir Sadeghian 24


Caffe:Overview

• WhatisCaffe?• Training/Finetuning asimplemodel• DeepdiveintoCaffe!

Amir Sadeghian 25


WhatisCaffe?

• A deeplearningframework

• Openframework,models,andworkedexamplesfordeeplearning

• 4000+citations,250+contributors,11,000+forks

• Focushasbeenvision,butbranchingout:sequences, reinforcementlearning,speech+text

Amir Sadeghian 26


Caffe

• PureC++/CUDAarchitecturefordeeplearning• commandline,Python,MATLABinterfaces

• Fast,well-testedcode• Tools,referencemodels,demos,andrecipes• SwitchbetweenCPUandGPU

• Caffe::set_mode(Caffe::GPU);

Amir Sadeghian 27


Installation

• http://caffe.berkeleyvision.org/installation.html •• CUDA,OPENCV• BLAS(BasicLinearAlgebraSubprograms):operationslikematrixmultiplication,matrixaddition,bothimplementationforCPU(cBLAS)andGPU(cuBLAS).providedbyMKL(INTEL),ATLAS,openBLAS,etc.

• Boost:ac++ library.>Usesomeofitsmathfunctionsandshared_pointer.

• glog,gflags providelogging&commandlineutilities.>Essentialfordebugging.

• leveldb,lmdb:databaseio foryourprogram.>Needtoknowthisforpreparingyourowndata.

• protobuf:anefficientandflexiblewaytodefinedatastructure.>Needtoknowthisfordefiningnewlayers.

Amir Sadeghian 28


Caffe Tutorial

• Nets,Layers,andBlobs:theanatomyofaCaffe model.• Forward/Backward:theessentialcomputationsoflayeredcompositionalmodels.

• Loss:thetasktobelearnedisdefinedbytheloss.• Solver:thesolvercoordinatesmodeloptimization.• Interfaces:commandline,Python,andMATLABCaffe.• Data:howtocaffeinatedataformodelinput.

http:/caffe.berkeleyvision.org/tutorial/

Amir Sadeghian 29


Caffe

• Blob:StorageandCommunicationofData• DatablobsareNxCxHxW

• Net:Containsallthelayersinthenetworks• Performs forward/backwardpassthrough theentirenetwork

• Solver:Usedtosettraining/testingparameters• Numberof iterations,backpropagationmethod,etc..

Amir Sadeghian 30


Training:Step1

• Createalenet_train.prototxt

• DataLayers

• OperationalLayers

• LossLayers

Amir Sadeghian 31


NetworkDefinition(train.prototxt)

Amir Sadeghian 32



Amir Sadeghian 33



Amir Sadeghian 34


Training:Step2

• Createalenet_solver.prototxt

Amir Sadeghian 35


Solver(solver.prototxt)

Amir Sadeghian 36


Training:Step2

• SomedetailsonSGDparameters

Amir Sadeghian 37


Training:Step3

• #trainLeNet• caffe train-solverexamples/mnist/lenet_solver.prototxt

• #trainonGPU2• caffe train-solverexamples/mnist/lenet_solver.prototxt -gpu 2

• #resumetrainingfromthehalf-waypointsnapshot• caffe train-solverexamples/mnist/lenet_solver.prototxt -snapshotexamples/mnist/lenet_iter_5000.solverstate

Amir Sadeghian 38


NetworkDefinition(test.prototxt)

Amir Sadeghian 39


NetworkDefinition(test.prototxt)

Amir Sadeghian 40


PyCaffe (TraininginPython)

• Addcaffe pythondirectorytopathandimportcaffe

Amir Sadeghian 41


UseNetSpec todefinelayers

Amir Sadeghian 42


Definesolverandtrainnetwork

Amir Sadeghian 43


AccessNetdata

Amir Sadeghian 44


PyCaffe (TestinginPython)

Amir Sadeghian 45


OpenModelCollection

• TheCaffe ModelZoo• opencollectionofdeepmodelstoshareinnovation

• VGGILSVRC14+Devilmodelsinthezoo• Network-in-Network/CCCPmodelinthezoo

• MITPlacesscenerecognitionmodelinthezoo• Helpreproduce research• Bundled toolsforloadingandpublishingmodels

• ShareYourModels!withyourcitation+licenseofcourse

Amir Sadeghian 46


ReferenceModels

Caffe offersthe

• Modeldefinitions

• Optimizationsettings

• Pre-trainedweightssoyoucanstartrightaway.

Amir Sadeghian 47

Alexnet: Imagenet Classification


WhentoFine-tune?

• Agoodfirststep!• Morerobustoptimization• good initializationhelps• Needslessdata• Fasterlearning

• State-of-the-art resultsin• recognition• detection• segmentation

Amir Sadeghian 48


Fine-tuningTricks

• Learnthelastlayerfirst• Caffe layershavelocallearning rates:blobs_lr• Freezeallbutthelastlayerforfastoptimizationandavoidingearlydivergence.

• Stopifgoodenough, orkeepfine-tuning

• Reducethelearningrate• Dropthesolverlearning rateby10x,100x–• Preservetheinitialization frompre-trainingandavoidthrashing

Amir Sadeghian 49


CNNTrainingtips

• Beforerunningfinal/longtraining• Makesureyoucanoverfit onasmalltrainingset• Makesureyour lossdecreasesoverfirstseveraliterations• Otherwiseadjustparameteruntilitdoes,especiallylearning rate

• Separatetrain/val/testdata

Amir Sadeghian 50


AnyQuestions?

51Amir Sadeghian

CS 231A Section: Computer Vision Libraries...

Documents

Transcript of CS 231A Section: Computer Vision Libraries...