CS 231A Section: Computer Vision Libraries...

51
Section 6- Computer Vision Libraries CS 231A Section: Computer Vision Libraries Overview Amir Sadeghian May 2017 1 Amir Sadeghian

Transcript of CS 231A Section: Computer Vision Libraries...

Section 6- Computer Vision Libraries

CS231ASection:ComputerVisionLibrariesOverview

AmirSadeghianMay2017

1Amir Sadeghian

Section 6- Computer Vision Libraries

Overview

• Opencv• DeepLearningFrameworks

• Caffe• Torch• Tensorflow

Amir Sadeghian 2

Section 6- Computer Vision Libraries

OtherCVlibraries

• Vlfeat: AnOpensource librarywithpopular computervisionalgorithmsspecializing inimageunderstanding andlocalfeaturesextractionandmatching.

• scikit-learn:AnopensourcePythonlibrary thatimplementsarangeofmachine learning, preprocessing, cross-validationandvisualizationalgorithms.

• PCL: Astandalone, largescale,openprojectfor2D/3Dimageandpointcloudprocessing.

• SLAMframeworks(bundler, visualsfm,meshlab):Applicationsfor3Dreconstructionusingstructurefrommotion (SFM).

• Librariesforspecifictasks:e.g.trackinglibraries,Detectionlibraries…

Amir Sadeghian 3

Section 6- Computer Vision Libraries

OpenCV

Amir Sadeghian 4

Section 6- Computer Vision Libraries

IntroductiontoOpenCV

• Opensourcecomputervisionandmachinelearninglibrary• Containsimplementationsofalargenumberofvisionalgorithms• WrittennativelyinC++,alsohasC,Python,Java,andMATLABinterfaces

• SupportsWindows,Linux,MacOSX,Android,andiOS

Amir Sadeghian 5

Section 6- Computer Vision Libraries

Installation

• Downloadfromhttp://opencv.org andcompilefromsource• Windows:RunexecutabledownloadedfromOpenCV website• MacOSX:InstallthroughMacPorts,easy_install,…• Linux:Installthroughthepackagemanager(e.g.yum,apt)butmakesuretheversionissufficientlyup-to-dateforyourneeds

Amir Sadeghian 6

Section 6- Computer Vision Libraries

BasicOpenCV Structures

• Point,Point2f- 2DPoint

• Size - 2Dsizestructure

• Rect - 2Drectangleobject

• RotatedRect - Rect objectwithangle

• Mat - imageobject

Amir Sadeghian 7

Section 6- Computer Vision Libraries

Point

Amir Sadeghian 8

• 2DPointObject- int x,y;

• SampleFunctions

- Point.dot(<Point>)- computesdotproduct

- Point.inside(<Rect>)- returnstrueifpointisinside

Mathoperators,youmayuse-Pointoperator+

-Pointoperator+=

-Pointoperator-

-Pointoperator-=

-Pointoperator*

-Pointoperator*=

-booloperator==

-booloperator!=

-doublenorm

Section 6- Computer Vision Libraries

Size

• 2DSizeStructure- int width,height;

• Functions- Size.area()- returns(width*height)

Amir Sadeghian 9

Rect

• 2DRectangleStructure- int x,y,width,height;

• Functions

- Rect.tl()- returntopleftpoint

- Rect.br()- returnbottomrightpoint

Section 6- Computer Vision Libraries

cv::Mat

• TheprimarydatastructureinOpenCV istheMatobject.Itstoresimagesandtheircomponents.

• Mainitems

• rows,cols- lengthandwidth(int)

• channels- 1:grayscale,3:BGR

• depth:CV_<depth>C<num chan>

• Seethemanualsformoreinformation

Amir Sadeghian 10

Section 6- Computer Vision Libraries

cv::Mat- Functions

• Mat.at<datatype>(row,col)[channel]- returnspointer toimagelocation• Mat.channels()- returnsthenumberofchannels• Mat.clone()- returnsadeepcopyoftheimage• Mat.create(rows,cols,TYPE)- re-allocatesnewmemorytomatrix• Mat.cross(<Mat>)- computescrossproductoftwomatricies• Mat.depth()- returnsdatatypeofmatrix• Mat.dot(<Mat>)- computesthedotproductoftwomatrices

Amir Sadeghian 11

Section 6- Computer Vision Libraries

Pixeltypes

• PixelTypes showshowtheimageisrepresentedindata

• BGR- Thedefaultcolorofimread().Normal3channelcolor

• HSV- Hueiscolor,Saturationisamount,Valueislightness.3channels

• GRAYSCALE- Grayvalues,Singlechannel• OpenCV requiresthatimagesbeinBGRorGrayscaleinordertobeshownorsaved.Otherwise,undesirableeffectsmayappear.

Amir Sadeghian 12

Section 6- Computer Vision Libraries

ImageNormalizationandThresholding

• Normalizationremapsarangeofpixelvaluestoanotherrangeofpixelvalues

• voidnormalize(InputArray src,OutputArray dst,...)

• OpenCV providesageneralpurposemethodforthresholdinganimage

• doublethreshold(InputArray src,OutputArraydst,double thresh,double maxval,int type)

• Specifythresholding schemespecifiedbythetypevariable

Amir Sadeghian 13

Section 6- Computer Vision Libraries

ImageSmoothing

• Reducesthesharpnessofedgesandsmoothsoutdetailsinanimage

• OpenCV implementsseveralofthemostcommonlyusedmethods

• voidGaussianBlur(InputArray src,OutputArray dst,...)

• voidmedianBlur(InputArray src,OutputArray dst,...)

• Otherfunctionsincludegenericconvolution,separableconvolution,dilate,anderode.

Amir Sadeghian 14

Section 6- Computer Vision Libraries

ImageSmoothing:Code

Amir Sadeghian 15

Section 6- Computer Vision Libraries

ImageSmoothing:SampleImage

Amir Sadeghian 16

Original GaussianBlur MedianBlur

Section 6- Computer Vision Libraries

EdgeDetection

• OpenCV implementsanumberofoperatorstohelpdetectedgesinanimage

• SobelOperator• void cv::Sobel(image in, image out, CV_DEPTH, dx, dy);

• Scharr Operator• void cv::Scharr(image in, image out, CV_DEPTH, dx, dy);

• LaplacianOperator• void cv::Laplacian( image in, image out, CV_DEPTH);

• OpenCV alsoimplementsmulti-stageedgedetectionalgorithmssuchasCannyedgedetection

• Tip:Ifyourimageisnoisy,thenedgedetectionwilloftenexaggeratethenoise

• Sometimessmoothingtheimagebeforerunningedgedetectiongivesbetterresults

Amir Sadeghian 17

Section 6- Computer Vision Libraries

EdgeDetection:Code

Amir Sadeghian 18

Section 6- Computer Vision Libraries

EdgeDetection:SampleResults

Amir Sadeghian 19

Section 6- Computer Vision Libraries

FaceDetection:Viola-Jones

} Robustandfast} IntroducedbyPaulViolaandMichaelJones

} http://research.microsoft.com/~viola/Pubs/Detect/violaJones_CVPR2001.pdf

} Haar-likeFeatures

Amir Sadeghian 20

Section 6- Computer Vision Libraries

AndManyMore…

• ObjectTrackingusingOpenCV• HandwrittenDigitsClassification:AnOpenCV(C++/Python)Tutorial

• EyeDetectorusingOpenCV• ImageRecognitionandObjectDetection• HeadPoseEstimationusingOpenCV• ConfiguringQtforOpenCV• …

Amir Sadeghian 21

Section 6- Computer Vision Libraries

DeepLearningFrameworks

Amir Sadeghian 22

Section 6- Computer Vision Libraries

DeepLearningFrameworks

• Caffe• Torch/PyTorch

• NYU• scientificcomputingframeworkinLua• supportedbyFacebook

• TensorFlow• Google• Python

• Theano/Pylearn2• U.Montreal• Python• symboliccomputationandautomaticdifferentiation

• MatConvNet• OxfordU.• DeepLearninginMATLAB

Amir Sadeghian 23

Section 6- Computer Vision Libraries

FrameworkComparison

• Morealikethandifferent• Allexpressdeepmodels• Allareopen-source(contributions differ)• Mostincludescripting forhackingandprototyping

• Nostrictwinners,experimentandchoosetheframeworkthatbestfitsyourwork

Amir Sadeghian 24

Section 6- Computer Vision Libraries

Caffe:Overview

• WhatisCaffe?• Training/Finetuning asimplemodel• DeepdiveintoCaffe!

Amir Sadeghian 25

Section 6- Computer Vision Libraries

WhatisCaffe?

• A deeplearningframework

• Openframework,models,andworkedexamplesfordeeplearning

• 4000+citations,250+contributors,11,000+forks

• Focushasbeenvision,butbranchingout:sequences, reinforcementlearning,speech+text

Amir Sadeghian 26

Section 6- Computer Vision Libraries

Caffe

• PureC++/CUDAarchitecturefordeeplearning• commandline,Python,MATLABinterfaces

• Fast,well-testedcode• Tools,referencemodels,demos,andrecipes• SwitchbetweenCPUandGPU

• Caffe::set_mode(Caffe::GPU);

Amir Sadeghian 27

Section 6- Computer Vision Libraries

Installation

• http://caffe.berkeleyvision.org/installation.html •• CUDA,OPENCV• BLAS(BasicLinearAlgebraSubprograms):operationslikematrixmultiplication,matrixaddition,bothimplementationforCPU(cBLAS)andGPU(cuBLAS).providedbyMKL(INTEL),ATLAS,openBLAS,etc.

• Boost:ac++ library.>Usesomeofitsmathfunctionsandshared_pointer.

• glog,gflags providelogging&commandlineutilities.>Essentialfordebugging.

• leveldb,lmdb:databaseio foryourprogram.>Needtoknowthisforpreparingyourowndata.

• protobuf:anefficientandflexiblewaytodefinedatastructure.>Needtoknowthisfordefiningnewlayers.

Amir Sadeghian 28

Section 6- Computer Vision Libraries

Caffe Tutorial

• Nets,Layers,andBlobs:theanatomyofaCaffe model.• Forward/Backward:theessentialcomputationsoflayeredcompositionalmodels.

• Loss:thetasktobelearnedisdefinedbytheloss.• Solver:thesolvercoordinatesmodeloptimization.• Interfaces:commandline,Python,andMATLABCaffe.• Data:howtocaffeinatedataformodelinput.

http:/caffe.berkeleyvision.org/tutorial/

Amir Sadeghian 29

Section 6- Computer Vision Libraries

Caffe

• Blob:StorageandCommunicationofData• DatablobsareNxCxHxW

• Net:Containsallthelayersinthenetworks• Performs forward/backwardpassthrough theentirenetwork

• Solver:Usedtosettraining/testingparameters• Numberof iterations,backpropagationmethod,etc..

Amir Sadeghian 30

Section 6- Computer Vision Libraries

Training:Step1

• Createalenet_train.prototxt

• DataLayers

• OperationalLayers

• LossLayers

Amir Sadeghian 31

Section 6- Computer Vision Libraries

NetworkDefinition(train.prototxt)

Amir Sadeghian 32

Section 6- Computer Vision Libraries

NetworkDefinition(train.prototxt)

Amir Sadeghian 33

Section 6- Computer Vision Libraries

NetworkDefinition(train.prototxt)

Amir Sadeghian 34

Section 6- Computer Vision Libraries

Training:Step2

• Createalenet_solver.prototxt

Amir Sadeghian 35

Section 6- Computer Vision Libraries

Solver(solver.prototxt)

Amir Sadeghian 36

Section 6- Computer Vision Libraries

Training:Step2

• SomedetailsonSGDparameters

Amir Sadeghian 37

Section 6- Computer Vision Libraries

Training:Step3

• #trainLeNet• caffe train-solverexamples/mnist/lenet_solver.prototxt

• #trainonGPU2• caffe train-solverexamples/mnist/lenet_solver.prototxt -gpu 2

• #resumetrainingfromthehalf-waypointsnapshot• caffe train-solverexamples/mnist/lenet_solver.prototxt -snapshotexamples/mnist/lenet_iter_5000.solverstate

Amir Sadeghian 38

Section 6- Computer Vision Libraries

NetworkDefinition(test.prototxt)

Amir Sadeghian 39

Section 6- Computer Vision Libraries

NetworkDefinition(test.prototxt)

Amir Sadeghian 40

Section 6- Computer Vision Libraries

PyCaffe (TraininginPython)

• Addcaffe pythondirectorytopathandimportcaffe

Amir Sadeghian 41

Section 6- Computer Vision Libraries

UseNetSpec todefinelayers

Amir Sadeghian 42

Section 6- Computer Vision Libraries

Definesolverandtrainnetwork

Amir Sadeghian 43

Section 6- Computer Vision Libraries

AccessNetdata

Amir Sadeghian 44

Section 6- Computer Vision Libraries

PyCaffe (TestinginPython)

Amir Sadeghian 45

Section 6- Computer Vision Libraries

OpenModelCollection

• TheCaffe ModelZoo• opencollectionofdeepmodelstoshareinnovation

• VGGILSVRC14+Devilmodelsinthezoo• Network-in-Network/CCCPmodelinthezoo

• MITPlacesscenerecognitionmodelinthezoo• Helpreproduce research• Bundled toolsforloadingandpublishingmodels

• ShareYourModels!withyourcitation+licenseofcourse

Amir Sadeghian 46

Section 6- Computer Vision Libraries

ReferenceModels

Caffe offersthe

• Modeldefinitions

• Optimizationsettings

• Pre-trainedweightssoyoucanstartrightaway.

Amir Sadeghian 47

Alexnet: Imagenet Classification

Section 6- Computer Vision Libraries

WhentoFine-tune?

• Agoodfirststep!• Morerobustoptimization• good initializationhelps• Needslessdata• Fasterlearning

• State-of-the-art resultsin• recognition• detection• segmentation

Amir Sadeghian 48

Section 6- Computer Vision Libraries

Fine-tuningTricks

• Learnthelastlayerfirst• Caffe layershavelocallearning rates:blobs_lr• Freezeallbutthelastlayerforfastoptimizationandavoidingearlydivergence.

• Stopifgoodenough, orkeepfine-tuning

• Reducethelearningrate• Dropthesolverlearning rateby10x,100x–• Preservetheinitialization frompre-trainingandavoidthrashing

Amir Sadeghian 49

Section 6- Computer Vision Libraries

CNNTrainingtips

• Beforerunningfinal/longtraining• Makesureyoucanoverfit onasmalltrainingset• Makesureyour lossdecreasesoverfirstseveraliterations• Otherwiseadjustparameteruntilitdoes,especiallylearning rate

• Separatetrain/val/testdata

Amir Sadeghian 50

Section 6- Computer Vision Libraries

AnyQuestions?

51Amir Sadeghian