Learning Spatiotemporal Features with 3D Convolutional...

21
Learning Spatiotemporal Features with 3D Convolutional Networks Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

Transcript of Learning Spatiotemporal Features with 3D Convolutional...

Page 1: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

LearningSpatiotemporalFeatureswith

3DConvolutionalNetworksDuTran,LubomirBourdev,RobFergus,LorenzoTorresani,ManoharPaluri

Page 2: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

EffectiveVideoDescriptor

• Generic– Canrepresentdifferenttypes

• Compact– Processing,storage

• Efficient– computation

• Simple– implementation

Page 3: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

3DConvolutionandPooling

• 3DConvolutionisbetterthan2DConvolutiontomodeltemporalinformation.– 2DCONV:performedonlyspatially,losetemporalinformation.

– 3DCONV:performedspatio-temporally,preservetemporalinformation.

• Samephenomenaisapplicableforpooling.

Page 4: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

2DConvolutionOn1-chInput

• Result:2DImage.

Page 5: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

2DConvolutionOnn-chInput

• Result:2DImage.

Page 6: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

3DConvolutionOnn-chInput

• Result:Volume

Page 7: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

IdentifyBestArchitectureFor3DConvNets(OnUCF101)

• Commonnetworksettings– Allvideoframesresizedinto128x171.– Videosaresplitintonon-overlapped16frameclip.– Input:3x16x128x171.– 5ConvolutionandPoolinglayer– 2FullyConnectedlayer– SoftmaxLosslayertopredictactionlabels

Page 8: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

IdentifyBestArchitectureFor3DConvNets(OnUCF101)

• VaryingNetworkArchitecture– Homogeneoustemporaldepth.• Depth–dfor1,3,5,7

– Varyingtemporaldepth.• Increasing:3-3-5-5-7• Decreasing:7-7-5-5-3-3

Page 9: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

3DConvolutionKernelTemporalDepthSearch

Page 10: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

SpatiotemporalFeatureLearning

• BestNetworkArchitecture–With3x3x3kernel

Page 11: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

SpatiotemporalFeatureLearning

• Datasetfortraining– Sports1MDataset• Largestvideoclassificationbenchmark• 1.1millionsportsvideos• 487categories

Page 12: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

Sports1MClassificationResults

Page 13: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

C3DVideoDescriptor

• C3DModelcanbeusedasafeatureextractorforvariousvideoanalysistasks.– Actionrecognition– Actionsimilarity– SceneandObjectrecognition

• Usingwithfc6activations– 4096dimension

Page 14: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

ActionRecognition

• Dataset:UCF101– 13.320video– 101humanaction

Page 15: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

ActionSimilarityLabeling

• Dataset:ASLAN– 3,631video– 432actionclass

Page 16: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

SceneObjectRecognition

• Dataset:YUPENN– 420video– 14scene

• Dataset:Maryland– 130video– 13scene

Page 17: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

WhyC3DFeatures?

• Generic• Compact• Efficient• Simple

Visualisation using t-SNE method:

L. van der Maaten and G. Hinton. Visualizing data using t-sne. JMLR

Page 18: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

WhatDoesC3DLearn?

Using deconvolution method in M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In ECCV, 2014

Page 19: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

UsefulLinks

• http://vlg.cs.dartmouth.edu/c3d/• https://github.com/facebook/C3D

Page 20: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

Tools and software required:

- keras- tensorflow- ffmpeg(compiled form source)- opencv(compiled from source)

Page 21: Learning Spatiotemporal Features with 3D Convolutional ...faculty.iitmandi.ac.in/~aditya/cs671/cs671_2017/data/Lect23.pdf · Learning Spatiotemporal Features with 3D Convolutional

Thank you