Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

Post on 20-May-2020

11 views 0 download

Transcript of Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

GettingStartedwithTensorFlow

TableofContents

GettingStartedwithTensorFlowCreditsAbouttheAuthorAbouttheReviewerwww.PacktPub.com

eBooks,discountoffers,andmoreWhysubscribe?

PrefaceWhatthisbookcoversWhatyouneedforthisbookWhothisbookisforConventionsReaderfeedbackCustomersupport

DownloadingtheexamplecodeDownloadingthecolorimagesofthisbookErrataPiracyQuestions

1.TensorFlow–BasicConceptsMachinelearninganddeeplearningbasics

SupervisedlearningUnsupervisedlearningDeeplearning

TensorFlow–AgeneraloverviewPythonbasics

SyntaxDatatypesStringsControlflowFunctionsClassesExceptionsImportingalibrary

InstallingTensorFlow

InstallingonMacorLinuxdistributionsInstallingonWindowsInstallationfromsourceTestingyourTensorFlowinstallation

FirstworkingsessionDataFlowGraphsTensorFlowprogrammingmodel

HowtouseTensorBoardSummary

2.DoingMathwithTensorFlowThetensordatastructure

One-dimensionaltensorsTwo-dimensionaltensors

TensorhandlingThree-dimensionaltensorsHandlingtensorswithTensorFlow

PreparetheinputdataComplexnumbersandfractals

PreparethedataforMandelbrotsetBuildandexecutetheDataFlowGraphforMandelbrot'ssetVisualizetheresultforMandelbrot'ssetPreparethedataforJulia'ssetBuildandexecutetheDataFlowGraphforJulia'ssetVisualizetheresult

ComputinggradientsRandomnumbers

UniformdistributionNormaldistributionGeneratingrandomnumberswithseeds

Montecarlo'smethodSolvingpartialdifferentialequations

InitialconditionModelbuildingGraphexecution

ComputationalfunctionusedSummary

3.StartingwithMachineLearningThelinearregressionalgorithm

Datamodel

CostfunctionsandgradientdescentTestingthemodel

TheMNISTdatasetDownloadingandpreparingthedata

ClassifiersThenearestneighboralgorithm

BuildingthetrainingsetCostfunctionandoptimization

TestingandalgorithmevaluationDataclustering

Thek-meansalgorithmBuildingthetrainingsetCostfunctionsandoptimization

TestingandalgorithmevaluationSummary

4.IntroducingNeuralNetworksWhatareartificialneuralnetworks?

NeuralnetworkarchitecturesSingleLayerPerceptronThelogisticregression

TensorFlowimplementationBuildingthemodelLaunchthesessionTestevaluationSourcecode

MultiLayerPerceptronMultiLayerPerceptronclassification

BuildthemodelLaunchthesessionSourcecode

MultiLayerPerceptronfunctionapproximationBuildthemodelLaunchthesession

Summary5.DeepLearning

DeeplearningtechniquesConvolutionalneuralnetworks

CNNarchitectureTensorFlowimplementationofaCNN

InitializationstepFirstconvolutionallayerSecondconvolutionallayerDenselyconnectedlayerReadoutlayerTestingandtrainingthemodelLaunchingthesessionSourcecode

RecurrentneuralnetworksRNNarchitectureLSTMnetworksNLPwithTensorFlow

DownloadthedataBuildingthemodelRunningthecode

Summary6.GPUProgrammingandServingwithTensorFlow

GPUprogrammingTensorFlowServing

HowtoinstallTensorFlowServingBazelgRPC

TensorFlowservingdependenciesInstallServing

HowtouseTensorFlowServingTrainingandexportingtheTensorFlowmodelRunningasession

LoadingandexportingaTensorFlowmodelTesttheserver

Summary

GettingStartedwithTensorFlow

GettingStartedwithTensorFlowCopyright©2016PacktPublishing

Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.

Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.

PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.

Firstpublished:July2016

Productionreference:1190716

PublishedbyPacktPublishingLtd.

LiveryPlace

35LiveryStreet

Birmingham

B32PB,UK.

ISBN978-1-78646-857-4

www.packtpub.com

CreditsAuthor

GiancarloZaccone

CopyEditor

AlphaSingh

Reviewer

JayaniWithanawasam

ProjectCoordinator

ShwetaHBirwatkar

CommissioningEditor

VeenaPagare

Proofreader

SafisEditing

AcquisitionEditor

VinayArgekar

Indexer

MariammalChettiyar

ContentDevelopmentEditor

SumeetSawant

ProductionCoordinator

NileshMohite

TechnicalEditor

DeeptiTuscano

CoverWork

NileshMohite

AbouttheAuthorGiancarloZacconehasmorethan10yearsofexperiencemanagingresearchprojectsinboththescientificandindustrialdomains.HeworkedasresearcherattheC.N.R,theNationalResearchCouncil,wherehewasinvolvedinprojectsrelatedtoparallelnumericalcomputingandscientificvisualization.

Currently,heisaseniorsoftwareengineerataconsultingcompanydevelopingandmaintainingsoftwaresystemsforspaceanddefenceapplications.

Giancarloholdsamaster'sdegreeinphysicsfromtheFedericoIIofNaplesanda2ndlevelpostgraduatemastercourseinscientificcomputingfromLaSapienzaofRome.

HehasalreadybeenaPacktauthorforthefollowingbook:PythonParallelProgrammingCookbook.

Youcancontacthimathttps://it.linkedin.com/in/giancarlozaccone

AbouttheReviewerJayaniWithanawasamisaseniorsoftwareengineeratZaiziAsia-ResearchandDevelopmentteam.SheistheauthorofthebookApacheMahoutEssentials,onscalablemachinelearning.ShewasasummitspeakeratAlfrescoSummit2014-London.Hertalkwasaboutapplicationsofmachinelearningtechniquesinsmartenterprisecontentmanagement(ECM)solutions.Shepresentedherresearch“ContentExtractionandContextInferencebasedInformationRetrieval”attheWomeninMachineLearning(WiML)2015workshop,whichwasco-locatedwiththeNeuralInformationProcessingSystems(NIPS)2015conference-Montreal,Canada.

JayaniiscurrentlypursuinganMScinArtificialIntelligenceattheUniversityofMoratuwa,SriLanka.Shehasstrongresearchinterestsinmachinelearningandcomputervision.

Youcancontactherathttps://lk.linkedin.com/in/jayaniwithanawasam

www.PacktPub.com

eBooks,discountoffers,andmoreDidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusatcustomercare@packtpub.comformoredetails.

Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.

https://www2.packtpub.com/books/subscription/packtlib

DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt'sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt'sentirelibraryofbooks.

Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser

PrefaceTensorFlowisanopensourcesoftwarelibraryusedtoimplementmachinelearninganddeeplearningsystems.

Behindthesetwonamesarehiddenaseriesofpowerfulalgorithmsthatshareacommonchallenge:toallowacomputertolearnhowtoautomaticallyrecognizecomplexpatternsandmakethesmartestdecisionspossible.

Machinelearningalgorithmsaresupervisedorunsupervised;simplifyingasmuchaspossible,wecansaythatthebiggestdifferenceisthatinsupervisedlearningtheprogrammerinstructsthecomputerhowtodosomething,whereasinunsupervisedlearningthecomputerwilllearnallbyitself.

Deeplearningisinsteadanewareaofmachinelearningresearchthathasbeenintroducedwiththeobjectiveofmovingmachinelearningclosertoartificialintelligencegoals.Thismeansthatdeeplearningalgorithmstrytooperatelikethehumanbrain.

Withtheaimofconductingresearchinthesefascinatingareas,theGoogleteamdevelopedTensorFlow,whichisthesubjectofthisbook.

TointroduceTensorFlow’sprogrammingfeatures,wehaveusedthePythonprogramminglanguage.Pythonisfunandeasytouse;itisatruegeneral-purposelanguageandisquicklybecomingamust-havetoolinthearsenalofanyself-respectingprogrammer.

ItisnottheaimofthisbooktocompletelydescribeallTensorFlowobjectsandmethods;insteadwewillintroducetheimportantsystemconceptsandleadyouupthelearningcurveasfastandefficientlyaswecan.EachchapterofthebookpresentsadifferentaspectofTensorFlow,accompaniedbyseveralprogrammingexamplesthatreflecttypicalissuesofmachineanddeeplearning.

Althoughitislargeandcomplex,TensorFlowisdesignedtobeeasytouseonceyoulearnaboutitsbasicdesignandprogrammingmethodology.

ThepurposeofGettingStartedwithTensorFlowistohelpyoudojustthat.

Enjoyreading!

WhatthisbookcoversChapter1,TensorFlow–BasicConcepts,containsgeneralinformationonthestructureofTensorFlowandtheissuesforwhichitwasdeveloped.ItalsoprovidesthebasicprogrammingguidelinesforthePythonlanguageandafirstTensorFlowworkingsessionaftertheinstallationprocedure.ThechapterendswithadescriptionofTensorBoard,apowerfultoolforoptimizationanddebugging.

Chapter2,DoingMathwithTensorFlow,describestheabilityofmathematicalprocessingofTensorFlow.Itcoversprogrammingexamplesonbasicalgebrauptopartialdifferentialequations.Also,thebasicdatastructureinTensorFlow,thetensor,isexplained.

Chapter3,StartingwithMachineLearning,introducessomemachinelearningmodels.Westarttoimplementthelinearregressionalgorithm,whichisconcernedwithmodelingrelationshipsbetweendata.Themainfocusofthechapterisonsolvingtwobasicproblemsinmachinelearning;classification,thatis,howtoassigneachnewinputtooneofthepossiblegivencategories;anddataclustering,whichisthetaskofgroupingasetofobjectsinsuchawaythatobjectsinthesamegrouparemoresimilartoeachotherthantothoseinothergroups.

Chapter4,IntroducingNeuralNetworks,providesaquickanddetailedintroductionofneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelements,theartificialneurons.Theyaremathematicalconstructsthattosomeextentmimicthepropertiesoflivingneurons.Neuralnetworksbuildthefoundationonwhichreststhearchitectureofdeeplearningalgorithms.Twobasictypesofneuralnetsarethenimplemented:theSingleLayerPerceptronandtheMultiLayerPerceptronforclassificationproblems.

Chapter5,DeepLearning,givesanoverviewofdeeplearningalgorithms.Onlyinrecentyearshasdeeplearningcollectedalargenumberofresultsconsideredunthinkableafewyearsago.We’llshowhowtoimplementtwofundamentaldeeplearningarchitectures,convolutionalneuralnetworks(CNN)andrecurrentneuralnetworks(RNN),forimagerecognitionandspeechtranslationproblemsrespectively.

Chapter6,GPUProgrammingandServingwithTensorFlow,showstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowServing,ahigh-performanceopensourceservingsystemformachinelearningmodelsdesignedfor

productionenvironmentsandoptimizedforTensorFlow.

WhatyouneedforthisbookAlltheexampleshavebeenimplementedusingPythonversion2.7onanUbuntuLinux64-bitmachine,includingtheTensorFlowlibraryversion0.7.1.

YouwillalsoneedthefollowingPythonmodules(preferablythelatestversion):

PipBazelMatplotlibNumPyPandas

WhothisbookisforThereadershouldhaveabasicknowledgeofprogrammingandmathconcepts,andatthesametime,wanttobeintroducedtothetopicsofmachineanddeeplearning.Afterreadingthisbook,youwillbeabletomasterTensorFlow’sfeaturestobuildpowerfulapplications.

ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.

Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:"Theinstructionsforflowcontrolareif,for,andwhile."

Anycommand-lineinputoroutputiswrittenasfollows:

>>>myvar=3

>>>myvar+=2

>>>myvar

5

>>>myvar-=1

>>>myvar

4

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,forexample,inmenusordialogboxes,appearinthetextlikethis:"TheshortcutsinthisbookarebasedontheMacOSX10.5+scheme."

Note

Warningsorimportantnotesappearinaboxlikethis.

Tip

Tipsandtricksappearlikethis.

ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook-whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.Tosendusgeneralfeedback,simplye-mailfeedback@packtpub.com,andmentionthebook'stitleinthesubjectofyourmessage.Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.

CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.

DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Youcandownloadthecodefilesbyfollowingthesesteps:

1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.2. HoverthemousepointerontheSUPPORT tabatthetop.3. ClickonCodeDownloads&Errata.4. EnterthenameofthebookintheSearchbox.5. Selectthebookforwhichyou'relookingtodownloadthecodefiles.6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.7. ClickonCodeDownload.

Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthelatestversionof:

WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMac7-Zip/PeaZipforLinux

ThecodebundleforthebookisalsohostedonGitHubathttps://github.com/PacktPublishing/Getting-Started-with-TensorFlow.Wealsohaveothercodebundlesfromourrichcatalogofbooksandvideosavailableathttps://github.com/PacktPublishing/.Checkthemout!

DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefromhttp://www.packtpub.com/sites/default/files/downloads/GettingStartedwithTensorFlow_ColorImages.pdf

ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks-maybeamistakeinthetextorthecode-wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.

Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.

PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.

Pleasecontactusatcopyright@packtpub.comwithalinktothesuspectedpiratedmaterial.

Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.

QuestionsIfyouhaveaproblemwithanyaspectofthisbook,youcancontactusatquestions@packtpub.com,andwewilldoourbesttoaddresstheproblem.

Chapter1.TensorFlow–BasicConceptsInthischapter,we'llcoverthefollowingtopics:

MachinelearninganddeeplearningbasicsTensorFlow–AgeneraloverviewPythonbasicsInstallingTensorFlowFirstworkingsessionDataFlowGraphTensorFlowprogrammingmodelHowtouseTensorBoard

MachinelearninganddeeplearningbasicsMachinelearningisabranchofartificialintelligence,andmorespecificallyofcomputerscience,whichdealswiththestudyofsystemsandalgorithmsthatcanlearnfromdata,synthesizingnewknowledgefromthem.

Thewordlearnintuitivelysuggeststhatasystembasedonmachinelearning,may,onthebasisoftheobservationofpreviouslyprocesseddata,improveitsknowledgeinordertoachievebetterresultsinthefuture,orprovideoutputclosertothedesiredoutputforthatparticularsystem.

Theabilityofaprogramorasystembasedonmachinelearningtoimproveitsperformanceinaparticulartask,thankstopastexperience,isstronglylinkedtoitsabilitytorecognizepatternsinthedata.Thistheme,calledpatternrecognition,isthereforeofvitalimportanceandofincreasinginterestinthecontextofartificialintelligence;itisthebasisofallmachinelearningtechniques.

Thetrainingofamachinelearningsystemcanbedoneindifferentways:

SupervisedlearningUnsupervisedlearning

SupervisedlearningSupervisedlearningisthemostcommonformofmachinelearning.Withsupervisedlearning,asetofexamples,thetrainingset,issubmittedasinputtothesystemduringthetrainingphase,whereeachexampleislabeledwiththerespectivedesiredoutputvalue.Forexample,let'sconsideraclassificationproblem,wherethesystemmustattributesomeexperimentalobservationsinoneoftheNdifferentclassesalreadyknown.Inthisproblem,thetrainingsetispresentedasasequenceofpairsofthetype{(X1,Y1),.....,(Xn,Yn)}whereXiaretheinputvectors(featurevectors)andYirepresentsthedesiredclassforthecorrespondinginputvector.Mostsupervisedlearningalgorithmsshareonecharacteristic:thetrainingisperformedbytheminimizationofaparticularlossfunction(costfunction),whichrepresentstheoutputerrorwithrespecttothedesiredoutputsystem.

Thecostfunctionmostusedforthistypeoftrainingcalculatesthestandarddeviationbetweenthedesiredoutputandtheonesuppliedbythesystem.Aftertraining,theaccuracyofthemodelismeasuredonasetofdisjointedexamplesfromthetrainingset,theso-calledvalidationset.

Supervisedlearningworkflow

Inthisphasethemodel'sgeneralizationcapabilityisthenverified:wewilltestiftheoutputiscorrectforanunusedinputduringthetrainingphase.

Unsupervisedlearning

Inunsupervisedlearning,thetrainingexamplesprovidedbythesystemarenotlabeledwiththerelatedbelongingclass.Thesystem,therefore,developsandorganizesthedata,lookingforcommoncharacteristicsamongthem,andchangingthembasedontheirinternalknowledge.

Unsupervisedlearningalgorithmsareparticularlyusedinclusteringproblems,inwhichanumberofinputexamplesarepresent,youdonotknowtheclassapriori,andyoudonotevenknowwhatthepossibleclassesare,orhownumeroustheyare.Thisisaclearcasewhenyoucannotusesupervisedlearning,becauseyoudonotknowapriorithenumberofclasses.

Unsupervisedlearningworkflow

Deeplearning

Deeplearningtechniquesrepresentaremarkablestepforwardtakenbymachinelearninginrecentdecades,havingprovidedresultsneverseenbeforeinmanyapplications,suchasimageandspeechrecognitionorNaturalLanguageProcessing(NLP).Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofthefieldofmachinelearningonlyinrecentdecades.Onereason,perhapsthemainone,issurelyrepresentedbyprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthembyafactorof10or20.Anotherreasoniscertainlytheevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofa

certaindepthandwithahighdimensionalityfortheinputdata.

Deeplearningworkflow

Deeplearningisbasedonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.Itconsistsinamachinelearningmodelatseverallevelsofrepresentationinwhichthedeeperlevelstakeasinputtheoutputsofthepreviouslevels,transformingthemandalwaysabstractingmore.Eachlevelcorrespondsinthishypotheticalmodeltoadifferentareaofthecerebralcortex:whenthebrainreceivesimages,itprocessesthemthroughvariousstagessuchasedgedetectionandformperception,thatis,fromaprimitiverepresentationleveltothemostcomplex.Forexample,inanimageclassificationproblem,eachblockgraduallyextractsthefeatures,atvariouslevelsofabstraction,inputtingofdataalreadyprocessed,bymeansoffilteringoperations.

TensorFlow–AgeneraloverviewTensorFlow(https://www.tensorflow.org/)isasoftwarelibrary,developedbyGoogleBrainTeamwithinGoogle'sMachineLearningIntelligenceresearchorganization,forthepurposesofconductingmachinelearninganddeepneuralnetworkresearch.TensorFlowthencombinesthecomputationalalgebraofcompilationoptimizationtechniques,makingeasythecalculationofmanymathematicalexpressionswheretheproblemisthetimerequiredtoperformthecomputation.

Themainfeaturesinclude:

Defining,optimizing,andefficientlycalculatingmathematicalexpressionsinvolvingmulti-dimensionalarrays(tensors).Programmingsupportofdeepneuralnetworksandmachinelearningtechniques.TransparentuseofGPUcomputing,automatingmanagementandoptimizationofthesamememoryandthedataused.YoucanwritethesamecodeandruniteitheronCPUsorGPUs.Morespecifically,TensorFlowwillfigureoutwhichpartsofthecomputationshouldbemovedtotheGPU.Highscalabilityofcomputationacrossmachinesandhugedatasets.

TensorFlowhomepage

TensorFlowisavailablewithPythonandC++support,andweshallusePython2.7forlearning,asindeedPythonAPIisbettersupportedandmucheasiertolearn.ThePythoninstallationdependsonyoursystems;thedownloadpage(https://www.python.org/downloads/)containsalltheinformationneededforitsinstallation.Inthenextsection,weexplainverybrieflythemainfeaturesofthePythonlanguage,withsomeprogrammingexamples.

PythonbasicsPythonisastronglytypedanddynamiclanguage(datatypesarenecessarybutitisnotnecessarytoexplicitlydeclarethem),case-sensitive(varandVARaretwodifferentvariables),andobject-oriented(everythinginPythonisanobject).

SyntaxInPython,alineterminatorisnotrequired,andtheblocksarespecifiedwiththeindentation.Indenttobeginablockandremoveindentationtoconcludeit,that'sall.Instructionsthatrequireanindentedblockendwithacolon(:).Commentsbeginwiththehashsign(#)andaresingle-line.Stringsonmultiplelinesareusedformulti-linecomments.Assignmentsareaccomplishedwiththeequalsign(=).Forequalitytestsweusethedoubleequal(==)symbol.Youcanincreaseanddecreaseavaluebyusing+=and-=followedbytheaddend.Thisworkswithmanydatatypes,includingstrings.Youcanassignandusemultiplevariablesonthesameline.

Followingaresomeexamples:

>>>myvar=3

>>>myvar+=2

>>>myvar

5

>>>myvar-=1

>>>myvar

4

"""Thisisacomment"""

>>>mystring="Hello"

>>>mystring+="world."

>>>printmystring

Helloworld.

Thefollowingcodeswapstwovariablesinoneline:

>>>myvar,mystring=mystring,myvar

DatatypesThemostsignificantstructuresinPythonarelists,tuples,anddictionaries.ThesetsareintegratedinPythonsinceversion2.5(forpreviousversions,theyareavailableinthesetslibrary).Listsaresimilartosingle-dimensionalarraysbutyoucancreateliststhatcontainotherlists.Dictionariesarearraysthatcontainpairsofkeysandvalues(hashtable),andtuplesareimmutablemono-dimensionalobjects.InPythonarrayscanbeofanytype,soyoucanmixintegers,strings,andsooninyourlists/dictionariesandtuples.Theindexofthefirstobjectinanytypeofarrayisalwayszero.Negativeindicesareallowedandcountingfromtheendofthearray,-1isthelastelement.Variablescanrefertofunctions.

>>>example=[1,["list1","list2"],("one","tuple")]

>>>mylist=["Element1",2,3.14]

>>>mylist[0]

"Element1"

>>>mylist[-1]

3.14

>>>mydict={"Key1":"Val1",2:3,"pi":3.14}

>>>mydict["pi"]

3.14

>>>mytuple=(1,2,3)

>>>myfunc=len

>>>printmyfunc(mylist)

3

Youcangetanarrayrangeusingacolon(:).Notspecifyingthestartingindexoftherangeimpliesthefirstelement;notindicatingthefinalindeximpliesthelastelement.Negativeindicescountfromthelastelement(-1isthelastelement).Thenrunthefollowingcommand:

>>>mylist=["firstelement",2,3.14]

>>>printmylist[:]

['firstelement',2,3.1400000000000001]

>>>printmylist[0:2]

['firstelement',2]

>>>printmylist[-3:-1]

['firstelement',2]

>>>printmylist[1:]

[2,3.14]

StringsPythonstringsareindicatedeitherwithasinglequotationmark(')ordouble(")andareallowedtouseanotationwithinadelimitedstringontheother("Hesaid'hello'."Itisvalid).Stringsofmultiplelinesareenclosedintriple(orsingle)quotes(""").Pythonsupportsunicode;justusethesyntax:"Thisisaunicodestring".Toinsertvaluesintoastring,usethe%operator(modulo)andatuple.Each%isreplacedbyatupleelement,fromlefttoright,andispermittedtouseadictionaryforthereplacements.

>>>print"Nome:%s\nNumber:%s\nString:%s"%(myclass.nome,3,3*"-

")

Name:Poromenos

Number:3

String:---

strString="""thisisastring

onmultiplelines."""

>>>print"This%(verbo)sun%(name)s."%{"name":"test","verb":

"is"}

Thisisatest.

ControlflowTheinstructionsforflowcontrolareif,for,andwhile.Thereistheselectcontrolflow;initsplace,weuseif.Theforcontrolflowisusedtoenumeratethemembersofalist.Togetalistofnumbers,youuserange(number).

rangelist=range(10)

>>>printrangelist

[0,1,2,3,4,5,6,7,8,9]

Let'scheckifnumberisoneofthenumbersinthetuple:

fornumberinrangelist:

ifnumberin(3,4,7,9):

#"Break"endstheforinstructionwithouttheelseclause

break

else:

#"Continue"continueswiththenextiterationoftheloop

continue

else:

#thisisanoptional"else"

#executedonlyiftheloopisnotinterruptedwith"break".

pass#itdoesnothing

ifrangelist[1]==2:

print"thesecondelement(listsare0-based)is2"

elifrangelist[1]==3:

print"thesecondelementis3"

else:

print"Idon'tknow"

whilerangelist[1]==1:

pass

FunctionsFunctionsaredeclaredwiththekeyworddef.Anyoptionalargumentsmustbedeclaredafterthosethataremandatoryandmusthaveavalueassigned.Whencallingfunctionsusingargumentstonameyoumustalsopassthevalue.Functionscanreturnatuple(tupleunpackingenablesthereturnofmultiplevalues).Lambdafunctionsarein-line.Parametersarepassedbyreference,butimmutabletypes(tuples,integers,strings,andsoon)cannotbechangedinthefunction.Thishappensbecauseitisonlypassedthroughthepositionoftheelementinmemory,andassigninganotherobjecttothevariableresultsinthelossoftheobjectreferenceearlier.

Forexample:

#equaltoadeff(x):returnx+1

funzionevar=lambdax:x+1

>>>printfunzionevar(1)

2

defpassing_example(my_list,my_int):

my_list.append("newelement")

my_int=4

returnmy_list,my_int

>>>input_my_list=[1,2,3]

>>>input_my_int=10

>>>printpassing_example(input_my_list,input_my_int)

([1,2,3,'newelement'],10)

>>>my_list

[1,2,3,'newelement']

>>>my_int

10

ClassesPythonsupportsmultipleinheritanceofclasses.Thevariablesandprivatemethodsaredeclaredbyconvection(itisnotaruleoflanguage)byprecedingthemwithtwounderscores(__).Wecanassignattributes(properties)toarbitraryinstancesofaclass.

Thefollowingisanexample:

classMyclass:

common=10

def__init__(self):

self.myvariable=3

defmyfunc(self,arg1,arg2):

returnself.myvariable

#Wecreateaninstanceoftheclass

>>>instance=Myclass()

>>>instance.myfunc(1,2)

3

#Thisvariableissharedbyallinstances

>>>instance2=Myclass()

>>>instance.common

10

>>>instance2.common

10

#Noteherehowweusetheclassname

#Insteadoftheinstance.

>>>Myclass.common=30

>>>instance.common

30

>>>instance2.common

30

#Thisdoesnotupdatethevariableintheclass,

#Insteadassignanewobjecttothevariable

#ofthefirstinstance.

>>>instance.common=10

>>>instance.common

10

>>>instance2.common

30

>>>Myclass.common=50

#Thevalueisnotchangedbecause"common"isaninstancevariable.

>>>instance.common

10

>>>instance2.common

50

#ThisclassinheritsfromMyclass.Multipleinheritance

#isdeclaredlikethis:

#classAltraClasse(Myclass1,Myclass2,MyclassN)

classAnotherClass(Myclass):

#Thetopic"self"isautomaticallypassed

#andmakesreferencetoinstanceoftheclass,soyoucanset

#ofinstancevariablesasabove,butwithintheclass.

def__init__(self,arg1):

self.myvariable=3

printarg1

>>>instance=AnotherClass("hello")

hello

>>>instance.myfunc(1,2)

3

#Thisclassdoesnothaveamember(property).testmember,but

#Wecanaddoneallinstancewhenwewant.Note

#.testThatwillbeamemberofonlyoneinstance.

>>>instance.test=10

>>>instance.test

10

ExceptionsExceptionsinPythonarehandledwithtry-exceptblocks[exception_name]:

defmy_func():

try:

#Divisionbyzerocausesanexception

10/0

exceptZeroDivisionError:

print"Oops,error"

else:

#noexception,let'sproceed

pass

finally:

#Thiscodeisexecutedwhentheblock

#Try..exceptisalreadyexecutedandallexceptions

#Werehandled,evenifthereisanew

#Exceptiondirectlyintheblock.

print"finish"

>>>my_func()

Oops,error.

finish

ImportingalibraryExternallibrariesareimportedwithimport[libraryname].Youcanalsousetheform[libraryname]import[funcname]toimportindividualfeatures.Here'sanexample:

importrandom

fromtimeimportclock

randomint=random.randint(1,100)

>>>printrandomint

64

InstallingTensorFlowTheTensorFlowPythonAPIsupportsPython2.7andPython3.3+.TheGPUversion(Linuxonly)requirestheCudaToolkit>=7.0andcuDNN>=v2.

WhenworkinginaPythonenvironment,itisrecommendedyouusevirtualenv.ItwillisolateyourPythonconfigurationfordifferentprojects;usingvirtualenvwillnotoverwriteexistingversionsofPythonpackagesrequiredbyTensorFlow.

InstallingonMacorLinuxdistributionsThefollowingarethestepstoinstallTensorFlowonMacandLinuxsystem:

1. Firstinstallpipandvirtualenv(optional)iftheyarenotalreadyinstalled:

ForUbuntu/Linux64-bit:

$sudoapt-getinstallpython-pippython-devpython-

virtualenv

ForMacOSX:

$sudoeasy_installpip

$sudopipinstall--upgradevirtualenv

2. Thenyoucancreateavirtualenvironmentvirtualenv.Thefollowingcommandscreateavirtualenvironmentvirtualenvinthe~/tensorflowdirectory:

$virtualenv--system-site-packages~/tensorflow

3. Thenextstepistoactivatevirtualenvasfollows:

$source~/tensorflow/bin/activate.csh

(tensorflow)$

4. Henceforth,thenameoftheenvironmentwe'reworkinginprecedesthecommandline.Onceactivated,PipisusedtoinstallTensorFlowwithinit.

ForUbuntu/Linux64-bit,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-

cp27-none-linux_x86_64.whl

ForMacOSX,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-

none-any.whl

IfyouwanttouseyourGPUcardwithTensorFlow,theninstallanotherpackage.IrecommendyouvisittheofficialdocumentationtoseeifyourGPUmeetsthespecificationsrequiredtosupportTensorFlow.

Note

ToenableyourGPUwithTensorFlow,youcanreferto(https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#optional-linux-enable-gpu-support)foracompletedescription.

Finally,whenyou'vefinished,youmustdisablethevirtualenvironment:

(tensorflow)$deactivate

Note

Giventheintroductorynatureofthisbook,IsuggestthereadertovisitthedownloadandsetupTensorFlowpageat(https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#download-and-setup)tofindmoreinformationaboutotherwaystoinstallTensorFlow.

InstallingonWindowsIfyoucan'tgetaLinux-basedsystem,youcaninstallUbuntuonavirtualmachine;justuseafreeapplicationcalledVirtualBox,whichletsyoucreateavirtualPConWindowsandinstallUbuntuinthelatter.Soyoucantrytheoperatingsystemwithoutcreatingpartitionsordealingwithcumbersomeprocedures.

Note

AfterinstallingVirtualBox,youcaninstallUbuntu(www.ubuntu.com)andthenfollowtheinstallationforLinuxmachinestoinstallTensorFlow.

InstallationfromsourceHowever,itmayhappenthatthePipinstallationcausesproblems,particularlywhenusingthevisualizationtoolTensorBoard(seehttps://github.com/tensorflow/tensorflow/issues/530).Tofixthisproblem,IsuggestyoubuildandinstallTensorFlow,startingformsourcefiles,throughthefollowingsteps:

1. ClonetheTensorFlowrepository:

gitclone--recurse-submodules

https://github.com/tensorflow/tensorflow2. InstallBazel(dependenciesandinstaller),followingtheinstructionsat:

http://bazel.io/docs/install.html.3. RuntheBazelinstaller:

chmod+xbazel-version-installer-os.sh

./bazel-version-installer-os.sh--user

4. InstallthePythondependencies:

sudoapt-getinstallpython-numpyswigpython-dev

5. Configure(GPUornoGPU?)yourinstallationintheTensorFlowdownloadedrepository:

./configure

6. CreateyourownTensorFlowPippackageusingbazel:

bazelbuild-copt

//tensorflow/tools/pip_package:build_pip_package

7. TobuildwithGPUsupport,usebazelbuild-copt--config=cudafollowedagainby:

//tensorflow/tools/pip_package:build_pip_package

8. Finally,installTensorBoardwherethenameofthe.whlfilewilldependonyourplatform.

pipinstall/tmp/tensorflow_pkg/tensorflow-0.7.1-py2-none-

linux_x86_64.whl

9. GoodLuck!

Note

Pleaserefertohttps://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#installation-for-linuxforfurtherinformation.

TestingyourTensorFlowinstallationOpenaterminalandtypethefollowinglinesofcode:

>>>importtensorflowastf

>>>hello=tf.constant("helloTensorFlow!")

>>>sess=tf.Session()

Toverifyyourinstallation,justtype:

>>>print(sess.run(hello))

Youshouldhavethefollowingoutput:

HelloTensorFlow!

>>>

FirstworkingsessionFinallyitistimetomovefromtheorytopractice.IwillusethePython2.7IDEtowritealltheexamples.TogetaninitialideaofhowtouseTensorFlow,openthePythoneditorandwritethefollowinglinesofcode:

x=1

y=x+9

print(y)

importtensorflowastf

x=tf.constant(1,name='x')

y=tf.Variable(x+9,name='y')

print(y)

Asyoucaneasilyunderstandinthefirstthreelines,theconstantx,setequalto1,isthenaddedto9tosetthenewvalueofthevariabley,andthentheendresultofthevariableyisprintedonthescreen.

Inthelastfourlines,wehavetranslatedaccordingtoTensorFlowlibrarythefirstthreevariables.

Ifweruntheprogram,wehavethefollowingoutput:

10

<tensorflow.python.ops.variables.Variableobjectat

0x7f30ccbf9190>

TheTensorFlowtranslationofthefirstthreelinesoftheprogramexampleproducesadifferentresult.Let'sanalyzethem:

1. ThefollowingstatementshouldneverbemissedifyouwanttousetheTensorFlowlibrary.Ittellsusthatweareimportingthelibraryandcallittf:

importtensorflowastf

2. Wecreateaconstantvaluecalledx,withavalueequaltoone:

x=tf.constant(1,name='x')

3. Thenwecreateavariablecalledy.Thisvariableisdefinedwiththesimpleequationy=x+9:

y=tf.Variable(x+9,name='y')

4. Finally,printouttheresult:

print(y)

Sohowdoweexplainthedifferentresult?Thedifferenceliesinthevariabledefinition.Infact,thevariableydoesn'trepresentthecurrentvalueofx+9,insteaditmeans:whenthevariableyiscomputed,takethevalueoftheconstantxandadd9toit.Thisisthereasonwhythevalueofyhasneverbeencarriedout.Inthenextsection,I'lltrytofixit.

SoweopenthePythonIDEandenterthefollowinglines:

Runningtheprecedingcode,theoutputresultisfinallyasfollows:

10

Wehaveremovedtheprintinstruction,butwehaveinitializedthemodelvariables:

model=tf.initialize_all_variables()

And,mostly,wehavecreatedasessionforcomputingvalues.Inthenextstep,werunthemodel,createdpreviously,andfinallyrunjustthevariableyandprintoutitscurrentvalue.

withtf.Session()assession:

session.run(model)

print(session.run(y))

Thisisthemagictrickthatpermitsthecorrectresult.Inthisfundamentalstep,theexecutiongraphcalledDataFlowGraphiscreatedinthesession,withallthedependenciesbetweenthevariables.Theyvariabledependsonthevariablex,andthatvalueistransformedbyadding9toit.Thevalueisnotcomputeduntilthesessionisexecuted.

ThislastexampleintroducedanotherimportantfeatureinTensorFlow,theDataFlowGraph.

DataFlowGraphsAmachinelearningapplicationistheresultoftherepeatedcomputationofcomplexmathematicalexpressions.InTensorFlow,acomputationisdescribedusingtheDataFlowGraph,whereeachnodeinthegraphrepresentstheinstanceofamathematicaloperation(multiply,add,divide,andsoon),andeachedgeisamulti-dimensionaldataset(tensors)onwhichtheoperationsareperformed.

TensorFlowsupportstheseconstructsandtheseoperators.Let'sseeindetailhownodesandedgesaremanagedbyTensorFlow:

Node:InTensorFlow,eachnoderepresentstheinstantionofanoperation.Eachoperationhas>=inputsand>=0outputs.Edges:InTensorFlow,therearetwotypesofedge:

NormalEdges:Theyarecarriersofdatastructures(tensors),whereanoutputofoneoperation(fromonenode)becomestheinputforanotheroperation.SpecialEdges:Theseedgesarenotdatacarriersbetweentheoutputofanode(operator)andtheinputofanothernode.Aspecialedgeindicatesacontroldependencybetweentwonodes.Let'ssupposewehavetwonodesAandBandaspecialedgesconnectingAtoB;itmeansthatBwillstartitsoperationonlywhentheoperationinAends.SpecialedgesareusedinDataFlowGraphtosetthehappens-beforerelationshipbetweenoperationsonthetensors.

Let'sexploresomefeaturesinDataFlowGraphingreaterdetail:

Operation:Thisrepresentsanabstractcomputation,suchasaddingormultiplyingmatrices.Anoperationmanagestensors.Itcanjustbepolymorphic:thesameoperationcanmanipulatedifferenttensorelementtypes.Forexample,theadditionoftwoint32tensors,theadditionoftwofloattensors,andsoon.Kernel:Thisrepresentstheconcreteimplementationofthatoperation.Akerneldefinestheimplementationoftheoperationonaparticulardevice.Forexample,anaddmatrixoperationcanhaveaCPUimplementationandaGPUone.Inthefollowingsection,wehaveintroducedtheconceptofsessionstocreateadelexecutiongraphinTensorFlow.Let'sexplainthistopic:Session:WhentheclientprogramhastoestablishcommunicationwiththeTensorFlowruntimesystem,asessionmustbecreated.Assoonasthesession

iscreatedforaclient,aninitialgraphiscreatedandisempty.Ithastwofundamentalmethods:

session.extend:Inacomputation,theusercanextendtheexecutiongraph,requestingtoaddmoreoperations(nodes)andedges(data).session.run:UsingTensorFlow,sessionsarecreatedwithsomegraphs,andthesefullgraphsareexecutedtogetsomeoutputs,orsometimes,subgraphsareexecutedthousands/millionsoftimesusingruninvocations.Basically,themethodrunstheexecutiongraphtoprovideoutputs.

FeaturesinDataFlowGraph

TensorFlowprogrammingmodelAdoptingDataFlowGraphasexecutionmodel,youdividethedataflowdesign(graphbuildinganddataflow)fromitsexecution(CPU,GPUcards,oracombination),usingasingleprogramminginterfacethathidesallthecomplexities.ItalsodefineswhattheprogrammingmodelshouldbelikeinTensorFlow.

Let'sconsiderthesimpleproblemofmultiplyingtwointegers,namelyaandb.

Thefollowingarethestepsrequiredforthissimpleproblem:

1. Defineandinitializethevariables.Eachvariableshoulddefinethestateofacurrentexecution.AfterimportingtheTensorFlowmoduleinPython:

importtensorflowastf

2. Wedefinethevariablesaandbinvolvedinthecomputation.Thesearedefinedviaamorebasicstructure,calledtheplaceholder:

a=tf.placeholder("int32")

b=tf.placeholder("int32")

3. Aplaceholderallowsustocreateouroperationsandtobuildourcomputationgraph,withoutneedingthedata.

4. Thenweusethesevariables,asinputsforTensorFlow'sfunctionmul:

y=tf.mul(a,b)

thisfunctionwillreturntheresultofthemultiplicationthe

inputintegersaandb.

5. Managetheexecutionflow,thismeansthatwemustbuildasession:

sess=tf.Session()

6. Visualizetheresults.Werunourmodelonthevariablesaandb,feedingdataintothedataflowgraphthroughtheplaceholderspreviouslydefined.

printsess.run(y,feed_dict={a:2,b:5})

HowtouseTensorBoardTensorBoardisavisualizationtool,devotedtoanalyzingDataFlowGraphandalsotobetterunderstandthemachinelearningmodels.Itcanviewdifferenttypesofstatisticsabouttheparametersanddetailsofanypartofacomputergraphgraphically.Itoftenhappensthatagraphofcomputationcanbeverycomplex.Adeepneuralnetworkcanhaveupto36,000nodes.Forthisreason,TensorBoardcollapsesnodesinhigh-levelblocks,highlightingthegroupswithidenticalstructures.Doingsoallowsabetteranalysisofthegraph,focusingonlyonthecoresectionsofthecomputationgraph.Also,thevisualizationprocessisinteractive;usercanpan,zoom,andexpandthenodestodisplaythedetails.

ThefollowingfigureshowsaneuralnetworkmodelwithTensorBoard:

ATensorBoardvisualizationexample

TensorBoard'salgorithmscollapsenodesintohigh-levelblocksandhighlightgroupswiththesamestructures,whilealsoseparatingouthigh-degreenodes.Thevisualizationtoolisalsointeractive:theuserscanpan,zoomin,expand,andcollapsethenodes.

TensorBoardisequallyusefulinthedevelopmentandtuningofamachinelearningmodel.Forthisreason,TensorFlowletsyouinsertso-calledsummaryoperationsintothegraph.Thesesummaryoperationsmonitorchangingvalues(duringtheexecutionofacomputation)writteninalogfile.ThenTensorBoardisconfiguredtowatchthislogfilewithsummaryinformationanddisplayhowthisinformationchangesovertime.

Let'sconsiderabasicexampletounderstandtheusageofTensorBoard.Wehavethefollowingexample:

importtensorflowastf

a=tf.constant(10,name="a")

b=tf.constant(90,name="b")

y=tf.Variable(a+b*2,name="y")

model=tf.initialize_all_variables()

withtf.Session()assession:

merged=tf.merge_all_summaries()

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

session.run(model)

print(session.run(y))

Thatgivesthefollowingresult:

190

Let'spointintothesessionmanagement.Thefirstinstructiontoconsiderisasfollows:

merged=tf.merge_all_summaries()

Thisinstructionmustmergeallthesummariescollectedinthedefaultgraph.

ThenwecreateSummaryWriter.Itwillwriteallthesummaries(inthiscasetheexecutiongraph)obtainedfromthecode'sexecutionintothe/tmp/tensorflowlogsdirectory:

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

Finally,werunthemodelandsobuildtheDataFlowGraph:

session.run(model)

print(session.run(y))

TheuseofTensorBoardisverysimple.Let'sopenaterminalandenterthefollowing:

$tensorboard--logdir=/tmp/tensorflowlogs

Amessagesuchasthefollowingshouldappear:

startigtensorboardonport6006

Then,byopeningawebbrowser,weshoulddisplaytheDataFlowGraphwithauxiliarynodes:

DataFlowGraphdisplaywithTensorBoard

NowwewillbeabletoexploretheDataFlowGraph:

ExploretheDataFlowGraphdisplaywithTensorBoard

TensorBoardusesspecialiconsforconstantsandsummarynodes.Tosummarize,wereportinthenextfigurethetableofnodesymbolsdisplayed:

NodesymbolsinTensorBoard

SummaryInthischapter,weintroducedthemaintopics:machinelearninganddeeplearning.Whilemachinelearningexploresthestudyandconstructionofalgorithmsthatcanlearnfrom,andmakepredictionsondata,deeplearningisbasedpreciselyonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.

Inthisvastscientificresearchandpracticalapplicationarea,wecanfirmlyplacetheTensorFlowsoftwarelibrary,developedbytheGoogle'sresearchgroupforartificialintelligence(GoogleBrainProject)andreleasedasopensourcesoftwareonNovember9,2015.

AfterelectingthePythonprogramminglanguageasthedevelopmenttoolforourexamplesandapplications,wesawhowtoinstallandcompilethelibrary,andthencarriedoutafirstworkingsession.ThisallowedustointroducetheexecutionmodelofTensorFlowandDataFlowGraph.Itledustodefinewhatourprogrammingmodelshouldbe.

Thechapterendedwithanexampleofhowtouseanimportanttoolfordebuggingmachinelearningapplications:TensorBoard.

Inthenextchapter,wewillcontinueourjourneyintotheTensorFlowlibrary,withtheintentionofshowingitsversatility.Startingfromthefundamentalconcept,tensors,wewillseehowtousethelibraryforpurelymathapplications.

Chapter2.DoingMathwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

ThetensordatastructureHandlingtensorswithTensorFlowComplexnumbersandfractalsComputingderivativesRandomnumbersSolvingpartialdifferentialequations

ThetensordatastructureTensorsarethebasicdatastructuresinTensorFlow.Aswehavealreadysaid,theyrepresenttheconnectingedgesinaDataFlowGraph.Atensorsimplyidentifiesamultidimensionalarrayorlist.

Itcanbeidentifiedbythreeparameters,rank,shape,andtype:

rank:Eachtensorisdescribedbyaunitofdimensionalitycalledrank.Itidentifiesthenumberofdimensionsofthetensor.Forthisreason,arankisknownasorderorn-dimensionsofatensor(forexample,arank2tensorisamatrixandarank1tensorisavector).shape:Theshapeofatensoristhenumberofrowsandcolumnsithas.type:Itisthedatatypeassignedtothetensor'selements.

Well,nowwetakeconfidencewiththisfundamentaldatastructure.Tobuildatensor,wecan:

Buildann-dimensionalarray;forexample,byusingtheNumPylibraryConvertthen-dimensionalarrayintoaTensorFlowtensor

Onceweobtainthetensor,wecanhandleitusingtheTensorFlowoperators.Thefollowingfigureprovidesavisualexplanationoftheconceptsintroduced:

Visualizationofmultidimensionaltensors

One-dimensionaltensorsTobuildaone-dimensionaltensor,weusetheNumpyarray(s)command,wheresisaPythonlist:

>>>importnumpyasnp

>>>tensor_1d=np.array([1.3,1,4.0,23.99])

UnlikeaPythonlist,thecommasbetweentheelementsarenotshown:

>>>printtensor_1d

[1.31.4.23.99]

TheindexingisthesameasPythonlists.Thefirstelementhasposition0,thethirdelementhasposition2,andsoon:

>>>printtensor_1d[0]

1.3

>>>printtensor_1d[2]

4.0

Finally,youcanviewthebasicattributesofthetensor,therankofthetensor:

>>>tensor_1d.ndim

1

Thetupleofthetensor'sdimensionisasfollows:

>>>tensor_1d.shape

(4L,)

Thetensor'sshapehasjustfourvaluesinarow.

Thedatatypeinthetensor:

>>>tensor_1d.dtype

dtype('float64')

Now,let'sseehowtoconvertaNumPyarrayintoaTensorFlowtensor:

importTensorFlowastf

TheTensorFlowfunctiontf_convert_to_tensorconvertsPythonobjectsof

varioustypestotensorobjects.Itacceptstensorobjects,Numpyarrays,Pythonlists,andPythonscalars:

tf_tensor=tf.convert_to_tensor(tensor_1d,dtype=tf.float64)

RunningtheSession,wecanvisualizethetensoranditselementsasfollows:

withtf.Session()assess:

printsess.run(tf_tensor)

printsess.run(tf_tensor[0])

printsess.run(tf_tensor[2])

Thatgivesthefollowingresults:

>>

[1.31.4.23.99]

1.3

4.0

>>>

Two-dimensionaltensorsTocreateatwo-dimensionaltensorormatrix,weagainusearray(s),butswillbeasequenceofarray:

>>>importnumpyasnp

>>>tensor_2d=np.array([(1,2,3,4),(4,5,6,7),(8,9,10,11),

(12,13,14,15)])

>>>printtensor_2d

[[1234]

[4567]

[891011]

[12131415]]

>>>

Avalueintensor_2disidentifiedbytheexpressiontensor_2d[row,col],whererowistherowpositionandcolisthecolumnposition:

>>>tensor_2d[3][3]

15

Youcanalsousethesliceoperator:toextractasubmatrix:

>>>tensor_2d[0:2,0:2]

array([[1,2],

[4,5]])

Inthiscase,weextracteda2×2submatrix,containingrow0and1,andcolumns0and1oftensor_2d.TensorFlowhasitsownsliceoperator.Inthenextsubsectionwewillseehowtouseit.

Tensorhandling

Let'sseehowwecanapplyalittlemorecomplexoperationstothesedatastructures.Considerthefollowingcode:

1. Importthelibraries:

importTensorFlowastf

importnumpyasnp

2. Let'sbuildtwointegerarrays.Theserepresentstwo3×3matrices:

matrix1=np.array([(2,2,2),(2,2,2),(2,2,2)],dtype='int32')

matrix2=np.array([(1,1,1),(1,1,1),(1,1,1)],dtype='int32')

3. Visualizethem:

print"matrix1="

printmatrix1

print"matrix2="

printmatrix2

4. TousethesematricesinourTensorFlowenvironment,theymustbetransformedintoatensordatastructure:

matrix1=tf.constant(matrix1)

matrix2=tf.constant(matrix2)

5. WeusedtheTensorFlowconstantoperatortoperformthetransformation.6. ThematricesarereadytobemanipulatedwithTensorFlowoperators.Inthis

case,wecalculateamatrixmultiplicationandamatrixsum:

matrix_product=tf.matmul(matrix1,matrix2)

matrix_sum=tf.add(matrix1,matrix2)

7. Thefollowingmatrixwillbeusedtocomputeamatrixdeterminant:

matrix_3=np.array([(2,7,2),(1,4,2),

(9,0,2)],dtype='float32')

print"matrix3="

printmatrix_3

matrix_det=tf.matrix_determinant(matrix_3)

8. It'stimetocreateourgraphandrunthesession,withthetensorsandoperatorscreated:

withtf.Session()assess:

result1=sess.run(matrix_product)

result2=sess.run(matrix_sum)

result3=sess.run(matrix_det)

9. Theresultswillbeprintedoutbyrunningthefollowingcommand:

print"matrix1*matrix2="

printresult1

print"matrix1+matrix2="

printresult2

print"matrix3determinantresult="

printresult3

Thefollowingfigureshowstheresults,afterrunningthecode:

TensorFlowprovidesnumerousmathoperationsontensors.Thefollowingtablesummarizesthem:

TensorFlowoperator Description

tf.add Returnsthesum

tf.sub

Returnssubtraction

tf.mul Returnsthemultiplication

tf.div Returnsthedivision

tf.mod Returnsthemodule

tf.abs Returnstheabsolutevalue

tf.neg Returnsthenegativevalue

tf.sign Returnsthesign

tf.inv Returnstheinverse

tf.square Returnsthesquare

tf.round Returnsthenearestinteger

tf.sqrt Returnsthesquareroot

tf.pow Returnsthepower

tf.exp Returnstheexponential

tf.log Returnsthelogarithm

tf.maximum Returnsthemaximum

tf.minimum Returnstheminimum

tf.cos Returnsthecosine

tf.sin Returnsthesine

Three-dimensionaltensorsThefollowingcommandsbuildathree-dimensionaltensor:

>>>importnumpyasnp

>>>tensor_3d=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

>>>printtensor_3d

[[[12]

[34]]

[[56]

[78]]]

>>>

Thethree-dimensionaltensorcreatedisa2x2x2matrix:

>>>tensor_3d.shape

(2L,2L,2L)

Toretrieveanelementfromathree-dimensionaltensor,weuseanexpressionofthefollowingform:

tensor_3d[plane,row,col]

Followingthesesettings:

Matrix3×3representation

Soallthefourelementsinthefirstplaneidentifiedbythevalueofthevariableplaneequaltozero:

>>>tensor_3d[0,0,0]

1

>>>tensor_3d[0,0,1]

2

>>>tensor_3d[0,1,0]

3

>>>tensor_3d[0,1,1]

4

Thethree-dimensionaltensorsallowtointroducethenexttopic,linkedtothemanipulationofimagesbutmoregenerallyintroducesustooperateassimpletransformationsontensors.

HandlingtensorswithTensorFlowTensorFlowisdesignedtohandletensorsofallsizesandoperatorsthatcanbeusedtomanipulatethem.Inthisexample,inordertoseearraymanipulations,wearegoingtoworkwithadigitalimage.Asyouprobablyknow,acolordigitalimagethatisaMxNx3sizematrix(athreeordertensor),whosecomponentscorrespondtothecomponentsofred,green,andblueintheimage(RGBspace),meansthateachfeatureintherectangularboxfortheRGBimagewillbespecifiedbythreecoordinates,i,j,andk.

TheRGBtensor

ThefirstthingIwanttoshowyouishowtouploadanimage,andthentoextractasub-imagefromtheoriginal,usingtheTensorFlowsliceoperator.

Preparetheinputdata

Usingtheimreadcommandinmatplotlib,weimportadigitalimageinstandardformatcolors(JPG,BMP,TIF):

importmatplotlib.imageasmp_image

filename="packt.jpeg"

input_image=mp_image.imread(filename)

However,wecanseetherankandtheshapeofthetensor:

print'inputdim={}'.format(input_image.ndim)

print'inputshape={}'.format(input_image.shape)

You'llseetheoutput,whichis(80,144,3).Thismeanstheimageis80pixelshigh,144pixelswide,and3colorsdeep.

Finally,usingmatplotlib,itispossibletovisualizetheimportedimage:

importmatplotlib.pyplotasplt

plt.imshow(input_image)

plt.show()

Thestartingimage

Inthisexample,sliceisabidimensionalsegmentofthestartingimage,whereeachpixelhastheRGBcomponents,soweneedaplaceholdertostoreallthevaluesoftheslice:

importTensorFlowastf

my_image=tf.placeholder("uint8",[None,None,3])

Forthelastdimension,we'llneedonlythreevalues.ThenweusetheTensorFlowoperatorslicetocreateasub-image:

slice=tf.slice(my_image,[10,0,0],[16,-1,-1])

ThelaststepistobuildaTensorFlowworkingsession:

withtf.Session()assession:

result=session.run(slice,feed_dict={my_image:input_image})

print(result.shape)

plt.imshow(result)

plt.show()

Theresultingshapeisthenasthefollowingimageshows:

Theinputimageaftertheslice

Inthisnextexample,wewillperformageometrictransformationoftheinputimage,usingthetransposeoperator:

importTensorFlowastf

Weassociatetheinputimagetoavariablewecallx:

x=tf.Variable(input_image,name='x')

Wetheninitializeourmodel:

model=tf.initialize_all_variables()

Next,webuildupthesessionwiththatwerunourcode:

withtf.Session()assession:

Toperformthetransposeofourmatrix,usethetransposefunctionofTensorFlow.Thismethodperformsaswapbetweentheaxes0and1oftheinputmatrix,whilethezaxisisleftunchanged:

x=tf.transpose(x,perm=[1,0,2])

session.run(model)

result=session.run(x)

plt.imshow(result)

plt.show()

Theresultisthefollowing:

Thetransposedimage

ComplexnumbersandfractalsFirstofall,welookathowPythonhandlescomplexnumbers.Itisasimplematter.Forexample,settingx=5+4jinPython,wemustwritethefollowing:

>>>x=5.+4j

Itmeansthat>>>xisequalto5+4j.

Atthesametime,youcanwritethefollowing:

>>>x=complex(5,4)

>>>x

(5+4j)

Wealsonotethat:

Pythonusesjtomean√-1insteadofiinmath.Ifyouputanumberbeforethej,Pythonwillconsideritasanimaginarynumber,otherwise,itsavariable.Itmeansthatifyouwanttowritetheimaginarynumberi,youmustwrite1jratherthanj.

TogettherealandimaginarypartsofaPythoncomplexnumber,youcanusethefollowing:

>>>x.real

5.0

>>>x.imag

4.0

>>>

Weturnnowtoourproblem,namelyhowtodisplaythefractalswithTensorFlow.TheMandelbrotsetisoneofthemostfamousfractals.Afractalisageometricobjectthatisrepeatedinitsstructureatdifferentscales.Fractalsareverycommoninnature,andanexampleisthecoastofGreatBritain.

TheMandelbrotsetisdefinedforthecomplexnumberscforwhichthefollowingsuccessionistrueandbounded:

Z(n+1)=Z(n)2+c,whereZ(0)=0

ThesetisnamedafteritscreatorBenoîtMandelbrot,aPolishmathematician

famousformakingfamousfractals.However,hewasabletogiveashapeorgraphicrepresentationtothesetofMandelbrotonlywiththehelpofcomputerprogramming.In1985,hepublishedinScientificAmericanthefirstalgorithmtocalculatetheMandelbrotset.Thealgorithm(foreachpointcomplexpointZ):

1. Zhasinitialvalueequalto0,Z(0)=0.2. Choosethecomplexnumbercasthecurrentpoint.IntheCartesianplane,the

abscissaaxis(horizontalline)representstherealpart,whiletheaxisofordinates(verticalline)representstheimaginarypartofc.

3. Iteration:Z(n+1)=Z(n)2+cStopwhenZ(n)2islargerthanthemaximumradius;

NowweseethroughsimplestepshowwecantranslatethealgorithmmentionedearlierusingTensorFlow.

PreparethedataforMandelbrotsetImportthenecessarylibrariestoourexample:

importTensorFlowastf

importnumpyasnp

importmatplotlib.pyplotasplt

WebuildacomplexgridthatwillcontainourMandelbrot'sset.Theregionofthecomplexplaneisbetween-1.3and+1.3ontherealaxisandbetween-2jand+1jontheimaginaryaxis.Eachpixellocationineachimagewillrepresentadifferentcomplexvalue,z:

Y,X=np.mgrid[-1.3:1.3:0.005,-2:1:0.005]

Z=X+1j*Y

c=tf.constant(Z.astype(np.complex64))

Thenwedefinedatastructures,orthetensorTensorFlowthatcontainsallthedatatobeincludedinthecalculation.Wethendefinetwovariables.Thefirstistheoneonwhichwewillmakeouriteration.Ithasthesamedimensionsasthecomplexgrid,butitisdeclaredasvariable,thatis,itsvalueswillchangeinthecourseofthecalculation:

zs=tf.Variable(c)

Thenextvariableisinitializedtozero.Italsohasthesamesizeasthevariablezs:

ns=tf.Variable(tf.zeros_like(c,tf.float32))

BuildandexecutetheDataFlowGraphforMandelbrot'ssetInsteadtointroduceasessionweinstantiateanInteractiveSession():

sess=tf.InteractiveSession()

Itrequires,asweshallsee,theTensor.eval()andOperation.run()methods.Thenweinitializeallthevariablesinvolvedthroughtherun()method:

tf.initialize_all_variables().run()

Starttheiteration:

zs_=zs*zs+c

Definethestopconditionoftheiteration:

not_diverged=tf.complex_abs(zs_)<4

Thenweusethegroupoperatorthatgroupsmultipleoperations:

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,tf.float32)))

ThefirstoperationisthestepiterationZ(n+1)=Z(n)2+ctocreateanewvalue.

Thesecondoperationaddsthisvaluetothecorrespondentelementvariableinns.Whenthisopfinishes,allopsininputhavefinished.Thisoperatorhasnooutput.

Thenweruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultforMandelbrot'ssetTheresultwillbethetensorns.eval().Usingmatplotlib,let'svisualizetheresult:

plt.imshow(ns.eval())

plt.show()

TheMandelbrotset

Ofcourse,theMandelbrotsetisnottheonlyfractalwecanvisualize.JuliasetsarefractalsthathavebeennamedafterGastonMauriceJuliaforhisworkinthisfield.TheirbuildingprocessisverysimilartothatusedfortheMandelbrotset.

PreparethedataforJulia'ssetLet'sdefinetheoutputcomplexplane.Itisbetween-2and+2ontherealaxisandbetween-2jand+2jontheimaginaryaxis:

Y,X=np.mgrid[-2:2:0.005,-2:2:0.005]

Andthecurrentpointlocation:

Z=X+1j*Y

ThedefinitionoftheJulia'ssetrequiresredefingZasaconstanttensor:

Z=tf.constant(Z.astype("complex64"))

Thustheinputtensorssupportingourcalculationisasfollows:

zs=tf.Variable(Z)

ns=tf.Variable(tf.zeros_like(Z,"float32"))

BuildandexecutetheDataFlowGraphforJulia'ssetAsinthepreviousexample,wecreateourowninteractivesession:

sess=tf.InteractiveSession()

Wetheninitializetheinputtensors:

tf.initialize_all_variables().run()

TocomputethenewvaluesoftheJuliaset,wewillusetheiterativeformulaZ(n+1)=Z(n)2–c,wheretheinitialpointcwillbeequaltotheimaginarynumber0.75i:

c=complex(0.0,0.75)

zs_=zs*zs-c

Thegroupingoperatorandthestopiteration'sconditionwillbethesameasintheMandelbrotcomputation:

not_diverged=tf.complex_abs(zs_)<4

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,"float32")))

Finally,weruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultTovisualizetheresultrunthefollowingcommand:

plt.imshow(ns.eval())

plt.show()

TheJuliaset

ComputinggradientsTensorFlowhasfunctionstosolveothermorecomplextasks.Forexample,wewilluseamathematicaloperatorthatcalculatesthederivativeofywithrespecttoitsexpressionxparameter.Forthispurpose,weusethetf.gradients()function.

Letusconsiderthemathfunctiony=2x².Wewanttocomputethegradientdiywithrespecttox=1.Thefollowingisthecodetocomputethisgradient:

1. First,importtheTensorFlowlibrary:

importTensorFlowastf

2. Thexvariableistheindependentvariableofthefunction:

x=tf.placeholder(tf.float32)

3. Let'sbuildthefunction:

y=2*x*x

4. Finally,wecallthetf.gradients()functionwithyandxasarguments:

var_grad=tf.gradients(y,x)

5. Toevaluatethegradient,wemustbuildasession:

withtf.Session()assession:

6. Thegradientwillbeevaluatedonthevariablex=1:

var_grad_val=session.run(var_grad,feed_dict={x:1})

7. Thevar_grad_valvalueisthefeedresult,tobeprinted:

print(var_grad_val)

8. Thatgivesthefollowingresult:

>>

[4.0]

>>

RandomnumbersThegenerationofrandomnumbersisessentialinmachinelearningandwithinthetrainingalgorithms.Whenrandomnumbersaregeneratedbyacomputer,theyaregeneratedbyaPseudoRandomNumberGenerator(PRNG).Thetermpseudocomesfromthefactthatthecomputerisastainlogicallyprogrammedrunningofinstructionsthatcanonlysimulaterandomness.Despitethislogicallimitation,computersareveryefficientatgeneratingrandomnumbers.TensorFlowprovidesoperatorstocreaterandomtensorswithdifferentdistributions.

UniformdistributionGenerally,whenweneedtoworkwithrandomnumbers,wetrytogetrepeatedvalueswiththesamefrequency,uniformlydistributed.TheoperatorTensorFlowprovidesvaluesbetweenminvalandmaxval,allwiththesameprobability.Let'sseeasimpleexamplecode:

random_uniform(shape,minval,maxval,dtype,seed,name)

WeimporttheTensorFlowlibraryandmatplotlibtodisplaytheresults:

importTensorFlowastf

importmatplotlib.pyplotasplt

Theuniformvariableisa1-dimensionaltensor,theelements100,containingvaluesrangingfrom0to1,distributedwiththesameprobability:

uniform=tf.random_uniform([100],minval=0,maxval=1,dtype=tf.float32)

Let'sdefinethesession:

sess=tf.Session()

Inoursession,weevaluatethetensoruniform,usingtheeval()operator:

withtf.Session()assession:

printuniform.eval()

plt.hist(uniform.eval(),normed=True)

plt.show()

Asyoucansee,allintermediatevaluesbetween0and1haveapproximatelythesamefrequency.Thisbehavioriscalleduniformdistribution.Theresultofexecutionisthereforeasfollows:

Uniformdistribution

NormaldistributionInsomespecificcases,youmayneedtogeneraterandomnumbersthatdifferbyafewunits.Inthiscase,weusedthenormaldistributionofrandomnumbers,alsocalledGaussiandistribution,thatincreasestheprobabilityofthenextissuesextractionat0.Eachintegerrepresentsthestandarddeviation.Asshownfromthefutureissuestothemarginsoftherangehaveaverylowchanceofbeingextracted.ThefollowingistheimplementationwithTensorFlow:

importTensorFlowastf

importmatplotlib.pyplotasplt

norm=tf.random_normal([100],mean=0,stddev=2)

withtf.Session()assession:

plt.hist(norm.eval(),normed=True)

plt.show()

Wecreateda1d-tensorofshape[100]consistingofrandomnormalvalues,withmeanequalto0andstandarddeviationequalto2,usingtheoperatortf.random_normal.Thefollowingistheresult:

Normaldistribution

GeneratingrandomnumberswithseedsWerecallthatoursequenceispseudo-random,becausethevaluesarecalculatedusingadeterministicalgorithmandprobabilityplaysnorealrole.Theseedisjustastartingpointforthesequenceandifyoustartfromthesameseedyouwillendupwiththesamesequence.Thisisveryuseful,forexample,todebugyourcode,whenyouaresearchingforanerrorinaprogramandyoumustbeabletoreproducetheproblembecauseeveryrunwouldbedifferent.

Considerthefollowingexamplewherewehavetwouniformdistributions:

uniform_with_seed=tf.random_uniform([1],seed=1)

uniform_without_seed=tf.random_uniform([1])

Inthefirstuniformdistribution,webeganwiththeseed=1.Thismeansthatrepeatedlyevaluatingthetwodistributions,thefirstuniformdistributionwillalwaysgeneratethesamesequenceofvalues:

print("FirstRun")

withtf.Session()asfirst_session:

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwithoutseed={}"\

.format(first_session.run(uniform_without_seed)))

print("uniformwithoutseed={}"\

.format(first_session.run(uniform_without_seed)))

print("SecondRun")

withtf.Session()assecond_session:

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))

print("uniformwithoutseed={}"\

.format(second_session.run(uniform_without_seed)))

print("uniformwithoutseed={}"\

.format(second_session.run(uniform_without_seed)))

Asyoucansee,thisistheendresult.Theuniformdistributionwithseed=1alwaysgivesthesameresult:

>>>

FirstRun

uniformwith(seed=1)=[0.23903739]

uniformwith(seed=1)=[0.22267115]

uniformwithoutseed=[0.92157185]

uniformwithoutseed=[0.43226039]

SecondRun

uniformwith(seed=1)=[0.23903739]

uniformwith(seed=1)=[0.22267115]

uniformwithoutseed=[0.50188708]

uniformwithoutseed=[0.21324408]

>>>

Montecarlo'smethod

WeendthesectiononrandomnumberswithasimplenoteabouttheMontecarlomethod.Itisanumericalprobabilisticmethodwidelyusedintheapplicationofhigh-performancescientificcomputing.Inourexample,wewillcalculatethevalueofπ:

importTensorFlowastf

trials=100

hits=0

Generatepseudo-randompointsinsidethesquare[-1,1]×[-1,1],usingtherandom_uniformfunction:

x=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

y=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

pi=[]

Startthesession:

sess=tf.Session()

Insidethesession,wecalculatethevalueofπ:theareaofthecircleisπandthatofthesquareis4.Therelationshipbetweenthenumbersinsidethecircleandthetotalofgeneratedpointsmustconverge(veryslowly)toπ,andwecounthowmanypointsfallinsidethecircleequationx2+y2=1.

withsess.as_default():

foriinrange(1,trials):

forjinrange(1,trials):

ifx.eval()**2+y.eval()**2<1:

hits=hits+1

pi.append((4*float(hits)/i)/trials)

plt.plot(pi)

plt.show()

Thefigureshowstheconvergenceduringthenumberofteststotheπvalue

SolvingpartialdifferentialequationsApartialdifferentialequation(PDE)isadifferentialequationinvolvingpartialderivativesofanunknownfunctionofseveralindependentvariables.PDEsarecommonlyusedtoformulateandsolvemajorphysicalproblemsinvariousfields,fromquantummechanicstofinancialmarkets.Inthissection,wetaketheexamplefromhttps://www.TensorFlow.org/versions/r0.8/tutorials/pdes/index.html,showingtheuseofTensorFlowinatwo-dimensionalPDEsolutionthatmodelsthesurfaceofsquarepondwithafewraindropslandingonit.Theeffectwillbetoproducebi-dimensionalwavesontheponditself.Wewon'tconcentrateonthecomputationalaspectsoftheproblem,asthisisbeyondthescopeofthisbook;insteadwewillfocusonusingTensorFlowtodefinetheproblem.

Thestartingpointistoimportthesefundamentallibraries:

importTensorFlowastf

importnumpyasnp

importmatplotlib.pyplotasplt

InitialconditionFirstwehavetodefinethedimensionsoftheproblem.Let'simaginethatourpondisa500x500square:

N=500

Thefollowingtwo-dimensionaltensoristhepondattimet=0,thatis,theinitialconditionofourproblem:

u_init=np.zeros([N,N],dtype=np.float32)

Wehave40randomraindropsonit

forninrange(40):

a,b=np.random.randint(0,N,2)

u_init[a,b]=np.random.uniform()

Thenp.random.randint(0,N,2)isaNumPyfunctionthatreturnsrandomintegersfrom0toNonatwo-dimensionalshape.

Usingmatplotlib,wecanshowtheinitialsquarepond:

plt.imshow(U.eval())

plt.show()

Zoomingonthepondinitsinitialcondition:thecoloreddotsrepresenttheraindropsfallen

Thenwedefinethefollowingtensor:

ut_init=np.zeros([N,N],dtype=np.float32)

Itisthetemporalevolutionofthepond.Attimet=tenditwillcontainthefinalstateofthepond.

ModelbuildingWemustdefinesomefundamentalparameters(usingTensorFlowplaceholders)andatimestepofthesimulation:

eps=tf.placeholder(tf.float32,shape=())

Wemustalsodefineaphysicalparameterofthemodel,namelythedampingcoefficient:

damping=tf.placeholder(tf.float32,shape=())

ThenweredefineourstartingtensorsasTensorFlowvariables,sincetheirvalueswillchangeoverthecourseofthesimulation:

U=tf.Variable(u_init)

Ut=tf.Variable(ut_init)

Finally,webuildourPDEmodel.Itrepresentstheevolutionintimeofthepondaftertheraindropshavefallen:

U_=U+eps*Ut

Ut_=Ut+eps*(laplace(U)-damping*Ut)

Asyoucansee,weintroducedthelaplace(U)functiontoresolvethePDE(itwillbedescribedinthelastpartofthissection).

UsingtheTensorFlowgroupoperator,wedefinehowourpondintimetshouldevolve:

step=tf.group(

U.assign(U_),

Ut.assign(Ut_))

Let'srecallthatthegroupoperatorgroupsmultipleoperationsasasingleone.

GraphexecutionInoursessionwewillseetheevolutionintimeofthepondby1000steps,whereeachtimestepisequalto0.03s,whilethedampingcoefficientissetequalto0.04.

Let'sinitializetheTensorFlowvariables:

tf.initialize_all_variables().run()

Thenwerunthesimulation:

foriinrange(1000):

step.run({eps:0.03,damping:0.04})

ifi%50==0:

clear_output()

plt.imshow(U.eval())

plt.show()

Every50stepsthesimulationresultwillbedisplayedasfollows:

Thepondafter400simulationsteps

Computationalfunctionused

Let'snowseewhatistheLaplace(U)functionandtheancillaryfunctionsused:

defmake_kernel(a):

a=np.asarray(a)

a=a.reshape(list(a.shape)+[1,1])

returntf.constant(a,dtype=1)

defsimple_conv(x,k):

x=tf.expand_dims(tf.expand_dims(x,0),-1)

y=tf.nn.depthwise_conv2d(x,k,[1,1,1,1],padding='SAME')

returny[0,:,:,0]

deflaplace(x):

laplace_k=make_kernel([[0.5,1.0,0.5],

[1.0,-6.,1.0],

[0.5,1.0,0.5]])

returnsimple_conv(x,laplace_k)

Thesefunctionsdescribethephysicsofthemodel,thatis,asthewaveiscreatedandpropagatesinthepond.Iwillnotgointothedetailsofthesefunctions,theunderstandingofwhichisbeyondthescopeofthisbook.

Thefollowingfigureshowsthewavesonthepondaftertheraindropshavefallen.

Zoomingonthepond

SummaryInthischapter,welookedatsomeofthemathematicalpotentialofTensorFlow.Fromthefundamentaldefinitionofatensor,thebasicdatastructureforanytypeofcomputation,wesawwithsomeexampleshowtohandlethesedatastructuresusingtheTensorFlow'smathoperators.Usingcomplexnumbers,weexploredtheworldoffractals.Thenweintroducedtheconceptofrandomnumbers.Theseareinfactusedinmachinelearningformodeldevelopmentandtesting,sothechapterendedwithanexampleofdefiningandsolvingamathematicalproblemusingdifferentialequationswithpartialderivatives.

Inthenextchapter,finallywe'llstarttoseeTensorFlowinactionrightinthefieldforwhichitwasdeveloped-inmachinelearning,solvingcomplexproblemssuchasclassificationanddataclustering.

Chapter3.StartingwithMachineLearningInthischapter,wewillcoverthefollowingtopics:

LinearregressionTheMNISTdatasetClassifiersThenearestneighboralgorithmDataclusteringThek-meansalgorithm

ThelinearregressionalgorithmInthissection,webeginourexplorationofmachinelearningtechniqueswiththelinearregressionalgorithm.Ourgoalistobuildamodelbywhichtopredictthevaluesofadependentvariablefromthevaluesofoneormoreindependentvariables.

Therelationshipbetweenthesetwovariablesislinear;thatis,ifyisthedependentvariableandxtheindependent,thenthelinearrelationshipbetweenthetwovariableswilllooklikethis:y=Ax+b.

Thelinearregressionalgorithmadaptstoagreatvarietyofsituations;foritsversatility,itisusedextensivelyinthefieldofappliedsciences,forexample,biologyandeconomics.

Furthermore,theimplementationofthisalgorithmallowsustointroduceinatotallyclearandunderstandablewaythetwoimportantconceptsofmachinelearning:thecostfunctionandthegradientdescentalgorithms.

DatamodelThefirstcrucialstepistobuildourdatamodel.Wementionedearlierthattherelationshipbetweenourvariablesislinear,thatis:y=Ax+b,whereAandbareconstants.Totestouralgorithm,weneeddatapointsinatwo-dimensionalspace.

WestartbyimportingthePythonlibraryNumPy:

importnumpyasnp

Thenwedefinethenumberofpointswewanttodraw:

number_of_points=500

Weinitializethefollowingtwolists:

x_point=[]

y_point=[]

Thesepointswillcontainthegeneratedpoints.

Wethensetthetwoconstantsthatwillappearinthelinearrelationofywithx:

a=0.22

b=0.78

ViaNumPy'srandom.normalfunction,wegenerate300randompointsaroundtheregressionequationy=0.22x+0.78:

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

Finally,viewthegeneratedpointsbymatplotlib:

importmatplotlib.pyplotasplt

plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

Linearregression:Thedatamodel

Costfunctionsandgradientdescent

ThemachinelearningalgorithmthatwewanttoimplementwithTensorFlowmustpredictvaluesofyasafunctionofxdataaccordingtoourdatamodel.ThelinearregressionalgorithmwilldeterminethevaluesoftheconstantsAandb(fixedforourdatamodel),whicharethenthetrueunknownsoftheproblem.

Thefirststepistoimportthetensorflowlibrary:

importtensorflowastf

ThendefinetheAandbunknowns,usingtheTensorFlowtf.Variable:

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

TheunknownfactorAwasinitializedusingarandomvaluebetween-1and1,whilethevariablebisinitiallysettozero:

b=tf.Variable(tf.zeros([1]))

Sowewritethelinearrelationshipthatbindsytox:

y=A*x_point+b

Nowwewillintroduce,thiscostfunction:thathasparameterscontainingapairofvaluesAandbtobedeterminedwhichreturnsavaluethatestimateshowwelltheparametersarecorrect.Inthisexample,ourcostfunctionismeansquareerror:

cost_function=tf.reduce_mean(tf.square(y-y_point))

Itprovidesanestimateofthevariabilityofthemeasures,ormoreprecisely,ofthedispersionofvaluesaroundtheaveragevalue;asmallvalueofthisfunctioncorrespondstoabestestimatefortheunknownparametersAandb.

Tominimizecost_function,weuseanoptimizationalgorithmwiththegradientdescent.Givenamathematicalfunctionofseveralvariables,gradientdescentallowstofindalocalminimumofthisfunction.Thetechniqueisasfollows:

Evaluate,atanarbitraryfirstpointofthefunction'sdomain,thefunctionitselfanditsgradient.Thegradientindicatesthedirectioninwhichthefunctiontendstoaminimum.Selectasecondpointinthedirectionindicatedbythegradient.Ifthefunctionforthissecondpointhasavaluelowerthanthevaluecalculatedatthefirstpoint,thedescentcancontinue.

Youcanrefertothefollowingfigureforavisualexplanationofthealgorithm:

Thegradientdescentalgorithm

Wealsoremarkthatthegradientdescentisonlyalocalfunctionminimum,butitcanalsobeusedinthesearchforaglobalminimum,randomlychoosinganewstartpointonceithasfoundalocalminimumandrepeatingtheprocessmanytimes.Ifthenumberofminimaofthefunctionislimited,andthereareveryhighnumberofattempts,thenthereisagoodchancethatsoonerorlatertheglobalminimumwillbeidentified.

UsingTensorFlow,theapplicationofthisalgorithmisverysimple.Theinstructionareasfollows:

optimizer=tf.train.GradientDescentOptimizer(0.5)

Here0.5isthelearningrateofthealgorithm.

Thelearningratedetermineshowfastorslowwemovetowardstheoptimal

weights.Ifitisverylarge,weskiptheoptimalsolution,andifitistoosmall,weneedtoomanyiterationstoconvergetothebestvalues.

Anintermediatevalue(0.5)isprovided,butitmustbetunedinordertoimprovetheperformanceoftheentireprocedure.

Wedefinetrainastheresultoftheapplicationofthecost_function(optimizer),throughitsminimizefunction:

train=optimizer.minimize(cost_function)

Testingthemodel

Nowwecantestthealgorithmofgradientdescentonthedatamodelyoucreatedearlier.Asusual,wehavetoinitializeallthevariables:

model=tf.initialize_all_variables()

Sowebuildouriteration(20computationsteps),allowingustodeterminethebestvaluesofAandb,whichdefinethelinethatbestfitsthedatamodel.Instantiatetheevaluationgraph:

withtf.Session()assession:

Weperformthesimulationonourmodel:

session.run(model)

forstepinrange(0,21):

Foreachiteration,weexecutetheoptimizationstep:

session.run(train)

Everyfivesteps,weprintourpatternofdots:

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

Andthestraightlinesareobtainedbythefollowingcommand:

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

Thefollowingfigureshowstheconvergenceoftheimplementedalgorithm:

Linearregression:startcomputation(step=0)

Afterjustfivesteps,wecanalreadysee(inthenextfigure)asubstantialimprovementinthefitoftheline:

Linearregression:situationafter5computationsteps

Thefollowing(andfinal)figureshowsthedefinitiveresultafter20steps.Wecanseetheefficiencyofthealgorithmused,withthestraightlineefficiencyperfectlyacrossthecloudofpoints.

Linearregression:finalresult

Finallywereport,tofurtherourunderstanding,thecompletecode:

importnumpyasnp

importmatplotlib.pyplotasplt

importtensorflowastf

number_of_points=200

x_point=[]

y_point=[]

a=0.22

b=0.78

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

B=tf.Variable(tf.zeros([1]))

y=A*x_point+B

cost_function=tf.reduce_mean(tf.square(y-y_point))

optimizer=tf.train.GradientDescentOptimizer(0.5)

train=optimizer.minimize(cost_function)

model=tf.initialize_all_variables()

withtf.Session()assession:

session.run(model)

forstepinrange(0,21):

session.run(train)

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

TheMNISTdatasetTheMNISTdataset(availableathttp://yann.lecun.com/exdb/mnist/),iswidelyusedfortrainingandtestinginthefieldofmachinelearning,andwewilluseitintheexamplesofthisbook.Itcontainsblackandwhiteimagesofhandwrittendigitsfrom0to9.

Thedatasetisdividedintotwogroups:60,000totrainthemodelandanadditional10,000totestit.Theoriginalimages,inblackandwhite,werenormalizedtofitintoaboxofsize28×28pixelsandcenteredbycalculatingthecenterofmassofthepixels.ThefollowingfigurerepresentshowthedigitscouldberepresentedintheMNISTdataset:

MNISTdigitsampling

EachMNISTdatapointisanarrayofnumbersdescribinghowdarkeachpixelis.Forexample,forthefollowingdigit(thedigit1),wecouldhave:

Pixelrepresentationofthedigit1

DownloadingandpreparingthedataThefollowingcodeimportstheMNISTdatafilesthatwearegoingtoclassify.IamusingascriptfromGooglethatcanbedownloadedfrom:

https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/examples/tutorials/mnist/input_data.pyThismustberuninthesamefolderwherethefilesarelocated.

Nowwewillshowhowtoloadanddisplaythedata:

importinput_data

importnumpyasnp

importmatplotlib.pyplotasplt

Usinginput_data,weloadthedatasets:

mnist_images=input_data.read_data_sets\

("MNIST_data/",\

one_hot=False)

train.next_batch(10)returnsthefirst10images:

pixels,real_values=mnist_images.train.next_batch(10)

Thisalsoreturnstwolists:thematrixofthepixelsloadedandthelistthatcontainstherealvaluesloaded:

print"listofvaluesloaded",real_values

example_to_visualize=5

print"elementN°"+str(example_to_visualize+1)\

+"ofthelistplotted"

>>

ExtractingMNIST_data/train-labels-idx1-ubyte.gz

ExtractingMNIST_data/t10k-images-idx3-ubyte.gz

ExtractingMNIST_data/t10k-labels-idx1-ubyte.gz

listofvaluesloaded[7346181098]

elementN6ofthelistplotted

>>

Whiledisplayinganelement,wecanusematplotlib,asfollows:

image=pixels[example_to_visualize,:]

image=np.reshape(image,[28,28])

plt.imshow(image)

plt.show()

Hereistheresult:

AMNISTimageofthenumbereight

ClassifiersInthecontextofmachinelearning,thetermclassificationidentifiesanalgorithmicprocedurethatassignseachnewinputdatum(instance)tooneofthepossiblecategories(classes).Ifweconsideronlytwoclasses,wetalkaboutbinaryclassification;otherwisewehaveamulti-classclassification.

Theclassificationfallsintothesupervisedlearningcategory,whichpermitsustoclassifynewinstancesbasedontheso-calledtrainingset.Thebasicstepstofollowtoresolveasupervisedclassificationproblemareasfollows:

1. Buildthetrainingexamplesinordertorepresenttheactualcontextandapplicationonwhichtoaccomplishtheclassification.

2. Choosetheclassifierandthecorrespondingalgorithmimplementation.3. Trainthealgorithmonthetrainingsetandsetanycontrolparametersthrough

validation.4. Evaluatetheaccuracyandperformanceoftheclassifierbyapplyingasetof

newinstances(testset).

ThenearestneighboralgorithmTheK-nearestneighbor(KNN)isasupervisedlearningalgorithmforbothclassificationorregression.Itisasystemthatassignstheclassofthesampletestedaccordingtoitsdistancefromtheobjectsstoredinthememory.

Thedistance,d,isdefinedastheEuclideandistancebetweentwopoints:

Herenisthedimensionofthespace.Theadvantageofthismethodofclassificationistheabilitytoclassifyobjectswhoseclassesarenotlinearlyseparable.Itisastableclassifier,giventhatsmallperturbationsofthetrainingdatadonotsignificantlyaffecttheresultsobtained.Themostobviousdisadvantage,however,isthatitdoesnotprovideatruemathematicalmodel;instead,foreverynewclassification,itshouldbecarriedoutbyaddingthenewdatatoallinitialinstancesandrepeatingthecalculationprocedurefortheselectedKvalue.

Moreover,itrequiresafairlyhighamountofdatatomakerealisticpredictionsandissensitivetothenoiseoftheanalyzeddata.

Inthenextexample,wewillimplementtheKNNalgorithmusingtheMNISTdataset.

Buildingthetrainingset

Let'sstartwiththeimportlibrariesneededforthesimulation:

importnumpyasnp

importtensorflowastf

importinput_data

Toconstructthedatamodelforthetrainingset,usetheinput_data.read_data_setsfunction,introducedearlier:

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

Inourexamplewewilltaketrainingphaseconsistingof100MNISTimages:

train_pixels,train_list_values=mnist.train.next_batch(100)

Whilewetestouralgorithmfor10images:

test_pixels,test_list_of_values=mnist.test.next_batch(10)

Finally,wedefinethetensorstrain_pixel_tensorandtest_pixel_tensorweusetoconstructourclassifier:

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

Costfunctionandoptimization

Thecostfunctionisrepresentedbythedistanceintermsofpixels:

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

Thetf.reducefunctionsumcomputesthesumofelementsacrossthedimensionsofatensor.Forexample(fromtheTensorFlowon-linemanual):

#'x'is[[1,1,1]

#[1,1,1]]

tf.reduce_sum(x)==>6

tf.reduce_sum(x,0)==>[2,2,2]

tf.reduce_sum(x,1)==>[3,3]

tf.reduce_sum(x,1,keep_dims=True)==>[[3],[3]]

tf.reduce_sum(x,[0,1])==>6

Finally,tominimizethedistancefunction,weusearg_min,whichreturnstheindexwiththesmallestdistance(nearestneighbor):

pred=tf.arg_min(distance,0)

Testingandalgorithmevaluation

Accuracyisaparameterthathelpsustocomputethefinalresultoftheclassifier:

accuracy=0

Initializethevariables:

init=tf.initialize_all_variables()

Startthesimulation:

withtf.Session()assess:

sess.run(init)

foriinrange(len(test_list_of_values)):

Thenweevaluatethenearestneighborindex,usingthepredfunction,definedearlier:

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

Finally,wefindthenearestneighborclasslabelandcompareittoitstruelabel:

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

Thenweevaluateandreporttheaccuracyoftheclassifier:

accuracy+=1./len(test_pixels)

print"Result=",accuracy

Aswecansee,eachelementofthetrainingsetiscorrectlyclassified.Theresultofthesimulationshowsthepredictedclasswiththerealclass,andfinallythetotalvalueofthesimulationisreported:

>>>

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

TestN°0PredictedClass:7TrueClass:7

TestN°1PredictedClass:2TrueClass:2

TestN°2PredictedClass:1TrueClass:1

TestN°3PredictedClass:0TrueClass:0

TestN°4PredictedClass:4TrueClass:4

TestN°5PredictedClass:1TrueClass:1

TestN°6PredictedClass:4TrueClass:4

TestN°7PredictedClass:9TrueClass:9

TestN°8PredictedClass:6TrueClass:5

TestN°9PredictedClass:9TrueClass:9

Result=0.9

>>>

Theresultisnot100%accurate;thereasonisthatitliesinawrongevaluationofthetestno.8insteadof5,theclassifierhasrated6.

Finally,wereportthecompletecodeforKNNclassification:

importnumpyasnp

importtensorflowastf

importinput_data

#BuildtheTrainingSet

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

train_pixels,train_list_values=mnist.train.next_batch(100)

test_pixels,test_list_of_values=mnist.test.next_batch(10)

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

#CostFunctionanddistanceoptimization

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

pred=tf.arg_min(distance,0)

#Testingandalgorithmevaluation

accuracy=0.

init=tf.initialize_all_variables()

withtf.Session()assess:

sess.run(init)

foriinrange(len(test_list_of_values)):

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

accuracy+=1./len(test_pixels)

print"Result=",accuracy

DataclusteringAclusteringproblemconsistsintheselectionandgroupingofhomogeneousitemsfromasetofinitialdata.Tosolvethisproblem,wemust:

IdentifyaresemblancemeasurebetweenelementsFindoutiftherearesubsetsofelementsthataresimilartothemeasurechosen

Thealgorithmdetermineswhichelementsformaclusterandwhatdegreeofsimilarityunitesthemwithinthecluster.

Theclusteringalgorithmsfallintotheunsupervisedmethods,becausewedonotassumeanypriorinformationonthestructuresandcharacteristicsoftheclusters.

Thek-meansalgorithmOneofthemostcommonandsimpleclusteringalgorithmsisk-means,whichallowssubdividinggroupsofobjectsintokpartitionsonthebasisoftheirattributes.Eachclusterisidentifiedbyapointorcentroidaverage.

Thealgorithmfollowsaniterativeprocedure:

1. RandomlyselectKpointsastheinitialcentroids.2. Repeat.3. FormKclustersbyassigningallpointstotheclosestcentroid.4. Recomputethecentroidofeachcluster.5. Untilthecentroidsdon'tchange.

Thepopularityofthek-meanscomesfromitsconvergencespeedanditseaseofimplementation.Intermsofthequalityofthesolutions,thealgorithmdoesnotguaranteeachievingtheglobaloptimum.Thequalityofthefinalsolutiondependslargelyontheinitialsetofclustersandmay,inpractice,toobtainamuchworsetheglobaloptimumsolution.Sincethealgorithmisextremelyfast,youcanapplyitseveraltimesandproducesolutionsfromwhichyoucanchooseamongmostsatisfyingone.Anotherdisadvantageofthealgorithmisthatitrequiresyoutochoosethenumberofclusters(k)tofind.

Ifthedataisnotnaturallypartitioned,youwillendupgettingstrangeresults.Furthermore,thealgorithmworkswellonlywhenthereareidentifiablesphericalclustersinthedata.

Letusnowseehowtoimplementthek-meansbytheTensorFlowlibrary.

BuildingthetrainingsetImportallthenecessarylibrariestooursimulation:

importmatplotlib.pyplotasplt

importnumpyasnp

importtensorflowastf

importpandasaspd

Note

Pandasisanopensource,easy-to-usedatastructure,anddataanalysistoolforthePythonprogramminglanguage.Toinstallit,typethefollowingcommand:

sudopipinstallpandas

Wemustdefinetheparametersofourproblem.Thetotalnumberofpointsthatwewanttoclusteris1000points:

num_vectors=1000

Thenumberofpartitionsyouwanttoachievebyallinitial:

num_clusters=4

Wesetthenumberofcomputationalstepsofthek-meansalgorithm:

num_steps=100

Weinitializetheinitialinputdatastructures:

x_values=[]

y_values=[]

vector_values=[]

Thetrainingsetcreatesarandomsetofpoints,whichiswhyweusetherandom.normalNumPyfunction,allowingustobuildthex_valuesandy_valuesvectors:

foriinxrange(num_vectors):

ifnp.random.random()>0.5:

x_values.append(np.random.normal(0.4,0.7))

y_values.append(np.random.normal(0.2,0.8))

else:

x_values.append(np.random.normal(0.6,0.4))

y_values.append(np.random.normal(0.8,0.5))

WeusethePythonzipfunctiontoobtainthecompletelistofvector_values:

vector_values=zip(x_values,y_values)

Thenvector_valuesisconvertedintoaconstant,usablebyTensorFlow:

vectors=tf.constant(vector_values)

Wecanseeourtrainingsetfortheclusteringalgorithmwiththefollowingcommands:

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

Thetrainingsetfork-means

Afterrandomlybuildingthetrainingset,wehavetogenerate(k=4)centroid,then

determineanindexusingtf.random_shuffle:

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

Byadoptingthisprocedure,weareabletodeterminefourrandomindices:

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

Theyhavetheirownindexesofourinitialcentroids:

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather\

(vector_values,centroid_indices))

CostfunctionsandoptimizationThecostfunctionwewanttominimizeforthisproblemisagaintheEuclideandistancebetweentwopoints:

Inordertomanagethetensorsdefinedpreviously,vectorsandcentroids,weusetheTensorFlowfunctionexpand_dims,whichautomaticallyexpandsthesizeofthetwoarguments:

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

Thisfunctionallowsyoutostandardizetheshapeofthetwotensors,inordertoevaluatethedifferencebythetf.submethod:

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

Finally,webuildtheeuclidean_distancescostfunction,usingthetf.reduce_sumfunction,whichcomputesthesumofelementsacrossthedimensionsofatensor,whilethetf.squarefunctioncomputesthesquareofthevectors_subtrationelement-wisetensor:

euclidean_distances=tf.reduce_sum(tf.square\

(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

Hereassignmentsisthevalueoftheindexwiththesmallestdistanceacrossthetensoreuclidean_distances.Letusnowturntotheoptimizationphase,thepurposeofwhichistoimprovethechoiceofcentroids,onwhichtheconstructionoftheclustersdepends.Wepartitionthevectors(whichisourtrainingset)intonum_clusterstensors,usingindicesfromassignments.

Thefollowingcodetakesthenearestindicesforeachsample,andgrabsthoseoutasseparategroupsusingtf.dynamic_partition:

partitions=tf.dynamic_partition\

(vectors,assignments,num_clusters)

Finally,weupdatethecentroids,usingtf.reduce_meanonasinglegrouptofindtheaverageofthatgroup,formingitsnewcentroid:

update_centroids=tf.concat(0,\

[tf.expand_dims\

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

Toformtheupdate_centroidstensor,weusetf.concattoconcatenatethesingleone.

Testingandalgorithmevaluation

It'stimetotestandevaluatethealgorithm.Thefirstprocedureistoinitializeallthevariablesandinstantiatetheevaluationgraph:

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

Nowwestartthecomputation:

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

Todisplaytheresult,weimplementthefollowingfunction:

display_partition(x_values,y_values,assignment_values)

Thistakesthex_valuesandy_valuesvectorsofthetrainingset,andtheassignemnt_valuesvector,todrawtheclusters.

Thecodeforthisvisualizationfunctionisasfollows:

defdisplay_partition(x_values,y_values,assignment_values):

labels=[]

colors=["red","blue","green","yellow"]

foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()

ax.scatter(df['x'],df['y'],c=df['color'])

plt.show()

Itassociatestoeachclusteritscolorbymeansofthefollowingdatastructure:

colors=["red","blue","green","yellow"]

Itthendrawsthemthroughthescatterfunctionofmatplotlib:

ax.scatter(df['x'],df['y'],c=df['color'])

Let'sdisplaytheresult:

Finalresultofthek-meansalgorithm

Hereisthecompletecodeofthek-meansalgorithm:

importmatplotlib.pyplotasplt

importnumpyasnp

importpandasaspd

importtensorflowastf

defdisplay_partition(x_values,y_values,assignment_values):

labels=[]

colors=["red","blue","green","yellow"]

foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()

ax.scatter(df['x'],df['y'],c=df['color'])

plt.show()

num_vectors=2000

num_clusters=4

n_samples_per_cluster=500

num_steps=1000

x_values=[]

y_values=[]

vector_values=[]

#CREATERANDOMDATA

foriinxrange(num_vectors):

ifnp.random.random()>0.5:

x_values.append(np.random.normal(0.4,0.7))

y_values.append(np.random.normal(0.2,0.8))

else:

x_values.append(np.random.normal(0.6,0.4))

y_values.append(np.random.normal(0.8,0.5))

vector_values=zip(x_values,y_values)

vectors=tf.constant(vector_values)

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather(vector_values,centroid_indices))

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

euclidean_distances=

\tf.reduce_sum(tf.square(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

partitions=[0,0,1,1,0]

num_partitions=2

data=[10,20,30,40,50]

outputs[0]=[10,20,50]

outputs[1]=[30,40]

partitions=tf.dynamic_partition(vectors,assignments,num_clusters)

update_centroids=tf.concat(0,[tf.expand_dims

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

display_partition(x_values,y_values,assignment_values)

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

SummaryInthischapter,webegantoexplorethepotentialofTensorFlowforsometypicalproblemsinMachineLearning.Withthelinearregressionalgorithm,theimportantconceptsofcostfunctionandoptimizationusinggradientdescentwereexplained.WethendescribedthedatasetMNISTofhandwrittendigits.Wealsoimplementedamulticlassclassifierusingthenearestneighboralgorithm,whichfallsintotheMachineLearningsupervisedlearningcategory.Thenthechapterconcludedwithanexampleofunsupervisedlearning,byimplementingthek-meansalgorithmforsolvingadataclusteringproblem.

Inthenextchapter,wewillintroduceneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelementsdefinedasartificialneurons,namelymathematicalconstructsthatmimicthepropertiesoflivingneurons.

We'llalsoimplementsomeneuralnetworklearningmodelsusingTensorFlow.

Chapter4.IntroducingNeuralNetworksInthischapter,wewillcoverthefollowingtopics:

Whatareneuralnetworks?SingleLayerPerceptronLogisticregressionMultiLayerPerceptronMultiLayerPerceptronclassificationMultiLayerPerceptronfunctionapproximation

Whatareartificialneuralnetworks?Anartificialneuralnetwork(ANN)isaninformationprocessingsystemwhoseoperatingmechanismisinspiredbybiologicalneuralcircuits.Thankstotheircharacteristics,neuralnetworksaretheprotagonistsofarealrevolutioninmachinelearningsystemsandmorespecificallyinthecontextofartificialintelligence.AnANNpossessesmanysimpleprocessingunitsvariouslyconnectedtoeachother,accordingtovariousarchitectures.IfwelookattheschemaofanANNreportedlater,itcanbeseenthatthehiddenunitscommunicatewiththeexternallayer,bothininputandoutput,whiletheinputandoutputunitscommunicateonlywiththehiddenlayerofthenetwork.

Eachunitornodesimulatestheroleoftheneuroninbiologicalneuralnetworks.Eachnode,saidartificialneuron,hasaverysimpleoperation:itbecomesactiveifthetotalquantityofsignalthatitreceivesexceedsitsactivationthreshold,definedbytheso-calledactivationfunction.Ifanodebecomesactive,itemitsasignalthatistransmittedalongthetransmissionchannelsuptotheotherunittowhichitisconnected.Eachconnectionpointactsasafilterthatconvertsthemessageintoaninhibitoryorexcitatorysignal,increasingordecreasingtheintensityaccordingtotheirindividualcharacteristics.Theconnectionpointssimulatethebiologicalsynapsesandhavethefundamentalfunctionofweighingtheintensityofthetransmittedsignals,bymultiplyingthembytheweightswhosevaluesdependontheconnectionitself.

ANNschematicdiagram

NeuralnetworkarchitecturesThewaytoconnectthenodes,thetotalnumberoflayers,thatisthelevelsofnodesbetweeninputandoutputsandthenumberofneuronsperlayer-allthesedefinethearchitectureofaneuralnetwork.Forexample,inmultilayernetworks(weintroducetheseinthesecondpartofthischapter),onecanidentifytheartificialneuronsoflayerssuchthat:

EachneuronisconnectedwithallthoseofthenextlayerTherearenoconnectionsbetweenneuronsbelongingtothesamelayerThenumberoflayersandofneuronsperlayerdependsontheproblemtobesolved

Nowwestartourexplorationofneuralnetworkmodels,introducingthemostsimpleneuralnetworkmodel:theSingleLayerPerceptronortheso-calledRosenblatt'sPerceptron.

SingleLayerPerceptronTheSingleLayerPerceptronwasthefirstneuralnetworkmodel,proposedin1958byFrankRosenblatt.Inthismodel,thecontentofthelocalmemoryoftheneuronconsistsofavectorofweights,W=(w1,w2,......,wn).ThecomputationisperformedoverthecalculationofasumoftheinputvectorX=(x1,x2,......,xn),eachofwhichismultipliedbythecorrespondingelementofthevectoroftheweights;thenthevalueprovidedintheoutput(thatis,aweightedsum)willbetheinputofanactivationfunction.Thisfunctionreturns1iftheresultisgreaterthanacertainthreshold,otherwiseitreturns-1.Inthefollowingfigure,theactivationfunctionistheso-calledsignfunction:

+1x>0

sign(x)=

−1otherwise

Itispossibletouseotheractivationfunctions,preferablynon-linear(suchasthesigmoidfunction,whichwewillseeinthenextsection).Thelearningprocedureofthenetisiterative:itslightlymodifiesforeachlearningcycle(calledepoch)thesynapticweightsbyusingaselectedsetcalledatrainingset.Ateachcycle,theweightsmustbemodifiedtominimizeacostfunction,whichisspecifictotheproblemunderconsideration.Finally,whentheperceptronhasbeentrainedonthetrainingset,itwillbetestedonotherinputs(thetestset)inordertoverifyitscapacityforgeneralization.

SchemaofaRosemblatt'sPerceptron

LetusnowseehowtoimplementasinglelayerneuralnetworkforanimageclassificationproblemusingTensorFlow.

ThelogisticregressionThisalgorithmhasnothingtodowiththecanonicallinearregressionwesawinChapter3,StartingwithMachineLearning,butitisanalgorithmthatallowsustosolveproblemsofsupervisedclassification.Infact,toestimatethedependentvariable,nowwemakeuseoftheso-calledlogisticfunctionorsigmoid.Itispreciselybecauseofthisfeaturewecallthisalgorithmlogisticregression.Thesigmoidfunctionhasthefollowingpattern:

Sigmoidfunction

Aswecansee,thedependentvariabletakesvaluesstrictlybetween0and1thatispreciselywhatservesus.Inthecaseoflogisticregression,wewantourfunctiontotelluswhat'stheprobabilityofbelongingtoaparticularelementofourclass.We

recallagainthatthesupervisedlearningbytheneuralnetworkisconfiguredasaniterativeprocessofoptimizationoftheweights;thesearethenmodifiedonthebasisofthenetwork'sperformanceofthetrainingset.Indeedtheaimistominimizethelossfunction,whichindicatesthedegreetowhichthebehaviorofthenetworkdeviatesfromthedesiredone.Theperformanceofthenetworkisthenverifiedonatestset,consistingofimagesotherthanthoseoftrained.

Thebasicstepsoftrainingthatwe'regoingtoimplementareasfollows:

Theweightsareinitializedwithrandomvaluesatthebeginningofthetraining.Foreachelementofthetrainingsettheerroriscalculated,thatis,thedifferencebetweenthedesiredoutputandtheactualoutput.Thiserrorisusedtoadjusttheweights.Theprocessisrepeated,resubmittingtothenetwork,inarandomorder,alltheexamplesofthetrainingsetuntiltheerrormadeontheentiretrainingsetisnotlessthanacertainthreshold,oruntilthemaximumnumberofiterationsisreached.

LetusnowseeindetailhowtoimplementthelogisticregressionwithTensorFlow.TheproblemwewanttosolveistoclassifyimagesfromtheMNISTdataset,whichasexplainedintheChapter3,StartingwithMachineLearningisadatabaseofhandwrittennumbers.

TensorFlowimplementationToimplementTensorFlow,weneedtoperformthefollowingsteps:

1. Firstofall,wehavetoimportallthenecessarylibraries:

importinput_data

importtensorflowastf

importmatplotlib.pyplotasplt

2. Weusetheinput_data.readfunctionintroducedinChapter3,StartingwithMachineLearning,intheMNISTdatasetsection,touploadtheimagestoourproblem:

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

3. Thenwesetthetotalnumberofepochsforthetrainingphase:

training_epochs=25

4. Wemustalsodefineotherparametersthatarenecessarytobuildamodel:

learning_rate=0.01

batch_size=100

display_step=1

5. Nowwemovetotheconstructionofthemodel.

BuildingthemodelDefinexastheinputtensor;itrepresentstheMNISTdataimageofsize28x28=784pixels:

x=tf.placeholder("float",[None,784])

Werecallthatourproblemconsistsofassigningaprobabilityvalueforeachofthepossibleclassesofmembership(thenumbersfrom0to9).Attheendofthiscalculation,wewilluseaprobabilitydistribution,whichgivesusthevalueofwhatisconfidentwithourprediction.

Sotheoutputwe'regoingtogetwillbeanoutputtensorwith10probabilities,eachonecorrespondingtoadigit(ofcoursethesumofprobabilitiesmustbeone):

y=tf.placeholder("float",[None,10])

Toassignprobabilitiestoeachimage,wewillusetheso-calledsoftmaxactivationfunction.

Thesoftmaxfunctionisspecifiedintwomainsteps:

CalculatetheevidencethatacertainimagebelongstoaparticularclassConverttheevidenceintoprobabilitiesofbelongingtoeachofthe10possibleclasses

Toevaluatetheevidence,wefirstdefinetheweightsinputtensorasW:

W=tf.Variable(tf.zeros([784,10]))

Foragivenimage,wecanevaluatetheevidenceforeachclassibysimplymultiplyingthetensorWwiththeinputtensorx.UsingTensorFlow,weshouldhavesomethinglikethefollowing:

evidence=tf.matmul(x,W)

Ingeneral,themodelsincludeanextraparameterrepresentingthebias,whichindicatesacertaindegreeofuncertainty.Inourcase,thefinalformulafortheevidenceisasfollows:

evidence=tf.matmul(x,W)+b

Itmeansthatforeveryi(from0to9)wehaveaWimatrixelements784(28×

28),whereeachelementjofthematrixismultipliedbythecorrespondingcomponentjoftheinputimage(784parts)isaddedandthecorrespondingbiaselementbi.

Sotodefinetheevidence,wemustdefinethefollowingtensorofbiases:

b=tf.Variable(tf.zeros([10]))

Thesecondstepistofinallyusethesoftmaxfunctiontoobtaintheoutputvectorofprobabilities,namelyactivation:

activation=tf.nn.softmax(tf.matmul(x,W)+b)

TensorFlow'stf.nn.softmaxfunctionprovidesaprobability-basedoutputfromtheinputevidencetensor.Onceweimplementthemodel,wecanspecifythenecessarycodetofindtheweightsWandbiasesbnetworkthroughtheiterativetrainingalgorithm.Ineachiteration,thetrainingalgorithmtakesthetrainingdata,appliestheneuralnetwork,andcomparestheresultwiththeexpected.

Note

TensorFlowprovidesmanyotheractivationfunctions.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/index.htmlforbetterreferences.

Inordertotrainourmodelandknowwhenwehaveagoodone,wemustdefinehowtodefinetheaccuracyofourmodel.OurgoalistotrytogetvaluesofparametersWandbthatminimizethevalueofthemetricthatindicateshowbadthemodelis.

Differentmetricscalculateddegreeoferrorbetweenthedesiredoutputandthetrainingdataoutputs.AcommonmeasureoferroristhemeansquarederrorortheSquaredEuclideanDistance.However,therearesomeresearchfindingsthatsuggesttouseothermetricstoaneuralnetworklikethis.

Inthisexample,weusetheso-calledcross-entropyerrorfunction.Itisdefinedas:

cross_entropy=y*tf.lg(activation)

Inordertominimizecross_entropy,wecanusethefollowingcombinationoftf.reduce_meanandtf.reduce_sumtobuildthecostfunction:

cost=tf.reduce_mean\

(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

Thenwemustminimizeitusingthegradientdescentoptimizationalgorithm:

optimizer=tf.train.GradientDescentOptimizer\

(learning_rate).minimize(cost)

Fewlinesofcodetobuildaneuralnetmodel!

LaunchthesessionIt'stimetobuildthesessionandlaunchourneuralnetmodel.

Wefixthefollowingliststovisualizethetrainingsession:

avg_set=[]

epoch_set=[]

ThenweinitializetheTensorFlowvariables:

init=tf.initialize_all_variables()

Startthesession:

withtf.Session()assess:

sess.run(init)

Asexplained,eachepochisatrainingcycle:

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

Thenweloopoverallthebatches:

foriinrange(total_batch):

batch_xs,batch_ys=\

mnist.train.next_batch(batch_size)

Fitthetrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

Computetheaveragelossrunningthetrain_stepfunctionwiththegivenimagevalues(x)andtherealoutput(y_):

avg_cost+=sess.run\

(cost,feed_dict={x:batch_xs,\

y:batch_ys})/total_batch

Duringcomputation,wedisplayalogperepochstep:

ifepoch%display_step==0:

print"Epoch:",\

'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

print"Trainingphasefinished"

Let'sgettheaccuracyofourmode.Itiscorrectiftheindexwiththehighestyvalueisthesameasintherealdigitvectorthemeanofthecorrect_predictiongivesustheaccuracy.Weneedtoruntheaccuracyfunctionwithourtestset(mnist.test).

Weusethekeyimagesandlabelsforxandy:

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

accuracy=tf.reduce_mean\

(tf.cast(correct_prediction,"float"))

print"MODELaccuracy:",accuracy.eval({x:mnist.test.images,\

y:mnist.test.labels})

TestevaluationWepreviouslyshowedthetrainingphaseandforeachepochwehaveprintedtherelativecostfunction:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.>>>=======================RESTART

============================

>>>

Extracting/tmp/data/train-images-idx3-ubyte.gz

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

Epoch:0001cost=1.174406662

Epoch:0002cost=0.661956009

Epoch:0003cost=0.550468774

Epoch:0004cost=0.496588717

Epoch:0005cost=0.463674555

Epoch:0006cost=0.440907706

Epoch:0007cost=0.423837747

Epoch:0008cost=0.410590841

Epoch:0009cost=0.399881751

Epoch:0010cost=0.390916621

Epoch:0011cost=0.383320325

Epoch:0012cost=0.376767031

Epoch:0013cost=0.371007620

Epoch:0014cost=0.365922904

Epoch:0015cost=0.361327561

Epoch:0016cost=0.357258660

Epoch:0017cost=0.353508228

Epoch:0018cost=0.350164634

Epoch:0019cost=0.347015593

Epoch:0020cost=0.344140861

Epoch:0021cost=0.341420144

Epoch:0022cost=0.338980592

Epoch:0023cost=0.336655581

Epoch:0024cost=0.334488012

Epoch:0025cost=0.332488823

Trainingphasefinished

Asyoucansee,duringthetrainingphasethecostfunctionisminimized.Attheendofthetest,weshowhowaccuratetheimplementedmodelis:

ModelAccuracy:0.9475

>>>

Finally,usingthefollowinglinesofcode,wecanvisualizethetrainingphaseofthenet:

plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

Trainingphaseinlogisticregression

Sourcecode#ImportMINSTdata

importinput_data

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

importtensorflowastf

importmatplotlib.pyplotasplt

#Parameters

learning_rate=0.01

training_epochs=25

batch_size=100

display_step=1

#tfGraphInput

x=tf.placeholder("float",[None,784])

y=tf.placeholder("float",[None,10])

#Createmodel

#Setmodelweights

W=tf.Variable(tf.zeros([784,10]))

b=tf.Variable(tf.zeros([10]))

#Constructmodel

activation=tf.nn.softmax(tf.matmul(x,W)+b)

#Minimizeerrorusingcrossentropy

cross_entropy=y*tf.log(activation)

cost=tf.reduce_mean\

(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

optimizer=tf.train.\

GradientDescentOptimizer(learning_rate).minimize(cost)

#Plotsettings

avg_set=[]

epoch_set=[]

#Initializingthevariables

init=tf.initialize_all_variables()

#Launchthegraph

withtf.Session()assess:

sess.run(init)

#Trainingcycle

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

#Loopoverallbatches

foriinrange(total_batch):

batch_xs,batch_ys=\

mnist.train.next_batch(batch_size)

#Fittrainingusingbatchdata

sess.run(optimizer,\

feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,feed_dict=\

{x:batch_xs,\

y:batch_ys})/total_batch

#Displaylogsperepochstep

ifepoch%display_step==0:

print"Epoch:",'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

avg_set.append(avg_cost)

epoch_set.append(epoch+1)

print"Trainingphasefinished"

plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"Modelaccuracy:",accuracy.eval({x:mnist.test.images,\

y:mnist.test.labels})

MultiLayerPerceptronAmorecomplexandefficientarchitectureisthatofMultiLayerPerceptron(MLP).Itissubstantiallyformedfrommultiplelayersofperceptrons,andthereforebythepresenceofatleastonehiddenlayer,thatisnotconnectedeithertotheinputsortotheoutputsofthenetwork:

TheMLParchitecture

Anetworkofthistypeistypicallytrainedusingsupervisedlearning,accordingtotheprinciplesoutlinedinthepreviousparagraph.Inparticular,atypicallearningalgorithmforMLPnetworksistheso-calledbackpropagation'salgorithm.

Note

Thebackpropagationalgorithmisalearningalgorithmforneuralnetworks.Itcomparestheoutputvalueofthesystemwiththedesiredvalue.Onthebasisofthedifferencethuscalculated(namely,theerror),thealgorithmmodifiesthesynapticweightsoftheneuralnetwork,byprogressivelyconvergingthesetofoutputvaluesofthedesiredones.

ItisimportanttonotethatinMLPnetworks,althoughyoudon'tknowthedesiredoutputsoftheneuronsofthehiddenlayersofthenetwork,itisalwayspossibleto

applyasupervisedlearningmethodbasedontheminimizationofanerrorfunctionviatheapplicationofgradient-descenttechniques.

Inthefollowingexample,weshowtheimplementationwithMLPforanimageclassificationproblem(MNIST).

MultiLayerPerceptronclassificationImportthenecessarylibraries:

importinput_data

importtensorflowastf

importmatplotlib.pyplotasplt

Loadtheimagestoclassify:

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

FixsomeparametersfortheMLPmodel:

Learningrateofthenet:

learning_rate=0.001

Theepochs:

training_epochs=20

Thenumberofimagestoclassify:

batch_size=100

display_step=1

Thenumberofneuronsforthefirstlayer:

n_hidden_1=256

Thenumberofneuronsforthesecondlayer:

n_hidden_2=256

Thesizeoftheinput(eachimagehas784pixels):

n_input=784#MNISTdatainput(imgshape:28*28)

Thesizeofoftheoutputclasses:

n_classes=10

Itshouldthereforebenotedthatwhileforagivenapplication,theinputandoutputsizeisperfectlydefined,therearenostrictcriteriaforhowtodefinethenumberof

hiddenlayersandthenumberofneuronsforeachlayer.

Everychoicemustbebasedonexperienceofsimilarapplications,asinourcase:

Whenincreasingthenumberofhiddenlayers,weshouldalsoincreasethesizeofthetrainingsetthatisnecessaryandalsoincreasethenumberofconnectionstobeupdated,duringthelearningphase.Thisresultsinanincreaseinthetrainingtime.Also,iftherearetoomanyneuronsinthehiddenlayer,notonlyaretheremoreweightstobeupdatedbutthenetworkalsohasatendencytolearntoomuchfromthetrainingexamplesset,resultinginapoorgeneralizationability.Butthenifthehiddenneuronsaretoofew,thenetworkisnotabletolearnevenwiththetrainingset.

Buildthemodel

Theinputlayeristhextensor[1×784],whichrepresentstheimagetoclassify:

x=tf.placeholder("float",[None,n_input])

Theoutputtensoryisequaltothenumberofclasses:

y=tf.placeholder("float",[None,n_classes])

Inthemiddle,wehavetwohiddenlayers.Thefirstlayerisconstitutedbythehtensorofweights,whosesizeis[784×256],where256isthetotalnumberofnodesofthelayer:

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

Forlayer1,sowehavetodefinetherespectivebiasestensor:

bias_layer_1=tf.Variable(tf.random_normal([n_hidden_1]))

Eachneuronreceivesthepixelsofinputimagetobeclassifiedcombinedwiththehijweightconnectionsandaddedtotherespectivevaluesofthebiasestensor:

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

Itsendsitsoutputtotheneuronsofthenextlayerthroughtheactivationfunction.Itmustbesaidthatfunctionscanbedifferentfromoneneurontoanother,butinpractice,however,weadoptacommonfeatureforalltheneurons,typicallyofthesigmoidaltype.Sometimestheoutputneuronsareequippedwithalinearactivationfunction.Itisinterestingtonotethattheactivationfunctionsoftheneuronsinthe

hiddenlayerscannotbelinearbecause,inthiscase,theMLPnetworkwouldbeequivalenttoanetworkwithtwolayersandthereforenolongeroftheMLPtype.Thesecondlayermustperformthesamestepsasthefirst.

Thesecondintermediatelayerisrepresentedbytheshapeoftheweightstensor[256×256]:

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

Withthetensorofbiases:

bias_layer_2=tf.Variable(tf.random_normal([n_hidden_2]))

Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwiththeweightWijconnectionsandaddedtotherespectivebiasesoflayer2:

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

Itsendsitsoutputtothenextlayer,namelytheoutputlayer:

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

bias_output=tf.Variable(tf.random_normal([n_classes]))

output_layer=tf.matmul(layer_2,output)+bias_output

Theoutputlayerreceivesasinputn-stimuli(256)comingfromlayer2,whichisconvertedtotherespectiveclassesofprobabilityforeachnumber.

Asforthelogisticregression,wethendefinethecostfunction:

cost=tf.reduce_mean\

(tf.nn.softmax_cross_entropy_with_logits\

(output_layer,y))

TheTensorFlowfunctiontf.nn.softmax_cross_entropy_with_logitscomputesthecostforasoftmaxlayer.Itisonlyusedduringtraining.Thelogitsaretheunnormalizedlogprobabilitiesoutputthemodel(thevaluesoutputbeforethesoftmaxnormalizationisappliedtothem).

Thecorrespondingoptimizerthatminimizesthecostfunctionis:

optimizer=tf.train.AdamOptimizer\

(learning_rate=learning_rate).minimize(cost)

tf.train.AdamOptimizerusesKingmaandBa'sAdamalgorithmtocontrolthelearningrate.Adamoffersseveraladvantagesoverthesimpletf.train.GradientDescentOptimizer.Infact,itusesalargereffectivestepsize,andthealgorithmwillconvergetothisstepsizewithoutfinetuning.

Asimpletf.train.GradientDescentOptimizercouldequallybeusedinyourMLP,butwouldrequiremorehyperparametertuningbeforeitcouldconvergeasquickly.

Note

TensorFlowprovidestheoptimizerbaseclasstocomputegradientsforalossandapplygradientstovariables.ThisclassdefinestheAPItoaddopstotrainamodel.Youneverusethisclassdirectly,butinsteadinstantiateoneofitssubclasses.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#Optimizertoseetheoptimizerimplemented.

Launchthesession

Thefollowingarethestepstolaunchthesession:

1. Plotthesettings:

avg_set=[]

epoch_set=[]

2. Initializethevariables:

init=tf.initialize_all_variables()

3. Launchthegraph:

withtf.Session()assess:

sess.run(init)

4. Definethetrainingcycle:

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

5. Loopoverallthebatches(100):

foriinrange(total_batch):

batch_xs,batch_ys=

mnist.train.next_batch(batch_size)

6. Fittrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,y:

batch_ys})

7. Computetheaverageloss:

avg_cost+=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys})/total_batch

Displaylogsperepochstep

ifepoch%display_step==0:

print"Epoch:",'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

avg_set.append(avg_cost)

epoch_set.append(epoch+1)

print"Trainingphasefinished"

8. Withtheselinesofcodes,weplotthetrainingphase:

plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

9. Finally,wecantesttheMLPmodel:

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

evaluatingitsaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,

"float"))

print"ModelAccuracy:",accuracy.eval({x:

mnist.test.images,\

y:mnist.test.labels})

10. Hereistheoutputresultafter20epochs:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.1

20151010]onlinux2Type"copyright","credits"or"license()"for

moreinformation.

>>>==========================RESTART

==============================

>>>

Succesfullydownloadedtrain-images-idx3-ubyte.gz9912422bytes.

Extracting/tmp/data/train-images-idx3-ubyte.gz

Succesfullydownloadedtrain-labels-idx1-ubyte.gz28881bytes.

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Succesfullydownloadedt10k-images-idx3-ubyte.gz1648877bytes.

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Succesfullydownloadedt10k-labels-idx1-ubyte.gz4542bytes.

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

Epoch:0001cost=1.723947845

Epoch:0002cost=0.539266024

Epoch:0003cost=0.362600502

Epoch:0004cost=0.266637279

Epoch:0005cost=0.205345784

Epoch:0006cost=0.159139332

Epoch:0007cost=0.125232637

Epoch:0008cost=0.098572041

Epoch:0009cost=0.077509963

Epoch:0010cost=0.061127526

Epoch:0011cost=0.048033808

Epoch:0012cost=0.037297983

Epoch:0013cost=0.028884999

Epoch:0014cost=0.022818390

Epoch:0015cost=0.017447586

Epoch:0016cost=0.013652348

Epoch:0017cost=0.010417282

Epoch:0018cost=0.008079228

Epoch:0019cost=0.006203546

Epoch:0020cost=0.004961207

Trainingphasefinished

ModelAccuracy:0.9775

>>>

Weshowthetrainingphaseinthefollowingfigure:

TrainingphaseinMultiLayerPerceptron

Sourcecode#ImportMINSTdata

importinput_data

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

importtensorflowastf

importmatplotlib.pyplotasplt

#Parameters

learning_rate=0.001

training_epochs=20

batch_size=100

display_step=1

#NetworkParameters

n_hidden_1=256#1stlayernumfeatures

n_hidden_2=256#2ndlayernumfeatures

n_input=784#MNISTdatainput(imgshape:28*28)

n_classes=10#MNISTtotalclasses(0-9digits)

#tfGraphinput

x=tf.placeholder("float",[None,n_input])

y=tf.placeholder("float",[None,n_classes])

#weightslayer1

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

#biaslayer1

bias_layer_1=tf.Variable(tf.random_normal([n_hidden_1]))

#layer1

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

#weightslayer2

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

#biaslayer2

bias_layer_2=tf.Variable(tf.random_normal([n_hidden_2]))

#layer2

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

#weightsoutputlayer

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

#biaroutputlayer

bias_output=tf.Variable(tf.random_normal([n_classes]))

#outputlayer

output_layer=tf.matmul(layer_2,output)+bias_output

#costfunction

cost=tf.reduce_mean\

(tf.nn.softmax_cross_entropy_with_logits(output_layer,y))

#optimizer

optimizer=tf.train.AdamOptimizer\

(learning_rate=learning_rate).minimize(cost)

#Plotsettings

avg_set=[]

epoch_set=[]

#Initializingthevariables

init=tf.initialize_all_variables()

#Launchthegraph

withtf.Session()assess:

sess.run(init)

#Trainingcycle

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

#Loopoverallbatches

foriinrange(total_batch):

batch_xs,batch_ys=mnist.train.next_batch(batch_size)

#Fittrainingusingbatchdata

sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,\

feed_dict={x:batch_xs,\

y:batch_ys})/total_batch

#Displaylogsperepochstep

ifepoch%display_step==0:

print"Epoch:",'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

avg_set.append(avg_cost)

epoch_set.append(epoch+1)

print"Trainingphasefinished"

plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"ModelAccuracy:",accuracy.eval({x:mnist.test.images,\

y:mnist.test.labels})

MultiLayerPerceptronfunctionapproximationInthefollowingexample,weimplementanMLPnetworkthatwillbeabletolearnthetrendofanarbitraryfunctionf(x).Inthetrainingphasethenetworkwillhavetolearnfromaknownsetofpoints,thatisxandf(x),whileinthetestphasethenetworkwilldeductthevaluesoff(x)onlyfromthexvalues.

Thisverysimplenetworkwillbebuiltbyasinglehiddenlayer.

Importthenecessarylibraries:

importtensorflowastf

importnumpyasnp

importmath,random

importmatplotlib.pyplotasplt

Webuildthedatamodel.Thefunctiontobelearnedwillfollowthetrendofthecosinefunction,evaluatedfor1000pointstowhichweaddaverylittlerandomerror(noise)toreproducearealcase:

NUM_points=1000

np.random.seed(NUM_points)

function_to_learn=lambdax:np.cos(x)+\

0.1*np.random.randn(*x.shape)

OurMLPnetworkwillbeformedbyahiddenlayerof10neurons:

layer_1_neurons=10

Thenetworklearnsfor100pointsatatimetoatotalof1500learningcycles(epochs):

batch_size=100

NUM_EPOCHS=1500

Finally,weconstructthetrainingsetandthetestset:

all_xcontienetuttiipunti

all_x=np.float32(np.random.uniform\

(-2*math.pi,2*math.pi,\

(1,NUM_points))).T

np.random.shuffle(all_x)

train_size=int(900)

Thefirst900pointsareinthetrainingset:

x_training=all_x[:train_size]

y_training=function_to_learn(x_training)

Thelast100willbeinthevalidationset:

x_validation=all_x[train_size:]

y_validation=function_to_learn(x_validation)

Usingmatplotlib,wedisplaythesesets:

plt.figure(1)

plt.scatter(x_training,y_training,c='blue',label='train')

plt.scatter(x_validation,y_validation,c='red',label='validation')

plt.legend()

plt.show()

Trainingandvalidationset

Buildthemodel

First,wecreatetheplaceholdersfortheinputtensor(X)andtheoutputtensor(Y):

X=tf.placeholder(tf.float32,[None,1],name="X")

Y=tf.placeholder(tf.float32,[None,1],name="Y")

Thenwebuildthehiddenlayerof[1x10]dimensions:

w_h=tf.Variable(tf.random_uniform([1,layer_1_neurons],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_h=tf.Variable(tf.zeros([1,layer_1_neurons],\

dtype=tf.float32))

ItreceivestheinputvaluefromtheXinputtensor,combinedwiththeweightw_hijconnectionsandaddedwiththerespectivebiasesoflayer1:

h=tf.nn.sigmoid(tf.matmul(X,w_h)+b_h)

Theoutputlayerisa[10x1]tensor:

w_o=tf.Variable(tf.random_uniform([layer_1_neurons,1],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_o=tf.Variable(tf.zeros([1,1],dtype=tf.float32))

Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwithweightw_oijconnectionsandaddedtogetherwiththerespectivebiasesoftheoutputlayer:

model=tf.matmul(h,w_o)+b_o

Wethendefineouroptimizerforthenewlydefinedmodel:

train_op=tf.train.AdamOptimizer().minimize\

(tf.nn.l2_loss(model-Y))

Wealsonotethatinthiscase,thecostfunctionadoptedisthefollowing:

tf.nn.l2_loss(model-Y)

Thetf.nn.l2_lossfunctionisaTensorFlowthatcomputeshalftheL2normofa

tensorwithoutthesqrt,thatis,theoutputfortheprecedingfunctionisasfollows:

output=sum((model-Y)**2)/2

Thetf.nn.l2_lossfunctioncanbeaviablecostfunctionforourexample.

Launchthesession

Let'sbuildtheevaluationgraph:

sess=tf.Session()

sess.run(tf.initialize_all_variables())

Nowwecanlaunchthelearningsession:

errors=[]

foriinrange(NUM_EPOCHS):

forstart,endinzip(range(0,len(x_training),batch_size),\

range(batch_size,\

len(x_training),batch_size)):

sess.run(train_op,feed_dict={X:x_training[start:end],\

Y:y_training[start:end]})

cost=sess.run(tf.nn.l2_loss(model-y_validation),\

feed_dict={X:x_validation})

errors.append(cost)

ifi%100==0:print"epoch%d,cost=%g"%(i,cost)

Runningthisnetworkfor1400epochs,we'llseetheerrorprogressivelyreducingandeventuallyconverging:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.

>>>=======================RESTART============================

>>>

epoch0,cost=55.9286

epoch100,cost=22.0084

epoch200,cost=18.033

epoch300,cost=14.0481

epoch400,cost=9.74721

epoch500,cost=5.83419

epoch600,cost=3.05434

epoch700,cost=1.53706

epoch800,cost=0.91719

epoch900,cost=0.726675

epoch1000,cost=0.668316

epoch1100,cost=0.633737

epoch1200,cost=0.608306

epoch1300,cost=0.590429

epoch1400,cost=0.574602

>>>

Thefollowinglinesofcodeallowustodisplayhowthecostchangesintherunningepochs:

plt.plot(errors,label='MLPFunctionApproximation')

plt.xlabel('epochs')

plt.ylabel('cost')

plt.legend()

plt.show()

TrainingphaseinMultiLayerPerceptron

SummaryInthischapter,weintroducedartificialneuralnetworks.Anartificialneuronisamathematicalmodelthattosomeextentmimicsthepropertiesofalivingneurons.Eachneuronofthenetworkhasaverysimpleoperationwhichconsistsofbecomingactiveifthetotalamountofsignalthatitreceivesexceedsalookattheactivationthreshold.Thelearningprocessistypicallysupervised:theneuralnetusesatrainingsettoinfertherelationshipbetweentheinputandthecorrespondingoutput,whilethelearningalgorithmmodifiestheweightsofthenetinordertominimizeacostfunctionthatrepresentstheforecasterrorrelatingtothetrainingset.Ifthetrainingissuccessful,theneuralnetwillbeabletomakeforecastsevenwheretheoutputisnotknownapriori.Inthischapterweimplemented,usingTensorFlow,someexamplesinvolvingneuralnetworks.WehaveseenneuralnetsusedtosolveclassificationandregressionsproblemsasthelogisticregressionalgorithminaclassificationproblemusingtheRosemblatt'sPerceptron.Attheendofthechapter,weintroducedtheMultiLayerPerceptronarchitecture,whichwehaveseeninactionpriortotheimplementationofanimageclassifier,thenforasimulatorofmathematicalfunctions.

Inthenextchapter,wefinallyintroducedeeplearningmodels;wewillexamineandimplementmorecomplexneuralnetworkarchitectures,suchasaconvolutionalneuralnetworkandarecurrentneuralnetwork.

Chapter5.DeepLearningInthischapter,wewillcoverthefollowingtopics:

DeeplearningtechniquesConvolutionalneuralnetwork(CNN)

CNNarchitectureTensorFlowimplementationofaCNN

Recurrentneuralnetwork(RNN)RNNarchitectureNaturalLanguageProcessingwithTensorFlow

DeeplearningtechniquesDeeplearningtechniquesareacrucialstepforwardtakenbythemachinelearningresearchersinrecentdecades,havingprovidedsuccessfulresultseverseenbeforeinmanyapplications,suchasimagerecognitionandspeechrecognition.

Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofattentioninthescopeofmachinelearning.Oneofthesereasonsisrepresentedbytheprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthem10/20times.

Anotherreasoniscertainlytheincreasingeaseoffindingevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofacertaindepthandwithhighdimensionalityoftheinputdata.Deeplearningconsistsofasetofmethodsthatallowasystemtoobtainahierarchicalrepresentationofthedataonmultiplelevels.Thisisachievedbycombiningsimpleunits(notlinear),eachofwhichtransformstherepresentationatitsownlevel,startingfromtheinputlevel,toarepresentationatahigher,levelslightlymoreabstract.Withasufficientnumberofthesetransformations,considerablycomplexinput-outputfunctionscanbelearned.

Withreferencetoaclassificationproblem,forexample,thehighestlevelsofrepresentation,highlighttheaspectsoftheinputdatathatarerelevantfortheclassification,suppressingtheonesthathavenoeffectontheclassificationpurposes.

Hierarchicalfeatureextractioninanimageclassificationsystem

Theprecedingschemedescribesthefeaturesoftheimageclassificationsystem(afacerecognizer):eachblockgraduallyextractsthefeaturesoftheinputimage,goingtoprocessdataalreadypre-processedfromthepreviousblocks,extractingincreasinglycomplexfeaturesoftheinputimage,andthusbuildingthehierarchicaldatarepresentationthatcharacterizesadeeplearning-basedsystem.

Apossiblerepresentationofthefeaturesofthehierarchycouldbeasfollows:

pixel-->edge-->texture-->motif-->part-->object

Inatextrecognitionproblem,however,thehierarchicalrepresentationcanbestructuredasfollows:

character-->word-->wordgroup-->clause-->sentence-->story

Adeeplearningarchitectureis,therefore,amulti-levelarchitecture,consistingofsimpleunits,allsubjecttotraining,manyofwhichcarrynon-lineartransformations.Eachunittransformsitsinputtoimproveitspropertiestoselectandamplifyonlytherelevantaspectsforclassificationpurposes,anditsinvariance,namelyitspropensitytoignoretheirrelevantaspectsandnegligible.

Withmultiplelevelsofnon-lineartransformations,therefore,withadepthapproximatelybetween5and20levels,adeeplearningsystemcanlearnandimplementextremelyintricateandcomplexfunctions,simultaneouslyverysensitivetothesmallestrelevantdetails,andextremelyinsensitiveandindifferenttolargevariationsofirrelevantaspectsoftheinputdatawhichcanbe,inthecaseofobjectrecognition:image'sbackground,brightness,orthepositionoftherepresentedobject.

Thefollowingsectionswillillustrate,withtheaidofTensorFlow,twoimportanttypesofdeepneuralnetworks:theconvolutionalneuralnetworks(CNNs),mainlyaddressedtotheclassificationproblems,andthentherecurrentneuralnetworks(RNNs),targetingNaturalLanguageProcessing(NLP)issues.

ConvolutionalneuralnetworksConvolutionalneuralnetworks(CNNs)areaparticulartypeofneuralnetwork-orienteddeeplearningthathaveachievedexcellentresultsinmanypracticalapplications,inparticulartheobjectrecognitioninimages.

Infact,CNNsaredesignedtoprocessdatarepresentedintheformofmultiplearrays,forexample,thecolorimages,representablebymeansofthreetwo-dimensionalarrayscontainingthepixel'scolorintensity.ThesubstantialdifferencebetweenCNNsandordinaryneuralnetworksisthattheformeroperatedirectlyontheimageswhilethelatteronfeaturesextractedfromthem.TheinputofaCNN,therefore,unlikethatofanordinaryneuralnetwork,willbetwo-dimensional,andthefeatureswillbethepixelsoftheinputimage.

TheCNNisthedominantapproachforalmostalltheproblemsofrecognition.Thespectacularperformanceofferedbynetworksofthistypehaveinfactpromptedthebiggestcompaniesintechnology,suchasGoogleandFacebook,toinvestinresearchanddevelopmentprojectsfornetworksofthiskind,andtodevelopanddistributeproductsimagerecognitionbasedonCNNs.

CNNarchitecture

TheCNNusethreebasicideas:localreceptivefields,convolution,andpooling.

Inconvolutionalnetworks,weconsiderinputassomethingsimilartowhatisshowninthefollowingfigure:

Inputneurons

OneoftheconceptsbehindCNNsislocalconnectivity.CNNs,infact,utilizespatial

correlationsthatmayexistwithintheinputdata.Eachneuronofthefirstsubsequentlayerconnectsonlysomeoftheinputneurons.Thisregioniscalledlocalreceptivefield.Inthefollowingfigure,itisrepresentedbytheblack5x5squarethatconvergestoahiddenneuron:

Frominputtohiddenneurons

Thehiddenneuron,ofcourse,willonlyprocesstheinputdatainsideofitsreceptivefield,notrealizingthechangesoutsideofthat.However,itiseasytoseethat,bysuperimposingseverallayers,thatarelocallyconnected,levelingupyouwillhaveunitsthatprocessmoreandmoreglobaldatacomparedtoinput,inaccordancewiththebasicprincipleofdeeplearning,tobringtheperformancetoalevelofabstractionthatisalwaysgrowing.

Note

Thereasonforthelocalconnectivityresidesinthefactthatindataofarraysform,suchastheimages,thevaluesareoftenhighlycorrelated,formingdistinctgroupsofdatathatcanbeeasilyidentified.

Eachconnectionlearnsaweight(soitwillget5x5=25),insteadofthehiddenneuronwithanassociatedconnectinglearnsatotalbias,thenwearegoingtoconnecttheregionstoindividualneuronsbyperformingashiftfromtimetotime,asinthefollowingfigures:

Theconvolutionoperation

Thisoperationiscalledconvolution.Doingso,ifwehaveanimageof28x28inputsand5x5regions,wewillget24x24neuronsinthehiddenlayer.Wesaidthateachneuronhasabiasand5x5weightsconnectedtotheregion:wewillusetheseweightsandbiasesforall24x24neurons.Thismeansthatalltheneuronsinthefirsthiddenlayerwillrecognizethesamefeatures,justplaceddifferentlyintheinputimage.Forthisreason,themapofconnectionsfromtheinputlayertothehiddenfeaturemapiscalledsharedweightsandbiasiscalledsharedbias,sincetheyareinfactshared.

Obviously,weneedtorecognizeanimageofmorethanamapoffeatures,soacompleteconvolutionallayerismadefrommultiplefeaturemaps.

Multiplefeaturemaps

Intheprecedingfigure,weseethreefeaturemaps;ofcourse,itsnumbercanincreaseinpracticeandyoucangettouseconvolutionallayerswitheven20or40featuremaps.Agreatadvantageinthesharingofweightsandbiasisthesignificantreductionoftheparametersinvolvedinaconvolutionalnetwork.Consideringourexample,foreachfeaturemapweneed25weights(5x5)andabias(shared);thatis26parametersintotal.Assumingwehave20featuremaps,wewillhave520parameterstobedefined.Withafullyconnectednetwork,with784inputneuronsand,forexample,30hiddenlayerneurons,weneed30more784x30biasweights,reachingatotalof23.550parameters.

Thedifferenceisevident.Theconvolutionalnetworksalsousepoolinglayers,whicharelayersimmediatelypositionedaftertheconvolutionallayers;thesesimplifytheoutputinformationofthepreviouslayertoit(theconvolution).Ittakestheinputfeaturemapscomingoutoftheconvolutionallayerandpreparesacondensedfeaturemap.Forexample,wecansaythatthepoolinglayercouldbesummedup,inallitsunits,ina2x2regionofneuronsofthepreviouslayer.

Thistechniqueiscalledpoolingandcanbesummarizedwiththefollowingscheme:

Thepoolingoperationhelpstosimplifytheinformationfromalayertothenext

Obviously,weusuallyhavemorefeaturesmapsandweapplythemaximumpoolingtoeachofthemseparately.

Fromtheinputlayertothesecondhiddenlayer

Sowehavethreefeaturemapsofsize24x24forthefirsthiddenlayer,andthesecondhiddenlayerwillbeofsize12x12,sinceweareassumingthatforeveryunitsummarizea2x2region.

Combiningthesethreeideas,weformacompleteconvolutionalnetwork.Itsarchitecturecanbedisplayedasfollows:

ACNNsarchitecturalschema

Let'ssummarize:therearethe28x28inputneuronsfollowedbyaconvolutionallayerwithalocalreceptivefield5x5and3featuremaps.Weobtainasaresultofahiddenlayerofneurons3x24x24.Thenthereisthemax-poolingappliedto2x2onthe3regionsoffeaturemapsgettingahiddenlayer3x12x12.Thelastlayerisfullyconnected:itconnectsalltheneuronsofthemax-poolinglayertoall10outputneurons,usefultorecognizethecorrespondingoutput.

Thisnetworkwillthenbetrainedbygradientdescentandthebackpropagationalgorithm.

TensorFlowimplementationofaCNN

Inthefollowingexample,wewillseeinactiontheCNNinaproblemofimageclassification.WewanttoshowtheprocessofbuildingaCNNnetwork:whatarethestepstoexecuteandwhatreasoningneedstobedonetorunaproperdimensioningoftheentirenetwork,andofcoursehowtoimplementitwithTensorFlow.

Initializationstep

1. LoadandpreparetheMNISTdata:

importtensorflowastf

importinput_data

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

2. DefinealltheCNNparameters:

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

3. MNISTdatainput(eachshapeisof28x28arraypixels):

n_input=784

4. TheMNISTtotalclasses(0-9digits)

n_classes=10

5. Toreducetheoverfitting,weapplythedropouttechnique.Thistermreferstodroppingoutunits(hidden,input,andoutput)inaneuralnetwork.Decidingwhichneuronstoeliminateisrandom;onewayistoapplyaprobability,asweshallseeinourcode.Forthisreason,wedefinethefollowingparameter(tobetuned):

dropout=0.75

6. Definetheplaceholdersfortheinputgraph.ThexplaceholdercontainstheMNISTdatainput(exactly728pixels):

x=tf.placeholder(tf.float32,[None,n_input])

7. Thenwechangetheformof4Dinputimagestoatensor,usingtheTensorFlowreshapeoperator:

_X=tf.reshape(x,shape=[-1,28,28,1])

Thesecondandthirddimensionscorrespondtothewidthandheightoftheimage,whilethelatterdimensionisthetotalnumberofcolorchannels(inourcase1).

Sowecandisplayourinputimageasatwo-dimensionaltensor,ofsize28x28:

Theinputtensorforourproblem

Theoutputtensorwillcontaintheoutputprobabilityforeachdigittoclassify:

y=tf.placeholder(tf.float32,[None,n_classes]).

Firstconvolutionallayer

Eachneuronofthehiddenlayerisconnectedtoasmallsubsetoftheinputtensorofdimension5x5.Thisimpliesthatthehiddenlayerwillhavea24x24size.Wealsodefineandinitializethetensorsofsharedweightsandsharedbias:

wc1=tf.Variable(tf.random_normal([5,5,1,32]))

bc1=tf.Variable(tf.random_normal([32]))

Recallthatinordertorecognizeanimage,weneedmorethanamapoffeatures.Thenumberisjustthenumberoffeaturemapsweareconsideringforthisfirstlayer.Inourcase,theconvolutionallayeriscomposedof32featuremaps.

Thenextstepistheconstructionofthefirstconvolutionlayer,conv1:

conv1=conv2d(_X,wc1,bc1)

Here,conv2disthefollowingfunction:

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

Forthispurpose,weusedtheTensorFlowtf.nn.conv2dfunction.Itcomputesa2Dconvolutionfromtheinputtensorandthesharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Forthispurpose,weusedthefunctiontf.nn.conv2dtocomputea2-Dconvolutionfromtheinputtensorandthetensorofsharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Whiletf.nn.reluistheRelufunction(Rectifiedlinearunit)thatistheusualactivationfunctioninthehiddenlayerofadeepneuralnetwork.

Wewillapplythisactivationfunctiontothereturnvaluethatwehavewiththeconvolutionfunction.Thepaddingvalueis'SAME',whichindicatesthattheoutputtensoroutputwillhavethesamesizeofinputtensor.

Onewaytorepresenttheconvolutionallayer,namelyconv1,isasfollows:

Thefirsthiddenlayer

Aftertheconvolutionoperation,weimposethepoolingstepthatsimplifiestheoutputinformationofthepreviouslycreatedconvolutionallayer.

Inourexample,let'stakea2x2regionoftheconvolutionlayerandwewillsummarizetheinformationateachpointinthepoolinglayer.

conv1=max_pool(conv1,k=2)

Here,forthepoolingoperation,wehaveimplementedthefollowingfunction:

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

Thetf.nn.max_poolfunctionperformsthemaxpoolingontheinput.Ofcourse,weapplythemaxpoolingforeachconvolutionallayer,andtherewillbemanylayersofpoolingandconvolution.Attheendofthepoolingphase,we'llhave12x12x32convolutionalhiddenlayers.

ThenextfigureshowstheCNNslayersafterthepoolingandconvolutionoperation:

TheCNNsafterafirstconvolutionandpoolingoperations

Thelastoperationistoreducetheoverfittingbyapplyingthetf.nn.dropoutTensorFlowoperatorsontheconvolutionallayer.Todothis,wecreateaplaceholderfortheprobability(keep_prob)thataneuron'soutputiskeptduringthedropout:

keep_prob=tf.placeholder(tf.float32)

conv1=tf.nn.dropout(conv1,keep_prob)

Secondconvolutionallayer

Forthesecondhiddenlayer,wemustapplythesameoperationsasthefirstlayer,andsowedefineandinitializethetensorsofsharedweightsandsharedbias:

wc2=tf.Variable(tf.random_normal([5,5,32,64]))

bc2=tf.Variable(tf.random_normal([64]))

Asyoucannote,thissecondhiddenlayerwillhave64featuresfora5x5window,whilethenumberofinputlayerswillbegivenfromthefirstconvolutionalobtainedlayer.Wenextapplyasecondlayertotheconvolutionalconv1tensor,butthistimeweapply64setsof5x5filterseachtothe32conv1layers:

conv2=conv2d(conv1,wc2,bc2)

Itgiveus6414x14arrayswhichwereducewithmaxpoolingto647x7arrays:

conv2=max_pool(conv2,k=2)

Finally,weagainusethedropoutoperation:

conv2=tf.nn.dropout(conv2,keep_prob)

Theresultinglayerisa7x7x64convolutiontensorbecausewestartedfromtheinputtensor12x12andaslidingwindowof5x5,consideringthathasastrideof1.

Buildingthesecondhiddenlayer

Denselyconnectedlayer

Inthisstep,webuildadenselyconnectedlayerthatweusetoprocesstheentireimage.Theweightandbiastensorsareasfollows:

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

bd1=tf.Variable(tf.random_normal([1024]))

Asyoucannote,thislayerwillbeformedby1024neurons.

Thenwereshapethetensorfromthesecondconvolutionallayerintoabatchofvectors:

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

Multiplythistensorbytheweightmatrix,wd1,addthetensorbias,bd1,andapplyaRELUoperation:

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

Wecompletethislayerbyagainusingthedropoutoperator:

dense1=tf.nn.dropout(dense1,keep_prob)

Readoutlayer

Thelastlayerdefinesthetensorswoutandbout:

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bout=tf.Variable(tf.random_normal([n_classes]))

Beforeapplyingthesoftmaxfunction,wemustcalculatetheevidencethattheimagebelongstoacertainclass:

pred=tf.add(tf.matmul(dense1,wout),bout)

Testingandtrainingthemodel

Theevidencemustbeconvertedintoprobabilitiesforeachofthe10possibleclasses(themethodisidenticaltowhatwesawinChapter4,IntroducingNeuralNetworks).Sowedefinethecostfunction,whichevaluatesthequalityofourmodel,byapplyingthesoftmaxfunction:

cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred,

y))

Anditsfunctionoptimization,usingtheTensorFlowAdamOptimizerfunction:

optimizer=

tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

Thefollowingtensorwillserveintheevaluationphaseofthemodel:

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

Launchingthesession

Initializethevariables:

init=tf.initialize_all_variables()

Buildtheevaluationgraph:

withtf.Session()assess:

sess.run(init)

step=1

Let'strainthenetuntiltraining_iters:

whilestep*batch_size<training_iters:

batch_xs,batch_ys=mnist.train.next_batch(batch_size)

Fittrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

Calculatetheaccuracy:

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

Calculatetheloss:

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

Weprinttheaccuracyforthe256MNISTtestimages:

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

Runningthecode,wehavethefollowingoutput:

Extracting/tmp/data/train-images-idx3-ubyte.gz

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

Iter1280,MinibatchLoss=27900.769531,

TrainingAccuracy=0.17188

Iter2560,MinibatchLoss=17168.949219,TrainingAccuracy=0.21094

Iter3840,MinibatchLoss=15000.724609,TrainingAccuracy=0.41406

Iter5120,MinibatchLoss=8000.896484,TrainingAccuracy=0.49219

Iter6400,MinibatchLoss=4587.275391,TrainingAccuracy=0.61719

Iter7680,MinibatchLoss=5949.988281,TrainingAccuracy=0.69531

Iter8960,MinibatchLoss=4932.690430,TrainingAccuracy=0.70312

Iter10240,MinibatchLoss=5066.223633,TrainingAccuracy=0.70312.

...................

....................

Iter81920,MinibatchLoss=442.895020,TrainingAccuracy=0.93750

Iter83200,MinibatchLoss=273.936676,TrainingAccuracy=0.93750

Iter84480,MinibatchLoss=1169.810303,TrainingAccuracy=0.89062

Iter85760,MinibatchLoss=737.561157,TrainingAccuracy=0.90625

Iter87040,MinibatchLoss=583.576965,TrainingAccuracy=0.89844

Iter88320,MinibatchLoss=375.274475,TrainingAccuracy=0.93750

Iter89600,MinibatchLoss=183.815613,TrainingAccuracy=0.94531

Iter90880,MinibatchLoss=410.157867,TrainingAccuracy=0.89844

Iter92160,MinibatchLoss=895.187683,TrainingAccuracy=0.84375

Iter93440,MinibatchLoss=819.893555,TrainingAccuracy=0.89062

Iter94720,MinibatchLoss=460.179779,TrainingAccuracy=0.90625

Iter96000,MinibatchLoss=514.344482,TrainingAccuracy=0.87500

Iter97280,MinibatchLoss=507.836975,TrainingAccuracy=0.89844

Iter98560,MinibatchLoss=353.565735,TrainingAccuracy=0.92188

Iter99840,MinibatchLoss=195.138626,TrainingAccuracy=0.93750

OptimizationFinished!

TestingAccuracy:0.921875

Itprovidesanaccuracyofabout99.2%.Obviously,itdoesnotrepresentthestateoftheart,becausethepurposeoftheexampleistojustseehowtobuildaCNN.Themodelcanbefurtherrefinedtogivebetterresults.

Sourcecode

#ImportMINSTdata

importinput_data

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

importtensorflowastf

#Parameters

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

#NetworkParameters

n_input=784#MNISTdatainput(imgshape:28*28)

n_classes=10#MNISTtotalclasses(0-9digits)

dropout=0.75#Dropout,probabilitytokeepunits

#tfGraphinput

x=tf.placeholder(tf.float32,[None,n_input])

y=tf.placeholder(tf.float32,[None,n_classes])

#dropout(keepprobability)

keep_prob=tf.placeholder(tf.float32)

#Createmodel

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

#Storelayersweight&bias

#5x5conv,1input,32outputs

wc1=tf.Variable(tf.random_normal([5,5,1,32]))

bc1=tf.Variable(tf.random_normal([32]))

#5x5conv,32inputs,64outputs

wc2=tf.Variable(tf.random_normal([5,5,32,64]))

bc2=tf.Variable(tf.random_normal([64]))

#fullyconnected,7*7*64inputs,1024outputs

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

#1024inputs,10outputs(classprediction)

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bd1=tf.Variable(tf.random_normal([1024]))

bout=tf.Variable(tf.random_normal([n_classes]))

#Constructmodel

_X=tf.reshape(x,shape=[-1,28,28,1])

#ConvolutionLayer

conv1=conv2d(_X,wc1,bc1)

#MaxPooling(down-sampling)

conv1=max_pool(conv1,k=2)

#ApplyDropout

conv1=tf.nn.dropout(conv1,keep_prob)

#ConvolutionLayer

conv2=conv2d(conv1,wc2,bc2)

#MaxPooling(down-sampling)

conv2=max_pool(conv2,k=2)

#ApplyDropout

conv2=tf.nn.dropout(conv2,keep_prob)

#Fullyconnectedlayer

#Reshapeconv2outputtofitdenselayerinput

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

#Reluactivation

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

#ApplyDropout

dense1=tf.nn.dropout(dense1,keep_prob)

#Output,classprediction

pred=tf.add(tf.matmul(dense1,wout),bout)

#Definelossandoptimizer

cost=tf.reduce_mean\

(tf.nn.softmax_cross_entropy_with_logits(pred,y))

optimizer=\

tf.train.AdamOptimizer\

(learning_rate=learning_rate).minimize(cost)

#Evaluatemodel

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

#Initializingthevariables

init=tf.initialize_all_variables()

#Launchthegraph

withtf.Session()assess:

sess.run(init)

step=1

#Keeptraininguntilreachmaxiterations

whilestep*batch_size<training_iters:

batch_xs,batch_ys=mnist.train.next_batch(batch_size)

#Fittrainingusingbatchdata

sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

#Calculatebatchaccuracy

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

#Calculatebatchloss

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

#Calculateaccuracyfor256mnisttestimages

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

RecurrentneuralnetworksAnotherdeeplearning-orientedarchitectureisthatoftheso-calledrecurrentneuralnetworks(RNNs).ThebasicideaofRNNsistomakeuseofthesequentialinformationtypeintheinput.Inneuralnetworks,wetypicallyassumethateachinputandoutputisindependentfromalltheothers.Formanytypesofproblems,however,thisassumptiondoesnotresulttobepositive.Forexample,ifyouwanttopredictthenextwordofaphrase,itiscertainlyimportanttoknowthosethatprecedeit.Theseneuralnetsarecalledrecurrentbecausetheyperformthesamecomputationsforallelementsofasequenceofinputs,andtheoutputeachelementdepends,inadditiontothecurrentinput,onallpreviouscomputations.

RNNarchitecture

RNNsprocessasequentialinputitematatime,maintainingasortofupdatedstatevectorthatcontainsinformationaboutallpastelementsofthesequence.Ingeneral,anRNNhasashapeofthefollowingtype:

RNNarchitectureschema

TheprecedingfigureshowstheaspectofanRNN,withitsunfoldedversion,explainingthenetworkstructureforthewholesequenceofinputs,ateachinstantoftime.Itbecomesclearthat,differentlyfromthetypicalmulti-levelneuralnetworks,

whichuseseveralparametersateachlevel,anRNNalwaysusesthesameparameters,denominatedU,V,andW(seethepreviousfigure).Furthermore,anRNNperformsthesamecomputationateachinstant,onmultipleofthesamesequenceininput.Sharingthesameparameters,itstronglyreducesthenumberofparametersthatthenetworkmustlearnduringthetrainingphase,thusalsoimprovingthetrainingtime.

Itisalsoevidenthowyoucantrainnetworksofthistype,infact,becausetheparametersaresharedforeachinstantoftime,thegradientcalculatedforeachoutputdependsnotonlyfromthecurrentcomputationbutalsofromthepreviousones.Forexample,tocalculatethegradientattimet=4,itisnecessarytobackpropagatethegradientforthethreepreviousinstantsoftimeandthensumthegradientsthusobtained.Also,theentireinputsequenceistypicallyconsideredtobeasingleelementofthetrainingset.

However,thetrainingofthistypeofnetworksuffersfromtheso-calledvanishing/explodinggradientproblem;thegradients,computedandbackpropagated,tendtoincreaseordecreaseateachinstantoftimeandthen,afteracertainnumberofinstantsoftime,divergetoinfinityorconvergetozero.

LetusnowexaminehowanRNNoperates.Xt;isthenetworkinputatinstantt,whichcouldbe,forexample,avectorthatrepresentsawordofasentence,whileSt;isthestatevectorofthenet.Itcanbeconsideredasortofmemoryofthesystemwhichcontainsinformationonallthepreviouselementsoftheinputsequence.Thestatevectoratinstanttisevaluatedstartingfromthecurrentinput(timet)andthestatusevaluatedatthepreviousinstant(timet-1)throughtheUandWparameters:

St=f([U]Xt+[W]St-1)

Thefunctionfisanonlinearfunctionsuchasrectifiedlinearunit(ReLu),whileOt;istheoutputatinstantt,calculatedusingtheparameterV.

Theoutputwilldependonthetypeofproblemforthewhichthenetworkisused.Forexample,ifyouwanttopredictthenextwordofasentence,itcouldbeaprobabilityvectorwithrespecttoeachwordinthevocabularyofthesystem.

LSTMnetworks

LongSharedTermMemory(LSTM)networksareanextensionofthebasicmodelofRNNarchitectures.Themainideaistoimprovethenetwork,providingitwithan

explicitmemory.TheLSTMnetworks,infact,despitenothavinganessentiallydifferentarchitecturefromRNN,areequippedwithspecialhiddenunits,calledmemorycells,thebehaviorofwhichistorememberthepreviousinputforalongtime.

ALSTM)unit

TheLSTMunithasthreegatesandfourinputweights,xt(fromthedatatotheinputandthreegates),whilehtistheoutputoftheunit.

ALSTMblockcontainsgatesthatdeterminewhetheraninputissignificantenoughtobesaved.Thisblockisformedbyfourunits:

Inputgate:AllowsthevalueinputinthestructureForgetgate:GoestoeliminatethevaluescontainedinthestructureOutputgate:DetermineswhentheunitwilloutputthevaluestrappedinstructureCell:Enablesordisablesthememorycell

Inthenextexample,wewillseeaTensorFlowimplementationofaLSTMnetworkinalanguageprocessingproblem.

NLPwithTensorFlow

RNNshaveprovedtohaveexcellentperformanceinproblemssuchaspredictingthenextcharacterinatextor,similarly,thepredictionofthenextsequencewordinasentence.However,theyarealsousedformorecomplexproblems,suchasMachineTranslation.Inthiscase,thenetworkwillhaveasinputasequenceofwordsinasourcelanguage,whileyouwanttooutputthecorrespondingsequenceofwordsinalanguagetarget.Finally,anotherapplicationofgreatimportanceinwhichRNNsarewidelyusedisthatofspeechrecognition.Inthefollowing,wewilldevelopacomputationalmodelthatcanpredictthenextwordinatextbasedonthesequenceoftheprecedingwords.Tomeasuretheaccuracyofthemodel,wewillusethePennTreeBank(PTB)dataset,whichisthebenchmarkusedtomeasuretheprecisionofthesemodels.

Thisexamplereferstothefilesthatyoufindinthe/rnn/ptbdirectoryofyourTensorFlowdistribution.Itcomprisesofthefollowingtwofiles:

ptb_word_lm.py:ThequeuestotrainalanguagemodelonthePTBdatasetreader.py:Thecodetoreadthedataset

Unlikepreviousexamples,wewillpresentonlythepseudocodeoftheprocedureimplemented,inordertounderstandthemainideasbehindtheconstructionofthemodel,withoutgettingboggeddowninunnecessaryimplementationdetails.Thesourcecodeisquitelong,andanexplanationofthecodelinebylinewouldbetoocumbersome.

Note

Seehttps://www.tensorflow.org/versions/r0.8/tutorials/recurrent/index.htmlforotherreferences.

Downloadthedata

Youcandownloadthedatafromthewebpagehttp://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgzandthenextractthedatafolder.Thedatasetispreprocessedandcontains10000,differentwords,includingtheend-of-sentencemarkerandaspecialsymbol(<unk>)forrarewords.Weconvertalloftheminreader.pytouniqueintegeridentifierstomakeiteasyfortheneuralnetworktoprocess.

Toextracta.tgzfilewithtar,youneedtousethefollowing:

tar-xvzf/path/to/yourfile.tgz

BuildingthemodelThismodelimplementsanarchitectureoftheRNNusingtheLSTM.Infact,itplanstoincreasethearchitectureoftheRNNbyincludingstorageunitsthatallowsavinginformationregardinglong-termtemporaldependencies.

TheTensorFlowlibraryallowsyoutocreateaLSTMthroughthefollowingcommand:

lstm=rnn_cell.BasicLSTMCell(size)

HeresizeshouldbethenumberofunitstobeusedLSTM.TheLSTMmemoryisinitializedtozero:

state=tf.zeros([batch_size,lstm.state_size])

Inthecourseofcomputation,aftereachwordtoexaminethestatevalueisupdatedwiththeoutputvalue,followingisthepseudocodelistoftheimplementedsteps:

loss=0.0

forcurrent_batch_of_wordsinwords_in_dataset:

output,state=lstm(current_batch_of_words,state)

outputisthenusedtomakepredictionsonthepredictionofthenextword:

logits=tf.matmul(output,softmax_w)+softmax_b

probabilities=tf.nn.softmax(logits)

loss+=loss_function(probabilities,target_words)

Thelossfunctionminimizestheaveragenegativelogprobabilityofthetargetwords,itistheTensorFowfunction:

tf.nn.seq2seq.sequence_loss_by_example

Itcomputestheaverageper-wordperplexity,itsvaluemeasurestheaccuracyofthemodel(tolowervaluescorrespondbestperformance)andwillbemonitoredthroughoutthetrainingprocess.

RunningthecodeThemodelimplementedsupportsthreetypesofconfigurations:small,medium,andlarge.ThedifferencebetweenthemisinsizeoftheLSTMsandthesetofhyperparametersusedfortraining.Thelargerthemodel,thebetterresultsitshouldget.Thesmallmodelshouldbeabletoreachperplexitybelow120onthetestsetandthelargeonebelow80,thoughitmighttakeseveralhourstotrain.

Toexecutethemodelsimplytypethefollowing:

pythonptb_word_lm--data_path=/tmp/simple-examples/data/--model

small

In/tmp/simple-examples/data/,youmusthavedownloadedthedatafromthePTBdataset.

Thefollowinglistshowstherunafter8hoursoftraining(13epochsforasmallconfiguration):

Epoch:1Learningrate:1.000

0.004perplexity:5263.762speed:391wps

0.104perplexity:837.607speed:429wps

0.204perplexity:617.207speed:442wps

0.304perplexity:498.160speed:438wps

0.404perplexity:430.516speed:436wps

0.504perplexity:386.339speed:427wps

0.604perplexity:348.393speed:431wps

0.703perplexity:322.351speed:432wps

0.803perplexity:301.630speed:431wps

0.903perplexity:282.417speed:434wps

Epoch:1TrainPerplexity:268.124

Epoch:1ValidPerplexity:180.210

Epoch:2Learningrate:1.000

0.004perplexity:209.082speed:448wps

0.104perplexity:150.589speed:437wps

0.204perplexity:157.965speed:436wps

0.304perplexity:152.896speed:453wps

0.404perplexity:150.299speed:458wps

0.504perplexity:147.984speed:462wps

0.604perplexity:143.367speed:462wps

0.703perplexity:141.246speed:446wps

0.803perplexity:139.299speed:436wps

0.903perplexity:135.632speed:435wps

Epoch:2TrainPerplexity:133.576

Epoch:2ValidPerplexity:143.072

............................................................

Epoch:12Learningrate:0.008

0.004perplexity:57.011speed:347wps

0.104perplexity:41.305speed:356wps

0.204perplexity:45.136speed:356wps

0.304perplexity:43.386speed:357wps

0.404perplexity:42.624speed:358wps

0.504perplexity:41.980speed:358wps

0.604perplexity:40.549speed:357wps

0.703perplexity:39.943speed:357wps

0.803perplexity:39.287speed:358wps

0.903perplexity:37.949speed:359wps

Epoch:12TrainPerplexity:37.125

Epoch:12ValidPerplexity:123.571

Epoch:13Learningrate:0.004

0.004perplexity:56.576speed:365wps

0.104perplexity:40.989speed:358wps

0.204perplexity:44.809speed:358wps

0.304perplexity:43.082speed:356wps

0.404perplexity:42.332speed:356wps

0.504perplexity:41.694speed:356wps

0.604perplexity:40.275speed:357wps

0.703perplexity:39.673speed:356wps

0.803perplexity:39.021speed:356wps

0.903perplexity:37.690speed:356wps

Epoch:13TrainPerplexity:36.869

Epoch:13ValidPerplexity:123.358

TestPerplexity:117.171

Asyoucansee,theperplexitybecameloweraftereachepoch.

SummaryInthischapter,wegaveanoverviewofdeeplearningtechniques,examiningtwoofthedeeplearningarchitecturesinuse,CNNandRNNs.ThroughtheTensorFlowlibrary,wedevelopedaconvolutionalneuralnetworkarchitectureforimageclassificationproblem.ThelastpartofthechapterwasdevotedtoRNNs,wherewedescribedtheTensorFlow'stutorialforRNNs,whereaLSTMnetworkisbuilttopredictthenextwordinanEnglishsentence.

ThenextchaptershowstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowserving,ahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.

Chapter6.GPUProgrammingandServingwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

GPUprogrammingTensorFlowServing:

HowtoinstallTensorFlowServingHowtouseTensorFlowServingHowtoloadandexportaTensorFlowmodel

GPUprogrammingInChapter5,DeepLearning,wherewetrainedarecurrentneuralnetwork(RNN)foranNLPapplication,wecouldseethatdeeplearningapplicationscanbecomputationallyintensive.However,youcanreducethetrainingtimebyusingparallelprogrammingtechniquesthroughagraphicprocessingunit(GPU).Infact,thecomputationalresourcesofmoderngraphicsunitsmakethemabletoperformparallelcodeportions,ensuringhighperformance.

TheGPUprogrammingmodelisaprogrammingstrategythatconsistsofreplacingaCPUtoaGPUtoacceleratetheexecutionofavarietyofapplications.Therangeofapplicationsofthisstrategyisverylargeandisgrowingdaybyday;theGPUs,currently,areabletoreducetheexecutiontimeofapplicationsacrossdifferentplatforms,fromcarstomobilephones,andfromtabletstodronesandrobots.

ThefollowingdiagramshowshowtheGPUprogrammingmodelworks.Intheapplication,therearecallstotelltheCPUtogiveawayspecificpartofthecodeGPUandletitruntogethighexecutionspeed.ThereasonforsuchspecificparttorelyontwoGPUisuptothespeedprovidedbytheGPUarchitecture.GPUhasmanyStreamingMultiprocessors(SMPs),witheachhavingmanycomputationalcores.ThesecoresarecapableofperformingALUandotheroperationswiththehelpofSingleInstructionMultipleThread(SIMT)calls,whichreducetheexecutiontimedrastically.

IntheGPUprogrammingmodeltherearepiecesofcodethatareexecutedsequentiallyintheCPU,andsomepartsareexecutedinparallelbytheGPU

TensorFlowpossessescapabilitiesthatyoucantakeadvantageofthisprogrammingmodel(ifyouhaveaNVIDIAGPU),thepackageversionthatsupportsGPUrequiresCudaToolkit7.0and6.5CUDNNV2.

Note

FortheinstallationofCudaenvironment,wesuggestreferringtheCudainstallationpage:http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/#axzz49w1XvzNj

TensorFlowreferstothesedevicesinthefollowingway:

/cpu:0:ToreferencetheserverCPU/gpu:0:TheGPUserverifthereisonlyone

/gpu:1:ThesecondGPUserverandsoon

Tofindoutwhichdeviceisassignedtoouroperationsandtensionersneedtocreatethesessionwiththeoptionofsettinglog_device_placementinstantiatedtoTrue.

Considerthefollowingexample.

Wecreateacomputationalgraph;aandbwillbetwomatrices:

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

Incweputthematrixmultiplicationofthesetwoinputtensors:

c=tf.matmul(a,b)

Thenwebuildasessionwithlog_device_placementsettoTrue:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

Finally,welaunchthesession:

printsess.run(c)

Youshouldseethefollowingoutput:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/gpu:0

a:/job:localhost/replica:0/task:0/gpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

Ifyouwouldlikeaparticularoperationtorunonadeviceofyourchoiceinsteadofwhat'sautomaticallyselectedforyou,youcanusetf.devicetocreateadevicecontext,sothatalltheoperationswithinthatcontextwillhavethesamedeviceassignment.

Let'screatethesamecomputationalgraphusingthetf.deviceinstruction:

withtf.device('/cpu:0'):

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

c=tf.matmul(a,b)

Again,webuildthesessiongraphandlaunchit:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

printsess.run(c)

Youwillseethatnowaandbareassignedtocpu:0:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/cpu:0

a:/job:localhost/replica:0/task:0/cpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

IfyouhavemorethanaGPU,youcandirectlyselectitsettingallow_soft_placementtoTrueintheconfigurationoptionwhencreatingthesession.

TensorFlowServingServingisaTensorFlowpackagethathasbeendevelopedtotakemachinelearningmodelsintoproductionsystems.ItmeansthatadevelopercanuseTensorFlowServing'sAPItobuildaservertoservetheimplementedmodel.

Theservedmodelwillbeabletomakeinferencesandpredictionseachtimeondatapresentedbyitsclients,allowingtoimprovethemodel.

Tocommunicatewiththeservingsystem,theclientsuseahighperformanceopensourceremoteprocedurecall(RPC)interfacedevelopedbyGoogle,calledgRPC.

Thetypicalpipeline(seethefollowingfigure)isthattrainingdataisfedtothelearner,whichoutputsamodel.Afterbeingvalidated,itisreadytobedeployedtotheTensorFlowservingsystem.Itisquitecommontolaunchanditerateonourmodelovertime,asnewdatabecomesavailable,orasyouimprovethemodel.

TensorFlowServingpipeline

HowtoinstallTensorFlowServingTocompileanduseTensorFlowServing,youneedtosetupsomeprerequisites.

Bazel

TensorFlowServingrequiresBazel0.2.0(http://www.bazel.io/)orhigher.Downloadbazel-0.2.0-installer-linux-x86_64.sh.

Note

Bazelisatoolthatautomatessoftwarebuildsandtests.Supportedbuildtasksincluderunningcompilersandlinkerstoproduceexecutableprogramsandlibraries,andassemblingdeployablepackages.

Runthefollowingcommands:

chmod+xbazel-0.2.0-installer-linux-x86_64.sh

./bazel-0.2.0-installer-linux-x86_64.sh-user

Finally,setupyourenvironment.Exportthisinyour~/.bashrcdirectory:

exportPATH="$PATH:$HOME/bin"

gRPC

OurtutorialsusegRPC(0.13orhigher)asourRPCframework.

Note

Youcanfindotherreferencesathttps://github.com/grpc.

TensorFlowservingdependencies

ToinstallTensorFlowservingdependencies,executethefollowing:

sudoapt-getupdate&&sudoapt-getinstall-y\

build-essential\

curl\

git\

libfreetype6-dev\

libpng12-dev\

libzmq3-dev\

pkg-config\

python-dev\

python-numpy\

python-pip\

software-properties-common\

swig\

zip\

zlib1g-dev

ThenconfigureTensorFlow,byrunningthefollowingcommand:

cdtensorflow

./configure

cd..

InstallServing

UseGittoclonetherepository:

gitclone--recurse-submodules

https://github.com/tensorflow/serving

cdserving

The--recurse-submodulesoptionisrequiredtofetchTensorFlow,gRPC,andotherlibrariesthatTensorFlowservingdependson.TobuildTensorFlow,youmustuseBazel:

bazelbuildtensorflow_serving/

Thebinarieswillbeplacedinthebazel-bindirectory,andcanberunusingthefollowingcommand:

/bazel-bin/tensorflow_serving/example/mnist_inference

Finally,youcantesttheinstallationbyexecutingthefollowingcommand:

bazeltesttensorflow_serving/

HowtouseTensorFlowServingInthistutorial,wewillshowhowtoexportatrainedTensorFlowmodelandbuildaservertoservetheexportedmodel.TheimplementedmodelisaSoftmaxRegressionmodelforhandwrittenimageclassification(MNISTdata).

Thecodewillconsistoftwoparts:

APythonfile(mnist_export.py)thattrainsandexportsthemodelAC++file(mnist_inference.cc)thatloadstheexportedmodelandrunsagRPCservicetoserveit

Inthefollowingsections,wereportthebasicstepstouseTensorFlowServing.Forotherreferences,youcanviewhttps://tensorflow.github.io/serving/serving_basic.

TrainingandexportingtheTensorFlowmodel

Asyoucanseeinmnist_export.py,thetrainingisdonethesamewayasintheMNIST.Forabeginnerstutorial,referthefollowinglink:

https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html

TheTensorFlowgraphislaunchedinTensorFlowsessionsess,withtheinputtensor(image)asxandtheoutputtensor(Softmaxscore)asy.ThenweusetheTensorFlowservingexportertoexportthemodel;itbuildsasnapshotofthetrainedmodelsothatitcanbeloadedlaterforinference.Let'snowseethemainfunctiontousetoexportatrainedmodel.

Importtheexportertoserializethemodel:

fromtensorflow_serving.session_bundleimportexporter

ThenyoumustdefinesaverusingtheTensorFlowfunctiontf.train.Saver.IthastheshardedparameterequaltoTrue:

saver=tf.train.Saver(sharded=True)

saverisusedtoserializegraphvariablevaluestothemodelexportsothattheycanbeproperlyrestoredlater.

Thenextstepistodefinemodel_exporter:

model_exporter=exporter.Exporter(saver)

signature=exporter.classification_signature\

(input_tensor=x,scores_tensor=y)

model_exporter.init(sess.graph.as_graph_def(),

default_graph_signature=signature)

model_exportertakesthefollowingtwoarguments:

sess.graph.as_graph_def()istheprotobufofthegraph.ExportingwillserializetheprotobuftothemodelexportsothattheTensorFlowgraphcanbeproperlyrestoredlater.default_graph_signature=signaturespecifiesamodelexportsignature.Thesignaturespecifieswhattypeofmodelisbeingexported,andtheinput/outputtensorstobindtowhenrunninginference.Inthiscase,youuseexporter.classification_signaturetospecifythatthemodelisaclassificationmodel.

Finally,wecreateourexport:

model_exporter.export(export_path,tf.constant\

(FLAGS.export_version),sess)

model_exporter.exporttakesthefollowingarguments:

export_pathisthepathoftheexportdirectory.Exportwillcreatethedirectoryifitdoesnotexist.tf.constant(FLAGS.export_version)isatensorthatspecifiestheversionofthemodel.Youshouldspecifyalargerintegervaluewhenexportinganewerversionofthesamemodel.Eachversionwillbeexportedtoadifferentsub-directoryunderthegivenpath.sessistheTensorFlowsessionthatholdsthetrainedmodelyouareexporting.

Runningasession

Toexportthemodel,firstcleartheexportdirectory:

$>rm-rf/tmp/mnist_model

Then,usingbazel,buildthemnist_exportexample:

$>bazelbuild//tensorflow_serving/example:mnist_export

Finally,youcanrunthefollowingexample:

$>bazel-bin/tensorflow_serving/example/mnist_export/tmp/mnist_model

Trainingmodel...

Donetraining!

Exportingtrainedmodelto/tmp/mnist_model

Doneexporting!

Lookingintheexportdirectory,weshouldhaveasub-directoryforexportingeachversionofthemodel:

$>ls/tmp/mnist_model

00000001

Thecorrespondingsub-directoryhasthedefaultvalueof1,becausewespecifiedtf.constant(FLAGS.export_version)asthemodelversionearlier,andFLAGS.export_versionhasthedefaultvalueof1.

Eachversionofsub-directorycontainsthefollowingfiles:

export.metaistheserializedtensorflow::MetaGraphDefofthemodel.Itincludesthegraphdefinitionofthemodel,aswellasmetadataofthemodel,suchassignatures.export-?????-of-?????arefilesthatholdtheserializedvariablesofthegraph.

$>ls/tmp/mnist_model/00000001

checkpointexport-00000-of-00001export.meta

LoadingandexportingaTensorFlowmodelTheC++codeforloadingtheexportedTensorFlowmodelisinthemain()functioninmnist_inference.cc.Herewereportanexcerpt;wedonotconsidertheparametersforbatching.Ifyouwanttoadjustthemaximumbatchsize,timeoutthreshold,orthenumberofbackgroundthreadsusedforbatchedinference,youcandosobysettingmorevaluesinBatchingParameters:

intmain(intargc,char**argv)

{

SessionBundleConfigsession_bundle_config;

...Herebatchingparameters

std::unique_ptr<SessionBundleFactory>bundle_factory;

TF_QCHECK_OK(

SessionBundleFactory::Create(session_bundle_config,

&bundle_factory));

std::unique_ptr<SessionBundle>bundle(newSessionBundle);

TF_QCHECK_OK(bundle_factory->CreateSessionBundle(bundle_path,

&bundle));

......

RunServer(FLAGS_port,std::move(bundle));

return0;

}

SessionBundleisacomponentofTensorFlowServing.Let'sconsidertheincludefileSessionBundle.h:

structSessionBundle{

std::unique_ptr<tensorflow::Session>session;

tensorflow::MetaGraphDefmeta_graph_def;

};

ThesessionparameterisaTensorFlowsessionthathastheoriginalgraphwiththenecessaryvariablesproperlyrestored.

SessionBundleFactory::CreateSessionBundle()loadstheexportedTensorFlowmodelfrombundle_pathandcreatesaSessionBundleobjectforrunninginferencewiththemodel.

RunServerbringsupagRPCserverthatexportsasingleClassify()API.

Eachinferencerequestwillbeprocessedinthefollowingsteps:

1. Verifytheinput.TheserverexpectsexactlyoneMNIST-formatimageforeachinferencerequest.

2. Transforminputtoinferenceinputtensorandcreateoutputtensorplaceholder.3. Runinference.

Torunaninference,youmusttypethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_inference

$>bazel-bin/tensorflow_serving/example/mnist_inference--port=9000

/tmp/mnist_model/00000001

TesttheserverTotesttheserver,weusethemnist_client.py(https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_client.pyutility.

ThisclientdownloadsMNISTtestdata,sendsitasrequeststotheserver,andcalculatestheinferenceerrorrate.

Torunit,typethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_client

$>bazel-bin/tensorflow_serving/example/mnist_client--num_tests=1000

--server=localhost:9000

Inferenceerrorrate:10.5%

Theresultconfirmsthattheserverloadsandrunsthetrainedmodelsuccessfully.Infact,a10.5%inferenceerrorratefor1,000imagesgivesus91%accuracyforthetrainedSoftmaxmodel.

SummaryWedescribedtwoimportantfeaturesofTensorFlowinthischapter.FirstwasthepossibilityofusingtheprogrammingmodelknownasGPUcomputing,withwhichitbecomespossibletospeedupthecode(forexample,thetrainingphaseofaneuralnetwork).ThesecondpartofthechapterwasdevotedtodescribingtheframeworkTensorFlowServing.Itisahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.Thispowerfulframeworkcanrunmultiplemodelsatlargescalethatchangeovertime,basedonreal-worlddata,enablingamoreefficientuseofGPUresourcesandallowingthedevelopertoimprovetheirownmachinelearningmodels.