Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

GettingStartedwithTensorFlow

TableofContents

GettingStartedwithTensorFlowCreditsAbouttheAuthorAbouttheReviewerwww.PacktPub.com

eBooks,discountoffers,andmoreWhysubscribe?

PrefaceWhatthisbookcoversWhatyouneedforthisbookWhothisbookisforConventionsReaderfeedbackCustomersupport

DownloadingtheexamplecodeDownloadingthecolorimagesofthisbookErrataPiracyQuestions

1.TensorFlow–BasicConceptsMachinelearninganddeeplearningbasics

SupervisedlearningUnsupervisedlearningDeeplearning

TensorFlow–AgeneraloverviewPythonbasics

SyntaxDatatypesStringsControlflowFunctionsClassesExceptionsImportingalibrary

InstallingTensorFlow

InstallingonMacorLinuxdistributionsInstallingonWindowsInstallationfromsourceTestingyourTensorFlowinstallation

FirstworkingsessionDataFlowGraphsTensorFlowprogrammingmodel

HowtouseTensorBoardSummary

2.DoingMathwithTensorFlowThetensordatastructure

One-dimensionaltensorsTwo-dimensionaltensors

TensorhandlingThree-dimensionaltensorsHandlingtensorswithTensorFlow

PreparetheinputdataComplexnumbersandfractals

PreparethedataforMandelbrotsetBuildandexecutetheDataFlowGraphforMandelbrot'ssetVisualizetheresultforMandelbrot'ssetPreparethedataforJulia'ssetBuildandexecutetheDataFlowGraphforJulia'ssetVisualizetheresult

ComputinggradientsRandomnumbers

UniformdistributionNormaldistributionGeneratingrandomnumberswithseeds

Montecarlo'smethodSolvingpartialdifferentialequations

InitialconditionModelbuildingGraphexecution

ComputationalfunctionusedSummary

3.StartingwithMachineLearningThelinearregressionalgorithm

Datamodel

CostfunctionsandgradientdescentTestingthemodel

TheMNISTdatasetDownloadingandpreparingthedata

ClassifiersThenearestneighboralgorithm

BuildingthetrainingsetCostfunctionandoptimization

TestingandalgorithmevaluationDataclustering

Thek-meansalgorithmBuildingthetrainingsetCostfunctionsandoptimization

TestingandalgorithmevaluationSummary

4.IntroducingNeuralNetworksWhatareartificialneuralnetworks?

NeuralnetworkarchitecturesSingleLayerPerceptronThelogisticregression

TensorFlowimplementationBuildingthemodelLaunchthesessionTestevaluationSourcecode

MultiLayerPerceptronMultiLayerPerceptronclassification

BuildthemodelLaunchthesessionSourcecode

MultiLayerPerceptronfunctionapproximationBuildthemodelLaunchthesession

Summary5.DeepLearning

DeeplearningtechniquesConvolutionalneuralnetworks

CNNarchitectureTensorFlowimplementationofaCNN

InitializationstepFirstconvolutionallayerSecondconvolutionallayerDenselyconnectedlayerReadoutlayerTestingandtrainingthemodelLaunchingthesessionSourcecode

RecurrentneuralnetworksRNNarchitectureLSTMnetworksNLPwithTensorFlow

DownloadthedataBuildingthemodelRunningthecode

Summary6.GPUProgrammingandServingwithTensorFlow

GPUprogrammingTensorFlowServing

HowtoinstallTensorFlowServingBazelgRPC

TensorFlowservingdependenciesInstallServing

HowtouseTensorFlowServingTrainingandexportingtheTensorFlowmodelRunningasession

LoadingandexportingaTensorFlowmodelTesttheserver

Summary

GettingStartedwithTensorFlow

Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.

Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.

PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.

Firstpublished:July2016

Productionreference:1190716

PublishedbyPacktPublishingLtd.

LiveryPlace

35LiveryStreet

Birmingham

B32PB,UK.

ISBN978-1-78646-857-4

www.packtpub.com

CreditsAuthor

GiancarloZaccone

CopyEditor

AlphaSingh

Reviewer

JayaniWithanawasam

ProjectCoordinator

ShwetaHBirwatkar

CommissioningEditor

VeenaPagare

Proofreader

SafisEditing

AcquisitionEditor

VinayArgekar

Indexer

MariammalChettiyar

ContentDevelopmentEditor

SumeetSawant

ProductionCoordinator

NileshMohite

TechnicalEditor

DeeptiTuscano

CoverWork

NileshMohite

AbouttheAuthorGiancarloZacconehasmorethan10yearsofexperiencemanagingresearchprojectsinboththescientificandindustrialdomains.HeworkedasresearcherattheC.N.R,theNationalResearchCouncil,wherehewasinvolvedinprojectsrelatedtoparallelnumericalcomputingandscientificvisualization.

Currently,heisaseniorsoftwareengineerataconsultingcompanydevelopingandmaintainingsoftwaresystemsforspaceanddefenceapplications.

Giancarloholdsamaster'sdegreeinphysicsfromtheFedericoIIofNaplesanda2ndlevelpostgraduatemastercourseinscientificcomputingfromLaSapienzaofRome.

HehasalreadybeenaPacktauthorforthefollowingbook:PythonParallelProgrammingCookbook.

Youcancontacthimathttps://it.linkedin.com/in/giancarlozaccone

AbouttheReviewerJayaniWithanawasamisaseniorsoftwareengineeratZaiziAsia-ResearchandDevelopmentteam.SheistheauthorofthebookApacheMahoutEssentials,onscalablemachinelearning.ShewasasummitspeakeratAlfrescoSummit2014-London.Hertalkwasaboutapplicationsofmachinelearningtechniquesinsmartenterprisecontentmanagement(ECM)solutions.Shepresentedherresearch“ContentExtractionandContextInferencebasedInformationRetrieval”attheWomeninMachineLearning(WiML)2015workshop,whichwasco-locatedwiththeNeuralInformationProcessingSystems(NIPS)2015conference-Montreal,Canada.

JayaniiscurrentlypursuinganMScinArtificialIntelligenceattheUniversityofMoratuwa,SriLanka.Shehasstrongresearchinterestsinmachinelearningandcomputervision.

Youcancontactherathttps://lk.linkedin.com/in/jayaniwithanawasam

www.PacktPub.com

eBooks,discountoffers,andmoreDidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusatcustomercare@packtpub.comformoredetails.

Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.

https://www2.packtpub.com/books/subscription/packtlib

DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt'sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt'sentirelibraryofbooks.

Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser

PrefaceTensorFlowisanopensourcesoftwarelibraryusedtoimplementmachinelearninganddeeplearningsystems.

Behindthesetwonamesarehiddenaseriesofpowerfulalgorithmsthatshareacommonchallenge:toallowacomputertolearnhowtoautomaticallyrecognizecomplexpatternsandmakethesmartestdecisionspossible.

Machinelearningalgorithmsaresupervisedorunsupervised;simplifyingasmuchaspossible,wecansaythatthebiggestdifferenceisthatinsupervisedlearningtheprogrammerinstructsthecomputerhowtodosomething,whereasinunsupervisedlearningthecomputerwilllearnallbyitself.

Deeplearningisinsteadanewareaofmachinelearningresearchthathasbeenintroducedwiththeobjectiveofmovingmachinelearningclosertoartificialintelligencegoals.Thismeansthatdeeplearningalgorithmstrytooperatelikethehumanbrain.

Withtheaimofconductingresearchinthesefascinatingareas,theGoogleteamdevelopedTensorFlow,whichisthesubjectofthisbook.

TointroduceTensorFlow’sprogrammingfeatures,wehaveusedthePythonprogramminglanguage.Pythonisfunandeasytouse;itisatruegeneral-purposelanguageandisquicklybecomingamust-havetoolinthearsenalofanyself-respectingprogrammer.

ItisnottheaimofthisbooktocompletelydescribeallTensorFlowobjectsandmethods;insteadwewillintroducetheimportantsystemconceptsandleadyouupthelearningcurveasfastandefficientlyaswecan.EachchapterofthebookpresentsadifferentaspectofTensorFlow,accompaniedbyseveralprogrammingexamplesthatreflecttypicalissuesofmachineanddeeplearning.

Althoughitislargeandcomplex,TensorFlowisdesignedtobeeasytouseonceyoulearnaboutitsbasicdesignandprogrammingmethodology.

ThepurposeofGettingStartedwithTensorFlowistohelpyoudojustthat.

Enjoyreading!

WhatthisbookcoversChapter1,TensorFlow–BasicConcepts,containsgeneralinformationonthestructureofTensorFlowandtheissuesforwhichitwasdeveloped.ItalsoprovidesthebasicprogrammingguidelinesforthePythonlanguageandafirstTensorFlowworkingsessionaftertheinstallationprocedure.ThechapterendswithadescriptionofTensorBoard,apowerfultoolforoptimizationanddebugging.

Chapter2,DoingMathwithTensorFlow,describestheabilityofmathematicalprocessingofTensorFlow.Itcoversprogrammingexamplesonbasicalgebrauptopartialdifferentialequations.Also,thebasicdatastructureinTensorFlow,thetensor,isexplained.

Chapter3,StartingwithMachineLearning,introducessomemachinelearningmodels.Westarttoimplementthelinearregressionalgorithm,whichisconcernedwithmodelingrelationshipsbetweendata.Themainfocusofthechapterisonsolvingtwobasicproblemsinmachinelearning;classification,thatis,howtoassigneachnewinputtooneofthepossiblegivencategories;anddataclustering,whichisthetaskofgroupingasetofobjectsinsuchawaythatobjectsinthesamegrouparemoresimilartoeachotherthantothoseinothergroups.

Chapter4,IntroducingNeuralNetworks,providesaquickanddetailedintroductionofneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelements,theartificialneurons.Theyaremathematicalconstructsthattosomeextentmimicthepropertiesoflivingneurons.Neuralnetworksbuildthefoundationonwhichreststhearchitectureofdeeplearningalgorithms.Twobasictypesofneuralnetsarethenimplemented:theSingleLayerPerceptronandtheMultiLayerPerceptronforclassificationproblems.

Chapter5,DeepLearning,givesanoverviewofdeeplearningalgorithms.Onlyinrecentyearshasdeeplearningcollectedalargenumberofresultsconsideredunthinkableafewyearsago.We’llshowhowtoimplementtwofundamentaldeeplearningarchitectures,convolutionalneuralnetworks(CNN)andrecurrentneuralnetworks(RNN),forimagerecognitionandspeechtranslationproblemsrespectively.

Chapter6,GPUProgrammingandServingwithTensorFlow,showstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowServing,ahigh-performanceopensourceservingsystemformachinelearningmodelsdesignedfor

productionenvironmentsandoptimizedforTensorFlow.

WhatyouneedforthisbookAlltheexampleshavebeenimplementedusingPythonversion2.7onanUbuntuLinux64-bitmachine,includingtheTensorFlowlibraryversion0.7.1.

YouwillalsoneedthefollowingPythonmodules(preferablythelatestversion):

PipBazelMatplotlibNumPyPandas

WhothisbookisforThereadershouldhaveabasicknowledgeofprogrammingandmathconcepts,andatthesametime,wanttobeintroducedtothetopicsofmachineanddeeplearning.Afterreadingthisbook,youwillbeabletomasterTensorFlow’sfeaturestobuildpowerfulapplications.

ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.

Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:"Theinstructionsforflowcontrolareif,for,andwhile."

Anycommand-lineinputoroutputiswrittenasfollows:

>>>myvar=3

>>>myvar+=2

>>>myvar

>>>myvar-=1

>>>myvar

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,forexample,inmenusordialogboxes,appearinthetextlikethis:"TheshortcutsinthisbookarebasedontheMacOSX10.5+scheme."

Warningsorimportantnotesappearinaboxlikethis.

Tipsandtricksappearlikethis.

ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook-whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.Tosendusgeneralfeedback,simplye-mailfeedback@packtpub.com,andmentionthebook'stitleinthesubjectofyourmessage.Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.

CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.

DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Youcandownloadthecodefilesbyfollowingthesesteps:

1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.2. HoverthemousepointerontheSUPPORT tabatthetop.3. ClickonCodeDownloads&Errata.4. EnterthenameofthebookintheSearchbox.5. Selectthebookforwhichyou'relookingtodownloadthecodefiles.6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.7. ClickonCodeDownload.

Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthelatestversionof:

WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMac7-Zip/PeaZipforLinux

ThecodebundleforthebookisalsohostedonGitHubathttps://github.com/PacktPublishing/Getting-Started-with-TensorFlow.Wealsohaveothercodebundlesfromourrichcatalogofbooksandvideosavailableathttps://github.com/PacktPublishing/.Checkthemout!

DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefromhttp://www.packtpub.com/sites/default/files/downloads/GettingStartedwithTensorFlow_ColorImages.pdf

ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks-maybeamistakeinthetextorthecode-wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.

Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.

PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.

Pleasecontactusatcopyright@packtpub.comwithalinktothesuspectedpiratedmaterial.

Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.

QuestionsIfyouhaveaproblemwithanyaspectofthisbook,youcancontactusatquestions@packtpub.com,andwewilldoourbesttoaddresstheproblem.

Chapter1.TensorFlow–BasicConceptsInthischapter,we'llcoverthefollowingtopics:

MachinelearninganddeeplearningbasicsTensorFlow–AgeneraloverviewPythonbasicsInstallingTensorFlowFirstworkingsessionDataFlowGraphTensorFlowprogrammingmodelHowtouseTensorBoard

MachinelearninganddeeplearningbasicsMachinelearningisabranchofartificialintelligence,andmorespecificallyofcomputerscience,whichdealswiththestudyofsystemsandalgorithmsthatcanlearnfromdata,synthesizingnewknowledgefromthem.

Thewordlearnintuitivelysuggeststhatasystembasedonmachinelearning,may,onthebasisoftheobservationofpreviouslyprocesseddata,improveitsknowledgeinordertoachievebetterresultsinthefuture,orprovideoutputclosertothedesiredoutputforthatparticularsystem.

Theabilityofaprogramorasystembasedonmachinelearningtoimproveitsperformanceinaparticulartask,thankstopastexperience,isstronglylinkedtoitsabilitytorecognizepatternsinthedata.Thistheme,calledpatternrecognition,isthereforeofvitalimportanceandofincreasinginterestinthecontextofartificialintelligence;itisthebasisofallmachinelearningtechniques.

Thetrainingofamachinelearningsystemcanbedoneindifferentways:

SupervisedlearningUnsupervisedlearning

SupervisedlearningSupervisedlearningisthemostcommonformofmachinelearning.Withsupervisedlearning,asetofexamples,thetrainingset,issubmittedasinputtothesystemduringthetrainingphase,whereeachexampleislabeledwiththerespectivedesiredoutputvalue.Forexample,let'sconsideraclassificationproblem,wherethesystemmustattributesomeexperimentalobservationsinoneoftheNdifferentclassesalreadyknown.Inthisproblem,thetrainingsetispresentedasasequenceofpairsofthetype{(X1,Y1),.....,(Xn,Yn)}whereXiaretheinputvectors(featurevectors)andYirepresentsthedesiredclassforthecorrespondinginputvector.Mostsupervisedlearningalgorithmsshareonecharacteristic:thetrainingisperformedbytheminimizationofaparticularlossfunction(costfunction),whichrepresentstheoutputerrorwithrespecttothedesiredoutputsystem.

Thecostfunctionmostusedforthistypeoftrainingcalculatesthestandarddeviationbetweenthedesiredoutputandtheonesuppliedbythesystem.Aftertraining,theaccuracyofthemodelismeasuredonasetofdisjointedexamplesfromthetrainingset,theso-calledvalidationset.

Supervisedlearningworkflow

Inthisphasethemodel'sgeneralizationcapabilityisthenverified:wewilltestiftheoutputiscorrectforanunusedinputduringthetrainingphase.

Unsupervisedlearning

Inunsupervisedlearning,thetrainingexamplesprovidedbythesystemarenotlabeledwiththerelatedbelongingclass.Thesystem,therefore,developsandorganizesthedata,lookingforcommoncharacteristicsamongthem,andchangingthembasedontheirinternalknowledge.

Unsupervisedlearningalgorithmsareparticularlyusedinclusteringproblems,inwhichanumberofinputexamplesarepresent,youdonotknowtheclassapriori,andyoudonotevenknowwhatthepossibleclassesare,orhownumeroustheyare.Thisisaclearcasewhenyoucannotusesupervisedlearning,becauseyoudonotknowapriorithenumberofclasses.

Unsupervisedlearningworkflow

Deeplearning

Deeplearningtechniquesrepresentaremarkablestepforwardtakenbymachinelearninginrecentdecades,havingprovidedresultsneverseenbeforeinmanyapplications,suchasimageandspeechrecognitionorNaturalLanguageProcessing(NLP).Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofthefieldofmachinelearningonlyinrecentdecades.Onereason,perhapsthemainone,issurelyrepresentedbyprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthembyafactorof10or20.Anotherreasoniscertainlytheevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofa

certaindepthandwithahighdimensionalityfortheinputdata.

Deeplearningworkflow

Deeplearningisbasedonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.Itconsistsinamachinelearningmodelatseverallevelsofrepresentationinwhichthedeeperlevelstakeasinputtheoutputsofthepreviouslevels,transformingthemandalwaysabstractingmore.Eachlevelcorrespondsinthishypotheticalmodeltoadifferentareaofthecerebralcortex:whenthebrainreceivesimages,itprocessesthemthroughvariousstagessuchasedgedetectionandformperception,thatis,fromaprimitiverepresentationleveltothemostcomplex.Forexample,inanimageclassificationproblem,eachblockgraduallyextractsthefeatures,atvariouslevelsofabstraction,inputtingofdataalreadyprocessed,bymeansoffilteringoperations.

TensorFlow–AgeneraloverviewTensorFlow(https://www.tensorflow.org/)isasoftwarelibrary,developedbyGoogleBrainTeamwithinGoogle'sMachineLearningIntelligenceresearchorganization,forthepurposesofconductingmachinelearninganddeepneuralnetworkresearch.TensorFlowthencombinesthecomputationalalgebraofcompilationoptimizationtechniques,makingeasythecalculationofmanymathematicalexpressionswheretheproblemisthetimerequiredtoperformthecomputation.

Themainfeaturesinclude:

Defining,optimizing,andefficientlycalculatingmathematicalexpressionsinvolvingmulti-dimensionalarrays(tensors).Programmingsupportofdeepneuralnetworksandmachinelearningtechniques.TransparentuseofGPUcomputing,automatingmanagementandoptimizationofthesamememoryandthedataused.YoucanwritethesamecodeandruniteitheronCPUsorGPUs.Morespecifically,TensorFlowwillfigureoutwhichpartsofthecomputationshouldbemovedtotheGPU.Highscalabilityofcomputationacrossmachinesandhugedatasets.

TensorFlowhomepage

TensorFlowisavailablewithPythonandC++support,andweshallusePython2.7forlearning,asindeedPythonAPIisbettersupportedandmucheasiertolearn.ThePythoninstallationdependsonyoursystems;thedownloadpage(https://www.python.org/downloads/)containsalltheinformationneededforitsinstallation.Inthenextsection,weexplainverybrieflythemainfeaturesofthePythonlanguage,withsomeprogrammingexamples.

PythonbasicsPythonisastronglytypedanddynamiclanguage(datatypesarenecessarybutitisnotnecessarytoexplicitlydeclarethem),case-sensitive(varandVARaretwodifferentvariables),andobject-oriented(everythinginPythonisanobject).

SyntaxInPython,alineterminatorisnotrequired,andtheblocksarespecifiedwiththeindentation.Indenttobeginablockandremoveindentationtoconcludeit,that'sall.Instructionsthatrequireanindentedblockendwithacolon(:).Commentsbeginwiththehashsign(#)andaresingle-line.Stringsonmultiplelinesareusedformulti-linecomments.Assignmentsareaccomplishedwiththeequalsign(=).Forequalitytestsweusethedoubleequal(==)symbol.Youcanincreaseanddecreaseavaluebyusing+=and-=followedbytheaddend.Thisworkswithmanydatatypes,includingstrings.Youcanassignandusemultiplevariablesonthesameline.

Followingaresomeexamples:

>>>myvar=3

>>>myvar+=2

>>>myvar

>>>myvar-=1

>>>myvar

"""Thisisacomment"""

>>>mystring="Hello"

>>>mystring+="world."

>>>printmystring

Helloworld.

Thefollowingcodeswapstwovariablesinoneline:

>>>myvar,mystring=mystring,myvar

DatatypesThemostsignificantstructuresinPythonarelists,tuples,anddictionaries.ThesetsareintegratedinPythonsinceversion2.5(forpreviousversions,theyareavailableinthesetslibrary).Listsaresimilartosingle-dimensionalarraysbutyoucancreateliststhatcontainotherlists.Dictionariesarearraysthatcontainpairsofkeysandvalues(hashtable),andtuplesareimmutablemono-dimensionalobjects.InPythonarrayscanbeofanytype,soyoucanmixintegers,strings,andsooninyourlists/dictionariesandtuples.Theindexofthefirstobjectinanytypeofarrayisalwayszero.Negativeindicesareallowedandcountingfromtheendofthearray,-1isthelastelement.Variablescanrefertofunctions.

>>>example=[1,["list1","list2"],("one","tuple")]

>>>mylist=["Element1",2,3.14]

>>>mylist[0]

"Element1"

>>>mylist[-1]

>>>mydict={"Key1":"Val1",2:3,"pi":3.14}

>>>mydict["pi"]

>>>mytuple=(1,2,3)

>>>myfunc=len

>>>printmyfunc(mylist)

Youcangetanarrayrangeusingacolon(:).Notspecifyingthestartingindexoftherangeimpliesthefirstelement;notindicatingthefinalindeximpliesthelastelement.Negativeindicescountfromthelastelement(-1isthelastelement).Thenrunthefollowingcommand:

>>>mylist=["firstelement",2,3.14]

>>>printmylist[:]

['firstelement',2,3.1400000000000001]

>>>printmylist[0:2]

['firstelement',2]

>>>printmylist[-3:-1]

['firstelement',2]

>>>printmylist[1:]

[2,3.14]

StringsPythonstringsareindicatedeitherwithasinglequotationmark(')ordouble(")andareallowedtouseanotationwithinadelimitedstringontheother("Hesaid'hello'."Itisvalid).Stringsofmultiplelinesareenclosedintriple(orsingle)quotes(""").Pythonsupportsunicode;justusethesyntax:"Thisisaunicodestring".Toinsertvaluesintoastring,usethe%operator(modulo)andatuple.Each%isreplacedbyatupleelement,fromlefttoright,andispermittedtouseadictionaryforthereplacements.

>>>print"Nome:%s\nNumber:%s\nString:%s"%(myclass.nome,3,3*"-

Name:Poromenos

Number:3

String:---

strString="""thisisastring

onmultiplelines."""

>>>print"This%(verbo)sun%(name)s."%{"name":"test","verb":

Thisisatest.

ControlflowTheinstructionsforflowcontrolareif,for,andwhile.Thereistheselectcontrolflow;initsplace,weuseif.Theforcontrolflowisusedtoenumeratethemembersofalist.Togetalistofnumbers,youuserange(number).

rangelist=range(10)

>>>printrangelist

[0,1,2,3,4,5,6,7,8,9]

Let'scheckifnumberisoneofthenumbersinthetuple:

fornumberinrangelist:

ifnumberin(3,4,7,9):

#"Break"endstheforinstructionwithouttheelseclause

#"Continue"continueswiththenextiterationoftheloop

continue

#thisisanoptional"else"

#executedonlyiftheloopisnotinterruptedwith"break".

pass#itdoesnothing

ifrangelist[1]==2:

print"thesecondelement(listsare0-based)is2"

elifrangelist[1]==3:

print"thesecondelementis3"

print"Idon'tknow"

whilerangelist[1]==1:

FunctionsFunctionsaredeclaredwiththekeyworddef.Anyoptionalargumentsmustbedeclaredafterthosethataremandatoryandmusthaveavalueassigned.Whencallingfunctionsusingargumentstonameyoumustalsopassthevalue.Functionscanreturnatuple(tupleunpackingenablesthereturnofmultiplevalues).Lambdafunctionsarein-line.Parametersarepassedbyreference,butimmutabletypes(tuples,integers,strings,andsoon)cannotbechangedinthefunction.Thishappensbecauseitisonlypassedthroughthepositionoftheelementinmemory,andassigninganotherobjecttothevariableresultsinthelossoftheobjectreferenceearlier.

Forexample:

#equaltoadeff(x):returnx+1

funzionevar=lambdax:x+1

>>>printfunzionevar(1)

defpassing_example(my_list,my_int):

my_list.append("newelement")

my_int=4

returnmy_list,my_int

>>>input_my_list=[1,2,3]

>>>input_my_int=10

>>>printpassing_example(input_my_list,input_my_int)

([1,2,3,'newelement'],10)

>>>my_list

[1,2,3,'newelement']

>>>my_int

ClassesPythonsupportsmultipleinheritanceofclasses.Thevariablesandprivatemethodsaredeclaredbyconvection(itisnotaruleoflanguage)byprecedingthemwithtwounderscores(__).Wecanassignattributes(properties)toarbitraryinstancesofaclass.

Thefollowingisanexample:

classMyclass:

common=10

def__init__(self):

self.myvariable=3

defmyfunc(self,arg1,arg2):

returnself.myvariable

#Wecreateaninstanceoftheclass

>>>instance=Myclass()

>>>instance.myfunc(1,2)

#Thisvariableissharedbyallinstances

>>>instance2=Myclass()

>>>instance.common

>>>instance2.common

#Noteherehowweusetheclassname

#Insteadoftheinstance.

>>>Myclass.common=30

>>>instance.common

>>>instance2.common

#Thisdoesnotupdatethevariableintheclass,

#Insteadassignanewobjecttothevariable

#ofthefirstinstance.

>>>instance.common=10

>>>instance.common

>>>instance2.common

>>>Myclass.common=50

#Thevalueisnotchangedbecause"common"isaninstancevariable.

>>>instance.common

>>>instance2.common

#ThisclassinheritsfromMyclass.Multipleinheritance

#isdeclaredlikethis:

#classAltraClasse(Myclass1,Myclass2,MyclassN)

classAnotherClass(Myclass):

#Thetopic"self"isautomaticallypassed

#andmakesreferencetoinstanceoftheclass,soyoucanset

#ofinstancevariablesasabove,butwithintheclass.

def__init__(self,arg1):

self.myvariable=3

printarg1

>>>instance=AnotherClass("hello")

>>>instance.myfunc(1,2)

#Thisclassdoesnothaveamember(property).testmember,but

#Wecanaddoneallinstancewhenwewant.Note

#.testThatwillbeamemberofonlyoneinstance.

>>>instance.test=10

>>>instance.test

ExceptionsExceptionsinPythonarehandledwithtry-exceptblocks[exception_name]:

defmy_func():

#Divisionbyzerocausesanexception

exceptZeroDivisionError:

print"Oops,error"

#noexception,let'sproceed

finally:

#Thiscodeisexecutedwhentheblock

#Try..exceptisalreadyexecutedandallexceptions

#Werehandled,evenifthereisanew

#Exceptiondirectlyintheblock.

print"finish"

>>>my_func()

Oops,error.

finish

ImportingalibraryExternallibrariesareimportedwithimport[libraryname].Youcanalsousetheform[libraryname]import[funcname]toimportindividualfeatures.Here'sanexample:

importrandom

fromtimeimportclock

randomint=random.randint(1,100)

>>>printrandomint

InstallingTensorFlowTheTensorFlowPythonAPIsupportsPython2.7andPython3.3+.TheGPUversion(Linuxonly)requirestheCudaToolkit>=7.0andcuDNN>=v2.

WhenworkinginaPythonenvironment,itisrecommendedyouusevirtualenv.ItwillisolateyourPythonconfigurationfordifferentprojects;usingvirtualenvwillnotoverwriteexistingversionsofPythonpackagesrequiredbyTensorFlow.

InstallingonMacorLinuxdistributionsThefollowingarethestepstoinstallTensorFlowonMacandLinuxsystem:

1. Firstinstallpipandvirtualenv(optional)iftheyarenotalreadyinstalled:

ForUbuntu/Linux64-bit:

$sudoapt-getinstallpython-pippython-devpython-

virtualenv

ForMacOSX:

$sudoeasy_installpip

$sudopipinstall--upgradevirtualenv

2. Thenyoucancreateavirtualenvironmentvirtualenv.Thefollowingcommandscreateavirtualenvironmentvirtualenvinthe~/tensorflowdirectory:

$virtualenv--system-site-packages~/tensorflow

3. Thenextstepistoactivatevirtualenvasfollows:

$source~/tensorflow/bin/activate.csh

(tensorflow)$

4. Henceforth,thenameoftheenvironmentwe'reworkinginprecedesthecommandline.Onceactivated,PipisusedtoinstallTensorFlowwithinit.

ForUbuntu/Linux64-bit,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-

cp27-none-linux_x86_64.whl

ForMacOSX,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-

none-any.whl

IfyouwanttouseyourGPUcardwithTensorFlow,theninstallanotherpackage.IrecommendyouvisittheofficialdocumentationtoseeifyourGPUmeetsthespecificationsrequiredtosupportTensorFlow.

ToenableyourGPUwithTensorFlow,youcanreferto(https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#optional-linux-enable-gpu-support)foracompletedescription.

Finally,whenyou'vefinished,youmustdisablethevirtualenvironment:

(tensorflow)$deactivate

Giventheintroductorynatureofthisbook,IsuggestthereadertovisitthedownloadandsetupTensorFlowpageat(https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#download-and-setup)tofindmoreinformationaboutotherwaystoinstallTensorFlow.

InstallingonWindowsIfyoucan'tgetaLinux-basedsystem,youcaninstallUbuntuonavirtualmachine;justuseafreeapplicationcalledVirtualBox,whichletsyoucreateavirtualPConWindowsandinstallUbuntuinthelatter.Soyoucantrytheoperatingsystemwithoutcreatingpartitionsordealingwithcumbersomeprocedures.

AfterinstallingVirtualBox,youcaninstallUbuntu(www.ubuntu.com)andthenfollowtheinstallationforLinuxmachinestoinstallTensorFlow.

InstallationfromsourceHowever,itmayhappenthatthePipinstallationcausesproblems,particularlywhenusingthevisualizationtoolTensorBoard(seehttps://github.com/tensorflow/tensorflow/issues/530).Tofixthisproblem,IsuggestyoubuildandinstallTensorFlow,startingformsourcefiles,throughthefollowingsteps:

1. ClonetheTensorFlowrepository:

gitclone--recurse-submodules

https://github.com/tensorflow/tensorflow2. InstallBazel(dependenciesandinstaller),followingtheinstructionsat:

http://bazel.io/docs/install.html.3. RuntheBazelinstaller:

chmod+xbazel-version-installer-os.sh

./bazel-version-installer-os.sh--user

4. InstallthePythondependencies:

sudoapt-getinstallpython-numpyswigpython-dev

5. Configure(GPUornoGPU?)yourinstallationintheTensorFlowdownloadedrepository:

./configure

6. CreateyourownTensorFlowPippackageusingbazel:

bazelbuild-copt

//tensorflow/tools/pip_package:build_pip_package

7. TobuildwithGPUsupport,usebazelbuild-copt--config=cudafollowedagainby:

//tensorflow/tools/pip_package:build_pip_package

8. Finally,installTensorBoardwherethenameofthe.whlfilewilldependonyourplatform.

pipinstall/tmp/tensorflow_pkg/tensorflow-0.7.1-py2-none-

linux_x86_64.whl

9. GoodLuck!

Pleaserefertohttps://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#installation-for-linuxforfurtherinformation.

TestingyourTensorFlowinstallationOpenaterminalandtypethefollowinglinesofcode:

>>>importtensorflowastf

>>>hello=tf.constant("helloTensorFlow!")

>>>sess=tf.Session()

Toverifyyourinstallation,justtype:

>>>print(sess.run(hello))

Youshouldhavethefollowingoutput:

HelloTensorFlow!

FirstworkingsessionFinallyitistimetomovefromtheorytopractice.IwillusethePython2.7IDEtowritealltheexamples.TogetaninitialideaofhowtouseTensorFlow,openthePythoneditorandwritethefollowinglinesofcode:

print(y)

importtensorflowastf

x=tf.constant(1,name='x')

y=tf.Variable(x+9,name='y')

print(y)

Asyoucaneasilyunderstandinthefirstthreelines,theconstantx,setequalto1,isthenaddedto9tosetthenewvalueofthevariabley,andthentheendresultofthevariableyisprintedonthescreen.

Inthelastfourlines,wehavetranslatedaccordingtoTensorFlowlibrarythefirstthreevariables.

Ifweruntheprogram,wehavethefollowingoutput:

<tensorflow.python.ops.variables.Variableobjectat

0x7f30ccbf9190>

TheTensorFlowtranslationofthefirstthreelinesoftheprogramexampleproducesadifferentresult.Let'sanalyzethem:

1. ThefollowingstatementshouldneverbemissedifyouwanttousetheTensorFlowlibrary.Ittellsusthatweareimportingthelibraryandcallittf:

2. Wecreateaconstantvaluecalledx,withavalueequaltoone:

x=tf.constant(1,name='x')

3. Thenwecreateavariablecalledy.Thisvariableisdefinedwiththesimpleequationy=x+9:

y=tf.Variable(x+9,name='y')

4. Finally,printouttheresult:

print(y)

Sohowdoweexplainthedifferentresult?Thedifferenceliesinthevariabledefinition.Infact,thevariableydoesn'trepresentthecurrentvalueofx+9,insteaditmeans:whenthevariableyiscomputed,takethevalueoftheconstantxandadd9toit.Thisisthereasonwhythevalueofyhasneverbeencarriedout.Inthenextsection,I'lltrytofixit.

SoweopenthePythonIDEandenterthefollowinglines:

Runningtheprecedingcode,theoutputresultisfinallyasfollows:

Wehaveremovedtheprintinstruction,butwehaveinitializedthemodelvariables:

model=tf.initialize_all_variables()

And,mostly,wehavecreatedasessionforcomputingvalues.Inthenextstep,werunthemodel,createdpreviously,andfinallyrunjustthevariableyandprintoutitscurrentvalue.

withtf.Session()assession:

session.run(model)

print(session.run(y))

Thisisthemagictrickthatpermitsthecorrectresult.Inthisfundamentalstep,theexecutiongraphcalledDataFlowGraphiscreatedinthesession,withallthedependenciesbetweenthevariables.Theyvariabledependsonthevariablex,andthatvalueistransformedbyadding9toit.Thevalueisnotcomputeduntilthesessionisexecuted.

ThislastexampleintroducedanotherimportantfeatureinTensorFlow,theDataFlowGraph.

DataFlowGraphsAmachinelearningapplicationistheresultoftherepeatedcomputationofcomplexmathematicalexpressions.InTensorFlow,acomputationisdescribedusingtheDataFlowGraph,whereeachnodeinthegraphrepresentstheinstanceofamathematicaloperation(multiply,add,divide,andsoon),andeachedgeisamulti-dimensionaldataset(tensors)onwhichtheoperationsareperformed.

TensorFlowsupportstheseconstructsandtheseoperators.Let'sseeindetailhownodesandedgesaremanagedbyTensorFlow:

Node:InTensorFlow,eachnoderepresentstheinstantionofanoperation.Eachoperationhas>=inputsand>=0outputs.Edges:InTensorFlow,therearetwotypesofedge:

NormalEdges:Theyarecarriersofdatastructures(tensors),whereanoutputofoneoperation(fromonenode)becomestheinputforanotheroperation.SpecialEdges:Theseedgesarenotdatacarriersbetweentheoutputofanode(operator)andtheinputofanothernode.Aspecialedgeindicatesacontroldependencybetweentwonodes.Let'ssupposewehavetwonodesAandBandaspecialedgesconnectingAtoB;itmeansthatBwillstartitsoperationonlywhentheoperationinAends.SpecialedgesareusedinDataFlowGraphtosetthehappens-beforerelationshipbetweenoperationsonthetensors.

Let'sexploresomefeaturesinDataFlowGraphingreaterdetail:

Operation:Thisrepresentsanabstractcomputation,suchasaddingormultiplyingmatrices.Anoperationmanagestensors.Itcanjustbepolymorphic:thesameoperationcanmanipulatedifferenttensorelementtypes.Forexample,theadditionoftwoint32tensors,theadditionoftwofloattensors,andsoon.Kernel:Thisrepresentstheconcreteimplementationofthatoperation.Akerneldefinestheimplementationoftheoperationonaparticulardevice.Forexample,anaddmatrixoperationcanhaveaCPUimplementationandaGPUone.Inthefollowingsection,wehaveintroducedtheconceptofsessionstocreateadelexecutiongraphinTensorFlow.Let'sexplainthistopic:Session:WhentheclientprogramhastoestablishcommunicationwiththeTensorFlowruntimesystem,asessionmustbecreated.Assoonasthesession

iscreatedforaclient,aninitialgraphiscreatedandisempty.Ithastwofundamentalmethods:

session.extend:Inacomputation,theusercanextendtheexecutiongraph,requestingtoaddmoreoperations(nodes)andedges(data).session.run:UsingTensorFlow,sessionsarecreatedwithsomegraphs,andthesefullgraphsareexecutedtogetsomeoutputs,orsometimes,subgraphsareexecutedthousands/millionsoftimesusingruninvocations.Basically,themethodrunstheexecutiongraphtoprovideoutputs.

FeaturesinDataFlowGraph

TensorFlowprogrammingmodelAdoptingDataFlowGraphasexecutionmodel,youdividethedataflowdesign(graphbuildinganddataflow)fromitsexecution(CPU,GPUcards,oracombination),usingasingleprogramminginterfacethathidesallthecomplexities.ItalsodefineswhattheprogrammingmodelshouldbelikeinTensorFlow.

Let'sconsiderthesimpleproblemofmultiplyingtwointegers,namelyaandb.

Thefollowingarethestepsrequiredforthissimpleproblem:

1. Defineandinitializethevariables.Eachvariableshoulddefinethestateofacurrentexecution.AfterimportingtheTensorFlowmoduleinPython:

2. Wedefinethevariablesaandbinvolvedinthecomputation.Thesearedefinedviaamorebasicstructure,calledtheplaceholder:

a=tf.placeholder("int32")

b=tf.placeholder("int32")

3. Aplaceholderallowsustocreateouroperationsandtobuildourcomputationgraph,withoutneedingthedata.

4. Thenweusethesevariables,asinputsforTensorFlow'sfunctionmul:

y=tf.mul(a,b)

thisfunctionwillreturntheresultofthemultiplicationthe

inputintegersaandb.

5. Managetheexecutionflow,thismeansthatwemustbuildasession:

sess=tf.Session()

6. Visualizetheresults.Werunourmodelonthevariablesaandb,feedingdataintothedataflowgraphthroughtheplaceholderspreviouslydefined.

printsess.run(y,feed_dict={a:2,b:5})

HowtouseTensorBoardTensorBoardisavisualizationtool,devotedtoanalyzingDataFlowGraphandalsotobetterunderstandthemachinelearningmodels.Itcanviewdifferenttypesofstatisticsabouttheparametersanddetailsofanypartofacomputergraphgraphically.Itoftenhappensthatagraphofcomputationcanbeverycomplex.Adeepneuralnetworkcanhaveupto36,000nodes.Forthisreason,TensorBoardcollapsesnodesinhigh-levelblocks,highlightingthegroupswithidenticalstructures.Doingsoallowsabetteranalysisofthegraph,focusingonlyonthecoresectionsofthecomputationgraph.Also,thevisualizationprocessisinteractive;usercanpan,zoom,andexpandthenodestodisplaythedetails.

ThefollowingfigureshowsaneuralnetworkmodelwithTensorBoard:

ATensorBoardvisualizationexample

TensorBoard'salgorithmscollapsenodesintohigh-levelblocksandhighlightgroupswiththesamestructures,whilealsoseparatingouthigh-degreenodes.Thevisualizationtoolisalsointeractive:theuserscanpan,zoomin,expand,andcollapsethenodes.

TensorBoardisequallyusefulinthedevelopmentandtuningofamachinelearningmodel.Forthisreason,TensorFlowletsyouinsertso-calledsummaryoperationsintothegraph.Thesesummaryoperationsmonitorchangingvalues(duringtheexecutionofacomputation)writteninalogfile.ThenTensorBoardisconfiguredtowatchthislogfilewithsummaryinformationanddisplayhowthisinformationchangesovertime.

Let'sconsiderabasicexampletounderstandtheusageofTensorBoard.Wehavethefollowingexample:

a=tf.constant(10,name="a")

b=tf.constant(90,name="b")

y=tf.Variable(a+b*2,name="y")

merged=tf.merge_all_summaries()

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

session.run(model)

Thatgivesthefollowingresult:

Let'spointintothesessionmanagement.Thefirstinstructiontoconsiderisasfollows:

merged=tf.merge_all_summaries()

Thisinstructionmustmergeallthesummariescollectedinthedefaultgraph.

ThenwecreateSummaryWriter.Itwillwriteallthesummaries(inthiscasetheexecutiongraph)obtainedfromthecode'sexecutionintothe/tmp/tensorflowlogsdirectory:

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

Finally,werunthemodelandsobuildtheDataFlowGraph:

session.run(model)

TheuseofTensorBoardisverysimple.Let'sopenaterminalandenterthefollowing:

$tensorboard--logdir=/tmp/tensorflowlogs

Amessagesuchasthefollowingshouldappear:

startigtensorboardonport6006

Then,byopeningawebbrowser,weshoulddisplaytheDataFlowGraphwithauxiliarynodes:

DataFlowGraphdisplaywithTensorBoard

NowwewillbeabletoexploretheDataFlowGraph:

ExploretheDataFlowGraphdisplaywithTensorBoard

TensorBoardusesspecialiconsforconstantsandsummarynodes.Tosummarize,wereportinthenextfigurethetableofnodesymbolsdisplayed:

NodesymbolsinTensorBoard

SummaryInthischapter,weintroducedthemaintopics:machinelearninganddeeplearning.Whilemachinelearningexploresthestudyandconstructionofalgorithmsthatcanlearnfrom,andmakepredictionsondata,deeplearningisbasedpreciselyonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.

Inthisvastscientificresearchandpracticalapplicationarea,wecanfirmlyplacetheTensorFlowsoftwarelibrary,developedbytheGoogle'sresearchgroupforartificialintelligence(GoogleBrainProject)andreleasedasopensourcesoftwareonNovember9,2015.

AfterelectingthePythonprogramminglanguageasthedevelopmenttoolforourexamplesandapplications,wesawhowtoinstallandcompilethelibrary,andthencarriedoutafirstworkingsession.ThisallowedustointroducetheexecutionmodelofTensorFlowandDataFlowGraph.Itledustodefinewhatourprogrammingmodelshouldbe.

Thechapterendedwithanexampleofhowtouseanimportanttoolfordebuggingmachinelearningapplications:TensorBoard.

Inthenextchapter,wewillcontinueourjourneyintotheTensorFlowlibrary,withtheintentionofshowingitsversatility.Startingfromthefundamentalconcept,tensors,wewillseehowtousethelibraryforpurelymathapplications.

Chapter2.DoingMathwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

ThetensordatastructureHandlingtensorswithTensorFlowComplexnumbersandfractalsComputingderivativesRandomnumbersSolvingpartialdifferentialequations

ThetensordatastructureTensorsarethebasicdatastructuresinTensorFlow.Aswehavealreadysaid,theyrepresenttheconnectingedgesinaDataFlowGraph.Atensorsimplyidentifiesamultidimensionalarrayorlist.

Itcanbeidentifiedbythreeparameters,rank,shape,andtype:

rank:Eachtensorisdescribedbyaunitofdimensionalitycalledrank.Itidentifiesthenumberofdimensionsofthetensor.Forthisreason,arankisknownasorderorn-dimensionsofatensor(forexample,arank2tensorisamatrixandarank1tensorisavector).shape:Theshapeofatensoristhenumberofrowsandcolumnsithas.type:Itisthedatatypeassignedtothetensor'selements.

Well,nowwetakeconfidencewiththisfundamentaldatastructure.Tobuildatensor,wecan:

Buildann-dimensionalarray;forexample,byusingtheNumPylibraryConvertthen-dimensionalarrayintoaTensorFlowtensor

Onceweobtainthetensor,wecanhandleitusingtheTensorFlowoperators.Thefollowingfigureprovidesavisualexplanationoftheconceptsintroduced:

Visualizationofmultidimensionaltensors

One-dimensionaltensorsTobuildaone-dimensionaltensor,weusetheNumpyarray(s)command,wheresisaPythonlist:

>>>importnumpyasnp

>>>tensor_1d=np.array([1.3,1,4.0,23.99])

UnlikeaPythonlist,thecommasbetweentheelementsarenotshown:

>>>printtensor_1d

[1.31.4.23.99]

TheindexingisthesameasPythonlists.Thefirstelementhasposition0,thethirdelementhasposition2,andsoon:

>>>printtensor_1d[0]

>>>printtensor_1d[2]

Finally,youcanviewthebasicattributesofthetensor,therankofthetensor:

>>>tensor_1d.ndim

Thetupleofthetensor'sdimensionisasfollows:

>>>tensor_1d.shape

Thetensor'sshapehasjustfourvaluesinarow.

Thedatatypeinthetensor:

>>>tensor_1d.dtype

dtype('float64')

Now,let'sseehowtoconvertaNumPyarrayintoaTensorFlowtensor:

importTensorFlowastf

TheTensorFlowfunctiontf_convert_to_tensorconvertsPythonobjectsof

varioustypestotensorobjects.Itacceptstensorobjects,Numpyarrays,Pythonlists,andPythonscalars:

tf_tensor=tf.convert_to_tensor(tensor_1d,dtype=tf.float64)

RunningtheSession,wecanvisualizethetensoranditselementsasfollows:

withtf.Session()assess:

printsess.run(tf_tensor)

printsess.run(tf_tensor[0])

printsess.run(tf_tensor[2])

Thatgivesthefollowingresults:

[1.31.4.23.99]

Two-dimensionaltensorsTocreateatwo-dimensionaltensorormatrix,weagainusearray(s),butswillbeasequenceofarray:

>>>importnumpyasnp

>>>tensor_2d=np.array([(1,2,3,4),(4,5,6,7),(8,9,10,11),

(12,13,14,15)])

>>>printtensor_2d

[[1234]

[4567]

[891011]

[12131415]]

Avalueintensor_2disidentifiedbytheexpressiontensor_2d[row,col],whererowistherowpositionandcolisthecolumnposition:

>>>tensor_2d[3][3]

Youcanalsousethesliceoperator:toextractasubmatrix:

>>>tensor_2d[0:2,0:2]

array([[1,2],

[4,5]])

Inthiscase,weextracteda2×2submatrix,containingrow0and1,andcolumns0and1oftensor_2d.TensorFlowhasitsownsliceoperator.Inthenextsubsectionwewillseehowtouseit.

Tensorhandling

Let'sseehowwecanapplyalittlemorecomplexoperationstothesedatastructures.Considerthefollowingcode:

1. Importthelibraries:

importnumpyasnp

2. Let'sbuildtwointegerarrays.Theserepresentstwo3×3matrices:

matrix1=np.array([(2,2,2),(2,2,2),(2,2,2)],dtype='int32')

matrix2=np.array([(1,1,1),(1,1,1),(1,1,1)],dtype='int32')

3. Visualizethem:

print"matrix1="

printmatrix1

print"matrix2="

printmatrix2

4. TousethesematricesinourTensorFlowenvironment,theymustbetransformedintoatensordatastructure:

matrix1=tf.constant(matrix1)

matrix2=tf.constant(matrix2)

5. WeusedtheTensorFlowconstantoperatortoperformthetransformation.6. ThematricesarereadytobemanipulatedwithTensorFlowoperators.Inthis

case,wecalculateamatrixmultiplicationandamatrixsum:

matrix_product=tf.matmul(matrix1,matrix2)

matrix_sum=tf.add(matrix1,matrix2)

7. Thefollowingmatrixwillbeusedtocomputeamatrixdeterminant:

matrix_3=np.array([(2,7,2),(1,4,2),

(9,0,2)],dtype='float32')

print"matrix3="

printmatrix_3

matrix_det=tf.matrix_determinant(matrix_3)

8. It'stimetocreateourgraphandrunthesession,withthetensorsandoperatorscreated:

result1=sess.run(matrix_product)

result2=sess.run(matrix_sum)

result3=sess.run(matrix_det)

9. Theresultswillbeprintedoutbyrunningthefollowingcommand:

print"matrix1*matrix2="

printresult1

print"matrix1+matrix2="

printresult2

print"matrix3determinantresult="

printresult3

Thefollowingfigureshowstheresults,afterrunningthecode:

TensorFlowprovidesnumerousmathoperationsontensors.Thefollowingtablesummarizesthem:

TensorFlowoperator Description

tf.add Returnsthesum

tf.sub

Returnssubtraction

tf.mul Returnsthemultiplication

tf.div Returnsthedivision

tf.mod Returnsthemodule

tf.abs Returnstheabsolutevalue

tf.neg Returnsthenegativevalue

tf.sign Returnsthesign

tf.inv Returnstheinverse

tf.square Returnsthesquare

tf.round Returnsthenearestinteger

tf.sqrt Returnsthesquareroot

tf.pow Returnsthepower

tf.exp Returnstheexponential

tf.log Returnsthelogarithm

tf.maximum Returnsthemaximum

tf.minimum Returnstheminimum

tf.cos Returnsthecosine

tf.sin Returnsthesine

Three-dimensionaltensorsThefollowingcommandsbuildathree-dimensionaltensor:

>>>importnumpyasnp

>>>tensor_3d=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

>>>printtensor_3d

[[[12]

[78]]]

Thethree-dimensionaltensorcreatedisa2x2x2matrix:

>>>tensor_3d.shape

(2L,2L,2L)

Toretrieveanelementfromathree-dimensionaltensor,weuseanexpressionofthefollowingform:

tensor_3d[plane,row,col]

Followingthesesettings:

Matrix3×3representation

Soallthefourelementsinthefirstplaneidentifiedbythevalueofthevariableplaneequaltozero:

>>>tensor_3d[0,0,0]

>>>tensor_3d[0,0,1]

>>>tensor_3d[0,1,0]

>>>tensor_3d[0,1,1]

Thethree-dimensionaltensorsallowtointroducethenexttopic,linkedtothemanipulationofimagesbutmoregenerallyintroducesustooperateassimpletransformationsontensors.

HandlingtensorswithTensorFlowTensorFlowisdesignedtohandletensorsofallsizesandoperatorsthatcanbeusedtomanipulatethem.Inthisexample,inordertoseearraymanipulations,wearegoingtoworkwithadigitalimage.Asyouprobablyknow,acolordigitalimagethatisaMxNx3sizematrix(athreeordertensor),whosecomponentscorrespondtothecomponentsofred,green,andblueintheimage(RGBspace),meansthateachfeatureintherectangularboxfortheRGBimagewillbespecifiedbythreecoordinates,i,j,andk.

TheRGBtensor

ThefirstthingIwanttoshowyouishowtouploadanimage,andthentoextractasub-imagefromtheoriginal,usingtheTensorFlowsliceoperator.

Preparetheinputdata

Usingtheimreadcommandinmatplotlib,weimportadigitalimageinstandardformatcolors(JPG,BMP,TIF):

importmatplotlib.imageasmp_image

filename="packt.jpeg"

input_image=mp_image.imread(filename)

However,wecanseetherankandtheshapeofthetensor:

print'inputdim={}'.format(input_image.ndim)

print'inputshape={}'.format(input_image.shape)

You'llseetheoutput,whichis(80,144,3).Thismeanstheimageis80pixelshigh,144pixelswide,and3colorsdeep.

Finally,usingmatplotlib,itispossibletovisualizetheimportedimage:

importmatplotlib.pyplotasplt

plt.imshow(input_image)

plt.show()

Thestartingimage

Inthisexample,sliceisabidimensionalsegmentofthestartingimage,whereeachpixelhastheRGBcomponents,soweneedaplaceholdertostoreallthevaluesoftheslice:

my_image=tf.placeholder("uint8",[None,None,3])

Forthelastdimension,we'llneedonlythreevalues.ThenweusetheTensorFlowoperatorslicetocreateasub-image:

slice=tf.slice(my_image,[10,0,0],[16,-1,-1])

ThelaststepistobuildaTensorFlowworkingsession:

result=session.run(slice,feed_dict={my_image:input_image})

print(result.shape)

plt.imshow(result)

plt.show()

Theresultingshapeisthenasthefollowingimageshows:

Theinputimageaftertheslice

Inthisnextexample,wewillperformageometrictransformationoftheinputimage,usingthetransposeoperator:

Weassociatetheinputimagetoavariablewecallx:

x=tf.Variable(input_image,name='x')

Wetheninitializeourmodel:

Next,webuildupthesessionwiththatwerunourcode:

Toperformthetransposeofourmatrix,usethetransposefunctionofTensorFlow.Thismethodperformsaswapbetweentheaxes0and1oftheinputmatrix,whilethezaxisisleftunchanged:

x=tf.transpose(x,perm=[1,0,2])

session.run(model)

result=session.run(x)

plt.imshow(result)

plt.show()

Theresultisthefollowing:

Thetransposedimage

ComplexnumbersandfractalsFirstofall,welookathowPythonhandlescomplexnumbers.Itisasimplematter.Forexample,settingx=5+4jinPython,wemustwritethefollowing:

>>>x=5.+4j

Itmeansthat>>>xisequalto5+4j.

Atthesametime,youcanwritethefollowing:

>>>x=complex(5,4)

(5+4j)

Wealsonotethat:

Pythonusesjtomean√-1insteadofiinmath.Ifyouputanumberbeforethej,Pythonwillconsideritasanimaginarynumber,otherwise,itsavariable.Itmeansthatifyouwanttowritetheimaginarynumberi,youmustwrite1jratherthanj.

TogettherealandimaginarypartsofaPythoncomplexnumber,youcanusethefollowing:

>>>x.real

>>>x.imag

Weturnnowtoourproblem,namelyhowtodisplaythefractalswithTensorFlow.TheMandelbrotsetisoneofthemostfamousfractals.Afractalisageometricobjectthatisrepeatedinitsstructureatdifferentscales.Fractalsareverycommoninnature,andanexampleisthecoastofGreatBritain.

TheMandelbrotsetisdefinedforthecomplexnumberscforwhichthefollowingsuccessionistrueandbounded:

Z(n+1)=Z(n)2+c,whereZ(0)=0

ThesetisnamedafteritscreatorBenoîtMandelbrot,aPolishmathematician

famousformakingfamousfractals.However,hewasabletogiveashapeorgraphicrepresentationtothesetofMandelbrotonlywiththehelpofcomputerprogramming.In1985,hepublishedinScientificAmericanthefirstalgorithmtocalculatetheMandelbrotset.Thealgorithm(foreachpointcomplexpointZ):

1. Zhasinitialvalueequalto0,Z(0)=0.2. Choosethecomplexnumbercasthecurrentpoint.IntheCartesianplane,the

abscissaaxis(horizontalline)representstherealpart,whiletheaxisofordinates(verticalline)representstheimaginarypartofc.

3. Iteration:Z(n+1)=Z(n)2+cStopwhenZ(n)2islargerthanthemaximumradius;

NowweseethroughsimplestepshowwecantranslatethealgorithmmentionedearlierusingTensorFlow.

PreparethedataforMandelbrotsetImportthenecessarylibrariestoourexample:

importnumpyasnp

WebuildacomplexgridthatwillcontainourMandelbrot'sset.Theregionofthecomplexplaneisbetween-1.3and+1.3ontherealaxisandbetween-2jand+1jontheimaginaryaxis.Eachpixellocationineachimagewillrepresentadifferentcomplexvalue,z:

Y,X=np.mgrid[-1.3:1.3:0.005,-2:1:0.005]

Z=X+1j*Y

c=tf.constant(Z.astype(np.complex64))

Thenwedefinedatastructures,orthetensorTensorFlowthatcontainsallthedatatobeincludedinthecalculation.Wethendefinetwovariables.Thefirstistheoneonwhichwewillmakeouriteration.Ithasthesamedimensionsasthecomplexgrid,butitisdeclaredasvariable,thatis,itsvalueswillchangeinthecourseofthecalculation:

zs=tf.Variable(c)

Thenextvariableisinitializedtozero.Italsohasthesamesizeasthevariablezs:

ns=tf.Variable(tf.zeros_like(c,tf.float32))

BuildandexecutetheDataFlowGraphforMandelbrot'ssetInsteadtointroduceasessionweinstantiateanInteractiveSession():

sess=tf.InteractiveSession()

Itrequires,asweshallsee,theTensor.eval()andOperation.run()methods.Thenweinitializeallthevariablesinvolvedthroughtherun()method:

tf.initialize_all_variables().run()

Starttheiteration:

zs_=zs*zs+c

Definethestopconditionoftheiteration:

not_diverged=tf.complex_abs(zs_)<4

Thenweusethegroupoperatorthatgroupsmultipleoperations:

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,tf.float32)))

ThefirstoperationisthestepiterationZ(n+1)=Z(n)2+ctocreateanewvalue.

Thesecondoperationaddsthisvaluetothecorrespondentelementvariableinns.Whenthisopfinishes,allopsininputhavefinished.Thisoperatorhasnooutput.

Thenweruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultforMandelbrot'ssetTheresultwillbethetensorns.eval().Usingmatplotlib,let'svisualizetheresult:

plt.imshow(ns.eval())

plt.show()

TheMandelbrotset

Ofcourse,theMandelbrotsetisnottheonlyfractalwecanvisualize.JuliasetsarefractalsthathavebeennamedafterGastonMauriceJuliaforhisworkinthisfield.TheirbuildingprocessisverysimilartothatusedfortheMandelbrotset.

PreparethedataforJulia'ssetLet'sdefinetheoutputcomplexplane.Itisbetween-2and+2ontherealaxisandbetween-2jand+2jontheimaginaryaxis:

Y,X=np.mgrid[-2:2:0.005,-2:2:0.005]

Andthecurrentpointlocation:

Z=X+1j*Y

ThedefinitionoftheJulia'ssetrequiresredefingZasaconstanttensor:

Z=tf.constant(Z.astype("complex64"))

Thustheinputtensorssupportingourcalculationisasfollows:

zs=tf.Variable(Z)

ns=tf.Variable(tf.zeros_like(Z,"float32"))

BuildandexecutetheDataFlowGraphforJulia'ssetAsinthepreviousexample,wecreateourowninteractivesession:

sess=tf.InteractiveSession()

Wetheninitializetheinputtensors:

TocomputethenewvaluesoftheJuliaset,wewillusetheiterativeformulaZ(n+1)=Z(n)2–c,wheretheinitialpointcwillbeequaltotheimaginarynumber0.75i:

c=complex(0.0,0.75)

zs_=zs*zs-c

Thegroupingoperatorandthestopiteration'sconditionwillbethesameasintheMandelbrotcomputation:

not_diverged=tf.complex_abs(zs_)<4

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,"float32")))

Finally,weruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultTovisualizetheresultrunthefollowingcommand:

plt.imshow(ns.eval())

plt.show()

TheJuliaset

ComputinggradientsTensorFlowhasfunctionstosolveothermorecomplextasks.Forexample,wewilluseamathematicaloperatorthatcalculatesthederivativeofywithrespecttoitsexpressionxparameter.Forthispurpose,weusethetf.gradients()function.

Letusconsiderthemathfunctiony=2x².Wewanttocomputethegradientdiywithrespecttox=1.Thefollowingisthecodetocomputethisgradient:

1. First,importtheTensorFlowlibrary:

2. Thexvariableistheindependentvariableofthefunction:

x=tf.placeholder(tf.float32)

3. Let'sbuildthefunction:

y=2*x*x

4. Finally,wecallthetf.gradients()functionwithyandxasarguments:

var_grad=tf.gradients(y,x)

5. Toevaluatethegradient,wemustbuildasession:

6. Thegradientwillbeevaluatedonthevariablex=1:

var_grad_val=session.run(var_grad,feed_dict={x:1})

7. Thevar_grad_valvalueisthefeedresult,tobeprinted:

print(var_grad_val)

8. Thatgivesthefollowingresult:

RandomnumbersThegenerationofrandomnumbersisessentialinmachinelearningandwithinthetrainingalgorithms.Whenrandomnumbersaregeneratedbyacomputer,theyaregeneratedbyaPseudoRandomNumberGenerator(PRNG).Thetermpseudocomesfromthefactthatthecomputerisastainlogicallyprogrammedrunningofinstructionsthatcanonlysimulaterandomness.Despitethislogicallimitation,computersareveryefficientatgeneratingrandomnumbers.TensorFlowprovidesoperatorstocreaterandomtensorswithdifferentdistributions.

UniformdistributionGenerally,whenweneedtoworkwithrandomnumbers,wetrytogetrepeatedvalueswiththesamefrequency,uniformlydistributed.TheoperatorTensorFlowprovidesvaluesbetweenminvalandmaxval,allwiththesameprobability.Let'sseeasimpleexamplecode:

random_uniform(shape,minval,maxval,dtype,seed,name)

WeimporttheTensorFlowlibraryandmatplotlibtodisplaytheresults:

Theuniformvariableisa1-dimensionaltensor,theelements100,containingvaluesrangingfrom0to1,distributedwiththesameprobability:

uniform=tf.random_uniform([100],minval=0,maxval=1,dtype=tf.float32)

Let'sdefinethesession:

sess=tf.Session()

Inoursession,weevaluatethetensoruniform,usingtheeval()operator:

printuniform.eval()

plt.hist(uniform.eval(),normed=True)

plt.show()

Asyoucansee,allintermediatevaluesbetween0and1haveapproximatelythesamefrequency.Thisbehavioriscalleduniformdistribution.Theresultofexecutionisthereforeasfollows:

Uniformdistribution

NormaldistributionInsomespecificcases,youmayneedtogeneraterandomnumbersthatdifferbyafewunits.Inthiscase,weusedthenormaldistributionofrandomnumbers,alsocalledGaussiandistribution,thatincreasestheprobabilityofthenextissuesextractionat0.Eachintegerrepresentsthestandarddeviation.Asshownfromthefutureissuestothemarginsoftherangehaveaverylowchanceofbeingextracted.ThefollowingistheimplementationwithTensorFlow:

norm=tf.random_normal([100],mean=0,stddev=2)

plt.hist(norm.eval(),normed=True)

plt.show()

Wecreateda1d-tensorofshape[100]consistingofrandomnormalvalues,withmeanequalto0andstandarddeviationequalto2,usingtheoperatortf.random_normal.Thefollowingistheresult:

Normaldistribution

GeneratingrandomnumberswithseedsWerecallthatoursequenceispseudo-random,becausethevaluesarecalculatedusingadeterministicalgorithmandprobabilityplaysnorealrole.Theseedisjustastartingpointforthesequenceandifyoustartfromthesameseedyouwillendupwiththesamesequence.Thisisveryuseful,forexample,todebugyourcode,whenyouaresearchingforanerrorinaprogramandyoumustbeabletoreproducetheproblembecauseeveryrunwouldbedifferent.

Considerthefollowingexamplewherewehavetwouniformdistributions:

uniform_with_seed=tf.random_uniform([1],seed=1)

uniform_without_seed=tf.random_uniform([1])

Inthefirstuniformdistribution,webeganwiththeseed=1.Thismeansthatrepeatedlyevaluatingthetwodistributions,thefirstuniformdistributionwillalwaysgeneratethesamesequenceofvalues:

print("FirstRun")

withtf.Session()asfirst_session:

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwithoutseed={}"\

.format(first_session.run(uniform_without_seed)))

print("SecondRun")

withtf.Session()assecond_session:

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))

.format(second_session.run(uniform_without_seed)))

Asyoucansee,thisistheendresult.Theuniformdistributionwithseed=1alwaysgivesthesameresult:

FirstRun

uniformwith(seed=1)=[0.23903739]

uniformwithoutseed=[0.92157185]

SecondRun

Montecarlo'smethod

WeendthesectiononrandomnumberswithasimplenoteabouttheMontecarlomethod.Itisanumericalprobabilisticmethodwidelyusedintheapplicationofhigh-performancescientificcomputing.Inourexample,wewillcalculatethevalueofπ:

trials=100

hits=0

Generatepseudo-randompointsinsidethesquare[-1,1]×[-1,1],usingtherandom_uniformfunction:

x=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

y=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

Startthesession:

sess=tf.Session()

Insidethesession,wecalculatethevalueofπ:theareaofthecircleisπandthatofthesquareis4.Therelationshipbetweenthenumbersinsidethecircleandthetotalofgeneratedpointsmustconverge(veryslowly)toπ,andwecounthowmanypointsfallinsidethecircleequationx2+y2=1.

withsess.as_default():

foriinrange(1,trials):

forjinrange(1,trials):

ifx.eval()**2+y.eval()**2<1:

hits=hits+1

pi.append((4*float(hits)/i)/trials)

plt.plot(pi)

plt.show()

Thefigureshowstheconvergenceduringthenumberofteststotheπvalue

SolvingpartialdifferentialequationsApartialdifferentialequation(PDE)isadifferentialequationinvolvingpartialderivativesofanunknownfunctionofseveralindependentvariables.PDEsarecommonlyusedtoformulateandsolvemajorphysicalproblemsinvariousfields,fromquantummechanicstofinancialmarkets.Inthissection,wetaketheexamplefromhttps://www.TensorFlow.org/versions/r0.8/tutorials/pdes/index.html,showingtheuseofTensorFlowinatwo-dimensionalPDEsolutionthatmodelsthesurfaceofsquarepondwithafewraindropslandingonit.Theeffectwillbetoproducebi-dimensionalwavesontheponditself.Wewon'tconcentrateonthecomputationalaspectsoftheproblem,asthisisbeyondthescopeofthisbook;insteadwewillfocusonusingTensorFlowtodefinetheproblem.

Thestartingpointistoimportthesefundamentallibraries:

importnumpyasnp

InitialconditionFirstwehavetodefinethedimensionsoftheproblem.Let'simaginethatourpondisa500x500square:

Thefollowingtwo-dimensionaltensoristhepondattimet=0,thatis,theinitialconditionofourproblem:

u_init=np.zeros([N,N],dtype=np.float32)

Wehave40randomraindropsonit

forninrange(40):

a,b=np.random.randint(0,N,2)

u_init[a,b]=np.random.uniform()

Thenp.random.randint(0,N,2)isaNumPyfunctionthatreturnsrandomintegersfrom0toNonatwo-dimensionalshape.

Usingmatplotlib,wecanshowtheinitialsquarepond:

plt.imshow(U.eval())

plt.show()

Zoomingonthepondinitsinitialcondition:thecoloreddotsrepresenttheraindropsfallen

Thenwedefinethefollowingtensor:

ut_init=np.zeros([N,N],dtype=np.float32)

Itisthetemporalevolutionofthepond.Attimet=tenditwillcontainthefinalstateofthepond.

ModelbuildingWemustdefinesomefundamentalparameters(usingTensorFlowplaceholders)andatimestepofthesimulation:

eps=tf.placeholder(tf.float32,shape=())

Wemustalsodefineaphysicalparameterofthemodel,namelythedampingcoefficient:

damping=tf.placeholder(tf.float32,shape=())

ThenweredefineourstartingtensorsasTensorFlowvariables,sincetheirvalueswillchangeoverthecourseofthesimulation:

U=tf.Variable(u_init)

Ut=tf.Variable(ut_init)

Finally,webuildourPDEmodel.Itrepresentstheevolutionintimeofthepondaftertheraindropshavefallen:

U_=U+eps*Ut

Ut_=Ut+eps*(laplace(U)-damping*Ut)

Asyoucansee,weintroducedthelaplace(U)functiontoresolvethePDE(itwillbedescribedinthelastpartofthissection).

UsingtheTensorFlowgroupoperator,wedefinehowourpondintimetshouldevolve:

step=tf.group(

U.assign(U_),

Ut.assign(Ut_))

Let'srecallthatthegroupoperatorgroupsmultipleoperationsasasingleone.

GraphexecutionInoursessionwewillseetheevolutionintimeofthepondby1000steps,whereeachtimestepisequalto0.03s,whilethedampingcoefficientissetequalto0.04.

Let'sinitializetheTensorFlowvariables:

Thenwerunthesimulation:

foriinrange(1000):

step.run({eps:0.03,damping:0.04})

ifi%50==0:

clear_output()

plt.imshow(U.eval())

plt.show()

Every50stepsthesimulationresultwillbedisplayedasfollows:

Thepondafter400simulationsteps

Computationalfunctionused

Let'snowseewhatistheLaplace(U)functionandtheancillaryfunctionsused:

defmake_kernel(a):

a=np.asarray(a)

a=a.reshape(list(a.shape)+[1,1])

returntf.constant(a,dtype=1)

defsimple_conv(x,k):

x=tf.expand_dims(tf.expand_dims(x,0),-1)

y=tf.nn.depthwise_conv2d(x,k,[1,1,1,1],padding='SAME')

returny[0,:,:,0]

deflaplace(x):

laplace_k=make_kernel([[0.5,1.0,0.5],

[1.0,-6.,1.0],

[0.5,1.0,0.5]])

returnsimple_conv(x,laplace_k)

Thesefunctionsdescribethephysicsofthemodel,thatis,asthewaveiscreatedandpropagatesinthepond.Iwillnotgointothedetailsofthesefunctions,theunderstandingofwhichisbeyondthescopeofthisbook.

Thefollowingfigureshowsthewavesonthepondaftertheraindropshavefallen.

Zoomingonthepond

SummaryInthischapter,welookedatsomeofthemathematicalpotentialofTensorFlow.Fromthefundamentaldefinitionofatensor,thebasicdatastructureforanytypeofcomputation,wesawwithsomeexampleshowtohandlethesedatastructuresusingtheTensorFlow'smathoperators.Usingcomplexnumbers,weexploredtheworldoffractals.Thenweintroducedtheconceptofrandomnumbers.Theseareinfactusedinmachinelearningformodeldevelopmentandtesting,sothechapterendedwithanexampleofdefiningandsolvingamathematicalproblemusingdifferentialequationswithpartialderivatives.

Inthenextchapter,finallywe'llstarttoseeTensorFlowinactionrightinthefieldforwhichitwasdeveloped-inmachinelearning,solvingcomplexproblemssuchasclassificationanddataclustering.

Chapter3.StartingwithMachineLearningInthischapter,wewillcoverthefollowingtopics:

LinearregressionTheMNISTdatasetClassifiersThenearestneighboralgorithmDataclusteringThek-meansalgorithm

ThelinearregressionalgorithmInthissection,webeginourexplorationofmachinelearningtechniqueswiththelinearregressionalgorithm.Ourgoalistobuildamodelbywhichtopredictthevaluesofadependentvariablefromthevaluesofoneormoreindependentvariables.

Therelationshipbetweenthesetwovariablesislinear;thatis,ifyisthedependentvariableandxtheindependent,thenthelinearrelationshipbetweenthetwovariableswilllooklikethis:y=Ax+b.

Thelinearregressionalgorithmadaptstoagreatvarietyofsituations;foritsversatility,itisusedextensivelyinthefieldofappliedsciences,forexample,biologyandeconomics.

Furthermore,theimplementationofthisalgorithmallowsustointroduceinatotallyclearandunderstandablewaythetwoimportantconceptsofmachinelearning:thecostfunctionandthegradientdescentalgorithms.

DatamodelThefirstcrucialstepistobuildourdatamodel.Wementionedearlierthattherelationshipbetweenourvariablesislinear,thatis:y=Ax+b,whereAandbareconstants.Totestouralgorithm,weneeddatapointsinatwo-dimensionalspace.

WestartbyimportingthePythonlibraryNumPy:

importnumpyasnp

Thenwedefinethenumberofpointswewanttodraw:

number_of_points=500

Weinitializethefollowingtwolists:

x_point=[]

y_point=[]

Thesepointswillcontainthegeneratedpoints.

Wethensetthetwoconstantsthatwillappearinthelinearrelationofywithx:

a=0.22

b=0.78

ViaNumPy'srandom.normalfunction,wegenerate300randompointsaroundtheregressionequationy=0.22x+0.78:

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

Finally,viewthegeneratedpointsbymatplotlib:

plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

Linearregression:Thedatamodel

Costfunctionsandgradientdescent

ThemachinelearningalgorithmthatwewanttoimplementwithTensorFlowmustpredictvaluesofyasafunctionofxdataaccordingtoourdatamodel.ThelinearregressionalgorithmwilldeterminethevaluesoftheconstantsAandb(fixedforourdatamodel),whicharethenthetrueunknownsoftheproblem.

Thefirststepistoimportthetensorflowlibrary:

ThendefinetheAandbunknowns,usingtheTensorFlowtf.Variable:

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

TheunknownfactorAwasinitializedusingarandomvaluebetween-1and1,whilethevariablebisinitiallysettozero:

b=tf.Variable(tf.zeros([1]))

Sowewritethelinearrelationshipthatbindsytox:

y=A*x_point+b

Nowwewillintroduce,thiscostfunction:thathasparameterscontainingapairofvaluesAandbtobedeterminedwhichreturnsavaluethatestimateshowwelltheparametersarecorrect.Inthisexample,ourcostfunctionismeansquareerror:

cost_function=tf.reduce_mean(tf.square(y-y_point))

Itprovidesanestimateofthevariabilityofthemeasures,ormoreprecisely,ofthedispersionofvaluesaroundtheaveragevalue;asmallvalueofthisfunctioncorrespondstoabestestimatefortheunknownparametersAandb.

Tominimizecost_function,weuseanoptimizationalgorithmwiththegradientdescent.Givenamathematicalfunctionofseveralvariables,gradientdescentallowstofindalocalminimumofthisfunction.Thetechniqueisasfollows:

Evaluate,atanarbitraryfirstpointofthefunction'sdomain,thefunctionitselfanditsgradient.Thegradientindicatesthedirectioninwhichthefunctiontendstoaminimum.Selectasecondpointinthedirectionindicatedbythegradient.Ifthefunctionforthissecondpointhasavaluelowerthanthevaluecalculatedatthefirstpoint,thedescentcancontinue.

Youcanrefertothefollowingfigureforavisualexplanationofthealgorithm:

Thegradientdescentalgorithm

Wealsoremarkthatthegradientdescentisonlyalocalfunctionminimum,butitcanalsobeusedinthesearchforaglobalminimum,randomlychoosinganewstartpointonceithasfoundalocalminimumandrepeatingtheprocessmanytimes.Ifthenumberofminimaofthefunctionislimited,andthereareveryhighnumberofattempts,thenthereisagoodchancethatsoonerorlatertheglobalminimumwillbeidentified.

UsingTensorFlow,theapplicationofthisalgorithmisverysimple.Theinstructionareasfollows:

optimizer=tf.train.GradientDescentOptimizer(0.5)

Here0.5isthelearningrateofthealgorithm.

Thelearningratedetermineshowfastorslowwemovetowardstheoptimal

weights.Ifitisverylarge,weskiptheoptimalsolution,andifitistoosmall,weneedtoomanyiterationstoconvergetothebestvalues.

Anintermediatevalue(0.5)isprovided,butitmustbetunedinordertoimprovetheperformanceoftheentireprocedure.

Wedefinetrainastheresultoftheapplicationofthecost_function(optimizer),throughitsminimizefunction:

train=optimizer.minimize(cost_function)

Testingthemodel

Nowwecantestthealgorithmofgradientdescentonthedatamodelyoucreatedearlier.Asusual,wehavetoinitializeallthevariables:

Sowebuildouriteration(20computationsteps),allowingustodeterminethebestvaluesofAandb,whichdefinethelinethatbestfitsthedatamodel.Instantiatetheevaluationgraph:

Weperformthesimulationonourmodel:

session.run(model)

forstepinrange(0,21):

Foreachiteration,weexecutetheoptimizationstep:

session.run(train)

Everyfivesteps,weprintourpatternofdots:

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

Andthestraightlinesareobtainedbythefollowingcommand:

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

Thefollowingfigureshowstheconvergenceoftheimplementedalgorithm:

Linearregression:startcomputation(step=0)

Afterjustfivesteps,wecanalreadysee(inthenextfigure)asubstantialimprovementinthefitoftheline:

Linearregression:situationafter5computationsteps

Thefollowing(andfinal)figureshowsthedefinitiveresultafter20steps.Wecanseetheefficiencyofthealgorithmused,withthestraightlineefficiencyperfectlyacrossthecloudofpoints.

Linearregression:finalresult

Finallywereport,tofurtherourunderstanding,thecompletecode:

importnumpyasnp

number_of_points=200

x_point=[]

y_point=[]

a=0.22

b=0.78

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

B=tf.Variable(tf.zeros([1]))

y=A*x_point+B

cost_function=tf.reduce_mean(tf.square(y-y_point))

optimizer=tf.train.GradientDescentOptimizer(0.5)

train=optimizer.minimize(cost_function)

session.run(model)

forstepinrange(0,21):

session.run(train)

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

TheMNISTdatasetTheMNISTdataset(availableathttp://yann.lecun.com/exdb/mnist/),iswidelyusedfortrainingandtestinginthefieldofmachinelearning,andwewilluseitintheexamplesofthisbook.Itcontainsblackandwhiteimagesofhandwrittendigitsfrom0to9.

Thedatasetisdividedintotwogroups:60,000totrainthemodelandanadditional10,000totestit.Theoriginalimages,inblackandwhite,werenormalizedtofitintoaboxofsize28×28pixelsandcenteredbycalculatingthecenterofmassofthepixels.ThefollowingfigurerepresentshowthedigitscouldberepresentedintheMNISTdataset:

MNISTdigitsampling

EachMNISTdatapointisanarrayofnumbersdescribinghowdarkeachpixelis.Forexample,forthefollowingdigit(thedigit1),wecouldhave:

Pixelrepresentationofthedigit1

DownloadingandpreparingthedataThefollowingcodeimportstheMNISTdatafilesthatwearegoingtoclassify.IamusingascriptfromGooglethatcanbedownloadedfrom:

https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/examples/tutorials/mnist/input_data.pyThismustberuninthesamefolderwherethefilesarelocated.

Nowwewillshowhowtoloadanddisplaythedata:

importinput_data

importnumpyasnp

Usinginput_data,weloadthedatasets:

mnist_images=input_data.read_data_sets\

("MNIST_data/",\

one_hot=False)

train.next_batch(10)returnsthefirst10images:

pixels,real_values=mnist_images.train.next_batch(10)

Thisalsoreturnstwolists:thematrixofthepixelsloadedandthelistthatcontainstherealvaluesloaded:

print"listofvaluesloaded",real_values

example_to_visualize=5

print"elementN°"+str(example_to_visualize+1)\

+"ofthelistplotted"

ExtractingMNIST_data/train-labels-idx1-ubyte.gz

ExtractingMNIST_data/t10k-images-idx3-ubyte.gz

ExtractingMNIST_data/t10k-labels-idx1-ubyte.gz

listofvaluesloaded[7346181098]

elementN6ofthelistplotted

Whiledisplayinganelement,wecanusematplotlib,asfollows:

image=pixels[example_to_visualize,:]

image=np.reshape(image,[28,28])

plt.imshow(image)

plt.show()

Hereistheresult:

AMNISTimageofthenumbereight

ClassifiersInthecontextofmachinelearning,thetermclassificationidentifiesanalgorithmicprocedurethatassignseachnewinputdatum(instance)tooneofthepossiblecategories(classes).Ifweconsideronlytwoclasses,wetalkaboutbinaryclassification;otherwisewehaveamulti-classclassification.

Theclassificationfallsintothesupervisedlearningcategory,whichpermitsustoclassifynewinstancesbasedontheso-calledtrainingset.Thebasicstepstofollowtoresolveasupervisedclassificationproblemareasfollows:

1. Buildthetrainingexamplesinordertorepresenttheactualcontextandapplicationonwhichtoaccomplishtheclassification.

2. Choosetheclassifierandthecorrespondingalgorithmimplementation.3. Trainthealgorithmonthetrainingsetandsetanycontrolparametersthrough

validation.4. Evaluatetheaccuracyandperformanceoftheclassifierbyapplyingasetof

newinstances(testset).

ThenearestneighboralgorithmTheK-nearestneighbor(KNN)isasupervisedlearningalgorithmforbothclassificationorregression.Itisasystemthatassignstheclassofthesampletestedaccordingtoitsdistancefromtheobjectsstoredinthememory.

Thedistance,d,isdefinedastheEuclideandistancebetweentwopoints:

Herenisthedimensionofthespace.Theadvantageofthismethodofclassificationistheabilitytoclassifyobjectswhoseclassesarenotlinearlyseparable.Itisastableclassifier,giventhatsmallperturbationsofthetrainingdatadonotsignificantlyaffecttheresultsobtained.Themostobviousdisadvantage,however,isthatitdoesnotprovideatruemathematicalmodel;instead,foreverynewclassification,itshouldbecarriedoutbyaddingthenewdatatoallinitialinstancesandrepeatingthecalculationprocedurefortheselectedKvalue.

Moreover,itrequiresafairlyhighamountofdatatomakerealisticpredictionsandissensitivetothenoiseoftheanalyzeddata.

Inthenextexample,wewillimplementtheKNNalgorithmusingtheMNISTdataset.

Buildingthetrainingset

Let'sstartwiththeimportlibrariesneededforthesimulation:

importnumpyasnp

importinput_data

Toconstructthedatamodelforthetrainingset,usetheinput_data.read_data_setsfunction,introducedearlier:

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

Inourexamplewewilltaketrainingphaseconsistingof100MNISTimages:

train_pixels,train_list_values=mnist.train.next_batch(100)

Whilewetestouralgorithmfor10images:

test_pixels,test_list_of_values=mnist.test.next_batch(10)

Finally,wedefinethetensorstrain_pixel_tensorandtest_pixel_tensorweusetoconstructourclassifier:

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

Costfunctionandoptimization

Thecostfunctionisrepresentedbythedistanceintermsofpixels:

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

Thetf.reducefunctionsumcomputesthesumofelementsacrossthedimensionsofatensor.Forexample(fromtheTensorFlowon-linemanual):

#'x'is[[1,1,1]

#[1,1,1]]

tf.reduce_sum(x)==>6

tf.reduce_sum(x,0)==>[2,2,2]

tf.reduce_sum(x,1)==>[3,3]

tf.reduce_sum(x,1,keep_dims=True)==>[[3],[3]]

tf.reduce_sum(x,[0,1])==>6

Finally,tominimizethedistancefunction,weusearg_min,whichreturnstheindexwiththesmallestdistance(nearestneighbor):

pred=tf.arg_min(distance,0)

Testingandalgorithmevaluation

Accuracyisaparameterthathelpsustocomputethefinalresultoftheclassifier:

accuracy=0

Initializethevariables:

init=tf.initialize_all_variables()

Startthesimulation:

sess.run(init)

foriinrange(len(test_list_of_values)):

Thenweevaluatethenearestneighborindex,usingthepredfunction,definedearlier:

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

Finally,wefindthenearestneighborclasslabelandcompareittoitstruelabel:

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

Thenweevaluateandreporttheaccuracyoftheclassifier:

accuracy+=1./len(test_pixels)

print"Result=",accuracy

Aswecansee,eachelementofthetrainingsetiscorrectlyclassified.Theresultofthesimulationshowsthepredictedclasswiththerealclass,andfinallythetotalvalueofthesimulationisreported:

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

TestN°0PredictedClass:7TrueClass:7

Result=0.9

Theresultisnot100%accurate;thereasonisthatitliesinawrongevaluationofthetestno.8insteadof5,theclassifierhasrated6.

Finally,wereportthecompletecodeforKNNclassification:

importnumpyasnp

importinput_data

#BuildtheTrainingSet

train_pixels,train_list_values=mnist.train.next_batch(100)

test_pixels,test_list_of_values=mnist.test.next_batch(10)

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

#CostFunctionanddistanceoptimization

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

pred=tf.arg_min(distance,0)

#Testingandalgorithmevaluation

accuracy=0.

sess.run(init)

foriinrange(len(test_list_of_values)):

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

accuracy+=1./len(test_pixels)

print"Result=",accuracy

DataclusteringAclusteringproblemconsistsintheselectionandgroupingofhomogeneousitemsfromasetofinitialdata.Tosolvethisproblem,wemust:

IdentifyaresemblancemeasurebetweenelementsFindoutiftherearesubsetsofelementsthataresimilartothemeasurechosen

Thealgorithmdetermineswhichelementsformaclusterandwhatdegreeofsimilarityunitesthemwithinthecluster.

Theclusteringalgorithmsfallintotheunsupervisedmethods,becausewedonotassumeanypriorinformationonthestructuresandcharacteristicsoftheclusters.

Thek-meansalgorithmOneofthemostcommonandsimpleclusteringalgorithmsisk-means,whichallowssubdividinggroupsofobjectsintokpartitionsonthebasisoftheirattributes.Eachclusterisidentifiedbyapointorcentroidaverage.

Thealgorithmfollowsaniterativeprocedure:

1. RandomlyselectKpointsastheinitialcentroids.2. Repeat.3. FormKclustersbyassigningallpointstotheclosestcentroid.4. Recomputethecentroidofeachcluster.5. Untilthecentroidsdon'tchange.

Thepopularityofthek-meanscomesfromitsconvergencespeedanditseaseofimplementation.Intermsofthequalityofthesolutions,thealgorithmdoesnotguaranteeachievingtheglobaloptimum.Thequalityofthefinalsolutiondependslargelyontheinitialsetofclustersandmay,inpractice,toobtainamuchworsetheglobaloptimumsolution.Sincethealgorithmisextremelyfast,youcanapplyitseveraltimesandproducesolutionsfromwhichyoucanchooseamongmostsatisfyingone.Anotherdisadvantageofthealgorithmisthatitrequiresyoutochoosethenumberofclusters(k)tofind.

Ifthedataisnotnaturallypartitioned,youwillendupgettingstrangeresults.Furthermore,thealgorithmworkswellonlywhenthereareidentifiablesphericalclustersinthedata.

Letusnowseehowtoimplementthek-meansbytheTensorFlowlibrary.

BuildingthetrainingsetImportallthenecessarylibrariestooursimulation:

importnumpyasnp

importpandasaspd

Pandasisanopensource,easy-to-usedatastructure,anddataanalysistoolforthePythonprogramminglanguage.Toinstallit,typethefollowingcommand:

sudopipinstallpandas

Wemustdefinetheparametersofourproblem.Thetotalnumberofpointsthatwewanttoclusteris1000points:

num_vectors=1000

Thenumberofpartitionsyouwanttoachievebyallinitial:

num_clusters=4

Wesetthenumberofcomputationalstepsofthek-meansalgorithm:

num_steps=100

Weinitializetheinitialinputdatastructures:

x_values=[]

y_values=[]

vector_values=[]

Thetrainingsetcreatesarandomsetofpoints,whichiswhyweusetherandom.normalNumPyfunction,allowingustobuildthex_valuesandy_valuesvectors:

foriinxrange(num_vectors):

ifnp.random.random()>0.5:

x_values.append(np.random.normal(0.4,0.7))

y_values.append(np.random.normal(0.2,0.8))

WeusethePythonzipfunctiontoobtainthecompletelistofvector_values:

vector_values=zip(x_values,y_values)

Thenvector_valuesisconvertedintoaconstant,usablebyTensorFlow:

vectors=tf.constant(vector_values)

Wecanseeourtrainingsetfortheclusteringalgorithmwiththefollowingcommands:

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

Thetrainingsetfork-means

Afterrandomlybuildingthetrainingset,wehavetogenerate(k=4)centroid,then

determineanindexusingtf.random_shuffle:

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

Byadoptingthisprocedure,weareabletodeterminefourrandomindices:

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

Theyhavetheirownindexesofourinitialcentroids:

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather\

(vector_values,centroid_indices))

CostfunctionsandoptimizationThecostfunctionwewanttominimizeforthisproblemisagaintheEuclideandistancebetweentwopoints:

Inordertomanagethetensorsdefinedpreviously,vectorsandcentroids,weusetheTensorFlowfunctionexpand_dims,whichautomaticallyexpandsthesizeofthetwoarguments:

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

Thisfunctionallowsyoutostandardizetheshapeofthetwotensors,inordertoevaluatethedifferencebythetf.submethod:

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

Finally,webuildtheeuclidean_distancescostfunction,usingthetf.reduce_sumfunction,whichcomputesthesumofelementsacrossthedimensionsofatensor,whilethetf.squarefunctioncomputesthesquareofthevectors_subtrationelement-wisetensor:

euclidean_distances=tf.reduce_sum(tf.square\

(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

Hereassignmentsisthevalueoftheindexwiththesmallestdistanceacrossthetensoreuclidean_distances.Letusnowturntotheoptimizationphase,thepurposeofwhichistoimprovethechoiceofcentroids,onwhichtheconstructionoftheclustersdepends.Wepartitionthevectors(whichisourtrainingset)intonum_clusterstensors,usingindicesfromassignments.

Thefollowingcodetakesthenearestindicesforeachsample,andgrabsthoseoutasseparategroupsusingtf.dynamic_partition:

partitions=tf.dynamic_partition\

(vectors,assignments,num_clusters)

Finally,weupdatethecentroids,usingtf.reduce_meanonasinglegrouptofindtheaverageofthatgroup,formingitsnewcentroid:

update_centroids=tf.concat(0,\

[tf.expand_dims\

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

Toformtheupdate_centroidstensor,weusetf.concattoconcatenatethesingleone.

Testingandalgorithmevaluation

It'stimetotestandevaluatethealgorithm.Thefirstprocedureistoinitializeallthevariablesandinstantiatetheevaluationgraph:

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

Nowwestartthecomputation:

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

Todisplaytheresult,weimplementthefollowingfunction:

display_partition(x_values,y_values,assignment_values)

Thistakesthex_valuesandy_valuesvectorsofthetrainingset,andtheassignemnt_valuesvector,todrawtheclusters.

Thecodeforthisvisualizationfunctionisasfollows:

defdisplay_partition(x_values,y_values,assignment_values):

labels=[]

colors=["red","blue","green","yellow"]

foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()

ax.scatter(df['x'],df['y'],c=df['color'])

plt.show()

Itassociatestoeachclusteritscolorbymeansofthefollowingdatastructure:

Itthendrawsthemthroughthescatterfunctionofmatplotlib:

Let'sdisplaytheresult:

Finalresultofthek-meansalgorithm

Hereisthecompletecodeofthek-meansalgorithm:

importnumpyasnp

importpandasaspd

defdisplay_partition(x_values,y_values,assignment_values):

labels=[]

foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()

plt.show()

num_vectors=2000

num_clusters=4

n_samples_per_cluster=500

num_steps=1000

x_values=[]

y_values=[]

vector_values=[]

#CREATERANDOMDATA

foriinxrange(num_vectors):

ifnp.random.random()>0.5:

vector_values=zip(x_values,y_values)

vectors=tf.constant(vector_values)

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather(vector_values,centroid_indices))

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

euclidean_distances=

\tf.reduce_sum(tf.square(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

partitions=[0,0,1,1,0]

num_partitions=2

data=[10,20,30,40,50]

outputs[0]=[10,20,50]

outputs[1]=[30,40]

partitions=tf.dynamic_partition(vectors,assignments,num_clusters)

update_centroids=tf.concat(0,[tf.expand_dims

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

display_partition(x_values,y_values,assignment_values)

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

SummaryInthischapter,webegantoexplorethepotentialofTensorFlowforsometypicalproblemsinMachineLearning.Withthelinearregressionalgorithm,theimportantconceptsofcostfunctionandoptimizationusinggradientdescentwereexplained.WethendescribedthedatasetMNISTofhandwrittendigits.Wealsoimplementedamulticlassclassifierusingthenearestneighboralgorithm,whichfallsintotheMachineLearningsupervisedlearningcategory.Thenthechapterconcludedwithanexampleofunsupervisedlearning,byimplementingthek-meansalgorithmforsolvingadataclusteringproblem.

Inthenextchapter,wewillintroduceneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelementsdefinedasartificialneurons,namelymathematicalconstructsthatmimicthepropertiesoflivingneurons.

We'llalsoimplementsomeneuralnetworklearningmodelsusingTensorFlow.

Chapter4.IntroducingNeuralNetworksInthischapter,wewillcoverthefollowingtopics:

Whatareneuralnetworks?SingleLayerPerceptronLogisticregressionMultiLayerPerceptronMultiLayerPerceptronclassificationMultiLayerPerceptronfunctionapproximation

Whatareartificialneuralnetworks?Anartificialneuralnetwork(ANN)isaninformationprocessingsystemwhoseoperatingmechanismisinspiredbybiologicalneuralcircuits.Thankstotheircharacteristics,neuralnetworksaretheprotagonistsofarealrevolutioninmachinelearningsystemsandmorespecificallyinthecontextofartificialintelligence.AnANNpossessesmanysimpleprocessingunitsvariouslyconnectedtoeachother,accordingtovariousarchitectures.IfwelookattheschemaofanANNreportedlater,itcanbeseenthatthehiddenunitscommunicatewiththeexternallayer,bothininputandoutput,whiletheinputandoutputunitscommunicateonlywiththehiddenlayerofthenetwork.

Eachunitornodesimulatestheroleoftheneuroninbiologicalneuralnetworks.Eachnode,saidartificialneuron,hasaverysimpleoperation:itbecomesactiveifthetotalquantityofsignalthatitreceivesexceedsitsactivationthreshold,definedbytheso-calledactivationfunction.Ifanodebecomesactive,itemitsasignalthatistransmittedalongthetransmissionchannelsuptotheotherunittowhichitisconnected.Eachconnectionpointactsasafilterthatconvertsthemessageintoaninhibitoryorexcitatorysignal,increasingordecreasingtheintensityaccordingtotheirindividualcharacteristics.Theconnectionpointssimulatethebiologicalsynapsesandhavethefundamentalfunctionofweighingtheintensityofthetransmittedsignals,bymultiplyingthembytheweightswhosevaluesdependontheconnectionitself.

ANNschematicdiagram

NeuralnetworkarchitecturesThewaytoconnectthenodes,thetotalnumberoflayers,thatisthelevelsofnodesbetweeninputandoutputsandthenumberofneuronsperlayer-allthesedefinethearchitectureofaneuralnetwork.Forexample,inmultilayernetworks(weintroducetheseinthesecondpartofthischapter),onecanidentifytheartificialneuronsoflayerssuchthat:

EachneuronisconnectedwithallthoseofthenextlayerTherearenoconnectionsbetweenneuronsbelongingtothesamelayerThenumberoflayersandofneuronsperlayerdependsontheproblemtobesolved

Nowwestartourexplorationofneuralnetworkmodels,introducingthemostsimpleneuralnetworkmodel:theSingleLayerPerceptronortheso-calledRosenblatt'sPerceptron.

SingleLayerPerceptronTheSingleLayerPerceptronwasthefirstneuralnetworkmodel,proposedin1958byFrankRosenblatt.Inthismodel,thecontentofthelocalmemoryoftheneuronconsistsofavectorofweights,W=(w1,w2,......,wn).ThecomputationisperformedoverthecalculationofasumoftheinputvectorX=(x1,x2,......,xn),eachofwhichismultipliedbythecorrespondingelementofthevectoroftheweights;thenthevalueprovidedintheoutput(thatis,aweightedsum)willbetheinputofanactivationfunction.Thisfunctionreturns1iftheresultisgreaterthanacertainthreshold,otherwiseitreturns-1.Inthefollowingfigure,theactivationfunctionistheso-calledsignfunction:

sign(x)=

−1otherwise

Itispossibletouseotheractivationfunctions,preferablynon-linear(suchasthesigmoidfunction,whichwewillseeinthenextsection).Thelearningprocedureofthenetisiterative:itslightlymodifiesforeachlearningcycle(calledepoch)thesynapticweightsbyusingaselectedsetcalledatrainingset.Ateachcycle,theweightsmustbemodifiedtominimizeacostfunction,whichisspecifictotheproblemunderconsideration.Finally,whentheperceptronhasbeentrainedonthetrainingset,itwillbetestedonotherinputs(thetestset)inordertoverifyitscapacityforgeneralization.

SchemaofaRosemblatt'sPerceptron

LetusnowseehowtoimplementasinglelayerneuralnetworkforanimageclassificationproblemusingTensorFlow.

ThelogisticregressionThisalgorithmhasnothingtodowiththecanonicallinearregressionwesawinChapter3,StartingwithMachineLearning,butitisanalgorithmthatallowsustosolveproblemsofsupervisedclassification.Infact,toestimatethedependentvariable,nowwemakeuseoftheso-calledlogisticfunctionorsigmoid.Itispreciselybecauseofthisfeaturewecallthisalgorithmlogisticregression.Thesigmoidfunctionhasthefollowingpattern:

Sigmoidfunction

Aswecansee,thedependentvariabletakesvaluesstrictlybetween0and1thatispreciselywhatservesus.Inthecaseoflogisticregression,wewantourfunctiontotelluswhat'stheprobabilityofbelongingtoaparticularelementofourclass.We

recallagainthatthesupervisedlearningbytheneuralnetworkisconfiguredasaniterativeprocessofoptimizationoftheweights;thesearethenmodifiedonthebasisofthenetwork'sperformanceofthetrainingset.Indeedtheaimistominimizethelossfunction,whichindicatesthedegreetowhichthebehaviorofthenetworkdeviatesfromthedesiredone.Theperformanceofthenetworkisthenverifiedonatestset,consistingofimagesotherthanthoseoftrained.

Thebasicstepsoftrainingthatwe'regoingtoimplementareasfollows:

Theweightsareinitializedwithrandomvaluesatthebeginningofthetraining.Foreachelementofthetrainingsettheerroriscalculated,thatis,thedifferencebetweenthedesiredoutputandtheactualoutput.Thiserrorisusedtoadjusttheweights.Theprocessisrepeated,resubmittingtothenetwork,inarandomorder,alltheexamplesofthetrainingsetuntiltheerrormadeontheentiretrainingsetisnotlessthanacertainthreshold,oruntilthemaximumnumberofiterationsisreached.

LetusnowseeindetailhowtoimplementthelogisticregressionwithTensorFlow.TheproblemwewanttosolveistoclassifyimagesfromtheMNISTdataset,whichasexplainedintheChapter3,StartingwithMachineLearningisadatabaseofhandwrittennumbers.

TensorFlowimplementationToimplementTensorFlow,weneedtoperformthefollowingsteps:

1. Firstofall,wehavetoimportallthenecessarylibraries:

importinput_data

2. Weusetheinput_data.readfunctionintroducedinChapter3,StartingwithMachineLearning,intheMNISTdatasetsection,touploadtheimagestoourproblem:

3. Thenwesetthetotalnumberofepochsforthetrainingphase:

training_epochs=25

4. Wemustalsodefineotherparametersthatarenecessarytobuildamodel:

learning_rate=0.01

batch_size=100

display_step=1

5. Nowwemovetotheconstructionofthemodel.

BuildingthemodelDefinexastheinputtensor;itrepresentstheMNISTdataimageofsize28x28=784pixels:

x=tf.placeholder("float",[None,784])

Werecallthatourproblemconsistsofassigningaprobabilityvalueforeachofthepossibleclassesofmembership(thenumbersfrom0to9).Attheendofthiscalculation,wewilluseaprobabilitydistribution,whichgivesusthevalueofwhatisconfidentwithourprediction.

Sotheoutputwe'regoingtogetwillbeanoutputtensorwith10probabilities,eachonecorrespondingtoadigit(ofcoursethesumofprobabilitiesmustbeone):

y=tf.placeholder("float",[None,10])

Toassignprobabilitiestoeachimage,wewillusetheso-calledsoftmaxactivationfunction.

Thesoftmaxfunctionisspecifiedintwomainsteps:

CalculatetheevidencethatacertainimagebelongstoaparticularclassConverttheevidenceintoprobabilitiesofbelongingtoeachofthe10possibleclasses

Toevaluatetheevidence,wefirstdefinetheweightsinputtensorasW:

W=tf.Variable(tf.zeros([784,10]))

Foragivenimage,wecanevaluatetheevidenceforeachclassibysimplymultiplyingthetensorWwiththeinputtensorx.UsingTensorFlow,weshouldhavesomethinglikethefollowing:

evidence=tf.matmul(x,W)

Ingeneral,themodelsincludeanextraparameterrepresentingthebias,whichindicatesacertaindegreeofuncertainty.Inourcase,thefinalformulafortheevidenceisasfollows:

evidence=tf.matmul(x,W)+b

Itmeansthatforeveryi(from0to9)wehaveaWimatrixelements784(28×

28),whereeachelementjofthematrixismultipliedbythecorrespondingcomponentjoftheinputimage(784parts)isaddedandthecorrespondingbiaselementbi.

Sotodefinetheevidence,wemustdefinethefollowingtensorofbiases:

Thesecondstepistofinallyusethesoftmaxfunctiontoobtaintheoutputvectorofprobabilities,namelyactivation:

activation=tf.nn.softmax(tf.matmul(x,W)+b)

TensorFlow'stf.nn.softmaxfunctionprovidesaprobability-basedoutputfromtheinputevidencetensor.Onceweimplementthemodel,wecanspecifythenecessarycodetofindtheweightsWandbiasesbnetworkthroughtheiterativetrainingalgorithm.Ineachiteration,thetrainingalgorithmtakesthetrainingdata,appliestheneuralnetwork,andcomparestheresultwiththeexpected.

TensorFlowprovidesmanyotheractivationfunctions.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/index.htmlforbetterreferences.

Inordertotrainourmodelandknowwhenwehaveagoodone,wemustdefinehowtodefinetheaccuracyofourmodel.OurgoalistotrytogetvaluesofparametersWandbthatminimizethevalueofthemetricthatindicateshowbadthemodelis.

Differentmetricscalculateddegreeoferrorbetweenthedesiredoutputandthetrainingdataoutputs.AcommonmeasureoferroristhemeansquarederrorortheSquaredEuclideanDistance.However,therearesomeresearchfindingsthatsuggesttouseothermetricstoaneuralnetworklikethis.

Inthisexample,weusetheso-calledcross-entropyerrorfunction.Itisdefinedas:

cross_entropy=y*tf.lg(activation)

Inordertominimizecross_entropy,wecanusethefollowingcombinationoftf.reduce_meanandtf.reduce_sumtobuildthecostfunction:

cost=tf.reduce_mean\

(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

Thenwemustminimizeitusingthegradientdescentoptimizationalgorithm:

optimizer=tf.train.GradientDescentOptimizer\

(learning_rate).minimize(cost)

Fewlinesofcodetobuildaneuralnetmodel!

LaunchthesessionIt'stimetobuildthesessionandlaunchourneuralnetmodel.

Wefixthefollowingliststovisualizethetrainingsession:

avg_set=[]

epoch_set=[]

ThenweinitializetheTensorFlowvariables:

Startthesession:

sess.run(init)

Asexplained,eachepochisatrainingcycle:

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

Thenweloopoverallthebatches:

foriinrange(total_batch):

batch_xs,batch_ys=\

mnist.train.next_batch(batch_size)

Fitthetrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

Computetheaveragelossrunningthetrain_stepfunctionwiththegivenimagevalues(x)andtherealoutput(y_):

avg_cost+=sess.run\

(cost,feed_dict={x:batch_xs,\

y:batch_ys})/total_batch

Duringcomputation,wedisplayalogperepochstep:

ifepoch%display_step==0:

print"Epoch:",\

'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

print"Trainingphasefinished"

Let'sgettheaccuracyofourmode.Itiscorrectiftheindexwiththehighestyvalueisthesameasintherealdigitvectorthemeanofthecorrect_predictiongivesustheaccuracy.Weneedtoruntheaccuracyfunctionwithourtestset(mnist.test).

Weusethekeyimagesandlabelsforxandy:

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

accuracy=tf.reduce_mean\

(tf.cast(correct_prediction,"float"))

print"MODELaccuracy:",accuracy.eval({x:mnist.test.images,\

y:mnist.test.labels})

TestevaluationWepreviouslyshowedthetrainingphaseandforeachepochwehaveprintedtherelativecostfunction:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.>>>=======================RESTART

============================

Extracting/tmp/data/train-images-idx3-ubyte.gz

Epoch:0001cost=1.174406662

Epoch:0002cost=0.661956009

Epoch:0003cost=0.550468774

Epoch:0004cost=0.496588717

Epoch:0005cost=0.463674555

Epoch:0006cost=0.440907706

Epoch:0007cost=0.423837747

Epoch:0008cost=0.410590841

Epoch:0009cost=0.399881751

Epoch:0010cost=0.390916621

Epoch:0011cost=0.383320325

Epoch:0012cost=0.376767031

Epoch:0013cost=0.371007620

Epoch:0014cost=0.365922904

Epoch:0015cost=0.361327561

Epoch:0016cost=0.357258660

Epoch:0017cost=0.353508228

Epoch:0018cost=0.350164634

Epoch:0019cost=0.347015593

Epoch:0020cost=0.344140861

Epoch:0021cost=0.341420144

Epoch:0022cost=0.338980592

Epoch:0023cost=0.336655581

Epoch:0024cost=0.334488012

Epoch:0025cost=0.332488823

Trainingphasefinished

Asyoucansee,duringthetrainingphasethecostfunctionisminimized.Attheendofthetest,weshowhowaccuratetheimplementedmodelis:

ModelAccuracy:0.9475

Finally,usingthefollowinglinesofcode,wecanvisualizethetrainingphaseofthenet:

plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

Trainingphaseinlogisticregression

Sourcecode#ImportMINSTdata

importinput_data

#Parameters

learning_rate=0.01

training_epochs=25

batch_size=100

display_step=1

#tfGraphInput

x=tf.placeholder("float",[None,784])

y=tf.placeholder("float",[None,10])

#Createmodel

#Setmodelweights

W=tf.Variable(tf.zeros([784,10]))

#Constructmodel

activation=tf.nn.softmax(tf.matmul(x,W)+b)

#Minimizeerrorusingcrossentropy

cross_entropy=y*tf.log(activation)

(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

optimizer=tf.train.\

GradientDescentOptimizer(learning_rate).minimize(cost)

#Plotsettings

avg_set=[]

epoch_set=[]

#Initializingthevariables

#Launchthegraph

sess.run(init)

#Trainingcycle

avg_cost=0.

#Loopoverallbatches

batch_xs,batch_ys=\

#Fittrainingusingbatchdata

sess.run(optimizer,\

feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,feed_dict=\

{x:batch_xs,\

#Displaylogsperepochstep

print"Epoch:",'%04d'%(epoch+1),\

avg_set.append(avg_cost)

epoch_set.append(epoch+1)

plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"Modelaccuracy:",accuracy.eval({x:mnist.test.images,\

MultiLayerPerceptronAmorecomplexandefficientarchitectureisthatofMultiLayerPerceptron(MLP).Itissubstantiallyformedfrommultiplelayersofperceptrons,andthereforebythepresenceofatleastonehiddenlayer,thatisnotconnectedeithertotheinputsortotheoutputsofthenetwork:

TheMLParchitecture

Anetworkofthistypeistypicallytrainedusingsupervisedlearning,accordingtotheprinciplesoutlinedinthepreviousparagraph.Inparticular,atypicallearningalgorithmforMLPnetworksistheso-calledbackpropagation'salgorithm.

Thebackpropagationalgorithmisalearningalgorithmforneuralnetworks.Itcomparestheoutputvalueofthesystemwiththedesiredvalue.Onthebasisofthedifferencethuscalculated(namely,theerror),thealgorithmmodifiesthesynapticweightsoftheneuralnetwork,byprogressivelyconvergingthesetofoutputvaluesofthedesiredones.

ItisimportanttonotethatinMLPnetworks,althoughyoudon'tknowthedesiredoutputsoftheneuronsofthehiddenlayersofthenetwork,itisalwayspossibleto

applyasupervisedlearningmethodbasedontheminimizationofanerrorfunctionviatheapplicationofgradient-descenttechniques.

Inthefollowingexample,weshowtheimplementationwithMLPforanimageclassificationproblem(MNIST).

MultiLayerPerceptronclassificationImportthenecessarylibraries:

importinput_data

Loadtheimagestoclassify:

FixsomeparametersfortheMLPmodel:

Learningrateofthenet:

learning_rate=0.001

Theepochs:

training_epochs=20

Thenumberofimagestoclassify:

batch_size=100

display_step=1

Thenumberofneuronsforthefirstlayer:

n_hidden_1=256

Thenumberofneuronsforthesecondlayer:

n_hidden_2=256

Thesizeoftheinput(eachimagehas784pixels):

n_input=784#MNISTdatainput(imgshape:28*28)

Thesizeofoftheoutputclasses:

n_classes=10

Itshouldthereforebenotedthatwhileforagivenapplication,theinputandoutputsizeisperfectlydefined,therearenostrictcriteriaforhowtodefinethenumberof

hiddenlayersandthenumberofneuronsforeachlayer.

Everychoicemustbebasedonexperienceofsimilarapplications,asinourcase:

Whenincreasingthenumberofhiddenlayers,weshouldalsoincreasethesizeofthetrainingsetthatisnecessaryandalsoincreasethenumberofconnectionstobeupdated,duringthelearningphase.Thisresultsinanincreaseinthetrainingtime.Also,iftherearetoomanyneuronsinthehiddenlayer,notonlyaretheremoreweightstobeupdatedbutthenetworkalsohasatendencytolearntoomuchfromthetrainingexamplesset,resultinginapoorgeneralizationability.Butthenifthehiddenneuronsaretoofew,thenetworkisnotabletolearnevenwiththetrainingset.

Buildthemodel

Theinputlayeristhextensor[1×784],whichrepresentstheimagetoclassify:

x=tf.placeholder("float",[None,n_input])

Theoutputtensoryisequaltothenumberofclasses:

y=tf.placeholder("float",[None,n_classes])

Inthemiddle,wehavetwohiddenlayers.Thefirstlayerisconstitutedbythehtensorofweights,whosesizeis[784×256],where256isthetotalnumberofnodesofthelayer:

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

Forlayer1,sowehavetodefinetherespectivebiasestensor:

bias_layer_1=tf.Variable(tf.random_normal([n_hidden_1]))

Eachneuronreceivesthepixelsofinputimagetobeclassifiedcombinedwiththehijweightconnectionsandaddedtotherespectivevaluesofthebiasestensor:

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

Itsendsitsoutputtotheneuronsofthenextlayerthroughtheactivationfunction.Itmustbesaidthatfunctionscanbedifferentfromoneneurontoanother,butinpractice,however,weadoptacommonfeatureforalltheneurons,typicallyofthesigmoidaltype.Sometimestheoutputneuronsareequippedwithalinearactivationfunction.Itisinterestingtonotethattheactivationfunctionsoftheneuronsinthe

hiddenlayerscannotbelinearbecause,inthiscase,theMLPnetworkwouldbeequivalenttoanetworkwithtwolayersandthereforenolongeroftheMLPtype.Thesecondlayermustperformthesamestepsasthefirst.

Thesecondintermediatelayerisrepresentedbytheshapeoftheweightstensor[256×256]:

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

Withthetensorofbiases:

Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwiththeweightWijconnectionsandaddedtotherespectivebiasesoflayer2:

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

Itsendsitsoutputtothenextlayer,namelytheoutputlayer:

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

bias_output=tf.Variable(tf.random_normal([n_classes]))

output_layer=tf.matmul(layer_2,output)+bias_output

Theoutputlayerreceivesasinputn-stimuli(256)comingfromlayer2,whichisconvertedtotherespectiveclassesofprobabilityforeachnumber.

Asforthelogisticregression,wethendefinethecostfunction:

(tf.nn.softmax_cross_entropy_with_logits\

(output_layer,y))

TheTensorFlowfunctiontf.nn.softmax_cross_entropy_with_logitscomputesthecostforasoftmaxlayer.Itisonlyusedduringtraining.Thelogitsaretheunnormalizedlogprobabilitiesoutputthemodel(thevaluesoutputbeforethesoftmaxnormalizationisappliedtothem).

Thecorrespondingoptimizerthatminimizesthecostfunctionis:

optimizer=tf.train.AdamOptimizer\

(learning_rate=learning_rate).minimize(cost)

tf.train.AdamOptimizerusesKingmaandBa'sAdamalgorithmtocontrolthelearningrate.Adamoffersseveraladvantagesoverthesimpletf.train.GradientDescentOptimizer.Infact,itusesalargereffectivestepsize,andthealgorithmwillconvergetothisstepsizewithoutfinetuning.

Asimpletf.train.GradientDescentOptimizercouldequallybeusedinyourMLP,butwouldrequiremorehyperparametertuningbeforeitcouldconvergeasquickly.

TensorFlowprovidestheoptimizerbaseclasstocomputegradientsforalossandapplygradientstovariables.ThisclassdefinestheAPItoaddopstotrainamodel.Youneverusethisclassdirectly,butinsteadinstantiateoneofitssubclasses.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#Optimizertoseetheoptimizerimplemented.

Launchthesession

Thefollowingarethestepstolaunchthesession:

1. Plotthesettings:

avg_set=[]

epoch_set=[]

2. Initializethevariables:

3. Launchthegraph:

sess.run(init)

4. Definethetrainingcycle:

avg_cost=0.

5. Loopoverallthebatches(100):

batch_xs,batch_ys=

6. Fittrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,y:

batch_ys})

7. Computetheaverageloss:

avg_cost+=sess.run(cost,feed_dict={x:batch_xs,\

Displaylogsperepochstep

8. Withtheselinesofcodes,weplotthetrainingphase:

plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

9. Finally,wecantesttheMLPmodel:

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

evaluatingitsaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,

"float"))

print"ModelAccuracy:",accuracy.eval({x:

mnist.test.images,\

10. Hereistheoutputresultafter20epochs:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.1

20151010]onlinux2Type"copyright","credits"or"license()"for

moreinformation.

>>>==========================RESTART

==============================

Succesfullydownloadedtrain-images-idx3-ubyte.gz9912422bytes.

Succesfullydownloadedtrain-labels-idx1-ubyte.gz28881bytes.

Succesfullydownloadedt10k-images-idx3-ubyte.gz1648877bytes.

Succesfullydownloadedt10k-labels-idx1-ubyte.gz4542bytes.

Epoch:0001cost=1.723947845

Epoch:0002cost=0.539266024

Epoch:0003cost=0.362600502

Epoch:0004cost=0.266637279

Epoch:0005cost=0.205345784

Epoch:0006cost=0.159139332

Epoch:0007cost=0.125232637

Epoch:0008cost=0.098572041

Epoch:0009cost=0.077509963

Epoch:0010cost=0.061127526

Epoch:0011cost=0.048033808

Epoch:0012cost=0.037297983

Epoch:0013cost=0.028884999

Epoch:0014cost=0.022818390

Epoch:0015cost=0.017447586

Epoch:0016cost=0.013652348

Epoch:0017cost=0.010417282

Epoch:0018cost=0.008079228

Epoch:0019cost=0.006203546

Epoch:0020cost=0.004961207

Trainingphasefinished

ModelAccuracy:0.9775

Weshowthetrainingphaseinthefollowingfigure:

TrainingphaseinMultiLayerPerceptron

Sourcecode#ImportMINSTdata

importinput_data

#Parameters

learning_rate=0.001

training_epochs=20

batch_size=100

display_step=1

#NetworkParameters

n_hidden_1=256#1stlayernumfeatures

n_hidden_2=256#2ndlayernumfeatures

n_classes=10#MNISTtotalclasses(0-9digits)

#tfGraphinput

x=tf.placeholder("float",[None,n_input])

y=tf.placeholder("float",[None,n_classes])

#weightslayer1

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

#biaslayer1

#layer1

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

#weightslayer2

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

#biaslayer2

#layer2

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

#weightsoutputlayer

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

#biaroutputlayer

bias_output=tf.Variable(tf.random_normal([n_classes]))

#outputlayer

output_layer=tf.matmul(layer_2,output)+bias_output

#costfunction

(tf.nn.softmax_cross_entropy_with_logits(output_layer,y))

#optimizer

optimizer=tf.train.AdamOptimizer\

#Plotsettings

avg_set=[]

epoch_set=[]

#Launchthegraph

sess.run(init)

#Trainingcycle

avg_cost=0.

#Loopoverallbatches

batch_xs,batch_ys=mnist.train.next_batch(batch_size)

sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,\

feed_dict={x:batch_xs,\

#Displaylogsperepochstep

plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"ModelAccuracy:",accuracy.eval({x:mnist.test.images,\

MultiLayerPerceptronfunctionapproximationInthefollowingexample,weimplementanMLPnetworkthatwillbeabletolearnthetrendofanarbitraryfunctionf(x).Inthetrainingphasethenetworkwillhavetolearnfromaknownsetofpoints,thatisxandf(x),whileinthetestphasethenetworkwilldeductthevaluesoff(x)onlyfromthexvalues.

Thisverysimplenetworkwillbebuiltbyasinglehiddenlayer.

Importthenecessarylibraries:

importnumpyasnp

importmath,random

Webuildthedatamodel.Thefunctiontobelearnedwillfollowthetrendofthecosinefunction,evaluatedfor1000pointstowhichweaddaverylittlerandomerror(noise)toreproducearealcase:

NUM_points=1000

np.random.seed(NUM_points)

function_to_learn=lambdax:np.cos(x)+\

0.1*np.random.randn(*x.shape)

OurMLPnetworkwillbeformedbyahiddenlayerof10neurons:

layer_1_neurons=10

Thenetworklearnsfor100pointsatatimetoatotalof1500learningcycles(epochs):

batch_size=100

NUM_EPOCHS=1500

Finally,weconstructthetrainingsetandthetestset:

all_xcontienetuttiipunti

all_x=np.float32(np.random.uniform\

(-2*math.pi,2*math.pi,\

(1,NUM_points))).T

np.random.shuffle(all_x)

train_size=int(900)

Thefirst900pointsareinthetrainingset:

x_training=all_x[:train_size]

y_training=function_to_learn(x_training)

Thelast100willbeinthevalidationset:

x_validation=all_x[train_size:]

y_validation=function_to_learn(x_validation)

Usingmatplotlib,wedisplaythesesets:

plt.figure(1)

plt.scatter(x_training,y_training,c='blue',label='train')

plt.scatter(x_validation,y_validation,c='red',label='validation')

plt.legend()

plt.show()

Trainingandvalidationset

Buildthemodel

First,wecreatetheplaceholdersfortheinputtensor(X)andtheoutputtensor(Y):

X=tf.placeholder(tf.float32,[None,1],name="X")

Y=tf.placeholder(tf.float32,[None,1],name="Y")

Thenwebuildthehiddenlayerof[1x10]dimensions:

w_h=tf.Variable(tf.random_uniform([1,layer_1_neurons],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_h=tf.Variable(tf.zeros([1,layer_1_neurons],\

dtype=tf.float32))

ItreceivestheinputvaluefromtheXinputtensor,combinedwiththeweightw_hijconnectionsandaddedwiththerespectivebiasesoflayer1:

h=tf.nn.sigmoid(tf.matmul(X,w_h)+b_h)

Theoutputlayerisa[10x1]tensor:

w_o=tf.Variable(tf.random_uniform([layer_1_neurons,1],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_o=tf.Variable(tf.zeros([1,1],dtype=tf.float32))

Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwithweightw_oijconnectionsandaddedtogetherwiththerespectivebiasesoftheoutputlayer:

model=tf.matmul(h,w_o)+b_o

Wethendefineouroptimizerforthenewlydefinedmodel:

train_op=tf.train.AdamOptimizer().minimize\

(tf.nn.l2_loss(model-Y))

Wealsonotethatinthiscase,thecostfunctionadoptedisthefollowing:

tf.nn.l2_loss(model-Y)

Thetf.nn.l2_lossfunctionisaTensorFlowthatcomputeshalftheL2normofa

tensorwithoutthesqrt,thatis,theoutputfortheprecedingfunctionisasfollows:

output=sum((model-Y)**2)/2

Thetf.nn.l2_lossfunctioncanbeaviablecostfunctionforourexample.

Launchthesession

Let'sbuildtheevaluationgraph:

sess=tf.Session()

sess.run(tf.initialize_all_variables())

Nowwecanlaunchthelearningsession:

errors=[]

foriinrange(NUM_EPOCHS):

forstart,endinzip(range(0,len(x_training),batch_size),\

range(batch_size,\

len(x_training),batch_size)):

sess.run(train_op,feed_dict={X:x_training[start:end],\

Y:y_training[start:end]})

cost=sess.run(tf.nn.l2_loss(model-y_validation),\

feed_dict={X:x_validation})

errors.append(cost)

ifi%100==0:print"epoch%d,cost=%g"%(i,cost)

Runningthisnetworkfor1400epochs,we'llseetheerrorprogressivelyreducingandeventuallyconverging:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.

>>>=======================RESTART============================

epoch0,cost=55.9286

epoch100,cost=22.0084

epoch1000,cost=0.668316

epoch1100,cost=0.633737

epoch1200,cost=0.608306

epoch1300,cost=0.590429

epoch1400,cost=0.574602

Thefollowinglinesofcodeallowustodisplayhowthecostchangesintherunningepochs:

plt.plot(errors,label='MLPFunctionApproximation')

plt.xlabel('epochs')

plt.ylabel('cost')

plt.legend()

plt.show()

TrainingphaseinMultiLayerPerceptron

SummaryInthischapter,weintroducedartificialneuralnetworks.Anartificialneuronisamathematicalmodelthattosomeextentmimicsthepropertiesofalivingneurons.Eachneuronofthenetworkhasaverysimpleoperationwhichconsistsofbecomingactiveifthetotalamountofsignalthatitreceivesexceedsalookattheactivationthreshold.Thelearningprocessistypicallysupervised:theneuralnetusesatrainingsettoinfertherelationshipbetweentheinputandthecorrespondingoutput,whilethelearningalgorithmmodifiestheweightsofthenetinordertominimizeacostfunctionthatrepresentstheforecasterrorrelatingtothetrainingset.Ifthetrainingissuccessful,theneuralnetwillbeabletomakeforecastsevenwheretheoutputisnotknownapriori.Inthischapterweimplemented,usingTensorFlow,someexamplesinvolvingneuralnetworks.WehaveseenneuralnetsusedtosolveclassificationandregressionsproblemsasthelogisticregressionalgorithminaclassificationproblemusingtheRosemblatt'sPerceptron.Attheendofthechapter,weintroducedtheMultiLayerPerceptronarchitecture,whichwehaveseeninactionpriortotheimplementationofanimageclassifier,thenforasimulatorofmathematicalfunctions.

Inthenextchapter,wefinallyintroducedeeplearningmodels;wewillexamineandimplementmorecomplexneuralnetworkarchitectures,suchasaconvolutionalneuralnetworkandarecurrentneuralnetwork.

Chapter5.DeepLearningInthischapter,wewillcoverthefollowingtopics:

DeeplearningtechniquesConvolutionalneuralnetwork(CNN)

CNNarchitectureTensorFlowimplementationofaCNN

Recurrentneuralnetwork(RNN)RNNarchitectureNaturalLanguageProcessingwithTensorFlow

DeeplearningtechniquesDeeplearningtechniquesareacrucialstepforwardtakenbythemachinelearningresearchersinrecentdecades,havingprovidedsuccessfulresultseverseenbeforeinmanyapplications,suchasimagerecognitionandspeechrecognition.

Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofattentioninthescopeofmachinelearning.Oneofthesereasonsisrepresentedbytheprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthem10/20times.

Anotherreasoniscertainlytheincreasingeaseoffindingevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofacertaindepthandwithhighdimensionalityoftheinputdata.Deeplearningconsistsofasetofmethodsthatallowasystemtoobtainahierarchicalrepresentationofthedataonmultiplelevels.Thisisachievedbycombiningsimpleunits(notlinear),eachofwhichtransformstherepresentationatitsownlevel,startingfromtheinputlevel,toarepresentationatahigher,levelslightlymoreabstract.Withasufficientnumberofthesetransformations,considerablycomplexinput-outputfunctionscanbelearned.

Withreferencetoaclassificationproblem,forexample,thehighestlevelsofrepresentation,highlighttheaspectsoftheinputdatathatarerelevantfortheclassification,suppressingtheonesthathavenoeffectontheclassificationpurposes.

Hierarchicalfeatureextractioninanimageclassificationsystem

Theprecedingschemedescribesthefeaturesoftheimageclassificationsystem(afacerecognizer):eachblockgraduallyextractsthefeaturesoftheinputimage,goingtoprocessdataalreadypre-processedfromthepreviousblocks,extractingincreasinglycomplexfeaturesoftheinputimage,andthusbuildingthehierarchicaldatarepresentationthatcharacterizesadeeplearning-basedsystem.

Apossiblerepresentationofthefeaturesofthehierarchycouldbeasfollows:

pixel-->edge-->texture-->motif-->part-->object

Inatextrecognitionproblem,however,thehierarchicalrepresentationcanbestructuredasfollows:

character-->word-->wordgroup-->clause-->sentence-->story

Adeeplearningarchitectureis,therefore,amulti-levelarchitecture,consistingofsimpleunits,allsubjecttotraining,manyofwhichcarrynon-lineartransformations.Eachunittransformsitsinputtoimproveitspropertiestoselectandamplifyonlytherelevantaspectsforclassificationpurposes,anditsinvariance,namelyitspropensitytoignoretheirrelevantaspectsandnegligible.

Withmultiplelevelsofnon-lineartransformations,therefore,withadepthapproximatelybetween5and20levels,adeeplearningsystemcanlearnandimplementextremelyintricateandcomplexfunctions,simultaneouslyverysensitivetothesmallestrelevantdetails,andextremelyinsensitiveandindifferenttolargevariationsofirrelevantaspectsoftheinputdatawhichcanbe,inthecaseofobjectrecognition:image'sbackground,brightness,orthepositionoftherepresentedobject.

Thefollowingsectionswillillustrate,withtheaidofTensorFlow,twoimportanttypesofdeepneuralnetworks:theconvolutionalneuralnetworks(CNNs),mainlyaddressedtotheclassificationproblems,andthentherecurrentneuralnetworks(RNNs),targetingNaturalLanguageProcessing(NLP)issues.

ConvolutionalneuralnetworksConvolutionalneuralnetworks(CNNs)areaparticulartypeofneuralnetwork-orienteddeeplearningthathaveachievedexcellentresultsinmanypracticalapplications,inparticulartheobjectrecognitioninimages.

Infact,CNNsaredesignedtoprocessdatarepresentedintheformofmultiplearrays,forexample,thecolorimages,representablebymeansofthreetwo-dimensionalarrayscontainingthepixel'scolorintensity.ThesubstantialdifferencebetweenCNNsandordinaryneuralnetworksisthattheformeroperatedirectlyontheimageswhilethelatteronfeaturesextractedfromthem.TheinputofaCNN,therefore,unlikethatofanordinaryneuralnetwork,willbetwo-dimensional,andthefeatureswillbethepixelsoftheinputimage.

TheCNNisthedominantapproachforalmostalltheproblemsofrecognition.Thespectacularperformanceofferedbynetworksofthistypehaveinfactpromptedthebiggestcompaniesintechnology,suchasGoogleandFacebook,toinvestinresearchanddevelopmentprojectsfornetworksofthiskind,andtodevelopanddistributeproductsimagerecognitionbasedonCNNs.

CNNarchitecture

TheCNNusethreebasicideas:localreceptivefields,convolution,andpooling.

Inconvolutionalnetworks,weconsiderinputassomethingsimilartowhatisshowninthefollowingfigure:

Inputneurons

OneoftheconceptsbehindCNNsislocalconnectivity.CNNs,infact,utilizespatial

correlationsthatmayexistwithintheinputdata.Eachneuronofthefirstsubsequentlayerconnectsonlysomeoftheinputneurons.Thisregioniscalledlocalreceptivefield.Inthefollowingfigure,itisrepresentedbytheblack5x5squarethatconvergestoahiddenneuron:

Frominputtohiddenneurons

Thehiddenneuron,ofcourse,willonlyprocesstheinputdatainsideofitsreceptivefield,notrealizingthechangesoutsideofthat.However,itiseasytoseethat,bysuperimposingseverallayers,thatarelocallyconnected,levelingupyouwillhaveunitsthatprocessmoreandmoreglobaldatacomparedtoinput,inaccordancewiththebasicprincipleofdeeplearning,tobringtheperformancetoalevelofabstractionthatisalwaysgrowing.

Thereasonforthelocalconnectivityresidesinthefactthatindataofarraysform,suchastheimages,thevaluesareoftenhighlycorrelated,formingdistinctgroupsofdatathatcanbeeasilyidentified.

Eachconnectionlearnsaweight(soitwillget5x5=25),insteadofthehiddenneuronwithanassociatedconnectinglearnsatotalbias,thenwearegoingtoconnecttheregionstoindividualneuronsbyperformingashiftfromtimetotime,asinthefollowingfigures:

Theconvolutionoperation

Thisoperationiscalledconvolution.Doingso,ifwehaveanimageof28x28inputsand5x5regions,wewillget24x24neuronsinthehiddenlayer.Wesaidthateachneuronhasabiasand5x5weightsconnectedtotheregion:wewillusetheseweightsandbiasesforall24x24neurons.Thismeansthatalltheneuronsinthefirsthiddenlayerwillrecognizethesamefeatures,justplaceddifferentlyintheinputimage.Forthisreason,themapofconnectionsfromtheinputlayertothehiddenfeaturemapiscalledsharedweightsandbiasiscalledsharedbias,sincetheyareinfactshared.

Obviously,weneedtorecognizeanimageofmorethanamapoffeatures,soacompleteconvolutionallayerismadefrommultiplefeaturemaps.

Multiplefeaturemaps

Intheprecedingfigure,weseethreefeaturemaps;ofcourse,itsnumbercanincreaseinpracticeandyoucangettouseconvolutionallayerswitheven20or40featuremaps.Agreatadvantageinthesharingofweightsandbiasisthesignificantreductionoftheparametersinvolvedinaconvolutionalnetwork.Consideringourexample,foreachfeaturemapweneed25weights(5x5)andabias(shared);thatis26parametersintotal.Assumingwehave20featuremaps,wewillhave520parameterstobedefined.Withafullyconnectednetwork,with784inputneuronsand,forexample,30hiddenlayerneurons,weneed30more784x30biasweights,reachingatotalof23.550parameters.

Thedifferenceisevident.Theconvolutionalnetworksalsousepoolinglayers,whicharelayersimmediatelypositionedaftertheconvolutionallayers;thesesimplifytheoutputinformationofthepreviouslayertoit(theconvolution).Ittakestheinputfeaturemapscomingoutoftheconvolutionallayerandpreparesacondensedfeaturemap.Forexample,wecansaythatthepoolinglayercouldbesummedup,inallitsunits,ina2x2regionofneuronsofthepreviouslayer.

Thistechniqueiscalledpoolingandcanbesummarizedwiththefollowingscheme:

Thepoolingoperationhelpstosimplifytheinformationfromalayertothenext

Obviously,weusuallyhavemorefeaturesmapsandweapplythemaximumpoolingtoeachofthemseparately.

Fromtheinputlayertothesecondhiddenlayer

Sowehavethreefeaturemapsofsize24x24forthefirsthiddenlayer,andthesecondhiddenlayerwillbeofsize12x12,sinceweareassumingthatforeveryunitsummarizea2x2region.

Combiningthesethreeideas,weformacompleteconvolutionalnetwork.Itsarchitecturecanbedisplayedasfollows:

ACNNsarchitecturalschema

Let'ssummarize:therearethe28x28inputneuronsfollowedbyaconvolutionallayerwithalocalreceptivefield5x5and3featuremaps.Weobtainasaresultofahiddenlayerofneurons3x24x24.Thenthereisthemax-poolingappliedto2x2onthe3regionsoffeaturemapsgettingahiddenlayer3x12x12.Thelastlayerisfullyconnected:itconnectsalltheneuronsofthemax-poolinglayertoall10outputneurons,usefultorecognizethecorrespondingoutput.

Thisnetworkwillthenbetrainedbygradientdescentandthebackpropagationalgorithm.

TensorFlowimplementationofaCNN

Inthefollowingexample,wewillseeinactiontheCNNinaproblemofimageclassification.WewanttoshowtheprocessofbuildingaCNNnetwork:whatarethestepstoexecuteandwhatreasoningneedstobedonetorunaproperdimensioningoftheentirenetwork,andofcoursehowtoimplementitwithTensorFlow.

Initializationstep

1. LoadandpreparetheMNISTdata:

importinput_data

2. DefinealltheCNNparameters:

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

3. MNISTdatainput(eachshapeisof28x28arraypixels):

n_input=784

4. TheMNISTtotalclasses(0-9digits)

n_classes=10

5. Toreducetheoverfitting,weapplythedropouttechnique.Thistermreferstodroppingoutunits(hidden,input,andoutput)inaneuralnetwork.Decidingwhichneuronstoeliminateisrandom;onewayistoapplyaprobability,asweshallseeinourcode.Forthisreason,wedefinethefollowingparameter(tobetuned):

dropout=0.75

6. Definetheplaceholdersfortheinputgraph.ThexplaceholdercontainstheMNISTdatainput(exactly728pixels):

x=tf.placeholder(tf.float32,[None,n_input])

7. Thenwechangetheformof4Dinputimagestoatensor,usingtheTensorFlowreshapeoperator:

_X=tf.reshape(x,shape=[-1,28,28,1])

Thesecondandthirddimensionscorrespondtothewidthandheightoftheimage,whilethelatterdimensionisthetotalnumberofcolorchannels(inourcase1).

Sowecandisplayourinputimageasatwo-dimensionaltensor,ofsize28x28:

Theinputtensorforourproblem

Theoutputtensorwillcontaintheoutputprobabilityforeachdigittoclassify:

y=tf.placeholder(tf.float32,[None,n_classes]).

Firstconvolutionallayer

Eachneuronofthehiddenlayerisconnectedtoasmallsubsetoftheinputtensorofdimension5x5.Thisimpliesthatthehiddenlayerwillhavea24x24size.Wealsodefineandinitializethetensorsofsharedweightsandsharedbias:

wc1=tf.Variable(tf.random_normal([5,5,1,32]))

bc1=tf.Variable(tf.random_normal([32]))

Recallthatinordertorecognizeanimage,weneedmorethanamapoffeatures.Thenumberisjustthenumberoffeaturemapsweareconsideringforthisfirstlayer.Inourcase,theconvolutionallayeriscomposedof32featuremaps.

Thenextstepistheconstructionofthefirstconvolutionlayer,conv1:

conv1=conv2d(_X,wc1,bc1)

Here,conv2disthefollowingfunction:

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

Forthispurpose,weusedtheTensorFlowtf.nn.conv2dfunction.Itcomputesa2Dconvolutionfromtheinputtensorandthesharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Forthispurpose,weusedthefunctiontf.nn.conv2dtocomputea2-Dconvolutionfromtheinputtensorandthetensorofsharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Whiletf.nn.reluistheRelufunction(Rectifiedlinearunit)thatistheusualactivationfunctioninthehiddenlayerofadeepneuralnetwork.

Wewillapplythisactivationfunctiontothereturnvaluethatwehavewiththeconvolutionfunction.Thepaddingvalueis'SAME',whichindicatesthattheoutputtensoroutputwillhavethesamesizeofinputtensor.

Onewaytorepresenttheconvolutionallayer,namelyconv1,isasfollows:

Thefirsthiddenlayer

Aftertheconvolutionoperation,weimposethepoolingstepthatsimplifiestheoutputinformationofthepreviouslycreatedconvolutionallayer.

Inourexample,let'stakea2x2regionoftheconvolutionlayerandwewillsummarizetheinformationateachpointinthepoolinglayer.

conv1=max_pool(conv1,k=2)

Here,forthepoolingoperation,wehaveimplementedthefollowingfunction:

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

Thetf.nn.max_poolfunctionperformsthemaxpoolingontheinput.Ofcourse,weapplythemaxpoolingforeachconvolutionallayer,andtherewillbemanylayersofpoolingandconvolution.Attheendofthepoolingphase,we'llhave12x12x32convolutionalhiddenlayers.

ThenextfigureshowstheCNNslayersafterthepoolingandconvolutionoperation:

TheCNNsafterafirstconvolutionandpoolingoperations

Thelastoperationistoreducetheoverfittingbyapplyingthetf.nn.dropoutTensorFlowoperatorsontheconvolutionallayer.Todothis,wecreateaplaceholderfortheprobability(keep_prob)thataneuron'soutputiskeptduringthedropout:

keep_prob=tf.placeholder(tf.float32)

conv1=tf.nn.dropout(conv1,keep_prob)

Secondconvolutionallayer

Forthesecondhiddenlayer,wemustapplythesameoperationsasthefirstlayer,andsowedefineandinitializethetensorsofsharedweightsandsharedbias:

Asyoucannote,thissecondhiddenlayerwillhave64featuresfora5x5window,whilethenumberofinputlayerswillbegivenfromthefirstconvolutionalobtainedlayer.Wenextapplyasecondlayertotheconvolutionalconv1tensor,butthistimeweapply64setsof5x5filterseachtothe32conv1layers:

conv2=conv2d(conv1,wc2,bc2)

Itgiveus6414x14arrayswhichwereducewithmaxpoolingto647x7arrays:

Finally,weagainusethedropoutoperation:

Theresultinglayerisa7x7x64convolutiontensorbecausewestartedfromtheinputtensor12x12andaslidingwindowof5x5,consideringthathasastrideof1.

Buildingthesecondhiddenlayer

Denselyconnectedlayer

Inthisstep,webuildadenselyconnectedlayerthatweusetoprocesstheentireimage.Theweightandbiastensorsareasfollows:

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

bd1=tf.Variable(tf.random_normal([1024]))

Asyoucannote,thislayerwillbeformedby1024neurons.

Thenwereshapethetensorfromthesecondconvolutionallayerintoabatchofvectors:

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

Multiplythistensorbytheweightmatrix,wd1,addthetensorbias,bd1,andapplyaRELUoperation:

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

Wecompletethislayerbyagainusingthedropoutoperator:

dense1=tf.nn.dropout(dense1,keep_prob)

Readoutlayer

Thelastlayerdefinesthetensorswoutandbout:

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bout=tf.Variable(tf.random_normal([n_classes]))

Beforeapplyingthesoftmaxfunction,wemustcalculatetheevidencethattheimagebelongstoacertainclass:

pred=tf.add(tf.matmul(dense1,wout),bout)

Testingandtrainingthemodel

Theevidencemustbeconvertedintoprobabilitiesforeachofthe10possibleclasses(themethodisidenticaltowhatwesawinChapter4,IntroducingNeuralNetworks).Sowedefinethecostfunction,whichevaluatesthequalityofourmodel,byapplyingthesoftmaxfunction:

cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred,

Anditsfunctionoptimization,usingtheTensorFlowAdamOptimizerfunction:

optimizer=

tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

Thefollowingtensorwillserveintheevaluationphaseofthemodel:

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

Launchingthesession

Initializethevariables:

Buildtheevaluationgraph:

sess.run(init)

step=1

Let'strainthenetuntiltraining_iters:

whilestep*batch_size<training_iters:

Fittrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

Calculatetheaccuracy:

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

Calculatetheloss:

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

Weprinttheaccuracyforthe256MNISTtestimages:

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

Runningthecode,wehavethefollowingoutput:

Iter1280,MinibatchLoss=27900.769531,

TrainingAccuracy=0.17188

Iter2560,MinibatchLoss=17168.949219,TrainingAccuracy=0.21094

Iter10240,MinibatchLoss=5066.223633,TrainingAccuracy=0.70312.

...................

....................

OptimizationFinished!

TestingAccuracy:0.921875

Itprovidesanaccuracyofabout99.2%.Obviously,itdoesnotrepresentthestateoftheart,becausethepurposeoftheexampleistojustseehowtobuildaCNN.Themodelcanbefurtherrefinedtogivebetterresults.

Sourcecode

#ImportMINSTdata

importinput_data

#Parameters

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

#NetworkParameters

n_classes=10#MNISTtotalclasses(0-9digits)

dropout=0.75#Dropout,probabilitytokeepunits

#tfGraphinput

x=tf.placeholder(tf.float32,[None,n_input])

y=tf.placeholder(tf.float32,[None,n_classes])

#dropout(keepprobability)

keep_prob=tf.placeholder(tf.float32)

#Createmodel

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

#Storelayersweight&bias

#5x5conv,1input,32outputs

#5x5conv,32inputs,64outputs

#fullyconnected,7*7*64inputs,1024outputs

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

#1024inputs,10outputs(classprediction)

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bd1=tf.Variable(tf.random_normal([1024]))

bout=tf.Variable(tf.random_normal([n_classes]))

#Constructmodel

_X=tf.reshape(x,shape=[-1,28,28,1])

#ConvolutionLayer

conv1=conv2d(_X,wc1,bc1)

#MaxPooling(down-sampling)

#ApplyDropout

#ConvolutionLayer

conv2=conv2d(conv1,wc2,bc2)

#MaxPooling(down-sampling)

#ApplyDropout

#Fullyconnectedlayer

#Reshapeconv2outputtofitdenselayerinput

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

#Reluactivation

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

#ApplyDropout

dense1=tf.nn.dropout(dense1,keep_prob)

#Output,classprediction

pred=tf.add(tf.matmul(dense1,wout),bout)

#Definelossandoptimizer

(tf.nn.softmax_cross_entropy_with_logits(pred,y))

optimizer=\

tf.train.AdamOptimizer\

#Evaluatemodel

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

#Launchthegraph

sess.run(init)

step=1

#Keeptraininguntilreachmaxiterations

whilestep*batch_size<training_iters:

sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

#Calculatebatchaccuracy

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

#Calculatebatchloss

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

#Calculateaccuracyfor256mnisttestimages

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

RecurrentneuralnetworksAnotherdeeplearning-orientedarchitectureisthatoftheso-calledrecurrentneuralnetworks(RNNs).ThebasicideaofRNNsistomakeuseofthesequentialinformationtypeintheinput.Inneuralnetworks,wetypicallyassumethateachinputandoutputisindependentfromalltheothers.Formanytypesofproblems,however,thisassumptiondoesnotresulttobepositive.Forexample,ifyouwanttopredictthenextwordofaphrase,itiscertainlyimportanttoknowthosethatprecedeit.Theseneuralnetsarecalledrecurrentbecausetheyperformthesamecomputationsforallelementsofasequenceofinputs,andtheoutputeachelementdepends,inadditiontothecurrentinput,onallpreviouscomputations.

RNNarchitecture

RNNsprocessasequentialinputitematatime,maintainingasortofupdatedstatevectorthatcontainsinformationaboutallpastelementsofthesequence.Ingeneral,anRNNhasashapeofthefollowingtype:

RNNarchitectureschema

TheprecedingfigureshowstheaspectofanRNN,withitsunfoldedversion,explainingthenetworkstructureforthewholesequenceofinputs,ateachinstantoftime.Itbecomesclearthat,differentlyfromthetypicalmulti-levelneuralnetworks,

whichuseseveralparametersateachlevel,anRNNalwaysusesthesameparameters,denominatedU,V,andW(seethepreviousfigure).Furthermore,anRNNperformsthesamecomputationateachinstant,onmultipleofthesamesequenceininput.Sharingthesameparameters,itstronglyreducesthenumberofparametersthatthenetworkmustlearnduringthetrainingphase,thusalsoimprovingthetrainingtime.

Itisalsoevidenthowyoucantrainnetworksofthistype,infact,becausetheparametersaresharedforeachinstantoftime,thegradientcalculatedforeachoutputdependsnotonlyfromthecurrentcomputationbutalsofromthepreviousones.Forexample,tocalculatethegradientattimet=4,itisnecessarytobackpropagatethegradientforthethreepreviousinstantsoftimeandthensumthegradientsthusobtained.Also,theentireinputsequenceistypicallyconsideredtobeasingleelementofthetrainingset.

However,thetrainingofthistypeofnetworksuffersfromtheso-calledvanishing/explodinggradientproblem;thegradients,computedandbackpropagated,tendtoincreaseordecreaseateachinstantoftimeandthen,afteracertainnumberofinstantsoftime,divergetoinfinityorconvergetozero.

LetusnowexaminehowanRNNoperates.Xt;isthenetworkinputatinstantt,whichcouldbe,forexample,avectorthatrepresentsawordofasentence,whileSt;isthestatevectorofthenet.Itcanbeconsideredasortofmemoryofthesystemwhichcontainsinformationonallthepreviouselementsoftheinputsequence.Thestatevectoratinstanttisevaluatedstartingfromthecurrentinput(timet)andthestatusevaluatedatthepreviousinstant(timet-1)throughtheUandWparameters:

St=f([U]Xt+[W]St-1)

Thefunctionfisanonlinearfunctionsuchasrectifiedlinearunit(ReLu),whileOt;istheoutputatinstantt,calculatedusingtheparameterV.

Theoutputwilldependonthetypeofproblemforthewhichthenetworkisused.Forexample,ifyouwanttopredictthenextwordofasentence,itcouldbeaprobabilityvectorwithrespecttoeachwordinthevocabularyofthesystem.

LSTMnetworks

LongSharedTermMemory(LSTM)networksareanextensionofthebasicmodelofRNNarchitectures.Themainideaistoimprovethenetwork,providingitwithan

explicitmemory.TheLSTMnetworks,infact,despitenothavinganessentiallydifferentarchitecturefromRNN,areequippedwithspecialhiddenunits,calledmemorycells,thebehaviorofwhichistorememberthepreviousinputforalongtime.

ALSTM)unit

TheLSTMunithasthreegatesandfourinputweights,xt(fromthedatatotheinputandthreegates),whilehtistheoutputoftheunit.

ALSTMblockcontainsgatesthatdeterminewhetheraninputissignificantenoughtobesaved.Thisblockisformedbyfourunits:

Inputgate:AllowsthevalueinputinthestructureForgetgate:GoestoeliminatethevaluescontainedinthestructureOutputgate:DetermineswhentheunitwilloutputthevaluestrappedinstructureCell:Enablesordisablesthememorycell

Inthenextexample,wewillseeaTensorFlowimplementationofaLSTMnetworkinalanguageprocessingproblem.

NLPwithTensorFlow

RNNshaveprovedtohaveexcellentperformanceinproblemssuchaspredictingthenextcharacterinatextor,similarly,thepredictionofthenextsequencewordinasentence.However,theyarealsousedformorecomplexproblems,suchasMachineTranslation.Inthiscase,thenetworkwillhaveasinputasequenceofwordsinasourcelanguage,whileyouwanttooutputthecorrespondingsequenceofwordsinalanguagetarget.Finally,anotherapplicationofgreatimportanceinwhichRNNsarewidelyusedisthatofspeechrecognition.Inthefollowing,wewilldevelopacomputationalmodelthatcanpredictthenextwordinatextbasedonthesequenceoftheprecedingwords.Tomeasuretheaccuracyofthemodel,wewillusethePennTreeBank(PTB)dataset,whichisthebenchmarkusedtomeasuretheprecisionofthesemodels.

Thisexamplereferstothefilesthatyoufindinthe/rnn/ptbdirectoryofyourTensorFlowdistribution.Itcomprisesofthefollowingtwofiles:

ptb_word_lm.py:ThequeuestotrainalanguagemodelonthePTBdatasetreader.py:Thecodetoreadthedataset

Unlikepreviousexamples,wewillpresentonlythepseudocodeoftheprocedureimplemented,inordertounderstandthemainideasbehindtheconstructionofthemodel,withoutgettingboggeddowninunnecessaryimplementationdetails.Thesourcecodeisquitelong,andanexplanationofthecodelinebylinewouldbetoocumbersome.

Seehttps://www.tensorflow.org/versions/r0.8/tutorials/recurrent/index.htmlforotherreferences.

Downloadthedata

Youcandownloadthedatafromthewebpagehttp://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgzandthenextractthedatafolder.Thedatasetispreprocessedandcontains10000,differentwords,includingtheend-of-sentencemarkerandaspecialsymbol(<unk>)forrarewords.Weconvertalloftheminreader.pytouniqueintegeridentifierstomakeiteasyfortheneuralnetworktoprocess.

Toextracta.tgzfilewithtar,youneedtousethefollowing:

tar-xvzf/path/to/yourfile.tgz

BuildingthemodelThismodelimplementsanarchitectureoftheRNNusingtheLSTM.Infact,itplanstoincreasethearchitectureoftheRNNbyincludingstorageunitsthatallowsavinginformationregardinglong-termtemporaldependencies.

TheTensorFlowlibraryallowsyoutocreateaLSTMthroughthefollowingcommand:

lstm=rnn_cell.BasicLSTMCell(size)

HeresizeshouldbethenumberofunitstobeusedLSTM.TheLSTMmemoryisinitializedtozero:

state=tf.zeros([batch_size,lstm.state_size])

Inthecourseofcomputation,aftereachwordtoexaminethestatevalueisupdatedwiththeoutputvalue,followingisthepseudocodelistoftheimplementedsteps:

loss=0.0

forcurrent_batch_of_wordsinwords_in_dataset:

output,state=lstm(current_batch_of_words,state)

outputisthenusedtomakepredictionsonthepredictionofthenextword:

logits=tf.matmul(output,softmax_w)+softmax_b

probabilities=tf.nn.softmax(logits)

loss+=loss_function(probabilities,target_words)

Thelossfunctionminimizestheaveragenegativelogprobabilityofthetargetwords,itistheTensorFowfunction:

tf.nn.seq2seq.sequence_loss_by_example

Itcomputestheaverageper-wordperplexity,itsvaluemeasurestheaccuracyofthemodel(tolowervaluescorrespondbestperformance)andwillbemonitoredthroughoutthetrainingprocess.

RunningthecodeThemodelimplementedsupportsthreetypesofconfigurations:small,medium,andlarge.ThedifferencebetweenthemisinsizeoftheLSTMsandthesetofhyperparametersusedfortraining.Thelargerthemodel,thebetterresultsitshouldget.Thesmallmodelshouldbeabletoreachperplexitybelow120onthetestsetandthelargeonebelow80,thoughitmighttakeseveralhourstotrain.

Toexecutethemodelsimplytypethefollowing:

pythonptb_word_lm--data_path=/tmp/simple-examples/data/--model

In/tmp/simple-examples/data/,youmusthavedownloadedthedatafromthePTBdataset.

Thefollowinglistshowstherunafter8hoursoftraining(13epochsforasmallconfiguration):

Epoch:1Learningrate:1.000

0.004perplexity:5263.762speed:391wps

Epoch:1TrainPerplexity:268.124

Epoch:1ValidPerplexity:180.210

............................................................

TestPerplexity:117.171

Asyoucansee,theperplexitybecameloweraftereachepoch.

SummaryInthischapter,wegaveanoverviewofdeeplearningtechniques,examiningtwoofthedeeplearningarchitecturesinuse,CNNandRNNs.ThroughtheTensorFlowlibrary,wedevelopedaconvolutionalneuralnetworkarchitectureforimageclassificationproblem.ThelastpartofthechapterwasdevotedtoRNNs,wherewedescribedtheTensorFlow'stutorialforRNNs,whereaLSTMnetworkisbuilttopredictthenextwordinanEnglishsentence.

ThenextchaptershowstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowserving,ahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.

Chapter6.GPUProgrammingandServingwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

GPUprogrammingTensorFlowServing:

HowtoinstallTensorFlowServingHowtouseTensorFlowServingHowtoloadandexportaTensorFlowmodel

GPUprogrammingInChapter5,DeepLearning,wherewetrainedarecurrentneuralnetwork(RNN)foranNLPapplication,wecouldseethatdeeplearningapplicationscanbecomputationallyintensive.However,youcanreducethetrainingtimebyusingparallelprogrammingtechniquesthroughagraphicprocessingunit(GPU).Infact,thecomputationalresourcesofmoderngraphicsunitsmakethemabletoperformparallelcodeportions,ensuringhighperformance.

TheGPUprogrammingmodelisaprogrammingstrategythatconsistsofreplacingaCPUtoaGPUtoacceleratetheexecutionofavarietyofapplications.Therangeofapplicationsofthisstrategyisverylargeandisgrowingdaybyday;theGPUs,currently,areabletoreducetheexecutiontimeofapplicationsacrossdifferentplatforms,fromcarstomobilephones,andfromtabletstodronesandrobots.

ThefollowingdiagramshowshowtheGPUprogrammingmodelworks.Intheapplication,therearecallstotelltheCPUtogiveawayspecificpartofthecodeGPUandletitruntogethighexecutionspeed.ThereasonforsuchspecificparttorelyontwoGPUisuptothespeedprovidedbytheGPUarchitecture.GPUhasmanyStreamingMultiprocessors(SMPs),witheachhavingmanycomputationalcores.ThesecoresarecapableofperformingALUandotheroperationswiththehelpofSingleInstructionMultipleThread(SIMT)calls,whichreducetheexecutiontimedrastically.

IntheGPUprogrammingmodeltherearepiecesofcodethatareexecutedsequentiallyintheCPU,andsomepartsareexecutedinparallelbytheGPU

TensorFlowpossessescapabilitiesthatyoucantakeadvantageofthisprogrammingmodel(ifyouhaveaNVIDIAGPU),thepackageversionthatsupportsGPUrequiresCudaToolkit7.0and6.5CUDNNV2.

FortheinstallationofCudaenvironment,wesuggestreferringtheCudainstallationpage:http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/#axzz49w1XvzNj

TensorFlowreferstothesedevicesinthefollowingway:

/cpu:0:ToreferencetheserverCPU/gpu:0:TheGPUserverifthereisonlyone

/gpu:1:ThesecondGPUserverandsoon

Tofindoutwhichdeviceisassignedtoouroperationsandtensionersneedtocreatethesessionwiththeoptionofsettinglog_device_placementinstantiatedtoTrue.

Considerthefollowingexample.

Wecreateacomputationalgraph;aandbwillbetwomatrices:

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

Incweputthematrixmultiplicationofthesetwoinputtensors:

c=tf.matmul(a,b)

Thenwebuildasessionwithlog_device_placementsettoTrue:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

Finally,welaunchthesession:

printsess.run(c)

Youshouldseethefollowingoutput:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/gpu:0

a:/job:localhost/replica:0/task:0/gpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

Ifyouwouldlikeaparticularoperationtorunonadeviceofyourchoiceinsteadofwhat'sautomaticallyselectedforyou,youcanusetf.devicetocreateadevicecontext,sothatalltheoperationswithinthatcontextwillhavethesamedeviceassignment.

Let'screatethesamecomputationalgraphusingthetf.deviceinstruction:

withtf.device('/cpu:0'):

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

c=tf.matmul(a,b)

Again,webuildthesessiongraphandlaunchit:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

printsess.run(c)

Youwillseethatnowaandbareassignedtocpu:0:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/cpu:0

a:/job:localhost/replica:0/task:0/cpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

IfyouhavemorethanaGPU,youcandirectlyselectitsettingallow_soft_placementtoTrueintheconfigurationoptionwhencreatingthesession.

TensorFlowServingServingisaTensorFlowpackagethathasbeendevelopedtotakemachinelearningmodelsintoproductionsystems.ItmeansthatadevelopercanuseTensorFlowServing'sAPItobuildaservertoservetheimplementedmodel.

Theservedmodelwillbeabletomakeinferencesandpredictionseachtimeondatapresentedbyitsclients,allowingtoimprovethemodel.

Tocommunicatewiththeservingsystem,theclientsuseahighperformanceopensourceremoteprocedurecall(RPC)interfacedevelopedbyGoogle,calledgRPC.

Thetypicalpipeline(seethefollowingfigure)isthattrainingdataisfedtothelearner,whichoutputsamodel.Afterbeingvalidated,itisreadytobedeployedtotheTensorFlowservingsystem.Itisquitecommontolaunchanditerateonourmodelovertime,asnewdatabecomesavailable,orasyouimprovethemodel.

TensorFlowServingpipeline

HowtoinstallTensorFlowServingTocompileanduseTensorFlowServing,youneedtosetupsomeprerequisites.

TensorFlowServingrequiresBazel0.2.0(http://www.bazel.io/)orhigher.Downloadbazel-0.2.0-installer-linux-x86_64.sh.

Bazelisatoolthatautomatessoftwarebuildsandtests.Supportedbuildtasksincluderunningcompilersandlinkerstoproduceexecutableprogramsandlibraries,andassemblingdeployablepackages.

Runthefollowingcommands:

chmod+xbazel-0.2.0-installer-linux-x86_64.sh

./bazel-0.2.0-installer-linux-x86_64.sh-user

Finally,setupyourenvironment.Exportthisinyour~/.bashrcdirectory:

exportPATH="$PATH:$HOME/bin"

OurtutorialsusegRPC(0.13orhigher)asourRPCframework.

Youcanfindotherreferencesathttps://github.com/grpc.

TensorFlowservingdependencies

ToinstallTensorFlowservingdependencies,executethefollowing:

sudoapt-getupdate&&sudoapt-getinstall-y\

build-essential\

libfreetype6-dev\

libpng12-dev\

libzmq3-dev\

pkg-config\

python-dev\

python-numpy\

python-pip\

software-properties-common\

zlib1g-dev

ThenconfigureTensorFlow,byrunningthefollowingcommand:

cdtensorflow

./configure

InstallServing

UseGittoclonetherepository:

gitclone--recurse-submodules

https://github.com/tensorflow/serving

cdserving

The--recurse-submodulesoptionisrequiredtofetchTensorFlow,gRPC,andotherlibrariesthatTensorFlowservingdependson.TobuildTensorFlow,youmustuseBazel:

bazelbuildtensorflow_serving/

Thebinarieswillbeplacedinthebazel-bindirectory,andcanberunusingthefollowingcommand:

/bazel-bin/tensorflow_serving/example/mnist_inference

Finally,youcantesttheinstallationbyexecutingthefollowingcommand:

bazeltesttensorflow_serving/

HowtouseTensorFlowServingInthistutorial,wewillshowhowtoexportatrainedTensorFlowmodelandbuildaservertoservetheexportedmodel.TheimplementedmodelisaSoftmaxRegressionmodelforhandwrittenimageclassification(MNISTdata).

Thecodewillconsistoftwoparts:

APythonfile(mnist_export.py)thattrainsandexportsthemodelAC++file(mnist_inference.cc)thatloadstheexportedmodelandrunsagRPCservicetoserveit

Inthefollowingsections,wereportthebasicstepstouseTensorFlowServing.Forotherreferences,youcanviewhttps://tensorflow.github.io/serving/serving_basic.

TrainingandexportingtheTensorFlowmodel

Asyoucanseeinmnist_export.py,thetrainingisdonethesamewayasintheMNIST.Forabeginnerstutorial,referthefollowinglink:

https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html

TheTensorFlowgraphislaunchedinTensorFlowsessionsess,withtheinputtensor(image)asxandtheoutputtensor(Softmaxscore)asy.ThenweusetheTensorFlowservingexportertoexportthemodel;itbuildsasnapshotofthetrainedmodelsothatitcanbeloadedlaterforinference.Let'snowseethemainfunctiontousetoexportatrainedmodel.

Importtheexportertoserializethemodel:

fromtensorflow_serving.session_bundleimportexporter

ThenyoumustdefinesaverusingtheTensorFlowfunctiontf.train.Saver.IthastheshardedparameterequaltoTrue:

saver=tf.train.Saver(sharded=True)

saverisusedtoserializegraphvariablevaluestothemodelexportsothattheycanbeproperlyrestoredlater.

Thenextstepistodefinemodel_exporter:

model_exporter=exporter.Exporter(saver)

signature=exporter.classification_signature\

(input_tensor=x,scores_tensor=y)

model_exporter.init(sess.graph.as_graph_def(),

default_graph_signature=signature)

model_exportertakesthefollowingtwoarguments:

sess.graph.as_graph_def()istheprotobufofthegraph.ExportingwillserializetheprotobuftothemodelexportsothattheTensorFlowgraphcanbeproperlyrestoredlater.default_graph_signature=signaturespecifiesamodelexportsignature.Thesignaturespecifieswhattypeofmodelisbeingexported,andtheinput/outputtensorstobindtowhenrunninginference.Inthiscase,youuseexporter.classification_signaturetospecifythatthemodelisaclassificationmodel.

Finally,wecreateourexport:

model_exporter.export(export_path,tf.constant\

(FLAGS.export_version),sess)

model_exporter.exporttakesthefollowingarguments:

export_pathisthepathoftheexportdirectory.Exportwillcreatethedirectoryifitdoesnotexist.tf.constant(FLAGS.export_version)isatensorthatspecifiestheversionofthemodel.Youshouldspecifyalargerintegervaluewhenexportinganewerversionofthesamemodel.Eachversionwillbeexportedtoadifferentsub-directoryunderthegivenpath.sessistheTensorFlowsessionthatholdsthetrainedmodelyouareexporting.

Runningasession

Toexportthemodel,firstcleartheexportdirectory:

$>rm-rf/tmp/mnist_model

Then,usingbazel,buildthemnist_exportexample:

$>bazelbuild//tensorflow_serving/example:mnist_export

Finally,youcanrunthefollowingexample:

$>bazel-bin/tensorflow_serving/example/mnist_export/tmp/mnist_model

Trainingmodel...

Donetraining!

Exportingtrainedmodelto/tmp/mnist_model

Doneexporting!

Lookingintheexportdirectory,weshouldhaveasub-directoryforexportingeachversionofthemodel:

$>ls/tmp/mnist_model

00000001

Thecorrespondingsub-directoryhasthedefaultvalueof1,becausewespecifiedtf.constant(FLAGS.export_version)asthemodelversionearlier,andFLAGS.export_versionhasthedefaultvalueof1.

Eachversionofsub-directorycontainsthefollowingfiles:

export.metaistheserializedtensorflow::MetaGraphDefofthemodel.Itincludesthegraphdefinitionofthemodel,aswellasmetadataofthemodel,suchassignatures.export-?????-of-?????arefilesthatholdtheserializedvariablesofthegraph.

$>ls/tmp/mnist_model/00000001

checkpointexport-00000-of-00001export.meta

LoadingandexportingaTensorFlowmodelTheC++codeforloadingtheexportedTensorFlowmodelisinthemain()functioninmnist_inference.cc.Herewereportanexcerpt;wedonotconsidertheparametersforbatching.Ifyouwanttoadjustthemaximumbatchsize,timeoutthreshold,orthenumberofbackgroundthreadsusedforbatchedinference,youcandosobysettingmorevaluesinBatchingParameters:

intmain(intargc,char**argv)

SessionBundleConfigsession_bundle_config;

...Herebatchingparameters

std::unique_ptr<SessionBundleFactory>bundle_factory;

TF_QCHECK_OK(

SessionBundleFactory::Create(session_bundle_config,

&bundle_factory));

std::unique_ptr<SessionBundle>bundle(newSessionBundle);

TF_QCHECK_OK(bundle_factory->CreateSessionBundle(bundle_path,

&bundle));

......

RunServer(FLAGS_port,std::move(bundle));

return0;

SessionBundleisacomponentofTensorFlowServing.Let'sconsidertheincludefileSessionBundle.h:

structSessionBundle{

std::unique_ptr<tensorflow::Session>session;

tensorflow::MetaGraphDefmeta_graph_def;

ThesessionparameterisaTensorFlowsessionthathastheoriginalgraphwiththenecessaryvariablesproperlyrestored.

SessionBundleFactory::CreateSessionBundle()loadstheexportedTensorFlowmodelfrombundle_pathandcreatesaSessionBundleobjectforrunninginferencewiththemodel.

RunServerbringsupagRPCserverthatexportsasingleClassify()API.

Eachinferencerequestwillbeprocessedinthefollowingsteps:

1. Verifytheinput.TheserverexpectsexactlyoneMNIST-formatimageforeachinferencerequest.

2. Transforminputtoinferenceinputtensorandcreateoutputtensorplaceholder.3. Runinference.

Torunaninference,youmusttypethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_inference

$>bazel-bin/tensorflow_serving/example/mnist_inference--port=9000

/tmp/mnist_model/00000001

TesttheserverTotesttheserver,weusethemnist_client.py(https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_client.pyutility.

ThisclientdownloadsMNISTtestdata,sendsitasrequeststotheserver,andcalculatestheinferenceerrorrate.

Torunit,typethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_client

$>bazel-bin/tensorflow_serving/example/mnist_client--num_tests=1000

--server=localhost:9000

Inferenceerrorrate:10.5%

Theresultconfirmsthattheserverloadsandrunsthetrainedmodelsuccessfully.Infact,a10.5%inferenceerrorratefor1,000imagesgivesus91%accuracyforthetrainedSoftmaxmodel.

SummaryWedescribedtwoimportantfeaturesofTensorFlowinthischapter.FirstwasthepossibilityofusingtheprogrammingmodelknownasGPUcomputing,withwhichitbecomespossibletospeedupthecode(forexample,thetrainingphaseofaneuralnetwork).ThesecondpartofthechapterwasdevotedtodescribingtheframeworkTensorFlowServing.Itisahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.Thispowerfulframeworkcanrunmultiplemodelsatlargescalethatchangeovertime,basedonreal-worlddata,enablingamoreefficientuseofGPUresourcesandallowingthedevelopertoimprovetheirownmachinelearningmodels.

Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

Documents

Transcript of Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

Getting Started with Carnegie Learning MATHia Software

Lesson 0. Getting Started - Adafruit Learning System

Getting Started - Create your Learning Toolbox Stack Table of … · 2020-05-23 · 1 getting started - create your learning toolbox stack version 2.0 table of contents getting started

Getting started with data science and machine learning

Examview Learning Series getting started guide

Getting started with the learning center

E-Learning Series: Getting Started with Windows and Mac ...firemonkey.borlandforum.com/impboard/attach/0000143970/e_learning... · 01.08.2012 · E-Learning Series: Getting Started

Getting Started with Machine Learning

English Learning Getting Started - bmj.com

EToolbox: Mapping Technology to Learning Getting Started.

Getting Started with Deep Learning using Scala

Family guide for getting started - IXL Learning

e-learning Getting started Professor Stephen Brown GEES.

TU0116 Getting Started with FPGA Design - Altiumvalhalla.altium.com/Learning-Guides/TU0116 Getting Started with... · Getting Started with FPGA Design ... and get it running on a

Learning Analytics Primer: Getting Started with Learning and Performance Analytics

Azure Machine Learning getting started

Flipgrid Getting Started Remote Learning(Japanese)new...Microsoft Word - Flipgrid Getting Started Remote Learning(Japanese)new.docx Created Date 20200320161710Z ...

Getting Started with e-Learning

Getting Started With Intercultural Language Learning

Getting Started with Linked Learning