Just Enough Python - Cloudera · PDF file§ Explana-on of how Python syntax is represented...
Transcript of Just Enough Python - Cloudera · PDF file§ Explana-on of how Python syntax is represented...
JustEnoughPython
201511a
Introduc5onChapter1
01-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
CourseChapters
§ Introduc.on
§ Introduc5ontoPython
§ Variables
§ Collec5ons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
01-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
ChapterTopics
Introduc.on
§ AboutthisCourse
§ AboutCloudera
§ CourseLogis5cs
§ Introduc5ons
01-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
Duringthiscourse,youwilllearn
§ “JustEnough”PythonProgramming– “JustEnough”meanstoenableasolidfounda5onforHands-On
ExercisesinClouderaDeveloperclasses
– Not“proficientasaPythonprogrammer”(Pythonista,Pythoneer)
CourseObjec5ves
01-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ Studentsshouldpossessthefollowingprerequisitesforthiscourse– InterestinoneofCloudera’sdeveloper-orientedcourses– Someprogrammingexperience
– Nospecificlanguageorlevelofexperience– Familiaritywithobject-orientedprogrammingconcepts
– Basicskillsandvocabulary§ Thefollowingarenotrequired(andarenotcoveredinthiscourse)
– ExperiencewithClouderaproducts– ExperiencewithHadooporSpark– Experiencewithdataanaly5cs
AudienceBackground
01-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
ChapterTopics
Introduc.on
§ AboutthisCourse
§ AboutCloudera
§ CourseLogis5cs
§ Introduc5ons
01-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ TheleaderinApacheHadoop-basedsoQwareandservices
§ FoundedbyHadoopexpertsfromFacebook,Yahoo,Google,andOracle
§ Providessupport,consul.ng,training,andcer.fica.onforHadoopusers
§ StaffincludescommiZerstovirtuallyallHadoopprojects
§ ManyauthorsofindustrystandardbooksonApacheHadoopprojects
AboutCloudera(1)
01-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ OurcustomersincludemanykeyusersofHadoop
§ Weofferseveralpublictrainingcourses,suchas– ClouderaDeveloperTrainingforApacheSparkandHadoop– ClouderaAdministratorTrainingforApacheHadoop– ClouderaDataAnalystTraining:UsingPig,Hive,andImpalawithHadoop– DesigningandBuildingBigDataApplicaCons– DataScienceatScaleUsingSparkandHadoop– ClouderaTrainingforApacheHBase
§ On-siteandcustomizedtrainingisalsoavailable
AboutCloudera(2)
01-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
CDH(Cloudera’sDistribu.onincludingApacheHadoop)
CDH
§ 100%opensource,enterprise-readydistribu.onofHadoopandrelatedprojects
§ Themostcomplete,tested,andwidely-deployeddistribu.onofHadoop
§ IntegratesallthekeyHadoopecosystemprojects
§ AvailableasRPMsandUbuntu,Debian,orSuSEpackages,orasatarball
BATCH
PROCESSING
MapReduce,Hive,Pig
ANALYTICSQL
Impala
SEARCHENGINE
ClouderaSearch
MACHINELEARNING
Spark,
MapReduce,Mahout
STREAM
PROCESSING
Spark
3rdPARTYAPPS
Partners
WORKLOADMANAGEMENT(YARN)
STORAGEFORANYTYPEOFDATAUNIFIED,ELASTIC,RESILIENT,SECURE(Sentry)
FilesystemHDFS
OnlineNoSQLHBase
DATAINTEGRATION(Sqoop,Flume,NFS)
METAD
ATA
Engines
01-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ ClouderaExpress– Completelyfreeto
downloadanduse
§ ThebestwaytogetstartedwithHadoop
§ IncludesCDH
§ IncludesClouderaManager– End-to-endadministra5onfor
Hadoop
– Deploy,manage,and
monitoryourcluster
ClouderaExpress
01-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ ClouderaEnterprise– Subscrip5onproductincludingCDHandClouderaManager
§ Includessupport
§ IncludesextraClouderaManagerfeatures– Configura5onhistoryandrollbacks– Rollingupdates– LDAPintegra5on– SNMPsupport
– Automateddisasterrecovery
§ Extendcapabili.eswithClouderaNavigatorsubscrip.on– Eventaudi5ng,metadatataggingcapabili5es,lineageexplora5on
– AvailableinboththeClouderaEnterpriseFlexandDataHubedi5ons
ClouderaEnterprise
01-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
ChapterTopics
Introduc.on
§ AboutthisCourse
§ AboutCloudera
§ CourseLogis.cs
§ Introduc5ons
01-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ Classstartandfinish.mes
§ Lunch
§ Breaks
§ Restrooms
§ Wi-Fiaccess
§ Virtualmachines
Logis5cs
Yourinstructorwillgiveyoudetailsonhowtoaccessthecoursematerialsandexerciseinstruc.onsfortheclass
01-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
ChapterTopics
Introduc.on
§ AboutthisCourse
§ AboutCloudera
§ CourseLogis5cs
§ Introduc.ons
01-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriEenconsentfromCloudera.
§ Aboutyourinstructor
§ Aboutyou– Wheredoyouwork?Whatdoyoudothere?
– DoyouhaveexperiencewithUNIXorLinux?– Whatprogramminglanguageshaveyouused?
– Howmuchprogrammingexperiencedoyouhave?
– Whatdoyouexpecttogainfromthiscourse?
Introduc5ons
Introduc)ontoPythonChapter2
02-2©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
CourseChapters
§ Introduc)on
§ Introduc-ontoPython
§ Variables
§ Collec)ons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
02-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
Inthischapteryouwilllearn
§ ThehistoryofPythonandwhatmakesPythonspecialamonglanguages
§ Thepurposeandscopeoftheclass,andwhatpartsofPythonwillbediscussedandwhattopicsarebeyondtheaimsofthisclass
§ Explana-onofhowPythonsyntaxisrepresentedintheclassslides
§ Introduc-ontotheHands-OnExerciseenvironmentincludingIPython
Introduc)ontoPython
02-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma-on
§ Scope
§ Exercises
§ Essen)alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Generalpurposehigh-levelprogramminglanguage– Mul)-paradigm(func)onal,object-oriented,procedural,logical)
§ StartedbyGuidovanRossumin1989– Version1.0releasedin1994– TransferredtothePythonSoYwareFounda)onin2001
§ Designphilosophy– Readability
– ExplicitisbeBerthanimplicit– SimpleisbeBerthancomplex,complexbeBerthancomplicated
– Fewerlinesofcode,easiercoding– Dynamictyping– Automa)cmemorymanagement– Comprehensivelibrary
Python
02-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma)on
§ Scope
§ Exercises
§ Essen)alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Python2.6/2.7(not3.x)
§ Variables– integer,float,boolean,string
§ Collec-ons– list[],tuple(),set{},frozenset,dictionary
§ FlowControl– Codeblocks,if,if else,if elif,else – while,for in range,for in list,try
§ ProgramStructure– Func)ons,anonymousfunc)ons,generators
§ WorkingwithLibraries– import,fromimport,sys,math
InScope
02-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Randomnumbers
§ Sta-s-calordate/-mefunc-ons
§ Advancedstringmanipula-on
§ Serializa-on,JSON,orXML
§ Timefunc-onsor-mers
§ GUIsorWebapplica-ons
§ DatabaseI/OornetworkI/O
§ Pythonconcurrencyandprocessmanagement
§ Codedebuggingorprototypingfacili-es
§ Applica-onpackaginganddeployment
OutofScope
02-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma)on
§ Scope
§ Exercises
§ Essen)alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ ThePythonshellisenteredbytypingpython – Thedefaultshell– Doesnotinterfacewiththehostfilesystem,nopathcomple)on– Doesnotprovidecommandcomple)on– IsREPL:Read–Evaluate–Print–Loop
– Whenyouenteranexpression,Pythonimmediatelyevaluatesit,assignsthevaluetoanimplicitvariable,andprintsittotheconsole– REPLisexcellentforinterac)vecodeexplora)on
PythonShell–REPL
$ python >>> a = 123 >>> 123 >>> type(a) >>> type <'int'>
02-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ TheIPythonshellisenteredbytypingipython – IPythonisaseparateprojectfromPython– Providesprogrammingfeaturesoverthedefaultshell– Filesysteminterac)on,ls,pathnamecomple)on(tab)– Commandcomple)onandsugges)ons(tab-tab)– IntegratedPythonhelp/documenta)onwithhelp() – Runfilesfromwithintheshellwithrun filename – Interrogateobjectswith?object – Editfileswith%ed – AYeredi)ngtheprogram,ifyouexitusing[Esc]wq,theprogramwillberunThisistheshellwewilluseinclass
IPythonShell
$ ipython In [1]: a = 123 In [2]: print a 123 In [3]: type(a) Out[3]: int
ExampleofthelookoftheIPythonshell
02-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Keywordsandsyntaxinbold
§ Developer-suppliednamesandvariablesinitalics
§ Developer-providedcodeingray
§ Call-outsinitalicbluetext
Forma`ngConven)onsofDocumenta)on
def name(parameters): code-block return variable
Optional return variable
02-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Codeexamplesareshownonapalebluebackground
§ Thecodeyouenterisdisplayedwithoutanypromptinblacktype
§ Thesystemresponseisshownwitha>promptandinbluetype
Forma`ngConven)onsofCodeExamples
s = "Titanic 4000" print s > Titanic 4000
months={1:'JAN',2:'FEB',3:'MAR',4:'APR',5:'MAY', \ 6:'JUN',7:'JUL',8:'AUG',9:'SEP',10:'OCT', \ 11:'NOV',12:'DEC'}
Line continuation
Pythonhasa79-characterlinelimit.Con-nuelinesexplicitlywithasinglebackslash.
02-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Thissymbol,intheupperrightcorneroftheslide,meansthatthetechniqueisfrequentlyusedinotherClouderaclasses
UsedInOtherClouderaTraining
02-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Loudacreisamobilephonecarrier– Theyhaveprovideduswithrealis)cdevicelogdatafromtheirmobilephoneswhichwewillbeusinginourexercises– Loudacreisafic)onalcompany
§ Aboutthelogdata– Every)meaphonehasacatastrophicerrorrequiringasoYorwarmrestart,thephonereportsdevicestatusinforma)onandsendsittothecentralsystemwhereitiscollectedforlateranalysis
LoudacreMobile
mobileudacreoL
02-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Usethebuilt-infileobjectanditsmethods– Methods:open(),read(),readline(),andclose() – Filemodes:r=read,w=write,a=append– Fileformat:b=binary,t=text(default)
BasicFileI/O
line =' ' file = open('loudacre.log','rt') while True: line = file.readline() if not line: break print line file.close()
02-17©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Usethebuilt-inprint()func-on– Basicprin)ngofvariables:print var1, var2, var3 – PrintformaBedstring:
– print formatstring %(var1, var2, var3) – Forma`ng:%d=integer,%r=real(float),%s=string
BasicPrin)ngandKeyboardInput
s = "Sorrento F41L" print s formatstring = "Device temperature %d to %r celsius" print formatstring %(24, 31.24) > Sorrento F41L > Device temperature 24 to 31.24 Celsius
02-18©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Usetheraw_input()func-on
BasicKeyboardInput
s = raw_input("Enter: ") print s > Enter: > Enter: Titanic 4000 > Titanic 4000
02-19©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma)on
§ Scope
§ Exercises
§ Essen-alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-20©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Backgroundinforma-onaboutPython
§ Scopeandversioncoverage
§ Symbolismandsyntaxusedinslides
§ Introduc-ontoexercises,IPythonshell,andLoudacre
Essen)alPoints
02-21©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma)on
§ Scope
§ Exercises
§ Essen)alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-22©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ StartusingIPythonshell– Usetheshellinterac)vely(REPL)– TrysomesimplePython– UsetheIPythonshelltoexecuteOScommands,pwdandls – Trytheintegratedhelp()system– Editandrunaprogramusing%ed
Hands-OnExercise:UsingIPython
02-23©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
ChapterTopics
Introduc-ontoPython
§ PythonBackgroundInforma)on
§ Scope
§ Exercises
§ Essen)alPoints
§ Hands-OnExercise:UsingIPython
§ Hands-OnExercise:LoudacreMobile
02-24©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriBenconsentfromCloudera.
§ Introduc-ontotheloudacre.logdata– EditandrunafileI/Oprogram– Readeachlineofloudacre.logandprintit
Hands-OnExercise:LoudacreMobile
VariablesChapter3
03-2©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
CourseChapters
§ IntroducEon
§ IntroducEontoPython
§ Variables
§ CollecEons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
03-3©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
Variables
Inthischapteryouwilllearn
§ Howtocreatenewvariableswithoutexplicitlydeclaringthem,usingPython'sdynamictyping
§ HowtodisAnguishbetweenMutableandImmutableproperAesofvariables
§ WhatisthebasicusageandmanipulaAonofnumericalvariables,booleans,andstrings?
§ HowtobuildsophisAcatedstringtransformaAonsusingchaining,slices,andconcatenaAontechniques
03-4©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenEalPoints
§ Hands-OnExercise:Variables
03-5©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ BasicI/O– raw_input() – print()
§ VariableNames– IniEalle@er,le@ersandnumbers,casesensiEve,underscorebutnootherspecialcharacters– Reservedwordsofthelanguage,keywords,prohibited– __(doubleunderscore)reservedforsomebuilt-inobjectmethods– ConvenEon:
– IniEalcapsforclasses,iniEallowercaseforinstances
Variables
yourVar = raw_input('yourPromptHere') print(yourVar)
Optional prompt
03-6©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ BasicVariables– Integers– Floats– Strings– Booleans
PreviewofVariables
§ CompoundVariables– Lists– Tuples– Sets– Frozensets– DicEonaries
type(yourVar)
AlmosteverythinginPythonissomekindofobjectorvariable.
ThereareafewaddiAonaltypesavailableinPythonsuchasa“bytearray”thatareoutsidethescopeofthiscourse.
03-7©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ DynamicallyTyped– Inotherlanguagesyouneedtodeclarevariabletype– InPythontypeisestablishedonfirstuse
§ MutableversusImmutable– Immutable–valuecannotbechangeda\eriniEalassignment– OnlytwocollecEons(TupleandFrozenset)areimmutable
§ Re-assignable– Pythonre-evaluatestypewitheveryassignment
– Insomeotherlanguagesa@emptstousethesamevariablenamewithadifferenttypewillcauseanerror
ProperEesofPythonVariables
abc = 10 abc = 'hello'
Pythondoesn'tcare!
Now abc is an integer
Now abc is a string
03-8©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ TypeisestablishedbytheiniAalvalueassignedtoavariablename– Thetypeisinferredbycontextclues– YoumustgiveavariableaniniEalvalueorPythonwillnotrecognizethevariable– NoexplicittypedeclaraEon– Typecanbechangedwheneveranassignmentismade
§ PythonsupportsmulApleassignmentsyntax
DynamicTypinginPython
abc = 10 abc = 1.57
ab = bc = 7 ab, bc, cd = 8, 10, 43
transi1ve
distribu1ve
Now abc is an integer
Now abc is a float
ab and bc are both assigned 7
ab is 8, bc is 10, and cd is 43
03-9©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Pythonprovidesautomatedmemorymanagement– Youdon'thaveto“allocate”andrememberto“free”memorytopreventamemoryleak
§ GarbageCollecAon– Whenavariableisnolongerneeded,thememoryisreleased– Scopeandreferencecountersdeterminewhetheravariableisneeded
AutomatedMemoryManagement
03-10©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ BasicI/Oissimple,inPythonyoudon'tneedalotofboilerplatecode
§ TherearebasicvariablesandcollecAonsbuiltfromthebasicvariables
§ Understandingthevariabletypes(especiallycollecAons)makesPythonmakesenseandiscrucialtowriAng“Pythonic”code
§ Youdon'tdeclareavariable,youiniAalizeit– Pythoninferstypebycontext
§ Pythonre-evaluatesandpotenAallychangesthevariabletypeoneachassignment.– type(var)tellsyouavariable'scurrenttypevs.print(var) whichtellsyouit'svalue
§ PythonautomaAcallyrecyclesmemorywhenitisnolongerbeingused– Youdon'thavetofree()variables
GeneralPointsaboutPythonVariables
03-11©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenEalPoints
§ Hands-OnExercise:Variables
03-12©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ PrecedenceandOperaAons– FollowsmathemaEcaloperaEons
– Uses**insteadof^forexponenEaEon– Supportsreflexiveoperators
– Doesnotsupportincrement/decrement(ab++,ab--)– Providesfloordivision,acompaniontothemodulusoperator
NumericalVariablesandArithmeEc
ab = 1 + 2 * 3 print(ab) > 7
7 // 5 = 1 7 % 5 = 2
floordivisionoperator
modulus
ab += bc à ab = ab + bc
The number of whole times 5 will “go into” 7
The partial remainder (i.e. 2/7ths remains)
03-13©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Howshoulddivisionbeimplementedinadynamictypinglanguage?– TwointegeroperandscouldreturnafloaEngpointresult(unlikeinC)
§ TrueDivision(ThewaydivisionishandledinmathemaAcs)– Example:3/2 = 1.5 and 1/4 = 0.25
§ ClassicalDivision(ThewayClanguageimplementsdivision)– Ifbothoperandsareintegers:3/2 = 1 and 1/4 = 0 (truncaEon)– FloaEngexample:3.0/2.0 = 1.5 and1/4 = 0.25
§ Python2.2àClassicalDivision– Ifbothoperandsareintegers,resultsaretruncated– FloorDivisionOperator‘//’–Example:3.0//2.0 = 1 and1//4 = 0
DivisionOperaEons
03-14©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Numbers– ImplicittransformaEon:inttofloatandfloattoint – Pythonhandlesmixingofintandfloatvariables…asexpected
– ExplicittransformaEon(casEng): – int(float_var), float(int_var)
NumericalInteracEonsbetweenIntandFloat
Dev_temp_celsius = 45 Fahrenheit = 9.0/5.0 * Dev_temp_celsius + 32 type(Fahrenheit) > type <'float'>
Dev_temp = 37 fTemp = float(Dev_temp) print fTemp > 37.0
Latitude = 33.19135811 iLat = int(Latitude) print iLat > 33
intmixedintandfloat
?
03-15©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Strings– Thereisnobin,oct,orhextypeinPython– hex(num),oct(num),bin(num)allreturntypestrstrings
– Stringsarenotimplicitlyconvertedtonumericvariables– ExplicittransformaEonfuncEonsprovided
– str(),hex(),oct(),bin()– int(),float()←worksappropriatelyforstringtype
NumericalInteracEonswithStrings
ab = "123.45" bc = float(ab) print bc > 123.45
ab = "123.45" bc = int(ab) > <Error>
print(hex(13552)) > "0x34f0"
Unique_Device_ID="ff375011-34f0-4758-bade-e68cea787115"
The string must be in the proper format
03-16©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
NumberstoStringTransformaEonExamples
a = bin(13552) type(a) > <type 'str'> print a > '0b11010011110000'
a = oct(13552) type(a) > <type 'str'> print a > '032360'
a = hex(13552) type(a) > <type 'str'> print a > '0x34f0'
print(0x79 + 1) > 122
Python recognizes the notation and converts it to a decimal integer before performing the math operations
03-17©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ FloaAngPointMathAccuracy– AccuracyisadapEvetotheplamormonwhichPythonisrunning– Warning!Don'tcodewithrelianceonfloaEngpointaccuracy
§ Ifyouneedbe=ercontrols,takealookatthemathmodule
§ TheMathLibrary– Themathlibraryisnotbuilt-inasitisinotherlanguages,itmustbeloaded– CallingsemanEcs
MathLibrary
import math
yourVar = math.function(yourParameters) Usually returns a floating point
Function of your choice
03-18©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenEalPoints
§ Hands-OnExercise:Variables
03-19©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ boolvariablesareusedtocontrolprogramflow– branching,condiEonalexecuEon,looping…moreinthechapteronflowcontrol…
§ boolvariablesarecreatedbyassigningaTrueorFalsevalue– TrueandFalsearecasesensi1veandarenotdelimited
Booleans
GPS_status = True GPS_status = 'True' GPS_status = true <Error>
CreatesaboolwiththevalueTrue
CreatesastringofcharactersT-r-u-e
CausesanerrorbecausevariablenamedtruehasnotbeeniniEalizedSameforTRUE
03-20©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ InequaliAes– <, >, <=, >= – InPython,=> and=< aresyntaxerrors
§ EqualiAes– ==, != – Both<> and!= arevalid– Youcanmixtypes,so1 == 1.0 resultsinTrue
§ Logic– and, or, not
§ IdenAAes– ComparestheidenEtyid(var) toseeifsymbolicnamesrefertothesameexactobjectinmemory– is,is not
ComparaEves
a = b =7 print id(a) > 4298189096 print id(b) > 4298189096 print a is b > True
a and b are two different names for the same object in memory
03-21©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenEalPoints
§ Hands-OnExercise:Variables
03-22©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Delimiters– Single-quoteordouble-quote– Thestartandenddelimitermustbethesamecharacter– Curlyquotesandslantedquotesarenotvaliddelimiters
§ ComposiAon– Usetheescapesequence(backslash)forliteralcharacters– \t =tab,\n =newline, \b=backspace,\r =carriagereturn
– Usether(raw)prefixforaliteralstring– Usedouble-escapetoindicateasinglebackslash
Strings
ab = "Python\text" à Python ext
ab = r'Python\text' à Python\text
ab = 'Python\\text' à Python\text
\t became a tab
03-23©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ ConcatenaAon– Usethe+plusoperator– join(sequence)
StringManipulaEon:ConcatenaEon
Style, Model = 'Sorrento', 'F10L' Device_Name_and_Number = Style + " " + Model print Device_Name_and_Number > Sorrento F10L StyMod = Style.join(Model) print StyMod > FSorrento1Sorrento0SorrentoL
03-24©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Slice– AsubstringconsisEngofanindividualcharacteriscalleda“slice”– string[offset] – Theoffsetiszero-based,sothefirstcharacterisstr[0]
§ SliceRange– AsubstringconsisEngofmulEplecharacters– string[start:end] – Warning!upperboundisnon-inclusive
– The“end”characterisnotincludedinthesubstring
StringManipulaEon:Slices
device = 'MeeToo 4.1' print device[1:5] > eeTo
M e e o o 4 T
0 1 2 3 4 5 6 7
e e o T
b .
81
9
03-25©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
StringTransformaEon
CaseManipulaAon
SpaceManipulaAon
JusAficaAon
EdiAng
string.title() string.capitalize() string.lower() string.upper() string.swapcase()
string.strip(chars) string.rstrip(chars) string.lstrip(chars) string.expandtabs(tabsize)
string.ljust(width,fillchar) string.rjust(width,fillchar) string.center(width,fillchar)
string.split('delimiter') string.partition('char') string.rpartition('char') string.splitlines(integer) string.replace(str1,str2)
03-26©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ Methodsarecalledfromtheobjectusingthedotoperator,asinmostobjectorientedprogramminglanguages
§ YoucanchainmethodsinPython.Intheexampleshown,theresultsofthetitle()methodarepassedtotheswapcase()method
ChainingExample
device = "titanic 2300" > titanic 2300 device.title() > 'Titanic 2300' device.swapcase() > 'TITANIC 2300' device.swapcase().title() > 'Titanic 2300'
03-27©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
StringInterrogaEon
Format
ConsAtuAon
string.isalpha() string.isdigit() string.islower() string.isspace() string.istitle() string.isupper()
string1.startswith(string2) string1.endswith(string2) string1.count(string2) string1.find(string2) len(string)
03-28©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenAalPoints
§ Hands-OnExercise:Variables
03-29©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ PythondoesalotofthingsautomaAcallyforyouwithvariables– Thatmakesthecodecleanerandeasiertoread– Butyouneedtoknowwhatitisdoingforyou,oritiseasytogetconfused–onereasonwearespendingsomuch1meonvariables– Thebasictypesareinteger,float,boolean,andstring
§ IntandFloatbehaveasyou'dexpect– mathisan“addin,”somathfuncEonsdon'tworkbydefault
§ Boolean– TrueandFalse
§ String– Slicestr[x],str[start:end] ß“end”isnotinclusive
EssenEalPoints
03-30©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
ChapterTopics
Variables
§ PythonVariables
§ Numerical
§ Boolean
§ String
§ EssenEalPoints
§ Hands-OnExercise:Variables
03-31©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
§ InteracAveExploraAon– ExploreNumericalvariables– Operators,Operators,CasEng,MathFuncEons,Booleans
§ Program– Writeaprogramtoextractfieldsfromasingledatarecord
Hands-OnExercise:Variables
Collec&onsChapter4
04-2©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
CourseChapters
§ Introduc&on
§ Introduc&ontoPython
§ Variables
§ Collec+ons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
04-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Inthischapteryouwilllearn
§ Essen+alinforma+onaboutPython'spowerfulcollec+ontypes
§ Whatarethedifferentproper+esofeachcollec+ontype,andsugges+ons
aboutwhentouseonetypeoranother
§ Howtocreate,manage,interrogate,andupdatecollec+ons
Collec&ons
04-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic&onaries
§ Essen&alPoints
§ Hands-OnExercise:Collec&ons
04-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Collec+ontypesarepowerfulandcaninfluenceprogramdesign.
Collec&ons
Type Descrip+on
Mutable
Ordered
Unique
Paired
List Aserialcollec&onofobjects. X
list.sort()àorderedcollec&on X X
Tuple Animmutablecollec&onofobjects
Set Anunorderedcollec&onofuniqueobjects X X
sorted(set)àorderedandunique X X X
Frozenset Animmutableorderedcollec&onofuniqueobjects X
Dictionary Anorderedcollec&onofuniquekey-valueobjectpairs X X X
04-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Collec&onsSyntax
list[element1, element2, element3] tuple(element1, element2, element3) set{element1, element2, element3} dictionary{key1:value1, key2:value2}
Collec+ons
The colon between key and value literals differentiates the initialization of a dictionary from initialization of a set.
04-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
• Alistisaserialcollec+onofelements
• Elementsdonothavetobeofthesametype• Elementscanbeothercollec&ons(nested)• Defaultquali&es–listsareunorderedandnon-unique• Alistcanbecreatedbyini&alizingavariablewithcommaseparated
elementssurroundedbysquarebrackets[]
Lists
List
0 1 2 3 4 5 6 7[ ]
record['iFruit 3A',27,42,52,17,'enabled','connected',37.02549]
04-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ element in list – ReturnsTrueiftheelementisinthelist
§ element not in list – ReturnsTrueiftheelementisnotinthelist
§ n * list – Copiesthelistn&mesandconcatenatesthecopies
ListOpera&ons
devices = ['iFruit','Sorrento'] devices = 2 * devices print devices > ['iFruit','Sorrento','iFruit','Sorrento']
04-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ list.append(element) – Appendsanelementa]erthefinalelementinalist
§ list.insert(offset, element) – Insertsanelementintoalista]ertheoffset
§ list.remove(element) – Removesthefirstoccurrenceofelementfromthelist
§ list.pop() – Returnsthelastelementinthelistandremovesit
ListModifica&on
models = ['Sorrento','iFruit','Titanic'] models.remove('iFruit') print models > ['Sorrento', 'Titanic']
04-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ list3 = list1 + list2 – Returnsanewlistcontainingtheelementsfromlist1andlist2 – Doesnotchangetheoriginallists
§ list1.extend(list2)
§ list1 += list2
§ list1 = list1 + list2 – Concatenatestheelementsfromlist2ontolist1– Copiesallelementsinlist2totheendoflist1– Overwritesthecontentsoflist1
WorkingwithMul&pleLists
04-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
append()Versusextend()forCollec&ons
mods1 = ['Sorrento','iFruit','Titanic'] mods2 = ['Ronin','MeToo'] mods1.append(mods2) print mods1 > ['Sorrento','iFruit','Titanic',['Ronin','MeToo']]
mods1 = ['Sorrento','iFruit','Titanic'] mods2 = ['Ronin','MeToo'] mods1.extend(mods2) print mods1 > ['Sorrento','iFruit','Titanic','Ronin','MeToo']
The last item is a list. This is a common mistake!
04-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Theslicenota+onworksthesamewith
collec+onsasitdidwithstrings
§ No+cethatthefirstelementis0,soq[0] à 1,andq[5]à6
§ Slicerange– Thelowerboundisincludedintheslice,buttheupperboundisnot.
§ Rangeincrement
– Selectsasubsetofelementsinalist,skippingsome
§ Slicesappliestoallcollec+ontypes
SlicesofaList
q = [1,2,3,4,5,6,7,8,9] q[1:9:3] > [2, 5, 8]
q = [1,2,3,4,5,6,7,8,9] q[2:5] > [3, 4, 5]
q = [1,2,3,4,5,6,7,8,9] q[5] > 6
04-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ del list – Deletestheen&relist
§ del list[offset] – Deletestheelementatthegivenoffset
§ del list[start:end] – Deletestherangeofelementsfromstarttoend-1– endisnotincludedintheelementsdeleted
§ del list[start:end:increment] – Deletestheelementswithintherange
Pythondelkeyword
q = [1,2,3,4,5,6,7,8,9] del q[1:9:2] > q = [1,3,4,7,9]
q = [1,2,3,4,5,6,7,8,9] del q[3] > q = [1,2,3,5,6,7,8,9]
q = [1,2,3,4,5,6,7,8,9] del q[2:6] > q = [1,2,7,8,9]
0 1 2 3 4 5 6 7 8
q = [1,2,3,4,5,6,7,8,9] del q print q > < Error>
04-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ len(list) – Thenumberofelementsinthelist
§ list.count(element) – Countsthenumberofiden&calelementsinalist
§ list.min() – Returnstheminimumvalueelementinthelist
§ list.max() – Returnsthemaximumvalueelementinthelist
§ list.index(element) – Returnstheoffsetofthefirstoccurrenceofelementinthelist
ListInterroga&on
q= ['a','b','c','b','c','c'] q.count('c') > 3 print len(q) > 6
04-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ list.sort() – Sortsthelistofelementsinascendingorderinplace– Changestheorderofthelistitself– Theoriginalorderislost
§ sorted(list) – Returnsasortedcopy
§ list.reverse() – Reversesthecurrentorderofelementsinthelistinplace– Changestheorderofthelistitself– Thepriororderislost
ListTransforma&on
q = [4,5,3,2,1,7,9,8] q.sort() print q > [1,2,3,4,5,7,8,9]
q = [4,5,3,2,1,7,9,8] q.reverse() print q > [8,9,7,1,2,3,5,4]
Common error! a = list.sort() will set a to an empty list
04-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ list = string.split(delimiter) – Createalistfromadelimitedstring– Thesplit()methodisusedfrequentlyinClouderaclasses
§ mystring = delimiter.join(list) – Methodforconver&ngalisttoastring– Notethatthisisamethodofthedelimiter…looksawkward– workswithanydelimiterstring,includingtheemptystring("")
ConversionbetweenStringsandLists
ab = '1,2,3,4,5' print ab > 1,2,3,4,5 mylist = ab.split(',') print mylist > ['1','2','3','4','5'] cd = "-".join(mylist) print cd > 1-2-3-4-5
04-17©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Usethereadlines()method
– Readsallthelinesinthefiletoend-of-file(EOF)– Returnseachlineasaseparateelementinalist
ListFileI/O
lines = [ ] file = open('example.txt','r') lines = file.readlines() file.close() print lines
04-18©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
devlog = [ ] file = open('loudacre.log','r') devlog = file.readlines() file.close() print devlog[75] > '2014-03-15:10:10:21, Titanic 1000, f1c9e8fe-6235-4fd9-afd2-057922d50f58, 81, 63, 26, 40, 12, 0, TRUE, enabled, enabled, 35.45767939, -117.5590077'
§ Thiscodereadstheen+reloudacre.logfile
§ Eachlineisstoredinoneelementofalist
ExampleCode
04-19©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
onerecord = devlog[75].split(',') print onerecord > ['2014-03-15:10:10:21', > 'Titanic 1000', > 'f1c9e8fe-6235-4fd9-afd2-057922d50f58', > '81', > '63', > '26', > '40', > '12', > '0', > 'TRUE', > 'enabled', > 'enabled', > '35.45767939', > '-117.5590077\r\n']
§ Thiscodeusessplit()tostoreeachfieldasastringintoanelementof
alist
Crea&ngaListbySplidngaString
04-20©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic&onaries
§ Essen&alPoints
§ Hands-OnExercise:Collec&ons
04-21©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Originatedfromtheabstrac+onofthecommoncoun+ngsequence…
triple,quadruple,pentuple,septuple,octuple,…n-tupleàtuple
§ Themathema+calconceptofaTupleis“asequenceofelements”
– Inmath(1,2,3,4,5)isapentuple–fiveelements
§ InPython,atupleisanimmutablecollec+on.Ithasallthesame
opera+onandinterroga+onmethodsasalist,butnoneofthe
manipula+onortransforma+onmethods
§ Thevaluesareassignedwhenthetupleiscreatedandcannotbechangedotherthanthroughre-assignment
§ Tryingtocallamethodthatchangesthecons+tu+onofatuplewillresult
inanerror
§ Createatuplewiththeparenthesis()nota+on.
Tuples
04-22©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Mutable
– Performingafunc&ononanobjectmaychangethecontentoftheobject– Theiden&tyoftheobjectisunchanged
§ Immutable
– Performingafunc&oncannotchangethecontentoftheobject– Ifchangedoesoccur,theresul&ngobjecthasanewiden&ty
PythonMutableVersusImmutable–Lists
a = 25 id(a) > 4298188664 a += 1 id(a) > 4298188640
wifiA = ['25','34','enabled'] id(wifiA) > 4347030920 wifiB = ['connected'] id(wifiB) > 4347029048 wifiA += wifiB print wifiA > ['25','34','enabled','connected'] id(wifiA) > 4347030920
Python lists are considered mutable – the object ID remains the same Python integers are considered immutable – the object id changes for the
variable named a
04-23©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Youcan'tmodifytheorderormembershipofatupleonceitisassigned
ImmutableTuple–NoChangeMethods
t = (1,2,3,4) type(t) > <type 'tuple'> t.append(5) > AttributeError: 'tuple' object has no attribute 'append' t.extend(t) > AttributeError: 'tuple' object has no attribute 'extend' t.sort() > AttributeError: 'tuple' object has no attribute 'sort' del t[2] > TypeError: 'tuple' object doesn't support element deletion
04-24©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Youcanre-assignthevalueofanen+retuple,butitcreatesanewobject
ImmutableTuple
t = (1,2,3,4) print t > (1,2,3,4) print id(t) > 4300627208 t = t + t print t > (1,2,3,4,1,2,3,4) print id(t) > 4297732776
mylist = [1,2,3,4] print mylist > [1,2,3,4] print id(mylist) > 4300881792 mylist.extend(mylist) print mylist > [1,2,3,4,1,2,3,4] print id(mylist) > 4300881792
Tuple List
04-25©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Listscommonlycontainelementsofthe
sametype,althoughthisisnotarule
§ Tuplescommonlycontainelementsof
differenttypes,torepresentarecord
§ Atuplecanbereassigned.Cas+ngatupleintoalist,modifyingit,andthen
Cas+ngitbackintoatupleiscalled
unfreezingandfreezing
Cas&ngBetweenTuplesandLists
t = (39,18,29,1) type(t) > <type 'tuple'> print t > (39,18,29,1) mylist = list(t) print mylist > [39,18,29,1] list.append(5) t = tuple(mylist) print t > (39,18,29,1,5)
04-26©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic&onaries
§ Essen&alPoints
§ Hands-OnExercise:Collec&ons
04-27©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Setsareunorderedandunique
§ Asetisrepresentedbycommaseparatedelements
– In2.6,requiredsyntaxisset([ … ]) – In2.7,asetcanbespecifiedusingcurlybraces{ }
§ Youcancreateanewsortedsetwiththesorted()func+on
Sets
ifruit = set(['1','5','5','3A','3','4','2','3A','4A','5']) print ifruit > set(['1', '3', '2', '5', '4', '3A', '4A'])
ifruit = sorted(ifruit) print ifruit > set(['1', '2', '3', '3A', '4', '4A', '5'])
04-28©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Youcancreateasetfromatupleoralistusingset()
§ Cas+ngwithsetsworksthesameasbetweenlistsandtuples
Cas&ngSets
mylist = [5,3,4,1,2,5,3,5] s1 = set(mylist) print s1 > set([1,2,3,4,5])
mytuple = (5,3,4,1,2,5,3,5) s2 = set(mytuple) print s2 > set([1,2,3,4,5])
04-29©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ set1.union(set2) – Alltheelementsinbothsets– Symbolicnota&onalsoworks:set1 | set2
§ set1.intersection(set2) – Onlyelementsthatareinbothsets,andnotinonlyone– Symbolicnota&onalsoworks:set1 & set2
SetTheoryMethodsandOperators
s1 = {1,2,3,4,5} s2 = {3,4,5,6,7} s1.union(s2) > set([1, 2, 3, 4, 5, 6, 7]) s1.intersection(s2) > set([3, 4, 5]) print s1 & s2 > set([3, 4, 5])
04-30©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ set1.difference(set2) – Providesthedifferencebetweenthesets– Theelementsthatareineitheroneortheotherset,butarenotintheintersec&on– Symbolicnota&on:set1 – set2
§ set1.symmetric_difference(set2) – Symbolicnota&on:set1 ^ set2 – Returnsthesetofallelementsthatarenotintheintersec&on
SetTheoryMethodsandOperators
s1 = {1,2,3,4,5} s2 = {3,4,5,6,7} s1.difference(s2) > set([1, 2]) s2.difference(s1) > set([6, 7]) s1 ^ s2 > set([1, 2, 6, 7])
04-31©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ set1.isdisjoint(set2) – ReturnsTrueifbothsetssharenoelements
§ set1.issubset(set2) – ReturnsTrueifeveryelementinset1isinset2– Symbolicnota&on:set1 <= set2
§ set1.issuperset(set2)
§ ReturnsTrueifeveryelementinset2isinset1– Symbolicnota&on:set1 >= set2
§ set2 > set1 – Propersubset– ReturnsTrueifset2containsallelementsofset1,andoneormoreaddi&onalelements
SetInterroga&onMethods
04-32©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Setshavetheirownmanipula+onmethods
§ set.add() – Addsanelementtotheset
§ set.update()
§ set.copy()
§ set.pop()
§ set.discard()
SetManipula&onMethods
04-33©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Youcancreateafrozensetfromasetusingfrozenset()
§ Afrozensetisimmutable
– Tupleistolistasfrozensetistoset
frozenset
s1 = {1, 2, 3, 4, 5} f1 = frozenset(s1) type(f1) > < type 'frozenset' > print(f1) > frozenset([1,2,3,4,5]) f1.add(6) > AttributeError: 'frozenset' object has no attribute 'add'
04-34©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic+onaries
§ Essen&alPoints
§ Hands-OnExercise:Collec&ons
04-35©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Adic+onaryisasetcomposedofkey-valuepairs
§ Defineusingsetnota+on–curlybraceswithpairsseparatedbycommas
§ Pairsaredefinedusingacolon–key:value– Keysandvaluescanbeofanybasictype
Dic&onaries
modelnumbers = {'F40L':'Sorrento', 'F41L':'Sorrento', '2500':'Titanic', '3000':'Titanic', '3A':'iFruit', '4':'iFruit'} type(modelnumbers) > < type 'dict' > print modelnumbers['3A'] > 'iFruit' print modelnumbers['2500'] > 'Titanic'
04-36©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Whenyouexplicitlycastadic+onarytoanyothercollec+on,yougetthe
keys,notthevalues
§ Usethevalues()methodtogetthevalues
Dic&onariesandLists
modellist = list(modelnumbers) print modellist > ['F41L', 'F40L', '4', '3A', '2500', '3000'] print modelnumbers.values() > ['Sorrento', 'Sorrento', 'iFruit', 'iFruit', 'Titanic', 'Titanic']
04-37©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ dict[key] – Returnsthevalueassociatedwiththekey
§ dict[key] = newvalue – Overwritestheoldvaluewiththenewvalueforthatkey
§ dict1.update(dict2) – Addsorupdateskey-valuepairsindict1usingcontentsofdict2
§ len(dict) – Providesthenumberofkey-valuepairsinthedic&onary
§ del dict[key] – Removestheentryiden&fiedbythespecifiedkeyfromthedic&onary
§ dict.pop(key) – Returnsthevalueandremovesthekey-valuepairfromthedic&onary
Dic&onaryMethods
04-38©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic&onaries
§ Essen+alPoints
§ Hands-OnExercise:Collec&ons
04-39©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Part1:What'syourPredic&on?
a = [1, 3, 5, 4, 7, 4] b = ('jan','feb','mar','apr','may') c = {3, 5, 7, 5, 7, 24, 3, 'big'} d = {1:'iFruit', 2:'Ronin', 3:'Sorrento','T':'Titanic'} e = [a, b, c, d] f = (a, b, c, d) print e print f type(e) type(f)
What type of collection will e and f be ?
04-40©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
print e > [[1, 3, 5, 4, 7, 4], ('jan', 'feb', 'mar', 'apr', 'may'), set([24, 'big', 3, 5, 7]), {1: 'iFruit', 2: 'Ronin', 3: 'Sorrento', 'T': 'Titanic'}] print f > ([1, 3, 5, 4, 7, 4], ('jan', 'feb', 'mar', 'apr', 'may'), set([24, 'big', 3, 5, 7]), {1: 'iFruit', 2: 'Ronin', 3: 'Sorrento', 'T': 'Titanic'}) type(e) > <type 'list'> type(f) > <type 'tuple'>
Part1:Results
04-41©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Part2:What'syourPredic&on?
# Old values # a = [1, 3, 5, 4, 7, 4] b = ('jan','feb','mar','apr','may') # New values # a = {5, 4, 3, 2, 1} b = ('jun','jul','aug','sep','oct') print e print f
Will the new values or the old values be displayed when e and f are printed?
04-42©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
print e > [[1, 3, 5, 4, 7, 4], ('jan', 'feb', 'mar', 'apr', 'may'), set([24, 'big', 3, 5, 7]), {1: 'iFruit', 2: 'Ronin', 3: 'Sorrento', 'T': 'Titanic'}] print f > ([1, 3, 5, 4, 7, 4], ('jan', 'feb', 'mar', 'apr', 'may'), set([24, 'big', 3, 5, 7]), {1: 'iFruit', 2: 'Ronin', 3: 'Sorrento', 'T': 'Titanic'})
Part2:Results
The list e and the tuple f still contain the original versions of a and b.
Take-away:Collec+onsareconstructedbyduplica+on,notbyreference.
04-43©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Part3:What'syourPredic&on?
a = [1, 2, 3, 4, 5] b = [a, a, a, a, a] c = [b, b, b, b, b] d = [c, c, c, c, c] print d[3][3][3][3]
04-44©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Part3:Results
a = [1, 2, 3, 4, 5] b = [a, a, a, a, a] c = [b, b, b, b, b] d = [c, c, c, c, c] print d[3][3][3][3] > 4
Part of the Python philosophy is “complex” without being “complicated”
Take-away:Youcancreateverycomplexnestedobjectstoorganizedata.
04-45©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Collec+ons– Mayconsistofelementsofdifferenttypes– Cancontainothercollec&onsaselements(nested)
§ Lists []
§ Tuples ()
§ Sets {}
§ Frozensets –immutablesets
§ Dictionaries {key:value}
Essen&alPoints
04-46©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
Collec+ons
§ Lists
§ Tuples
§ Sets
§ Dic&onaries
§ Essen&alPoints
§ Hands-OnExercise:Collec+ons
04-47©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Part1– Processafileline-by-lineintoanestedlist
§ Part2– Readanen&refileintoalist
Hands-OnExercises:Collec&ons
FlowControlChapter5
05-2©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
CourseChapters
§ IntroducEon
§ IntroducEontoPython
§ Variables
§ CollecEons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
05-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Inthischapteryouwilllearn
§ Howtoloopandperformrepe==veopera=ons
§ HowtoiterateoverindividualitemsinaCollec=on
§ Howtomakeablockofcodecondi=onallyexecute
§ Howtoiden=fyandhandleexcep=ons
FlowControl
05-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ IteraEveExecuEon
§ CondiEonalExecuEon
§ TentaEveExecuEon(ExcepEonHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ CodeblocksinPythonarerepresentedbyindenta=on– Notcurlybraces– Endoflinesequence(EOLS),notsemicolon
§ Statementsinablockmustbeindentediden=cally– IfyourindentaEonisoff–evenbyaspace–thenitisanerror
CodeBlocks
a,b = 7, 12 if 1 < 9 : print a else : print b: > 7
a,b = 7, 12 if 1 < 9 : print a else : print b: > IndentationError: expected an indented block
05-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Intheexampleontheright,else:isnolongerassociatedwithacorrespondingifbecausethelineprint 'hi'endedthescopeoftheifstatement.
§ Below,onespacetoomuchisalsoanerror.
CodeBlockMistakes
a,b = 7, 12 if 1 < 9 : print a print 'hi' else : print b > Error line 5 > else : > | > SyntaxError: invalid syntax
a,b = 7, 12 if 1 < 9 : print a print 'hi' > Error line 4 > IndentationError: unexpected indent
05-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ Repe==veExecu=on
§ IteraEveExecuEon
§ CondiEonalExecuEon
§ TentaEveExecuEon(ExcepEonHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
When test is False …continue
while test: code-block
§ while test:
§ Iden=fiesacodeblocktobeexecutedrepeatedlywhiletestisTrue
§ iftestisneverFalse,itisaninfiniteloop– IfyouaccidentallyenteraninfiniteloopinthePythonshell,breakwithCTRL-DorCTRL-C
CondiEonalLooping
a = 1 while a < 10: print a a = a + 1 print("complete") Note: variable used in test
must be initialized before use
Code to execute repeatedly while test is True
05-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
count = 0 while count < len(devlog): print count, len(devlog[count]) print devlog[count] count += 1
§ Thiscodeprintseachrecordofmobilephonedata
ExampleCode
05-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
CountableLooping
for c in range(5): print c > 0 > 1 > 2 > 3 > 4
for c in range(3,8): print c > 3 > 4 > 5 > 6 > 7
for c in range(0,12,3): print c > 0 > 3 > 6 > 9
Countfrom0toN-1
Countfromstarttoend-1
starttoend-1withincrement
for var in range(start,end,inc): code-block
Note: counting variable did not need to be separately initialized before use.
05-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ModifyingLoopOperaEons
§ break – Endsaloop
§ continue – SkipsoneiteraEonofaloop
for i in range(100): if(i > 4): break print i > 0 > 1 > 2 > 3 > 4
for i in range(10): if(i==5): continue print(i) > 0 > 1 > 2 > 3 > 4 > 6 > 7 > 8 > 9
05-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ Itera=veExecu=on
§ CondiEonalExecuEon
§ TentaEveExecuEon(ExcepEonHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ for each_element in list: – Executeablockofcodeforeachelementinalist, tuple, set, frozenset,ordictionary
§ Returnsvariableoftypeoftheelement
IteraEon
mylist = [1,2,3,4,5,6] for each in mylist: print each > 1 > 2 > 3 > 4 > 5 > 6 type(each) > <type 'int'>
for each_element in collection : code-block
Code to execute for each element in collection
05-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Exampleofitera=ngoverkeysandvaluessimultaneouslyinaDic=onary
KeyandValueIteraEon
modelnumbers = {'F40L':'Sorrento', 'F41L':'Sorrento', '2500':'Titanic', '3000':'Titanic', '3A':'iFruit', '4':'iFruit'} for key,value in modelnumbers.items(): print key,value > F41L Sorrento > F40L Sorrento > 4 iFruit > 3A iFruit > 2500 Titanic > 3000 Titanic
05-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Pythonprovidesatechniquetocreateanewlistfromanexis=ngcollec=onbyplacinganiteratorwithinbrackets– Eachelementisprocessedtocreatethenewlist
ListComprehension
devicetemp=[30,33,54,34,29,49] Ftemp = [elem*9/5+32 for elem in devicetemp] print Ftemp > [86, 91, 129, 93, 84, 120]
newlist = [code for each_element in collection]
05-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ for each_element in enumerate(list): – Returnsatwo-valuetuplewheretheelementatindex0istheenumeraEon,andtheelementatposiEon1istheitemcontent
IteraEonwithEnumeraEon
mylist = ['Ronin','MeToo','iFruit'] for each in enumerate(mylist): print each > (0, 'Ronin') > (1, 'MeToo') > (2, 'iFruit') type(each) > <type 'tuple'> len(each) > 2
05-17©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ for each_element in zip(list1,list2): – Returnsatwo-valuetuplewherethe0elementcomesfromthefirstlist,andthe1elementisthecorrespondingelementfromthesecondlist
CorrespondingIteraEonwithZip
mylist1 =['Ronin','MeToo','iFruit','Sorrento'] mylist2 =['s1','4.0','3A','F41L'] for each in zip(mylist1, mylist2): print each > ('Ronin','s1') > ('MeToo','4.0') > ('iFruit','3A') > ('Sorrento','F41L') type(each) > <type 'tuple'> len(each) > 2
05-18©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
MulEpleListIteraEonwithzip()andListComprehension
x = ['x1','x2','x3'] y = ['y1','y2'] for a,b in zip(x,y): print a,b > x1 y2 > x2 y2
x = ['x1','x2','x3'] y = ['y1','y2'] for a,b in[(a,b) for a in x for b in y]: print a,b > x1 y1 > x1 y2 > x2 y1 > x2 y2 > x3 y1 > x3 y2
Multiple list iteration without the need for explicit nested for loops or counters
05-19©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ IteraEveExecuEon
§ Condi=onalExecu=on
§ TentaEveExecuEon(ExcepEonHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-20©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ if test:
§ Makeablockofcodecondi=onal– IfFalse,conEnueonnextlineofcodefollowingtheblock
§ test– Boolean
– TrueorFalse – Comparator
– Operatorsdiscussedinchapteronvariables– ==, !=, <, > – <=, >= – is, is not
– Endswithacolon
CondiEonalTesEngwithIf
if test: code-block
CodetoexecuteiftestisTrue
Otherwise…con=nue
# Device not to exceed 120 F # 120 F = 48.889 C # devicetemp=[30,33,54,34,29,49] maxtemp = 48.8889 for eachtemp in devicetemp: if eachtemp > maxtemp: print eachtemp > 54 > 49
05-21©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ AddsanalternateblockofcodetoexecuteiftestisFalse
§ else: iden=fiesalternateblockofcode– else:keywordisalignedwithif– requiresindentedblockbeneath
§ Eithercodeblock1or2willbeexecuted,thenprocessingwillcon=nue
AlternaEveCondiEonswithElse
CodetoexecuteiftestisTrue
CodetoexecuteiftestisFalse
device = "Titanic 2000" if device[8:13] == "2000": print "Model recognized" else : print "Model unknown" > Model recognized
if test: code-block-1 else: code-block-2
05-22©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Createsalternatetests
§ elif test2:iden=fiesanalternatetesttoperformiftest1isFalse– IfTrueexecutesfollowingcodeblock
Elif
Codetoexecuteiftest1isTrue
CodetoexecuteifalltestsareFalse
Iftest1isFalse,trytest2
if test1: code-block-1 elif test2: code-block-2 else: code-block-n
05-23©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Pythondoesnothaveaswitch()case:construct
§ Usemul=pleelif's– Else…If…
§ Iftest1isFalse,trytest2,iftest2isFalse,trytest3,iftest3isFalsetrytest4.....<else:>..ifalltheprevioustestswereFalse…thendothis
StackingElif's
Codetoexecuteiftest1isTrue
CodetoexecuteifalltestsareFalse
if test1: code-block-1 elif test2: code-block-2 elif test3: code-block-3 elif test4: code-block-4 elif test5: code-block-5 elif test6: code-block-6 elif test7: code-block-7 else: code-block-n
05-24©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Titanicphonesareadver=sedbybrandnameratherthanmodelnumber
§ Adver=singnamesarebasedontypesofsailingships
§ Exampleshowsuseofif…eliftorecognizemodelnumberandprintcorrespondingmarketname
ExampleofElifs
# Get make and model from keyboard device = raw_input('Make Model :') # Recognize model number if device[8:13] == "1000": mod = 'Clipper' elif device[8:13]== "1100": mod = 'Schooner' elif device[8:13]== "2000": mod = 'Sloop' elif device[8:13]== "2100": mod = 'Caravel' elif device[8:13]== "2200": mod = 'Cutter' else : mod = device[8:13] # Prepare output string mod = device[0:8]+mod print mod
05-25©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Pythonsupportsasingle-linecondi=onalstatement– alsocalledan“inlinefunc*on”
Single-lineIf
return-true if(test) else return-false
a = 5 'low' if a <= 5 else 'high' > 'low'
a = 5 a <= 5 ? 'low' : 'high' > Syntax Error
# test ? return-true : return-false
Python does not support the ternary operator (the conditional operator) that is common in many other languages
ExampleofaTernaryOperator
05-26©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ IteraEveExecuEon
§ CondiEonalExecuEon
§ Tenta=veExecu=on(Excep=onHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-27©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Excep=onHandling– Theopportunitytoexecutecodeintheeventofanerrorratherthanqui_ng
§ Errors– NameError – ValueError – IndexError – SyntaxError
TentaEveExecuEon
Codetotestforerror
Handler
OpEonalcoderunswhethererrorwasdetectedornot
try: code-block-1 except: error code-block-2 finally: code-block-3
05-28©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ErrorHandling
mylist =['Ronin','MeToo','iFruit','Sorrento'] print len(mylist) toofar = len(mylist)+1 index = 0 for index in range(0,toofar): print mylist[index] > Ronin > MeToo > iFruit > Sorrento > IndexError : List out of range
Index out of bounds
05-29©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ErrorHandling
mylist =['Ronin','MeToo','iFruit','Sorrento'] print len(mylist) toofar = len(mylist)+1 index = 0 try: for index in range(0,toofar): print mylist[index] except IndexError: print ">> tragedy averted" finally: print ">> continuing" > Ronin > MeToo > iFruit > Sorrento > >> tragedy averted > >> continuing
Code block must be indented
05-30©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ IteraEveExecuEon
§ CondiEonalExecuEon
§ TentaEveExecuEon(ExcepEonHandling)
§ Essen=alPoints
§ Hands-OnExercise:FlowControl
05-31©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Codeblocks– CodeflowinPythonisdependentonindentedcodeblocks
§ Condi=onal(True/False)execu=onusingif,elif,andelse – Stackelif'stosimulateswitch()…case:
§ whileisusedincondi=onal(True/False)looping
§ forisusedforrepe==on– for in range() – for element in collection – for element in enumerate(collection) – for element in zip(collection1, collection2)
§ tryisusedforexcep=onhanding
EssenEalPoints
05-32©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
FlowControl
§ CodeBlocks
§ RepeEEveExecuEon
§ IteraEveExecuEon
§ CondiEonalExecuEon
§ TentaEveExecuEon(ExcepEonHandling)
§ EssenEalPoints
§ Hands-OnExercise:FlowControl
05-33©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Part1– Prepareabrandindexandamodelindexusingsets
§ Part2– PrepareaninformaEonbasefromdatareadfromtheLoudacrelogfile
Hands-OnExercise:FlowControl
ProgramStructureChapter6
06-2©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
CourseChapters
§ IntroducEon
§ IntroducEontoPython
§ Variables
§ CollecEons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
06-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
Inthischapteryouwilllearn
§ Howtocreateandworkwithnamedfunc?onsincludingabasic
understandingoffunc?oncontextandscoping
§ Howtocreateandworkwithanonymousfunc?ons(alsoknownas
Lambdafunc?ons)
– Note:LambdafuncEonsareusedextensivelytointerfacewithBigDataservicesandsoOware
§ Howtopassonefunc?ontoanotherfunc?onbyreference
§ ThepurposeandgeneraluseofPythongenerators
ProgramStructure
06-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
ProgramStructure
§ NamedFunc?ons
§ AnonymousFuncEons(Lambda)
§ GeneratorFuncEons
§ EssenEalPoints
§ Hands-OnExercise:ProgramStructure
06-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Definesafunc?onandassociatesitwithaname
NamedFuncEons
def name(parameters): code-block return variable
• return must be indented so it is ‘within’ the function
• return is optional
return(1, 2, "data")
Q: Will this work? If so, what type of variable will it return?
Function gets a local copy of the passed parameters
Optional return variable
06-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ExampleofNamedFuncEons
# Define function to return month from date time stamp # def getmonth(dts): months={1:'JAN',2:'FEB',3:'MAR',4:'APR',5:'MAY', \ 6:'JUN',7:'JUL',8:'AUG',9:'SEP',10:'OCT', \ 11:'NOV',12:'DEC'} month = dts[5:7] nummonth = int(month) return months[nummonth] dts="2014-03-15:10:10:20" print getmonth(dts) > MAR Continuation marks used for formatting of slide.
The marked lines can be entered as a single line of code or use continuation marks for readability.
06-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Variablesmaybedefinedwithinthefunc?on(i.e.,localscope)
§ Localvariablesdefinedinthecodethatcallsafunc?onarenotvisibletothefunc?onthatisbeingcalled
§ Parametersarepassedintofunc?onsbycopying,notbyreference
§ Modifyingthevalueofaparameteronlymodifiesthelocalcopy
§ Whenafunc?onreturns,thelocalvariablesdefinedwithinthefunc?on
arelost
§ Itispossibletodefineglobalvariablesbyprefixingtheirdefini?onwiththeglobalkeyword,butthisshouldbeavoidedwheneverpossible
VariableScope
06-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Defini?ons– aisapassedparameter– bisaglobalvariable– cisalocalvariable
§ Changes– Inthecalltofun1()allthreevariablesareincremented
§ Results– WhatwillhappeninthemainprogramaOerthecalltofun1() – Whatwilltheresultoftheprintstatementsbe?
Scope
# -- function definition def fun1(a): global b c = 1 # -- change of values a += 1 # passed parameter b += 1 # global variable c += 1 # local variable return a # -- main a = 7 b = 3 d = fun1(a) print a # What will this do? print b # What will this do? print c # What will this do?
06-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ ais7
§ Whenaispassedintofun1(),fun1getsalocalcopyofa
§ Local-ahasthesamevalueasthe
ainmain,7
§ Local-aisincrementedto8
§ Whenfun1()ends,andcontrolreturnstomain,local-aisdeleted
§ Theprint astatementprints
thevalueofglobal-a,7,andnot
local-a,whichnolongerexists
Answerto“a”
# -- function definition def fun1(a): global b c = 1 # -- change of values a += 1 # passed parameter b += 1 # global variable c += 1 # local variable # -- main a = 7 b = 3 fun1(a) print a print b print c
06-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ bis3
§ Thestatementglobal binfun1()makesglobal-b
accessiblefromwithinthe
func?on
§ Whenbisincrementedin
fun1(),itisglobal-b
§ Theprint bstatementprints
thevalueofglobal-bwhichis
now4
Answerto“b”
# -- function definition def fun1(a): global b c = 1 # -- change of values a += 1 # passed parameter b += 1 # global variable c += 1 # local variable # -- main a = 7 b = 3 fun1(a) print a print b print c
06-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ cdoesnotexistinmain
§ Infun1(),alocalvariableciscreatedandgivenavalueof1
§ Local-cisincrementedto2
§ Whenfun1()ends,andcontrolreturnstomain,local-cisdeleted
§ Theprint cstatementcauses
an<Error>,becauseintheglobalcontextcneverexisted
Answerto“c”
# -- function definition def fun1(a): global b c = 1 # -- change of values a += 1 # passed parameter b += 1 # global variable c += 1 # local variable # -- main a = 7 b = 3 fun1(a) print a print b print c
06-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Occursonasingleline
§ returnkeywordprecedessinglelineofcode
Single-LineNotaEonFuncEons
def CtoF(tempC): return (tempC*9/5) + 32 print CtoF(25) > 77
def name(parameters): return code-line
No code-block here
06-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
No parentheses, pass by reference Parentheses:
execute
§ Whenafunc?onnameappearsinparentheses,meaningitispassedasa
parameter,Pythonexecutesit
§ Whenafunc?onnameappearswithoutparentheses,Pythontreatsitas
anobjectandpassesitbyreference(calledfirstclassfunc/ons)
PassbyReference
def square(number): return number*number def halve(number): return number/2 def doit(somefunction,input): return somefunction(input) doit(halve,7) > 3 doit(square,7) > 49
06-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
ProgramStructure
§ NamedFuncEons
§ AnonymousFunc?ons(Lambda)
§ GeneratorFuncEons
§ EssenEalPoints
§ Hands-OnExercise:ProgramStructure
06-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
AnonymousFuncEons
§ Enablesafunc?ontobedefinedinlinewithoutasymbolicname
– DefinedwithinthecalltoanotherfuncEon– Usefulforone-EmefuncEonswherereuseisnotanobjecEve– Limitedtoasingleline–EnyfuncEons– Codemustimplicitlyreturnavalue
lambda parameters: code-line
06-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Iteratesovereachelementinthecollec?onandreturnsonlythose
elementsthatmatchthecriteriadeterminedbythefunc?on
– Thecodebelowsimulatesfilter'sbasicoperaEon– Anexampleusingfilterisonthenextslide
filter()
filter(test_function, collection)
def myfilter(function, collection): newlist=[] for each in collection[1:]: if function(each): newlist.append(each) return newlist
06-17©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Anonymousfunc?onincalltofilter()
Examplefilter()
# -- display only names fewer than 6 characters # devices = ['iFruit','Sorrento','Titanic','Ronin','MeToo'] filtered_devices = filter(lambda x: len(x)<6, devices) print filtered_devices > ['Ronin', 'MeToo']
06-18©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Iteratesovereachelementinthecollec?onandreturns1-for-1elements
thataretransformedbythefunc?on
– Thecodebelowsimulatesmap'sbasicoperaEon– Anexampleusingmapisonthenextslide
map()
map(transform_function, collection)
def mymap(function, collection): newlist=[] for each in collection[1:]: newlist.append(function(each)) return newlist
06-19©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Anonymousfunc?onincalltomap()
Examplemap()
# -- CPU utilization for devices was read as strings # Convert to numeric, and turn them into percentages # cpu_util= ['32','46','18','76','23','49','91','87'] cpu_percent = map(lambda c: float(c)/100.0, cpu_util) print cpu_percent > [0.32000000000000001, 0.46000000000000002, 0.17999999999999999, 0.76000000000000001, 0.23000000000000001, 0.48999999999999999, 0.91000000000000003, 0.87]
06-20©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Youcanpassmap()mul?plecollec?onsofthesamesize
SecondExampleofmap()
# -- Find the absolute difference between ambient # temperature and device temperature # ambient = [21,26,54,87,69,77,61,47,81,49,37] device = [86,97,53,98,31,61,71,70,92,98,58] variation = map(lambda x,y: abs(x-y), ambient,device) print variation > [65, 71, 1, 11, 38, 16, 10, 23, 11, 49, 21]
06-21©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Iteratesoverasequenceofelementsinthecollec?onandperformsa
func?ononthesequence
– Thecodebelowsimulatesreduce'sbasicoperaEon– Anexampleusingreduceisonthenextslide
reduce()
reduce(reduce_function, collection)
def myreduce(function, collection): tally = collection[0] for next in collection[1:]: tally = function(tally, next) return tally
06-22©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Anonymousfunc?onincalltoreduce()
Examplereduce()
# -- restarts is a dictionary the contains number of # reported device restarts by brand name # This code uses reduce() to calculate the total # number of restarts on all devices # restarts = {'MeToo':7,'Ronin':7,'Sorrento':15, \ 'Titanic':11, 'iFruit':7} total = reduce(lambda x,y: x+y, restarts.values()) print total > 47
06-23©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
ProgramStructure
§ NamedFuncEons
§ AnonymousFuncEons(Lambda)
§ GeneratorFunc?ons
§ EssenEalPoints
§ Hands-OnExercise:ProgramStructure
06-24©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Loudacremarke?ngwantstosendanadver?singtextmessagetoevery
telephoneinthe555areacode.
§ Firstchallengeistowriteaprogramtogenerateallthephonenumbers
– 555-eee-ssss=3+1+3+1+4=Eachphoneisa12characterstring– Skipzeroesintheexchangefield– aaa-123-4567=10,000,000=about10millionnumbers– 12x10m=120MBofdata
Problem#1
872-0117CutyourmonthlypaymentinhalfwithLoudacremobileservice!
Messages
mobileudacreoL
06-25©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
def phonegen(): d1=d2=d3=d4=d5=d6=d7 = 0 phonelist = [] for d1 in range(1,10): for d2 in range(1,10): for d3 in range(1,10): for d4 in range(0,10): for d5 in range(0,10): for d6 in range(0,10): for d7 in range(0,10): phone = '555'+'-'+str(d1)+ \ str(d2)+str(d3) phone = phone+'-'+str(d4)+ \ str(d5)+str(d6)+str(d7) phonelist.append(phone) return phonelist # -- test program list_of_phones = phonegen() for each_phone in list_of_phones: # send_message(each_phone) print each_phone
§ Generatesthephonenumbersinthe555
areacodeinsequence
skippingleadingzeroes
intheexchangedigits
§ Returnsalistofthenumbersgenerated
§ Thiscodetakesabout30secondstorunon
thetargetplaaorm
phonegen()
06-26©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Theadver?singprogramwasasuccess!Marke?ngwantstoexpandto
cover250areacodesintheUnitedStates.
§ Concerns– 250areacodesx120MBeach=30GBofdata
– Mycomputeronlyhas16GBofRAM!– Thepreviousprogramtookabout30secondstogeneratethelist
– Thenewprogramwilltakeabouttwohourstogeneratethelist!
§ What'sneededisawaytoiterateovereachphonenumberinsequence
§ Buthowdoyouiteratewithafunc?on?– LocalvariablesarelostwhenthefuncEonreturns– HowwouldyoupreservethecurrentstatesoyoucouldconEnueatthenextnumberinsequenceeachEmeyoucallphonegen()?
Problem#2
06-27©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ APythongeneratorisaniterablefunc?on– AlsoknownasacofuncAonincomputerscience
§ yield – PreservesthefuncEonalcontext(localvariablesandstate)– Returnscontroltothecaller– Passesaniteratorobjectback
Generators
06-28©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Yieldwillpassbackfunc?onalcontext.Soanewwaytopassthephonenumberbacktothecallingcodeisneeded
#-- Global phone number phonenum = "- " # --- Generate all phone numbers in the target area codes def phonegen(): target_area_codes = [555, 205, 251, 256, 334, 907, 480, 520, \\ 623, 928, 501, 870, 209, 213, 310, 323, 408, 415, 510, 530, 559, \\ 562, 619, 626, 650, 661, 707, 714, 760, 805, 818, 831, 858, 909, \\ 916, 925, 949, 303, 719, 720, 970, 203, 860, 302, 305, 321, 352, \\ 386, 407, 561, 727, 754, 772, . . . . . . 704, \\ 828, 910, 919, 980, 701, 216, 234, 330, 419, 440, 513, 614, 740, \\ 717, 724, 814, 878, 401, 803, 843, 864, 210, 214, 254, 281, 361, \\ 409, 469, 512, 682, 713, 806, 817, 830, 832, 903, 915, 936, 940, \\ 956, 972, 979, 435, 801, 802, 276, 434, 540, 571, 703, 757, 804, \\ 206, 253, 360, 425, 509, 715, 920, 307]
AreaCodesforphonegen()
06-29©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ PythonGeneratorexpression
§ Returnsaniterator
phonegen() def phonegen(): global phonenum d1=d2=d3=d4=d5=d6=d7 = 0 for d1 in range(1,10): for d2 in range(1,10): for d3 in range(1,10): for d4 in range(0,10): for d5 in range(0,10): for d6 in range(0,10): for d7 in range(0,10): phone = '555'+'-'+str(d1)+ \ str(d2)+str(d3) phone = phone+'-'+str(d4)+ \ str(d5)+str(d6)+str(d7) phonenum = phone yield phonenum it = 0 for phonenum in phonegen(): if it > 50: break print phonenum it += 1
06-30©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
ProgramStructure
§ NamedFuncEons
§ AnonymousFuncEons(Lambda)
§ GeneratorFuncEons
§ Essen?alPoints
§ Hands-OnExercise:ProgramStructure
06-31©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ Namedfunc?ons
– Callbyreference§ Variablescopingandthefunc?oncontext
– Global§ Anonymous(lambda)func?ons
– Commonlyusedin:filter(),map(),reduce()
§ Generatorfunc?ons– yield
EssenEalPoints
06-32©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
ChapterTopics
ProgramStructure
§ NamedFuncEons
§ AnonymousFuncEons(Lambda)
§ GeneratorFuncEons
§ EssenEalPoints
§ Hands-OnExercise:ProgramStructure
06-33©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriAenconsentfromCloudera.
§ NamedFunc?ons
– WriteafuncEonthatcontrolshowfilter()works– PassbyreferencetoanamedfuncEon
§ AnonymousFunc?ons
– Convertfilter()programtouseanonymousfuncEon
§ GeneratorFunc?ons– WriteanumberseriesgeneratorusingastandardfuncEon
Hands-OnExercise:ProgramStructure
WorkingwithLibrariesChapter7
07-2©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
CourseChapters
§ IntroducGon
§ IntroducGontoPython
§ Variables
§ CollecGons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
07-3©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
Inthischapteryouwilllearn
§ Howtoorganizecodeintoseparatemodules
§ Howtogainandcontrolaccesstomodularcode
§ Whatisofferedincommonstandardlibraries
WorkingwithLibraries
07-4©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ChapterTopics
WorkingwithLibraries
§ StoringandRetrievingFuncFons
§ ModuleControl
§ CommonStandardLibraries
§ EssenGalPoints
§ Hands-OnExercise:Libraries
07-5©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ Alsocalleda“library”
§ TheprogramthatisloadedintoPythonfromthecommandlineisthemainprogram– $ python program.py
§ Loadcodefromother*.pyfilesintothemainprogram– import libraryname
– Searchesthepathforlibraryname.py– Loadsthelibraryasanobject,andallfuncFonsasmethods– Toaccessfun1()inlibraryname,uselibraryname.fun1()
– from libraryname import fun1, fun2, fun3 – LoadsthefuncFonsdirectlyintothemainprogram– Toaccessfun1(),usefun1()
PythonModules
07-6©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ExampleLibrary
def loadfile(filename): lines=[ ] file = open(filename,'rt') lines = file.readlines(); file.close() return lines def userinput(prompt): keybuffer = raw_input(prompt) return keybuffer
File:library.py
07-7©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ TheimportsyntaxloadsthefuncFonsasmethodsofthelibraryobject
§ ThefromsyntaxloadsthefuncFonsdirectlyintothemainprogramsymboltable
ExamplesofAccessingtheLibrary
import library mylist = library.loadfile('loudacre.log') print mylist print "\n\n" myline = library.userinput('Greetings: ') print myline
from library import loadfile, userinput mylist = loadfile('loudacre.log') print mylist print "\n\n" myline = userinput('Greetings: ') print myline
07-8©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ChapterTopics
WorkingwithLibraries
§ StoringandRetrievingFuncGons
§ ModuleControl
§ CommonStandardLibraries
§ EssenGalPoints
§ Hands-OnExercise:Libraries
07-9©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ AtPythonstartup– Currentdirectorywheretheprogramislocated– PYTHONPATH
– OS-dependentpath– AlistofdirectorynameswiththesamesyntaxasPATHintheOS– Pathtodefaultlibrariessuchasprefix/lib/pythonversion
§ AXerprogramisrunning– Thepathisavailableandcanbechangedbyarunningprogram– Itislocatedinsys.path – Addtopathwithsys.path.append(newpath)
HowDoesPythonFindLibraries?
07-10©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ Addingauser-defineddirectoryoflibraries
ExampleofAddingtothePath
import sys print sys.path > ['', '/Library/Frameworks/Python.framework/Versions/2.7/bin', '/Library/Python/2.7/site-packages/bigquery-2.0.17-py2.7.egg', '/Library/Python/2.7/site-packages/httplib2-0.8-py2.7.egg', '/Library/Python/2.7/site-packages/oauth2client-1.2-py2.7.egg', . . .and so forth . . . sys.path.append('/home/user/python-libs')
07-11©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ ProvidesasinglelocaFonformaintenanceofsite-specificpaths
§ Onimport,siteextendssys.pathwithsys.prefixandsys.exec_prefix – Theexactfiles/pathsareOS-dependent
– ExampleUnix:lib/python$version/site-packagesandlib/site-python
– Youcangloballyset/changesite-specificpaths
Site-SpecificPaths
import sys print sys.path print sys.prefix print sys.exec_prefix import site print site.PREFIXES print site.USER_BASE print site.USER_SITE
07-12©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ Standard(shipswithPython)– Notloadedbydefault,mustbeimported– Examples:sys,math,re
§ Separatelyinstalled– *.eggfiles– APython“egg”(*.egg)isadistribuGonofaPythonprojectthatmaycontaincode,metadata,andresources– Separatelyinstalledusingpiporeasy_install
WhereDoLibrariesComeFrom?
/Library/Python/2.7/site-packages
bigquery-2.0.17-py2.7.egg oauth2client-1.2-py2.7.egg pip-1.5.2-py2.7.egg google_api_python_client-1.2-py2.7.egg python_gflags-2.0-py2.7.egg
07-13©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ import libraryname
§ dir(libraryname)
What'sinaLibrary?
import math dir(math) > ['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']
07-14©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ __doc__ContainsadocumentaFonstring
§ __name__ – Containsastringwiththecontextinwhichthecodeisrunning,nameofthelibraryifimported,or__main__
§ __file__Containsthepathfromwhichthelibraryoriginated
SpecialStrings
print math.__doc__ > 'This module is always available. It provides access to the\nmathematical functions defined by the C standard.' print math.__name__ > 'math' print math.__file__ > '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/math.so'
07-15©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ Aprogramcantest__name__todetermineifitisrunningaXerimportinamainprogramorifitisrunningstandalone– Oneuseistotestcodethatonlyrunsifthelibraryisexecutedstandalone
Exampleof__name__
if __name__ == '__main__': # only prints if in main program print "running library test" def loadfile(filename): lines=[ ] file = open(filename,'rt') lines = file.readlines(); file.close() return lines
$ python library.py > running library test
07-16©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ Argumentsarepassedfromthecommandlineusingthesysmodule– sysisabuilt-in;notenabledbydefault,youhavetoimportit– import sys
§ Argumentsonthecommandlinearepassedasspace-delimitedstring
PassingCommandLineArguments
import sys arguments = len(sys.argv) for i in range(0,arguments): print "argv[",i,"] ", sys.argv[i]
$ python args.py 1 2 3 à sys.argv[0]='1', sys.argv[1]='2', sys.argv[2]='3'
$ python args.py 1,2,3 à sys.argv[0] ='1,2,3'
07-17©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ChapterTopics
WorkingwithLibraries
§ StoringandRetrievingFuncGons
§ ModuleControl
§ CommonStandardLibraries
§ EssenGalPoints
§ Hands-OnExercise:Libraries
07-18©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ import sys – Importssystemenvironmentobject– help(sys)àdetailsonavailablevaluesandmethods
– sys.version • GivestheversionofPython
'2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]'
– sys.executable • GivesthepathtothePythonexecutable
– sys.getrefcount(var) • Givestherefcountofavariable
Built-insys
07-19©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ import math
§ Importsthemathmodule § math.ceil()
– roundsuptoawholenumber,returnsafloat
§ math.floor() – roundsdowntowholenumber,returnsafloat
§ math.sqrt() – squareroot
§ math.pi() – constant
§ math.e() – constant
Built-inmath
More information: https://docs.python.org/2/library/math.html
07-20©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ import re – ImportsregularexpressionmoduleforsophisGcatedpaCernmatching– re.compile(pattern)
– Createsthepatternobject– Loadsandparsestheregularexpressionsearchstring
– pattern.match(content) – createstheMatchobject– Executestheregularexpressioncodeandstorestheresults– YoucanthencallmethodsonthepaCernobjecttoinvesGgate
• pattern.start(),pattern.end() • pattern.group(),pattern.span()
Built-inre
07-21©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ RegExre-defineselementswithinthepa^ernstringthatalreadyhaveadifferentmeaningtoPythonoutsideofthesearchstring– [ ]=delimitsagroupofcharacters,anyofwhichareamatch– [0-9]=matchescharactersbetweenzeroandnine– [^0-9]=matchesanycharactersexceptzerothroughnine– [0-9]*=matchesnumberofGmes(zeroormorerepeGGons)– .=matchesanycharacterexceptanewline– \w=matchesanywhitespacecharacter
§ Pythonwrinkle…ifyouwantedtomatch'['– you'dhavetoescapeitlikethisinthesearchstring:\[ – but…Pythonwillprocesstheescapesequenceinthestring…– so…you'dneedtoescapetheescapeandthebracket:\\\\[ – or…usePythonraw:r"\[" – or…triplequotes:"""match[this] string exactly"""
OverviewofRegularExpressions
07-22©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
RegExExample
import re regobj = re.compile("[a-z]*") print(type(regobj)) mat = regobj.match("abcdefghijk") print(mat) print(type(mat)) print print(mat.start(), mat.end(), mat.group(), mat.span()) > <type '_sre.SRE_Pattern'> > <_sre.SRE_Match object at 0x1006ae030> > <type '_sre.SRE_Match'> > (0, 11, 'abcdefghijk', (0, 11))
07-23©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ChapterTopics
WorkingwithLibraries
§ StoringandRetrievingFuncGons
§ ModuleControl
§ CommonStandardLibraries
§ EssenFalPoints
§ Hands-OnExercise:Libraries
07-24©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ CreaFngandUsingLibraries– Method-callingsemanGcs
– import library – FuncGon-callingsemanGcs
– from library import functions § Control
– Paths– Specialstrings– Pipand*.egg
§ StandardLibraries– sys,math,site,re
EssenGalPoints
07-25©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
ChapterTopics
WorkingwithLibraries
§ StoringandRetrievingFuncGons
§ ModuleControl
§ CommonStandardLibraries
§ EssenGalPoints
§ Hands-OnExercise:Libraries
07-26©Copyright2010-2016Cloudera.Allrightsreserved.NottobereproducedorsharedwithoutpriorwriCenconsentfromCloudera.
§ DemonstratebasicskillswithPythonModules– Createalibraryfileandcallitfromamainprogram– MovethelibrarytoadifferentlocaGon– UseacommonstandardlibrarytointerfacewiththeOS– UsetheRegExlibrarytoperformasophisGcatedsearch
Hands-OnExercise:Libraries
ConclusionChapter8
08-2©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
CourseChapters
§ IntroducDon
§ IntroducDontoPython
§ Variables
§ CollecDons
§ FlowControl
§ ProgramStructure
§ WorkingwithLibraries
§ Conclusion
08-3©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
Duringthiscourse,youwilllearn
§ “JustEnough”PythonProgramming– "JustEnough"meanstoenableasolidfoundaDonforHands-OnExercisesinClouderatrainingclasses– Not"proficientasaPythonprogrammer”(Pythonista,Pythoneer)
CourseObjecDves
08-4©Copyright2010-2016Cloudera.Allrightsreserved.Nottobereproducedorsharedwithoutpriorwri@enconsentfromCloudera.
WhichCoursetoTakeNext?
Clouderaoffersarangeoftrainingcoursesforyouandyourteam
§ Fordevelopers– ClouderaDeveloperTrainingforApacheSparkandHadoop– DesigningandBuildingBigDataApplica;ons
§ Forsystemadministrators– ClouderaAdministratorTrainingforApacheHadoop
§ FordataanalystsanddatascienFsts– ClouderaDataAnalystTraining:UsingPig,Hive,andImpalawithHadoop– DataScienceatScaleusingSparkandHadoop
§ Forarchitects,managers,CIOs,andCTOs– ClouderaEssen;alsforApacheHadoop