Hackathon Hadoop Exercise

download Hackathon Hadoop Exercise

of 27

Transcript of Hackathon Hadoop Exercise

  • 8/12/2019 Hackathon Hadoop Exercise

    1/27

    IBM Software An IBM Proof of Technology

    Hadoop Basics with InfoSphereBigInsights

    Lesson 2: Hadoop architecture

  • 8/12/2019 Hackathon Hadoop Exercise

    2/27

    An IBM Proof of Technology

    Catalog Number

    Copyright IBM Corporation, 2013

    US Government Uer !etri"te# !ight $ Ue, #upli"ation or #i"loure retri"te# by GS% %&' S"he#ule Contra"t (ith IBM Corp)

  • 8/12/2019 Hackathon Hadoop Exercise

    3/27

    IBM So*t(are

    Contents

    Lab 1Exploring Hadoop Distributed File System................................................................................................ ...41)1Getting Starte#))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) )))))))))))+1)2-ploring .a#oop &itribute# /ile Sytem erminal)))))))))))))))))))))))))))))))))))))))))))))))))))))))))

    1)2)1Uing the "omman# line Inter*a"e)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

    1)3-ploring .a#oop &itribute# /ile Sytem 4eb Conole))))))))))))))))))))))))))))))))))))))))))))))))1+1)3)1Uing the 4eb Conole)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1+1)3)24or5ing (ith the 4el"ome page))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) ))))161)3)3%#minitering BigInight)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))171)3)8Inpe"ting the tatu o* your "luter))))))))))))))))))))))))))))))))))))))))))))))))))))))))) )))))))))171)3)+Starting an# topping a "omponent)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) )11)3)64or5ing (ith /ile))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))20

    1)8Summary))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) ))))28

    Content 'age 3

  • 8/12/2019 Hackathon Hadoop Exercise

    4/27

    IBM So*t(are

    Lab 1 Exploring Hadoop Distributed File System

    he over(helming tren# to(ar# #igital ervi"e, "ombine# (ith "heap torage, ha generate# maiveamount o* #ata that enterprie nee# to e**e"tively gather, pro"e, an# analy9e) &ata analyite"hni:ue *rom the #ata (arehoue an# high$per*orman"e "omputing "ommunitie are invaluable *or

    many enterprie, ho(ever o*ten time their "ot or "omple-ity o* "ale$up #i"ourage the a""umulationo* #ata (ithout an imme#iate nee#) % valuable 5no(le#ge may neverthele be burie# in thi #ata,relate# "ale#$up te"hnologie have been #evelope#) -ample in"lu#e Google; Map!e#u"e, an# theopen$our"e implementation, %pa"he .a#oop)

    .a#oop i an open$our"e pro

  • 8/12/2019 Hackathon Hadoop Exercise

    5/27

    IBM So*t(are

    1.1 Getting Started

    o prepare *or the "ontent o* thi lab, you mut go through the pro"e o* getting all o* the .a#oop"omponent tarte#

    AA1) Start the ?M(are image by "li"5ing the 'lay virtual ma"hine button in the ?M(are 'layer i* it inot alrea#y on)

    AA2) @og in to the ?M(are virtual ma"hine uing the *ollo(ing "re#ential)

    Uer= bia#min

    'a(or#= bia#min

    .an#$on$@ab 'age +

  • 8/12/2019 Hackathon Hadoop Exercise

    6/27

    IBM So*t(are

    AA3) %*ter you log in, your "reen houl# loo5 imilar to the one belo()

    Be*ore (e "an tart (or5ing (ith .a#oop &itribute# /ile ytem, (e mut *irt tart all the Biginight"omponent) here are t(o (ay o* #oing thi, through terminal an# through imply #ouble$"li"5ing ani"on) Both o* thee metho# (ill be ho(n in the *ollo(ing tep)

    AA8) No( open the terminal by #ouble "li"5ing the BigInsights Shell i"on)

    AA+) Cli"5 on the Terminal i"on

    )

    'age 6 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    7/27

    IBM So*t(are

    AA6) n"e the terminal ha been opene# "hange to the BIGINSIG.SA.MDbin #ire"tory (hi"hby #e*ault i DoptDibmDbiginight

    cd $BIGINSIGHTS_HOME/bin

    or

    cd /opt/ibm/biginsights/bin

    AAE) Start the .a#oop "omponent #aemon on the BigInight erver) Fou "an pra"ti"e tarting all"omponent (ith thee "omman#) 'leae note that they (ill ta5e a *e( minute to run)

    ./start-all.sh

    AA7) Sometime "ertain ha#oop "omponent may *ail to tart) Fou "an tart an# top the *aile#"omponent one at a time by uing start.shan# stop.shrepe"tively) /or e-ample to tart an#top .ive ue=

    ./start.sh hi!

    ./stop.sh hi!

    Noti"e that in"e .ive #i# not initially *ail, the terminal i telling u that .ive i alrea#y running)

    .an#$on$@ab 'age E

  • 8/12/2019 Hackathon Hadoop Exercise

    8/27

    IBM So*t(are

    AA) n"e all "omponent have tarte# u""e*ully you may move on)

    AA10) I* you (oul# li5e to top all "omponent e-e"ute the "omman# belo() .o(ever, *or thi labpleae leave all "omponent tarte#)

    ./stop-all.sh

    Ne-t, let u loo5 at ho( you (oul# tart all the "omponent by #ouble$"li"5ing an i"on)

    AA11) &ouble$"li"5ing on the Start BigInsights i"on (oul# e-e"ute a "ript that #oe the abovementione# tep) n"e all "omponent are tarte# the terminal e-it an# you are et) Simple)

    AA12) 4e "an top the "omponent in a imilar manner, by #ouble$"li"5ing on the Stop Biginsightsi"on) o the right o* Start BigInsightsi"on

    No( that are "omponent are tarte# you may move on to the ne-t e"tion)

    'age 7 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    9/27

    IBM So*t(are

    1.2 Exploring Hadoop Distributed ile S!ste" #$er"inal%

    .a#oop &itribute# /ile Sytem .&/S allo( uer #ata to be organi9e# in the *orm o* *ile an##ire"torie) It provi#e a "omman# line inter*a"e "alle# /S hell that let a uer intera"t (ith the #ata in.&/S a""eible to .a#oop Map!e#u"e program)

    here are t(o metho# to intera"t (ith .&/S=

    1) Fou "an ue the "omman#$line approa"h an# invo5e the /ileSytem * hell uing the *ormat=hadoop fs

    2) Fou "an alo manipulate .&/S uing the BigInight 4eb Conole)

    4e (ill be uing both metho# in thi lab

    1.2.1 Using the command line nter!ace

    4e (ill tart (ith the hadoop fs -ls "omman#, (hi"h return the lit o* *ile an# #ire"torie (ithpermiion in*ormation)

    nure the .a#oop "omponent are all tarte#, an# *rom the ame terminal (in#o( a be*ore an#logge# on a bia#min, *ollo( thee intru"tion

    AA1) @it the "ontent o* the root #ire"tory)

    hadoop "s -ls /

    AA2) o lit the "ontent o* the DuerDbia#min #ire"tory, e-e"ute=

    hadoop "s -ls

    or

    hadoop "s -ls /#s!r/biadmin

    .an#$on$@ab 'age

  • 8/12/2019 Hackathon Hadoop Exercise

    10/27

    IBM So*t(are

    Note that in the *irt "omman# there (a no #ire"tor re*eren"e#, but it i e:uivalent to the e"on#"omman# (here DuerDbia#min i e-pli"itly pe"i*ie#) a"h uer (ill get it o(n home #ire"tory un#erDuer) /or e-ample, in the "ae o* uer bia#min, the home #ire"tory i DuerDbia#min) %ny "omman#(here there i no e-pli"it #ire"tory pe"i*ie# (ill be relative to the uer; home #ire"tory) Uer pa"e inthe native *ile ytem @inu- i generally *oun# un#er DhomeDbia#min or DurDbia#min, but in .&/S uerpa"e i DuerDbia#min pelle# a uerH rather than urH)

    AA3) o "reate the #ire"tory test you "an iue the *ollo(ing "omman#=

    hadoop "s -mdir t!st

    AA8) Iue the l "omman# again to ee the ub#ire"tory test=

    hadoop "s -ls /#s!r/biadmin

    'age 10 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    11/27

    IBM So*t(are

    he reult o* lshere i imilar to that *oun# (ith @inu-, e-"ept *or the e"on# "olumn in thi "aeeither 1H or $) he 1H in#i"ate the repli"ation *a"tor generally 1H *or peu#o$#itribute#"luter an# 3H *or #itribute# "luter #ire"tory in*ormation i 5ept in the nameno#e an# thunot ub

  • 8/12/2019 Hackathon Hadoop Exercise

    12/27

    IBM So*t(are

    AAE) o move *ile bet(een your regular @inu- *ile ytem an# .&/S you "an ue the put an# get"omman#) /or e-ample, move the te-t *ile !%&M to the ha#oop *ile ytem=

    hadoop "s -p#t /hom!/biadmin/'E()ME 'E()ME

    hadoop "s -ls /#s!r/biadmin

    Fou houl# no( ee a ne( *ile "alle# DuerDbia#minD!%&M lite# a ho(n above)

    AA7) In or#er to vie( the "ontent o* thi *ile ue the $"at "omman# a *ollo(=

    hadoop "s -cat 'E()ME

    Fou houl# ee the output o* the !%&M *ile that i tore# in .&/S) 4e "an alo ue thelinu- #i** "omman# to ee i* the *ile (e put in .&/S i a"tually the ame a the original on thelo"al *ileytem)

    AA) -e"ute the "omman# belo( to ue the #i** "omman#)

    cd /hom!/biadmin/

    di"" *+ hadoop "s -cat 'E()ME , 'E()ME

    Sin"e the #i** "omman# pro#u"e no output (e 5no( that the *ile are the ame the #i** "omman# printall the line in the *ile that #i**er)

    o *in# the i9e o* *ile you nee# to ue the $#u or $#u "omman#) Leep in min# that thee"omman# return the *ile i9e in byte)

    'age 12 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    13/27

    IBM So*t(are

    AA10) o *in# the i9e o* the !%&M *ile ue the *ollo(ing "omman#)

    hadoop "s -d# 'E()ME

    AA11) o *in# the i9e o* all *ile in#ivi#ually in the DuerDbia#min #ire"tory ue the *ollo(ing "omman#=

    hadoop "s -d# /#s!r/biadmin

    AA12) o *in# the i9e o* all *ile in total o* the DuerDbia#min #ire"tory ue the *ollo(ing "omman#)

    hadoop "s -d#s /#s!r/biadmin

    AA13) I* you (oul# li5e to get more in*ormation about ha#oop * "omman#, invo5e $help a *ollo()

    hadoop "s -h!lp

    .an#$on$@ab 'age 13

  • 8/12/2019 Hackathon Hadoop Exercise

    14/27

    IBM So*t(are

    AA18) /or pe"i*i" help on a "omman#, a## the "omman# name a*ter help) /or e-ample, to get help onthe #u "omman# youJ# #o the *ollo(ing)

    hadoop "s -h!lp d#s

    4e are no( #one (ith the terminal e"tion, you may "loe the terminal)

    'age 18 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    15/27

    IBM So*t(are

    1.& Exploring Hadoop Distributed ile S!ste" #'eb (onsole%

    he *irt tep to a""eing the BigInight 4eb Conole i to laun"h all o* the BigInight pro"ee.a#oop, .ive, o9ie, MapD!e#u"e et") hey houl# have been tarte# at the beginning o* thi lab)

    1.".1 Using the #eb $onsole

    AA1) Start the 4eb Conole by #ouble$"li"5ing on the BigInsights WebConsole icon)

    AA2) ?eri*y that your 4eb "onole appear imilar to thi, an# note ea"h e"tion=Tasks= :ui"5 a""e to popular BigInight ta5,Quck !nks= @in5 to internal an# e-ternal :ui"5 lin5 an# #o(nloa# to enhan"e yourenvironment, an#!earn More= nline reour"e available to learn more about BigInight

    .an#$on$@ab 'age 1+

  • 8/12/2019 Hackathon Hadoop Exercise

    16/27

    IBM So*t(are

    1.".2 #or%ing &ith the #elcome page

    hi e"tion intro#u"e you to the 4eb "onoleJ main page #iplaye# through the 4el"ome tab) he4el"ome page *eature lin5 to "ommon ta5, many o* (hi"h "an alo be laun"he# *rom other area o*the "onole) In a##ition, the 4el"ome page in"lu#e lin5 to popular e-ternal reour"e, u"h a theBigInight In*oCenter pro#u"t #o"umentation an# "ommunity *orum) FouJll e-plore everal ape"t o*

    thi page)

    AA3) In the 4el"ome ab, the a5 pane allo( you to :ui"5ly a""e "ommon ta5) Sele"t the"ew# start or stop a ser$ce task) I* ne"eary "roll #o(n)

    AA8) hi ta5e you to the Cluster %tatustab) .ere, you "an top an# tart .a#oop ervi"e, a (ella gain a##itional in*ormation a ho(n in the ne-t e"tion

    AA+) Cli"5 on the &elcometabto return ba"5 to the main page)

    'age 16 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    17/27

    IBM So*t(are

    AA6) Inpe"t the Quck !nks pane at top right an# ue it verti"al "roll bar i* ne"eary to be"ome*amiliar (ith the variou reour"e a""eible through thi pane) he *irt everal lin5 implya"tivate #i**erent tab in the 4eb "onole, (hile ube:uent lin5 enable you to per*orm et$up*un"tion, u"h a a##ing BigInight plug$in to your "lipe #evelopment environment)

    AAE) Inpe"t the !earn Morepane at lo(er right) @in5 in thi area a""e e-ternal 4eb reour"ethat you may *in# ue*ul, u"h a the %""elerator #emo an# #o"umentation, BigInightIn*oCenter, a publi" #i"uion *orum, IBM upport, an# IBMJ BigInight pro#u"t ite) I*#eire#, "li"5 on one or more o* thee lin5 to ee (hatJ available to you

    .an#$on$@ab 'age 1E

  • 8/12/2019 Hackathon Hadoop Exercise

    18/27

  • 8/12/2019 Hackathon Hadoop Exercise

    19/27

    IBM So*t(are

    AA11) ptionally, "ut$an#$pate the U!@ *or .ive; 4eb inter*a"e into a ne( tab o* your bro(er)FouJll ee an open our"e tool provi#e# (ith .ive *or a#minitration purpoe, a ho(n belo()

    AA12) Cloe thi tab an# return to the Cluster %tatus e"tion o* the BigInight 4eb "onole

    1.".) Starting and stopping a component

    AA13) I* ne"eary, "li"5 on the .ive ervi"e to #iplay it tatu)

    AA18) In the pane to the right (hi"h #iplay the .ive tatu, "li"5 the re# Stop button to top theervi"e

    AA1+) 4hen prompte# to "on*irm that you (ant to top the .ive ervi"e, "li"5 '(an# (ait *or theoperation to "omplete) he right pane houl# appear imilar to the *ollo(ing image

    AA16) !etart the .ive ervi"e by "li"5ing on the green arro(

  • 8/12/2019 Hackathon Hadoop Exercise

    20/27

    IBM So*t(are

    1.".* #or%ing &ith Files

    he )les tab o* the "onole enable you to e-plore the "ontent o* your *ile ytem, "reate ne(ub#ire"torie, uploa# mall *ile *or tet purpoe, an# per*orm other *ile$relate# *un"tion) In thimo#ule, you;ll learn ho( to per*orm u"h ta5 againt the .a#oop &itribute# /ile Sytem .&/S o*BigInight)

    AA1E) Cli"5 on the )les tab o* the "onole to begin e-ploring your #itribute# *ile ytem)

    AA17) -pan# the #ire"tory tree ho(n in the pane at le*t *user*+admn,) I* you alrea#y uploa#e#*ile to .&/S, you;ll be able to navigate through the #ire"tory to lo"ate them)

    AA1) Be"ome *amiliar (ith the *un"tion provi#e# through the i"on at the top o* thi pane, a (eJllre*er to ome o* thee in ube:uent e"tion o* thi mo#ule) Simply point your "uror at thei"on to learn it *un"tion) /rom le*t to right, the i"on enable you to Copy a *ile or #ire"tory, movea *ile, "reate a #ire"tory, rename, uploa# a *ile to .&/S, #o(nloa# a *ile *rom .&/S to your lo"al*ile ytem, #elete a *ile *rom .&/S, et permiion, open a "omman# (in#o( to laun"h .&/S

    hell "omman#, an# re*reh the 4eb "onole page

    'age 20 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    21/27

    IBM So*t(are

    AA20) 'oition your "uror on the user*+admn#ire"tory an# "li"5 the Create rectory i"on to "reatea ub#ire"tory *or tet purpoe

    AA21) 4hen a pop$up (in#o( appear prompting you *or a #ire"tory name, enter Console!a+ an#"li"5 '(

    AA22) -pan# the #ire"tory hierar"hy to veri*y that your ne( ub#ire"tory (a "reate#)

    AA23) Create another #ire"tory name# Conole@abet)

    AA28) Ue the enamei"on to rename thi #ire"tory to Conole@abet2

    .an#$on$@ab 'age 21

  • 8/12/2019 Hackathon Hadoop Exercise

    22/27

    IBM So*t(are

    AA2+) Cli"5 the Mo$ei"on, (hen the pop up Mo$e"reen appear ele"t the Conole@ab #ire"toryan# "li"5 '()

    AA26) Uing the set permssoni"on, you "an "hange the permiion etting *or your #ire"tory) 4hen*inihe# "li"5 '()

    'age 22 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    23/27

    IBM So*t(are

    AA2E) 4hile highlighting the Conole@abet2 *ol#er, ele"t the emo$ei"on an# #elete the #ire"tory)

    AA27) !emain in the Conole@ab #ire"tory, an# "li"5 the Upload i"on to uploa# a mall ample *ile *ortet purpoe)

    AA2) 4hen the pop$up (in#o( appear, "li"5 the Browse button to bro(e your lo"al *ile ytem *or aample *ile)

    AA30) Navigate through your lo"al *ile ytem to the #ire"tory (here BigInight (a intalle#) /or theIBM$provi#e# ?M4are image, BigInight i intalle# in *ile ytem= /opt/ibm/biginsights.

    @o"ate the/IHCub#ire"tory an# ele"t the C/A012%.t3t*ile) Cli"5 'pen.

    AA31) ?eri*y that the (in#o( #iplay the name o* thi *ile) Note that you "an "ontinue to Browse *ora##itional *ile to uploa# an# that you "an #elete *ile a uploa# target *rom the #iplaye# lit).o(ever, *or thi e-er"ie, imply "li"5 '(

    .an#$on$@ab 'age 23

  • 8/12/2019 Hackathon Hadoop Exercise

    24/27

    IBM So*t(are

    AA32) 4hen the uploa# "omplete, veri*y that the C.%NGS)t-t *ile appear in the #ire"tory tree at le*t,I* it i not imme#iately viible "li"5 the re*reh button) n the right, you houl# ee a ubet o*the *ile; "ontent #iplaye# in te-t *ormat

    AA33) .ighlight the C.%NGS)t-t *ile in your Conole@ab #ire"tory an# "li"5 the ownloadbutton)

    AA38) 4hen prompte#, "li"5 the %a$e )le button) hen ele"t '()

    AA3+) I* /ire*o- i et a #e*ault bro(er, the *ile (ill be ave# to your uer &o(nloa# #ire"tory) /orthi e-er"ie, the #e*ault #ire"tory lo"ation i *ine

    1.) Su""ar!

    Congratulation FouJre no( *amiliar (ith the .a#oop &itribute# /ile Sytem) Fou 5no( no( ho( tomanipulate *ile (ithin by uing the terminal an# the BigInight 4eb Conole) Fou may move on to thene-t Unit)

    'age 28 .a#oop Bai"= 'art1

  • 8/12/2019 Hackathon Hadoop Exercise

    25/27

    NOTES

  • 8/12/2019 Hackathon Hadoop Exercise

    26/27

    NOTES

  • 8/12/2019 Hackathon Hadoop Exercise

    27/27

    Copyright IBM Corporation 2013.

    The information contained in these materials is proided for

    informational p!rposes only" and is proided A# I# $itho!t $arranty

    of any %ind" e&press or implied. IBM shall not 'e responsi'le for any

    damages arising o!t of the !se of" or other$ise related to" these

    materials. (othing contained in these materials is intended to" nor

    shall hae the effect of" creating any $arranties or representations

    from IBM or its s!ppliers or licensors" or altering the terms and

    conditions of the applica'le license agreement goerning the !se of

    IBM soft$are. )eferences in these materials to IBM prod!cts"

    programs" or serices do not imply that they $ill 'e aaila'le in all

    co!ntries in $hich IBM operates. This information is 'ased onc!rrent IBM prod!ct plans and strategy" $hich are s!'*ect to change

    'y IBM $itho!t notice. Prod!ct release dates and+or capa'ilities

    referenced in these materials may change at any time at IBM,s sole

    discretion 'ased on mar%et opport!nities or other factors" and are not

    intended to 'e a commitment to f!t!re prod!ct or feat!re aaila'ility

    in any $ay.

    IBM" the IBM logo and i'm.com are trademar%s of International

    B!siness Machines Corp." registered in many *!risdictions

    $orld$ide. -ther prod!ct and serice names might 'e trademar%s of

    IBM or other companies. A c!rrent list of IBM trademar%s is

    aaila'le on the e' at /Copyright and trademar% information at

    $$$.i'm.com+legal+copytrade.shtml.