Computing tools for data analysis
a practical introduction
First things first:
• can you launch ROOT? • Download examples from here (two files) – http://phy.duke.edu/~atarce/Hep101/
– move both files to the directory where you launch ROOT.
What is LHC computing?
pattern recognition
storing events
brokering access to data
simulating events
calculating observables
making pretty pictures
classifying signals
classifying events
analysis
produc.on
opera.ons
Outline
• Brief overview of production and operations • The last step: analysis – introduction to data representation
– Exercise: plotting and fitting data with ROOT
• The first step: simulation – introduction to Monte Carlo simulation
– Exercise/homework: writing a toy Monte Carlo
LHC computing is huge!
• LHC data output estimate: 15 PB/year – Stored and processed
on shared WLCG
– and…we need multiple copies!
• US LHC: nearly 500k CPU-hours daily.
ATLAS computer use
produc.on
analysis
ATLAS computing for users
• Main programming languages: – FORTRAN (some generators) – C++ (main reconstruction algorithms, analysis) – python (steering, analysis)
• Main interactive interfaces: – athena
• controls MC and data production
– root (C++ interface) • viewing/analyzing reconstructed data • pyroot (for python fans: python “import root” )
Production: data
ATLAS trigger convert
MERGE&
derive
bytestream
RECO
esd
aod
tag
D3PD
aod
RDO (raw)
pattern recognition
fast pattern recognition event sorting
more sorting
Production: Monte Carlo
MERGE &
derive
RECO
esd
aod
tag
D3PD
aod
MONTE CARLO PRODUCTION CHAIN RDO (raw)
Data representation
• always organized by event • global quantities:
– metadata – missing energy…
• physics object lists: – muons – jets – tracks – “truth” particles…
• object properties: – hits on tracks – jet constituents
µ track track track track track track track
jet jet jet
track hit track hit
event
“n-‐tuple” “tree”
Data representation
Event number
nTracks track pT track eta track phi track layers…
0 3 12.4 0.3 2.1 30
8.1 1.1 1.0 14
5.0 -‐0.9 4.0 17
1 2 24.5 1.1 0.2 22
20.5 0.9 3.3 17
2 1 2.0 1.9 1.4 5
3 5 50.4 0.1 0.8 21
… … … …
User’s interface to nature: histograms
: histo = makeHisto(nbins=50, firstbin=0*GeV, lastbin=200*GeV) for thisEvent in allEvents:
if HasZ( thisEvent ): m = recoZMass( thisEvent )
histo.FillWith( m )
• ``Hello World’’ for HEP computing: making a histogram
TH1F::Fill(value,weight)
TH1F(“name”, “title; x title; y title”, nBins, firstBinValue, LastBinValue)
EXERCISE ONE note: in code examples, your input is given in green
Let’s look at a mini-D3PD:
• open the ROOT file: – you% root Hep101MysteryEvents.root
• Browse it: – root [1] new TBrowser();
• Notice the TTree (event data):
• Look at some histograms – Click to make histograms! – Also try “drawing” more complex
functions…
Interacting with the tree
• Histograms: TH1F • Functions: TF1 • Drawing:
root [2] example->Draw("mc_px"); • you now have a default histogram called htemp.
root [3] h2 = TH1F(“h2”, “myHist;GeV”, 10, -200, 200); root [4] example->Draw(“mc_px/1000.>>h2”);
– Note: expression can contain most functions: mc_px + mc_py mc_px * cos(1.2 ) sqrt(mc_px*mc_py) Note: ROOT is also a nice command-
line calculator with Lorentz vectors,
special functions, etc…
PART TWO Monte Carlo in HEP
pick random x, random y if y2 < 1-‐x2: increment area
What is Monte Carlo? • HEP predictions require a lot
of convolution integrals – one reason: QM!
Monte Carlo calculation of π
pick random x, random y if y2 < 1-‐x2: increment area
What is Monte Carlo? • HEP predictions require a lot
of convolution integrals – one reason: QM!
• The Monte Carlo Method: – use random numbers as
an integration tool
Monte Carlo calculation of π
What is Monte Carlo?
Z picks mass
and decay angles
electron ET
• The Monte Carlo Method:
– use random numbers as an integration tool
– Useful because predictions in HEP require a lot of convolution integrals!
• Much simpler picture:
– a series of random choices
What is Monte Carlo?
Z picks mass
and decay angles
electron ET
calorimeter mismeasurement
ovserved electron ET
• The Monte Carlo Method:
– use random numbers as an integration tool
– Useful because predictions in HEP require a lot of convolution integrals!
• Much simpler picture:
– a series of random choices
Meet your Monte Carlo
Slides: Sjöstrand
Meet your MC: #PYTHIA, HERWIG, MadGraph, MCFM, MC@NLO, BaurMC, POWHEG, &c.…
Meet your MC: #PYTHIA, HERWIG, MadGraph, MCFM, MC@NLO, BaurMC, POWHEG, &c.…
Meet your MC: #PYTHIA, HERWIG/JIMMY, …
Meet your MC: #PYTHIA, HERWIG/JIMMY, …
EXERCISE TWO note: in code examples, your input is given in green
Let’s write a Monte Carlo!#B± meson decays in top events
• The B± is unstable but has a long lifetime (picoseconds) because it decays weakly.
• We often measure the lifetime from the flight distance (picoseconds are hard to measure, micrometers are not).
• The flight distance for a B± produced at the origin of the ATLAS detector is L. Lxy is the projection of this distance on to the plane perpendicular to the beam.
z
x
y
Generating the Lxy distribution
path of B±
tracks from decay products
z
Lxy
xy-plane we use Lxy because distances perpendicular to beam
are most precisely measured
Writing the Monte Carlo
The “proper decay length”
• proper decay length = – c * (lifetime in B± rest frame)
• Tables often list c * mean lifetime (cτ)
• Useful particle data for B± – reference: pdg.lbl.gov
Calculate and plot Lxy choose decay .me in rest frame
choose B velocity vector
calculate decay .me in lab frame
calculate decay distance in lab frame
project onto x-‐y plane
e
A program outline
choose decay .me in rest frame
choose B velocity vector
calculate decay .me in lab frame
calculate decay distance in lab frame
project onto x-‐y plane
fill your histogram
create an empty histogram
plot and save your histogram!
calculate Lxy
“event” loop
these 3 steps can be com
bined!
Random Numbers
• Any MC generator needs random numbers! Use the ROOT class TRandom3.
• Example – root [1] TRandom3 r; – root [2] float random1 = r.Gaus(0,35);
• //generate a gaussian-distributed random number with mean 0 and width 35;
– root [3] float random2 = r.Exp(492.0); • //generate a B± decay length (cτ) in microns
Hints
Note: This assignment requires a little modest programming (loops and conditionals).
• You might want to break this problem into two steps: – write an Lxy generator for B decays with fixed E = 25 GeV, pT = 16 GeV – modify this generator to use any (E, pT) distribution
• The relativistic calculation is simple (and good practice). – try it by hand! How simple can you make the expression for Lxy ? – Give up? Try Google – but be sure you understand the steps in the calculation. – Compare the result you get using ROOT classes TLorentzVector and TLorentzRotation!
• Now add a more realistic B (E, pT) distribution. – Look in the file Hep101AdvancedDecay.root. It should have the information you need to
finish your generator (i.e., choose the B decay vector).
• Run your final Monte Carlo, and fill a histogram with the observable decay length Lxy ! • Good statistical practices:
– how many decays did you generate? Why? • Print your histogram!
More homework help: • About TRandom:
– http://root.cern.ch/root/html/TRandom3 • About TLorentzVector
– http://root.cern.ch/root/html/TLorentzVector.html • Getting correlated random numbers from a histogram:
– http://root.cern.ch/root/html/TH2.html • see TH2::GetRandom2()
• You can write your program out on the command line, or in a separate file “myprog.C” (this is easier)
• to run it in root, type – root [1]
• You can print your histogram to a PDF file by clicking the menu in the canvas, or typing – root [1]
Top Related