3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook...

22
3 rd May’03 Nick Brook – 4 th LHC Symposium 1 Data Analysis – Present & Future Nick Brook University of Bristol •Generic Requirements & Introduction •Expt specific approaches:

Transcript of 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook...

Page 1: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 1

Data Analysis – Present & Future

Nick Brook

University of Bristol•Generic Requirements & Introduction

•Expt specific approaches:

Page 2: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 2

Detectors: ~2 orders of magnitude more channels than today

Triggers must choose correctly only 1 event in every 400,000

High Level triggers are software-based

Computer resourceswill not be availablein a single location

Complexity of the Problem

Page 3: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 3

Complexity of the Problem

Major challenges associated with:Communication and collaboration at a distanceDistributed computing resources Remote software development and physics analysis

Page 4: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 4

Analysis Software System

Reconstruction

Selection

Analysis

Re-processing3 per year

Iterative selectionOnce per month

Different Physics cuts& MC comparison~1 time per day

~25 Individual~25 Individualper Groupper GroupActivityActivity

(10(1066 –10 –1088 events) events)

New detector calibrations

Or understanding

Trigger based andPhysics basedrefinements

Algorithms appliedto data

to get results

30 kSI2000sec/event

1 job year

30 kSI2000sec/event

1 job year

30 kSI2000sec/event

3 jobs per year

30 kSI2000sec/event

3 jobs per year

0.25 kSI2000sec/event~20 jobs per month

0.25 kSI2000sec/event~20 jobs per month

0.1 kSI2000sec/event~500 jobs per day

0.1 kSI2000sec/event~500 jobs per day

Monte Carlo

50 kSI2000sec/event

50 kSI2000sec/event

~20 Groups’~20 Groups’ActivityActivity

(10(109 9 101077 events) events)

2GHz ~ 700 SI2000

Experiment-Experiment-Wide ActivityWide Activity(10(1099 events) events)

Page 5: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 5

Data Management

tools

Detector/EventDisplay

Data Browser

Analysis jobwizards

Generic analysis Tools

ReconstructionReconstruction

SimulationSimulation

LCGLCGtoolstools

GRIDGRID

FrameworkFramework

DistributedData Store

& ComputingInfrastructure

ExptExpttoolstools

StableUser Interface

Coherent set of basic tools and mechanisms

Software development and installation

Analysis Software System

Page 6: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 6

PhilosophyPhilosophy

• we want to perform analysis from day 1 (now) !

• building on Grid tools/concepts to simplify distributed environment

Time

Hype

Peak of Inflated Expectations

Trough of Disillusionment

Slope of Enlightenment

Plateau of Productivity

Trigger

Page 7: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 7

Data Challenges & Production Tools

All experiments have well-developed production tools for co-ordinated data challenges e.g. CHEP talks on

DIRAC – Distributed Infrastructure with Remote Agent Control

Tools provide management of workflows, job submission, monitoring, book-keeping, …

Page 8: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 8

AliEn (ALIce ENvironment) is an attempt to gradually approach and tackle computing problems at LHC scale

and implement ALICE Computing Model

Main features– Distributed file catalogue built on top of RDBMS – File replica and cache manager with interface to MSS

• CASTOR,HPSS,HIS…• AliEnFS – Linux file system that uses AliEn File Catalogue and replica manager

– SASL based authentication which supports various authentication mechanisms (including Globus/GSSAPI)

– Resource Broker with interface to batch systems• LSF,PBS,Condor,BQS,…

– Various user interfaces • command line, GUI, Web portal

– Package manager (dependencies, distribution…)– Metadata catalogue – C/C++/perl/java API– ROOT interface (TAliEn)– SOAP/Web Services

– EDG compatible user interface• Common authentication• Compatible JDL (Job description language) based on CONDOR ClassAds

Page 9: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 9

(…)

DB

I

DB

D

RD

BM

S(M

ySQ

L)

LDA

P

V.O

.Pack

ages

&C

om

mand

s

Perl C

ore

Perl M

odule

s

Exte

rnal

Libra

ries

File &

Meta

data

C

ata

logue

SO

AP

/XM

LC

ESE

Logger

Data

base

Pro

xy

Auth

enti

cati

on

RB

Use

r Inte

rface

AD

BI

Config

Mgr

Packa

ge

Mgr W

eb

Port

al

Use

r A

pplica

tion

API (C

/C+

+/p

erl)

CLI

GU

I

AliEn

Core

Com

pon

en

ts &

serv

ices

Inte

rfaces

Exte

rnal soft

ware

Low

level

Hig

h level

FS

AliEn Architecture

Page 10: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 10

ALICE have deployed a distributed computing environment which meets their experimental needs Simulation & Reconstruction

Event mixing

Analysis

Using Open Source components (representing 99% of the code), internet standards (SOAP,XML, PKI…) and scripting language (perl) has been a key element - quick prototyping and very fast development cycles

close to finalizing AliEn architecture and API

OpenAliEn?

Page 11: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 11

PROOF – The Parallel ROOT Facility

Collaboration between core ROOT group at CERN and MIT Heavy Ion Group

Part of and based on ROOT framework Uses heavily ROOT networking and other infrastructure

classes Currently no external technologies

Motivation:o interactive analysis of very large sets of ROOT data

files on a cluster of computerso speed up the query processing by employing

parallelismo to extend from a local cluster to a wide area “virtual

cluster” - GRID. o analyze a globally distributed data set and get back a

“single” result with “single” query

Page 12: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 12

PROOF – Parallel Script Execution

root

Remote PROOF Cluster

proof

proof

proof

TNetFile

TFile

Local PC

$ root

ana.Cstdout/obj

node1

node2

node3

node4

$ root

root [0] .x ana.C

$ root

root [0] .x ana.C

root [1] gROOT->Proof(“remote”)

$ root

root [0] tree->Process(“ana.C”)

root [1] gROOT->Proof(“remote”)

root [2] chain->Process(“ana.C”)

ana.C

#proof.confslave node1slave node2slave node3slave node4

*.root

*.root

*.root

proof

proof = master server*.root

proof

proof = slave server

TFile

TFile

Page 13: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 13

PROOF & the Grid

Page 14: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 14

Converter

Algorithm

Event DataService

PersistencyService

DataFiles

AlgorithmAlgorithm

Transient Event Store

Detec. DataService

PersistencyService

DataFiles

Transient Detector

Store

MessageService

JobOptionsService

Particle Prop.Service

OtherServices

HistogramService

PersistencyService

DataFiles

TransientHistogram

Store

ApplicationManager

ConverterConverter

Gaudi – ATLAS/LHCb software framework

Page 15: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 15

GAUDI Program

GANGAGU

I

JobOptionsAlgorithms

Collective&

ResourceGrid

Services

HistogramsMonitoringResults

GANGA: Gaudi ANd Grid AllianceJoint Atlas and LHCb project,

Based on the concept of Python bus:use different modules whichever are required to provide full functionality of the interfaceuse Python to glue this modules, i.e., allow interaction and communication between them

Page 16: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 16

Python Software Bus

Server

BookkeepingDBProductio

nDB

EDG UI

PYTHON SW BUS

XML RPC server

XML RPC module

GANGA Core Module

OS Module

Athena\GAUDI

GaudiPython PythonROOT

PYTHON SW BUS

GU

I

JobConfigurati

onDB

Remote user

(client)

Local JobDB

LAN/WAN

GRID

LRMS

Page 17: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 17

Most of base classes are developed. Serialization of objects (user jobs) is implemented with the Python pickle module.

GaudiApplicationHandler can access Configuration DB for some Gaudi applications (Brunel). It is implemented with the xmlrpclib module. Ganga can create user-customized Job Options files using this DB.

DaVinci and AtlFast application handlers are implemented Various LRMS are implemented - allows to submit and to get

simple monitoring information for a job on several batch systems.

Much of GRID-related functionality is already implemented in GridJobHandler using EDG testbed 1.4 software. Ganga can submit, monitor, and get output from GRID jobs.

JobsRegistry class provides jobs monitoring via multithreaded environment based on Python threading module

GUI available - using wxPython extension module ALPHA release available

Current Status

Page 18: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 18

… … … …

Reconstruction,L1, HLTORCA DST

CMS analysis/production chain

… … … …

DigitizationORCA

Digis:raw data

bx

AnalysisIguana/

Root/PAW

Ntuples:MC info,tracks,

etc

DST strippingORCA

… … … …

MB… … … … MC ntuples

Event generationPYTHIA

b/ e/ JetMet

Calib

rati

on

Detector simulationOSCAR

Detector Hits

MB… … … …

Page 19: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 19

Production system and data repositories

ORCA analysis farm(s) (or distributed `farm’

using grid queues)

RDBMS based data

warehouse(s)

PIAF/Proof/..type analysis

farm(s)

Local disk

User

Tier 1/2

Tier 0/1/2

Tier 3/4/5

TAGs/AODsdata flow

Physics Query flow

Productiondata flow

TAG and AOD extraction/conversion/transport services

Data extraction

Web service(s)

Local analysis tool: Iguana/ROOT/… Web browser

Query Web service(s)

Tool plugin

module

CMS components and data flows

Page 20: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 20

Grid-enabling the working environment for physicists' data analysis

Clarens consists of a server communicating with various clients via the commodity XML-RPC protocol. This ensures implementation independence.

The server will provide a remote API to Grid tools:

Client

RPC

Web Server

Clarens

Service

http

/htt

ps

The Virtual Data Toolkit: Object collection access Data movement between Tier centres using GSI-

FTP CMS analysis software (ORCA/COBRA), Security services provided by the Grid (GSI) No Globus needed on client side, only certificate Current prototype is running on the Caltech proto-Tier2

CLARENS – a CMS Grid Portal

Page 21: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 21

CLARENS

Proxy escrow Client access available from wide variety of languages

PYTHON C/C++ Java application Java/Javascript browser-based client

Access to JetMET data via SQL2ROOT Root access to remote data files Access to files managed by San Diego SC storage

resource broker (SRB)

Several web services applications have been built on the Clarens web service architectures:

Page 22: 3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.

3rd May’03 Nick Brook – 4th LHC Symposium 22

Summary

•all 4 expts have successfully “managed” distributed production

• many lessons learnt – not only by expt but useful feedback to m/w providers

• a large degree of automisation achieved

•Expts moving onto next challenge – analysis

• Chaotic, unmanaged access to data & resources

• Tools already (being) developed to aid Joe Bloggs

•Success will be measured in terms:

• Simplicity, stability & effectiveness

• Access to resources

• Management & access to data

• Ease of development of user applications