GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of...
-
Upload
ariel-griff -
Category
Documents
-
view
216 -
download
0
Transcript of GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of...
GRID: Computing Without Borders
Kajari Mazumdar
Department of High Energy PhysicsTata Institute of Fundamental Research, Mumbai.
Soft-computing workshop University of Mumbai , December 1, 2009
Disclaimer:• I am only a physicist whose research field induces & utilizes cutting-edge technology• I have mostly borrowed slides from various resources
Plan of talk
Grid concept in simple terms
Requirements of today’s scientific community
Evolution of Grid
LHC Computing Grid TIFR grid computing centre DAE contributions
Outlook
Grid computing in simple words Grid is an utility or infra-structure for complex, huge computations, where remote resources are accessible through web (internet), from desktop, laptop, mobile phone. It is similar to power grid, where the user does not have to worry about the source of the computing power.
Imagine millions of computers owned by individuals, institutes from various countries across the world connected to form a single, huge, super-computer!
This technology, developed since last only one decade, is being used presently, by• High energy physicists to analyze data to be produced very soon in LHC experiment where Indian scientists are taking part.
• Earth scientists to monitor Ozone layer activity (deals daily with Data whose volume is equivalent to 150 CDs).
It is the natural evolution of internet facility .
1. Share more than information Data, computing power, applications in dynamic, multi-institutional, virtual
organizations (Ian Foster: Anatomy of Grid)
2. Efficient use of resources at many institutes. People from many institutions working to solve a common problem (virtual organisation).
3. Join local communities.
4. Interactions with the underneath layers must be transparent and seemless to the user.
From Web to Grid Computing
Computing requirements
High-end computing application
Weather forecast
Share data between thousands of scientists with multiple interests
• Link major and minor computer centres
• Ensure all data accessible anywhere, anytime
• Grow rapidly, yet remain reliable for more than a decade
• Cope with different management policies of different centres
• Ensure data security
• Be up and running routinely
•Need to check up health of facilty on 24X7 basis.
A huge man power is at work invisibly .
Challenges in scientific computations
Ever-increasing demand
• PC of early 2000 era is as fast as of supercomputers of 1990’s. Still, for many application it is not adequate! users continue to buy new machines!
• Storage available in a PC could not be thought of during 1990’s storage capacity doubles every 12 month or so!
• Recent years of this decade is seeing mammoth scientifc projects where data size is several Petabytes per year.
• To work with a colleague even across a campus on Petabyte scale we need ultrafast network.
Even though CPU power, disc storage, communication speed continue to increase, computing resources are failing to satisfy users’ demands, they are difficult to use.
Supercomputer
Parallel computer
PC clusters, multiple PCs
Clusters: Primary IT infrastructures
Clusters replace traditional computing platforms and can be configured according to the needNetwork load distribution and load balanceHigh availability, High performance /computation intensive, ..
Issues related to building clusters
• Scalability of interconnection network• Scalability of software components (libraries, applications,..)• Auto-installation, cluster management, trouble-shooting, …• Space management (desktop/rack mounted)• Layout of nodes, noises, cable layout, cooling, ..• Power management• Centralized infrastructure management software• Performance/ Price/ Power consumption
Cost of ownership not very low!
Peer to Peer (P2P) computing
Computing based on idea of sharing distributed resources with each other with or without the support from a server
There are many under-utilised resourcesWith powerful pcs, real utilisation today is < 10%
In large organizations, with thousands of PCs, increasing day by day utilise that in cycle stealing mode!• Total delivered power is > few Mflops• Total available free disk space > 100 Terabytes
• Latency and bandwidth of LAN environment is quite adequate for p2p computing mostly.• Space is also not a problem, keep the PCs wherever they are!
Internet computing
• Today you cannot run your jobs on the internet.
• Internet computing using idle PC’s is becoming an important computing platform (Seti@home, Napster, ..)www is the promising candidate for core component of wide-area distributed computing environment.Efficient client/server models/protocolsTransparent networking, navigation, GUI with multimedia access and dissemination for data visualization.
• Mechanism for distributed computing : CGI, Java
• With improved price/performance and open source, free software, web-services, it is becoming easy to develop loosely coupled distributed applications.
Working together apart
Essentials of GRID computing
Virtual organizations: GRID
TIFR
Grid Components
Grid overview
How does grid work?
GRID portal / Gateway
Grid Services: grid middleware
LHC and the GRID Computing
A pathologist uses a microscope to examine blood cells, of size about one thousandth of a mm, ie, 10-6 m
High energy probes structure of fundamental matter.
LHC will collide very, very high energy protons for this purpose.
Mammoth, very complex detectors (length 30 m, dia 20 m) are the technical eye of several thousand scientists to probe the smallest length scale.
Complexity of LHC experiments
When 2 very high energy protons will collide at LHC, mostly the situation in the detector will be like this, very crowded.
About 10 million electrical signals will have to be recorded in tiny fraction of a second, repeatedly for a long time (about 10 years). Using computers, a digital image is created for each such instance. Image size is about 2 MB on average, but varies considerably.
But most of these pictures are not interesting! Good things are always rare!
In LHC experiment the task of the scientist is, to
Look for an instance with patterns of this type from 10 thousand Billion (1013 )crowded pictures.
This picture contains the clue about our universe.
Such a job is like, searching for a needle in a million haystacks! Similar to looking for a particular person in a thousand world populations of today (6 Billion, India’s population 1.2 Billion) A single computing system will never scale up to the
challenge.Concept of GRID computing developed from such requirements
LHC will collide 6-8 hundred million proton-on-proton per second for several years.
Only 1 in 20 thousand collisions will have an important tale to
tell, but we do not know which one!
so we have to search through all of them!
Huge task!
• 15 PBytes (10 15 bytes) of data a year
• Analysis requires ~100,000
computers to get results in reasonable time.
GRID computing is essential
In hard numbers
LHC-CERN DAE collaboration
175 Mbps
100 Mbps
100 Mbps100 Mbps
CAF450MB/s(300Hz)
30-300MB/s (ag. 800MB/s)
~150k jobs/day
~50k jobs/day
50-500MB/s
10-20MB/s
CMSdetector
Tier-0Prompt ReconstructionArchival of Copy of Raw
and First RECO dataCalibration Streams (CAF)Data Distribution Tier-1
7 Tier-1sRe-Reconstruction
SkimmingSecond Archival of RAW
Served Copy of RECOArchival of Simulation
Data Distribution Tier-2
~50 Tier-2sPrimary Resources for
Physics Analysis and Detetector Studies by
users MC Simulation Tier-1
WLCG Computing Grid Infrastructure
The way CMS uses the GRID (WLCG)
TIER-3TIER-3 TIER-3TIER-3
100MB/s
CMS in Total: 1 Tier-0 at CERN (GVA) 7 Tier-1s on 3 continents 50 Tier-2s on 4 continents
33P.Kreuzer - GRID Computing - Mumbai
Tier 0
Tier 1National centres
Tier 2Regional groups
Different Universities,Institutes
Individual scientist’s PC
Experimental
site
CERN computer
centre,Geneva
ASIA(Taiwan)
India China KoreaPakist
an
France
ItalyGermany
USA
TIFRDelhi
U.Panjab U.
Useful model for Particle Physics experiments, but not necessary for others
T2_IN_TIFR
Tiered/Layered Structure connecting computers across the globe
Hardware at TIFR site: T2_IN_TIFR
About 50 users/scientists at present, still growing.
Another similar Tier2 centre in Kolkata for a different experiment at LHC.
Grid facility has been functional at TIFR for almost a year.
The CMS collaboration at LHC, CERN has been using the computer resources.
• Storage: 350 TB• 300 worker nodes.• Internet bandwidth: 1 GBps. To be upgraded in near future.
Note, continuous monitoring essential, we are manageing with 5 engineers, not all are full time.
Networking , GRID Middleware , Sites
GRID Middleware Services- Storage Elements- Computing Element- Workload Management System- Local File Catalog- Information System- Virtual Organisation Management Service- Inter-operability between GRIDs EGEE, OSG, NorduGriD..
Networking
Site Specificities, e.g. Storage/Batch systems at CMS Tier-1s:
Storage : dCache/ Castor dCache/HPSS dCache/ Castor Castor+ dCache/Chimera Enstore Enstore Storm TSM
Batch : Condor Torque/Maui BQS Torque/Maui Torque/Maui LSF PBSPro
FNAL RAL CCIN2P3 PIC ASGC INFN FZK
36P.Kreuzer - GRID Computing - Mumbai
Data Transfers from/to TIFR• TIFR T1 Transfer Quality (last 3 months) : improving, aim for stability :
• TIFR T1 production transfers (last year) : modest but ready to grow !
TIFR ASGC8TB MC data(custodial storage)
T1 TIFRTot 37TB from 7 T1s
37P.Kreuzer - GRID Computing - Mumbai
Statistics and plotsSite summary table
Site history
Site ranking
38P.Kreuzer - GRID Computing - Mumbai
CMS Software Deployment
Deployment of CMS SW to 90% sites in few hours
Basic strategy: Use RPM (with apt-get) in CMS SW area
EGEE
39P.Kreuzer - GRID Computing - Mumbai
CMS Centers and Computing Shifts
CMS Centre at CERN: monitoring, computing operations, analysis
CMS Experiment Control Room
CMS Remote Operations Centre at Fermilab
• CMS running Computing shifts 24/7• Encourage remote shifts• Main task: monitor and alarm •CMS sites & Computing Experts
40P.Kreuzer - GRID Computing - Mumbai
Challenge
Current Issues involved
You may be one of these scientists working at LHC and using GRID computing facility very soon, even trying to improve it!
Compute-communication cross-over
World Wide Web – Information Sharing
• Invented at CERN by Tim Berners-Lee (in 1990s)
• Agreed protocols: HTTP, HTML, URLs
• Anyone can access information and post their own
• Quickly crossed over into public use
No.
of
Inte
rnet
hosts
(m
illion
s)
Year
Going back