PanDAMon Integration in CMS
description
Transcript of PanDAMon Integration in CMS
Support for Distributed Computing
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
DBSDC
Author etc
PanDAMon Integration in CMS
Workshop on Analysis Tools Development
May 16th 2013
Nicolò Magini
CERN IT-SDC-OL
date
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
2Author etc2013-05-16
Outline
• Status after the prototype• Current status of the testbed deployment• Plans for the integration testbed• Next steps
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
3Author etc2013-05-16
Monitoring of PanDA jobs
• Reminder: “Monitoring of jobs in PanDA” is more than “PanDA Monitor”
• ATLAS ops and users take advantage of Dashboard (populated from PanDA DB) to complement PanDA Monitor, especially for– Task monitoring– Historical view
• Here I’m going to look only at the “PanDA Monitor” itself, in particular for job debugging
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
4Author etc2013-05-16
PanDAMon for the prototype
• Using ATLAS PanDA Monitor as-is, with minimal updates by V. Fine (ATLAS PanDAMon developer) to make it functional for CMS jobs
• Already working successfully by CMS power users in proof of concept phase
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
5Author etc2013-05-16
PanDAMon for the prototype
• viewlogfiles: perform LFN2PFN conversion with PhEDEx datasvc to find log file location (instead of looking up in central ATLAS catalog)
– Recently had an issue with logfile retrieval, now fixed by V. Fine
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
6Author etc2013-05-16
Testbed deployment
• Additional 2 core, 8 GB VM could be useful as PanDA Mon “development instance” to test deployment and new modules
vocms09 Panda Mon (varnish) SLC6 LB
Preslav VM 2 cores, 8 GB mem, 500 GB disk
prototype
vocms35 Panda Mon (varnish) SLC6 LB
Preslav VM 2 cores, 8 GB mem, 500 GB disk
prototype
vocms33 Panda Mon SLC6 - power node LB
Preslav 23-JAN-14 24 cores, 32 GB mem, 2x750 GB disk
prototype
vocms100 Panda Mon SLC6 LB (temporary node, this is the ASO spare)
Preslav 27-JAN-14 8 cores, 24 GB mem, 3x1TB disk
spare
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
7Author etc2013-05-16
Testbed status
• Basic quattor configuration performed by VOC on all machines following ATLAS templates
• Now in contact with ATLAS Distributed Computing operators for software deployment and configuration procedures
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
8Author etc2013-05-16
PanDAMon testbed goals
• During testbed phase– Reproduce working PanDA Monitor setup from
prototype phase in CMS instance– Identify “ATLAS” assumptions in monitoring,
assess usability for CMS• Some examples found by developers in job debugging
views reported in the following• More surely to be found by CMS ops and users, will
gather feedback
– Produce new PandaMon custom modules for CMS integration for items not covered by current PanDAMon or Dashboard
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
9Author etc2013-05-16
Navigation
• A lot of information on the website is aggregated by cloud
• For CMS, more useful to look at sites rather than clouds?
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
10Author etc2013-05-16
Dataset info
• Dataset info linking to DQ2
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
11Author etc2013-05-16
Dataset info
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
• Need to update to link to DAS/DBS
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
12Author etc2013-05-16
Task monitoring
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
• Linked to ATLAS Task Monitoring• Integrate with CMS Task Monitoring
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
13Author etc2013-05-16
Output file links
• Links to log and output file locations working in “viewlogfile” page, need to fix in “findfile“
• (do we want to update output location in PanDA DB from /store/temp/user to /store/user after ASO?)
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
14Author etc2013-05-16
Error reporting
• ASO failures reported to DB and visible in monitoring but not in “Error details”
• CMS transformation (job wrapper) exit code visible in PanDAMon, but not detailed error message - includes cmsRun messages
• Update links to support mail…
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL
CERN IT Department
CH-1211 Geneva 23
Switzerlandwww.cern.ch/
it
SDC
15Author etc2013-05-16
Next steps
• Next week: deploy PanDAMon as-is on dev server in testbed setup
• When testbed setup is ready, start looking into reported issues
• Interact with PanDAMon developers to learn how to integrate new modules if needed by CMS– First session already done
• Reproduce deployment on prod server
Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL