Status September 2013

12
Status September 2013 Information System meeting with users 24th September 2013

description

24th September 2013. Status September 2013. Information System meeting with users. Performed releases. Update 3 on 09.09.2013 http :// gridinfo.web.cern.ch/sys-admins/bdii-releases bdii 5.2.22-1 Fix for hardcoded path affecting ARC glite -info-provider- ldap 1.4.6-1 - PowerPoint PPT Presentation

Transcript of Status September 2013

Page 1: Status September 2013

Status September 2013Information System meeting with users

24th September 2013

Page 2: Status September 2013

Information System meeting with users - 1st October 2013

2

Performed releases• Update 3 on 09.09.2013 http://

gridinfo.web.cern.ch/sys-admins/bdii-releases– bdii 5.2.22-1

• Fix for hardcoded path affecting ARC– glite-info-provider-ldap 1.4.6-1

• Rollback to previous version in order to publish GLUE 2 Contact and Location objects– These were not published after a modification in the ldap query

requested by ARC

• New GOCDB v5 release– Tested with Top BDII: backwards compatible for

retrieving the site BDIIs endpoints

Page 3: Status September 2013

Information System meeting with users - 1st October 2013

3

Upcoming releases• Fix in the glite-info-provider-ldap needed

– Top BDIIs may not publish all sites if the host is performing slowly• Tracked in https://savannah.cern.ch/bugs/?102608• Documented in http://gridinfo.web.cern.ch/sys-admins/known-issues• Few top BDIIs seem to be affected

– Looking into GLUE 2 (we could look at GLUE 1 too):» 328 site BDIIs published by GOCDB» Average of 30 EGI site BDIIs unresponsive» Publishing more than 300 sites means top BDII is OK» 20 out of 80 top BDIIs endpoints may be affected

• Performance issues– Currently monitoring performance of top BDII

• Due to LDAP design feature• Performance issues already showed up in GLUE 1!

Page 4: Status September 2013

Information System meeting with users - 1st October 2013

4

BDII deployment statusPackage EMI 2 EMI 3 UMD 2 UMD 3 EPEL 5 EPEL 6bdii 5.2.22-1 5.2.22-1 5.2.12-1 5.2.21-1 5.2.21-1 5.2.21-1

EGI WLCG Jun-13 Jul-13 Sep-13 Jun-13 Jul-13 Sep-13

site BDII 407 420 337 site BDII 132 137 131top BDII 104 98 88 top BDII 47 48 40

site BDII 16.03.2012 LDAP improvements 5.2.10 21 15 13 5.2.10 5 5 509.08.2012 IPv6 support 5.2.12 329 292 169 5.2.12 107 82 54

EPEL only 5.2.13 19 12 9 5.2.13 8 6 411.03.2013 ARC integration 5.2.17 20 13 7 5.2.17 6 9 531.05.2013 GLUE 2 Delete bug 5.2.20 18 23 8 5.2.20 6 9 405.08.2013 Security Vulnerability 5.2.21 0 65 26 5.2.21 0 26 1109.09.2013 Contact and Location 5.2.22 0 0 105 5.2.22 0 0 48

407 420 337 132 137 131top BDII

16.03.2012 LDAP improvements 5.2.10 5 3 1 5.2.10 4 2 109.08.2012 IPv6 support 5.2.12 55 46 31 5.2.12 24 21 15

EPEL only 5.2.13 12 8 8 5.2.13 3 3 111.03.2013 ARC integration 5.2.17 15 9 4 5.2.17 9 6 331.05.2013 GLUE 2 Delete bug 5.2.20 17 20 10 5.2.20 7 9 705.08.2013 Security Vulnerability 5.2.21 0 12 9 5.2.21 0 7 309.09.2013 Contact and Location 5.2.22 0 0 25 5.2.22 0 0 10

104 98 88 47 48 40

Page 5: Status September 2013

Information System meeting with users - 1st October 2013

5

EGI Technical Forum• Training on glue-validator– Recorded and available in: https://

documents.egi.eu/public/ShowDocument?docid=1955

– 17 people registered• Few non registered participants• Few connected remotely

• OGF GLUE WG meeting– Discussion to include ARC changes in LDAP

• Information System workshop– Unicore and Globus resources now integrated in BDII

Page 6: Status September 2013

Information System meeting with users - 1st October 2013

6

BDII and ARC DITs

• Works for ARC but not for BDII (glite-info-provider-ldap 1.4.4-1)– GLUE2Contact and GLUE2Location missing!ldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glueldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glue –s base

• Works for BDII but not for ARC (glite-info-provider-ldap 1.4.6-1)– Services missing!ldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glue

• However, all ARC sites in WLCG seem to have a BDII like DIT!

Page 7: Status September 2013

Information System meeting with users - 1st October 2013

7

EPEL• No progress to obtain packaging status– Started during holidays– Postponed due to other priorities– M. Ellert agreed to release bdii when needed

• EPEL status– https://twiki.cern.ch/twiki/bin/view/EMI/

BDIIEPELstatus

Page 8: Status September 2013

Information System meeting with users - 1st October 2013

8

GLUE 2 validation for sites

• Still analysing September results– Will summarise findings for GDB next week

• Checking that sites can actually fix problems – Using exclude-known-issues option

• Most errors related to default values being published!– Estimated Average and Worst waiting times, Max

running jobs -> calculated by dynamic scheduler– Waiting jobs (famous 444444) – All these attributes rely on batch system configuration!

Page 9: Status September 2013

Information System meeting with users - 1st October 2013

9

GLUE 2 validation for middleware

• Sent a mail to URT to get testing resources to check newer versions– No answer! IS not a priority for MW developers

• Storage Capacity in GLUE 2– Discussions need to be restarted– Is there a need for a usage document for GLUE 2?

Page 10: Status September 2013

Information System meeting with users - 1st October 2013

10

Glue-validator in Nagios• Final version on midmon 01.10.2013• Validation by COD/ROD team 10.10.2013• Glue-validator in operations on 01.11.2013

Page 11: Status September 2013

Information System meeting with users - 1st October 2013

11

Retirement of GLUE 1• EGI is preparing the retirement of GLUE 1– Test GLUE 2 information consumption• 2014 QR1

– Stop support of GLUE 1 as of May 2014• If no blocking issues are found• How will the end of support be actually implemented?

– Modify information providers so they don’t publish GLUE 1?» This will take time!

• Retiring GLUE 1 will take a long time– But it won’t be a trustful source of information as soon as the

support officially ends

Page 12: Status September 2013

Information System meeting with users - 1st October 2013

12

FCR• FCR only in GLUE 1– Any plans to write it also for GLUE 2?

• CMS queues are removed completely if they are blacklisted– All the ACBRs are removed– The object is no longer valid and does not get

published– Do we want to fix this? It’s a known issue

• Entries removed due to FCR are still cached– Is this OK? Should this be fixed?