Publication patterns in HEP computing
description
Transcript of Publication patterns in HEP computing
Maria Grazia Pia, INFN Genova 1
Publication patterns in HEP computing
M. G. Pia1, T. Basaglia2, Z. W. Bell3, P. V. Dressendorfer4
1INFN Genova, Genova, Italy2CERN, Geneva, Switzerland3ORNL, Oak Ridge, TN, USA4IEEE, Piscataway, NJ, USA
CHEP 2012, NYC
Maria Grazia Pia, INFN Genova 2
Analysis topics
General tools−Geant4−ROOT
HEP experiments−LEP
ALEPH, DELPHI, L3, OPAL
−BaBar−LHC
ALICE, ATLAS, CMS, LHCb, TOTEM
Grid computing −LCG
What they publish
How much
Where
Citations
Technology vs physics
Software vs hardware
Software/DAQ-trigger
Representative
Not exhaustive
Maria Grazia Pia, INFN Genova 3
Data sourcesThomson-Reuters: ISI Web of Knowledge− CERN subscription: since 1970, conference database not included− Search by keywords, collaboration name
Journal web sites− IEEE TNS− NIM, Comp. Phys. Comm. (Elsevier)− JINST (IOP/SISSA)➤ Full-text searches
CERN databases− CERN Document System− Greybook
Years: 1982-2011 (LEP), 1992-2011 (BaBar, LHC)− Reproducible sample
Maria Grazia Pia, INFN Genova 4
Data sampleContamination−Non-pertinent entries in the data sample
Omission− Pertinent papers are not included in the data sample
➩ Cross-checks−WoS/CDS, WoS/publishers’ web sites
WoS inconsistencies and errors− Total number of citations includes Conference database− Proceedings papers: false classifications and omissions ➩Manually corrected whenever possible
Automated analysis (whenever possible)
Manual evaluation: abstracts and full-text papers − Some degree of subjectivity
Maria Grazia Pia, INFN Genova 5
S. Agostinelli et al.
Geant4: a simulation toolkitNIM A, vol. 506, no. 3, pp. 250-303, 2003
J. Allison et al.
Geant4 Developments and ApplicationsIEEE Trans. Nucl. Sci., vol. 53, no. 1, pp. 270-278, 2006
2934 citations (14 May 2012)
2026 citations excluding proceedings
Most cited CERN publication in WoS(excluding Rev. Part. Properties)
574 citations (14 May 2012)
381 citations excluding proceedings
Many papers cite the NIM paper, but they omit citing the TNS one, even though both are indicated in http://cern.ch/geant4
Many papers that use Geant4 do not cite either reference
Citation analysis: until 2011 (reproducibility)
Maria Grazia Pia, INFN Genova 6
2003 2004 2005 2006 2007 2008 2009 2010 20110
100
200
300
400
500 Geant4 NIM Geant4 TNS
Year
Cit
ati
on
sRadiat. Prot. Dosim.
J. Korean Phys. Soc.Radiat. Meas.
Appl. Radiat. Isot.J. Phys. G
JHEPAstrop. Phys.
EPJCJINSTNIM B
Phys. Lett. BPhys. Rev. C
Phys. Med. Biol.Med. Phys.
Phys. Rev. Lett.TNS
Phys. Rev. DNIM A
0 50 100 150 200 250 300 350 400
Geant4 NIM: Citing Journals
Citations
30% Physics
75% citations(plot)
ISOLDE
ALICE
JET EFDA
BES III
N TOF
MiniBooNE
LUNA
CDF
HARP
LHCb
CMS
ATLAS
BaBar
0 20 40 60 80 100 120 140 160 180 200
G4 NIM: Citing Collaborations
Citations
LHCHEPOther
16% citations (plot)19% citations from collaborations
Born from LHC experimental requirementsMultidisciplinary sources of citations
Maria Grazia Pia, INFN Genova 7
R. Brun and F. Rademakers
ROOT - An object oriented data analysis framework NIM A, vol. 389, no. 1-2, pp. 81-86, 1997
I. Antcheva et al.
ROOT - A C++ framework for petabyte data storage, statistical analysis and visualizationComp. Phys Comm., vol. 180, no. 12, pp. 2499-2512, 2009
540 citations (14 May 2012)
347 citations excluding proceedings
27 citations (14 May 2012)
20 citations excluding proceedings
AIHENP Workshop proceedings paper
Citation analysis: until 2011 (reproducibility)
Maria Grazia Pia, INFN Genova 8
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
0
10
20
30
40
50
60 ROOT Proc. ROOT CPC
Year
Cit
ati
on
s
NIM ATNS
Comp. Phys. Comm.Phys. Rev. CPhys. Rev. D
JINSTPhys. Med. Biol.
EPJCMed. Phys.
JHEPNIM B
Lect. Notes Comp.Astropart. Phys.
0 20 40 60 80 100 120
ROOT Proc.: Citing Journals
Citations
AUGER
D0
H1
JET-EFDA
PHOBOS
RISING
ALICE
T2K
D0
CDF
0 1 2 3
Citations
75% citations
8% of all citations from collaborationsGeant4 % ROOT %
Technology 30.3 49.6
Physics 29.9 18.2
BioMedical 13.9 6.0
Field of citing journals
Maria Grazia Pia, INFN Genova 9
HEP experiments
LEP• ALEPH• DELPHI• L3• OPAL
BaBarLHC
• ALICE• ATLAS• CMS• LHCb• TOTEM
LEP: 1989BaBar: 1999LHC: 2008
Start of run
Maria Grazia Pia, INFN Genova 10
Time distribution LEP: 1989BaBar: 1999LHC: 2008
Run start
Publication year Rescaled w.r.t. year of start run
Maria Grazia Pia, INFN Genova 11
Time distribution LEP: 1989BaBar: 1999LHC: 2008
Run start
Same as previous slide, rescaled by the number of experiment members
Maria Grazia Pia, INFN Genova 12
Publications Share of hardware, software and DAQ-trigger
publications
Maria Grazia Pia, INFN Genova 13
Physics publications
LEP experiments completed their life-cycleLHC experiments: at an early stage of their physics production
Maria Grazia Pia, INFN Genova 14
Technological publications
Roughly constant trends, once the number of publications is normalized to the number of collaborators
Maria Grazia Pia, INFN Genova 15
Software vs. hardware
Hardware publications: approximately 4 times more than softwareDAQ-trigger publications: approximately 1.3 times more than software
Maria Grazia Pia, INFN Genova 16
Journals
hardware
DAQ-trigger
software
TNS
NIMA
JINST
Maria Grazia Pia, INFN Genova 17
Journals: LEP and LHC
Still dominated by technological publications
LHCLEP
Dominated by physics publications
Maria Grazia Pia, INFN Genova 18
Journals: pre- and post-2000
IEEE TNS is the most popular journal for HEP technological publications in recent years
Maria Grazia Pia, INFN Genova 19
CitationsThe most cited papers are often the general reference papers about the detector published by each experiment
Citations of the most cited paperALEPH: 340DELPHI: 309L3: 509OPAL: 473BaBar: 859ALICE: 116CMS: 129LHCb: 101TOTEM: 35ATLAS: ATLAS pixel detector electronics and sensors: 185
0 citations: 4% 0 citations: 17%
0 citations: 27% 0 citations: 25%
Maria Grazia Pia, INFN Genova 20
More references
more citations
ReferencesPhysics papers cite
more references than technological
papers
Bibliographical entries in software papers are often
web sites
Maria Grazia Pia, INFN Genova 21
Pages
The number of pages of a paper depends on the format of the journal− 1 pageTNS ≈ 2.5 pagesJINST
Different journal formats in the same category
Evolutions of the format of some journals (e.g. NIM)
Maria Grazia Pia, INFN Genova 22
Sources of citations to physics papers
Nucl. Phys. A
Phys. Atom. Nucl.
Mod. Phys. Lett. A
Phys. Rep.
NIM A
J. Phys. G
Acta Phys. Pol. B
Int. J. Mod. Phys. A
Z. Phys. C
JHEP
Phys. Rev. Lett.
Nucl. Phys. B Proc. Suppl.
Nucl. Phys. B
EPJC
Phys. Lett. B
Phys. Rev. D
0 5 10 15 20 25
DELPHI ALEPH
Citations (%)
Ann. Rev. Nucl. Part. Sci.
J. Cosm. Astrop. Phys.
JINST
New J. Phys.
Progr. Theor. Phys. Suppl.
Int. J. Mod. Phys. A
J. Phys. G Nucl.
Mod. Phys. Lett. A
Nucl. Phys. A
Phys. Rev. C
Acta Phys. Pol. B
Phys. Rev. Lett.
EPJC
Phys. Lett. B
JHEP
Phys. Rev. D
0 5 10 15 20 25 30
CMS ATLAS
Citations (%)
LHCLEP
Samples in plots account for >90% of citations
Citations to HEP physics papers mostly come from journals specialized in HEP and a few related fields (astroparticle and nuclear physics)
Maria Grazia Pia, INFN Genova 23
Sources of citations to technological papers
Int. J. Mod. Phys A
Comp.Phys. Comm.
Phys. Lett. B
Nucl. Phys. B Proc. Suppl.
JHEP
Phys. Rev. D
EPJC
JINST
TNS
NIM A
0 10 20 30 40 50 60
CMS ATLAS
Citations (%)
Rep. Prog. Phys.
Rev. Mod. Phys.
Ann. Rev. Nucl. Part. Sci.
Phys. Rep.
Int. J. Mod. Phys. A
Acta Phys. Pol. B
JHEP
Phys. Rev. D
Nucle. Phys. B
Comp. Phys. Comm.
Z. Phys. C
Nucl. Phys. B Proc. Suppl.
TNS
Phys. Lett. B
EPJC
NIM A
0 5 10 15 20 25 30 35 40
DELPHI ALEPH
Citations (%)
Citations from HEP physics and technology journals
LHCLEP
Maria Grazia Pia, INFN Genova 24
2008-2011 More refined analysis of technological papers published since start of LHC run
ATLAS CMS LHCb ALICE TOTEM LHC 0
5
10
15
20
25
TNS 2008-2011
Hardware Software DAQ-trigger
Nu
mb
er o
f p
aper
s
ATLAS CMS LHCb ALICE TOTEM LHC 0
5
10
15
20
25
30
35
40
45
50
NIM 2008-2011
Hardware Software
Nu
mb
er
of
pa
pe
rs
Maria Grazia Pia, INFN Genova 25
ATLAS CMS LHCb ALICE TOTEM LHC 0
5
10
15
20
25
30
35
40 Hardware Software DAQ-trigger
Nu
mb
er
of
se
lf-c
ita
tio
ns
ATLAS CMS LHCb ALICE TOTEM LHC 0
5
10
15
20
25
30
35
40Hardware Software DAQ-trigger
Nu
mb
er
of
ou
tsid
e c
ita
tio
ns
TNS TNS
NIM ANIM A
Citations 2008-2011Self-citationsOutside citations
ATLAS CMS LHCb ALICE TOTEM LHC 0
10
20
30
40
50
60
70
80 Hardware Software
Nu
mb
er
of
se
lf-c
ita
tio
ns
ATLAS CMS LHCb ALICE TOTEM LHC 0
10
20
30
40
50
60
70
80 Hardware Software
Nu
mb
er
of
ou
tsid
e c
ita
tio
ns
Maria Grazia Pia, INFN Genova 26
LCG – LHC Computing GridSakamoto, H Data grid deployment for high energy physics in Japan CPC 200
7Shiers, J The Worldwide LHC Computing Grid (worldwide LCG) CPC 200
7Belov, S et al. LCG MCDB - a knowledgebase of Monte-Carlo simulated events CPC 200
8Yin, Fet al. Grid resource management policies for load-balancing and energy-saving
by vacation queuing theoryCPC 200
9Malawski, M et al. Invocation of operations from script-based Grid applications Fut. Gen.
Comp. Syst.201
0Huedo, E et al. A modular meta-scheduling architecture for interfacing with pre-WS and
WS Grid resource management servicesFut. Gen. Comp. Syst.
2007
Agarwal, A et al. GridX1: A Canadian computational grid Fut. Gen. Comp. Syst.
2007
Chytracek, R et al. POOL development status and production experience TNS 2005
Hatlo, M et al. Developments of mathematical software libraries for the LHC experiments
TNS 2005
Pfeiffer, A et al. The LCG PI project: Using interfaces for physics data analysis TNS 2005
Munro, C et al. Measurement of the LCG2 and gLite File Catalogue's performance TNS 2006
Li, H Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids
IEEE Trans. Par. Distr. Syst.
2010
Andreeva, J et al. High-Energy Physics on the Grid: the ATLAS and CMS Experience J. Grid Comp. 2008
Munoz, VM et al. A Decentralized Deployment Strategy and Performance Evaluation of LCG File Catalog Service
J. Grid Comp. 2011
Hou, S et al. PacCAF: a Grid Portal in Pacific Asia for the CDF Experiment J. Grid Comp. 2009
Kim, BK et al. A Composition of Monitoring Services for the LHC Computing Grid J. Grid Comp. 2009
WoS
Maria Grazia Pia, INFN Genova 27
LCG
Small sample of publications
Hard to perform any statistical analysis
Maria Grazia Pia, INFN Genova 28
ConclusionsSoftware is largely underrepresented in HEP scholarly literature w.r.t. hardware
Publication patterns appear similar in the LEP and LHC era
Citation patterns are different for publications by HEP experiments and about general software tools
Publish!…and don’t forget to cite