Publication patterns in HEP computing

28
Maria Grazia Pia, INFN Genova 1 Publication patterns in HEP computing M. G. Pia 1 , T. Basaglia 2 , Z. W. Bell 3 , P. V. Dressendorfer 4 1 INFN Genova, Genova, Italy 2 CERN, Geneva, Switzerland 3 ORNL, Oak Ridge, TN, USA 4 IEEE, Piscataway, NJ, USA CHEP 2012, NYC

description

Publication patterns in HEP computing. M. G. Pia 1 , T. Basaglia 2 , Z. W. Bell 3 , P. V. Dressendorfer 4 1 INFN Genova , Genova , Italy 2 CERN, Geneva, Switzerland 3 ORNL, Oak Ridge, TN, USA 4 IEEE, Piscataway, NJ, USA. CHEP 2012, NYC. Analysis topics. General tools Geant4 ROOT - PowerPoint PPT Presentation

Transcript of Publication patterns in HEP computing

Page 1: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 1

Publication patterns in HEP computing

M. G. Pia1, T. Basaglia2, Z. W. Bell3, P. V. Dressendorfer4

1INFN Genova, Genova, Italy2CERN, Geneva, Switzerland3ORNL, Oak Ridge, TN, USA4IEEE, Piscataway, NJ, USA

CHEP 2012, NYC

Page 2: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 2

Analysis topics

General tools−Geant4−ROOT

HEP experiments−LEP

ALEPH, DELPHI, L3, OPAL

−BaBar−LHC

ALICE, ATLAS, CMS, LHCb, TOTEM

Grid computing −LCG

What they publish

How much

Where

Citations

Technology vs physics

Software vs hardware

Software/DAQ-trigger

Representative

Not exhaustive

Page 3: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 3

Data sourcesThomson-Reuters: ISI Web of Knowledge− CERN subscription: since 1970, conference database not included− Search by keywords, collaboration name

Journal web sites− IEEE TNS− NIM, Comp. Phys. Comm. (Elsevier)− JINST (IOP/SISSA)➤ Full-text searches

CERN databases− CERN Document System− Greybook

Years: 1982-2011 (LEP), 1992-2011 (BaBar, LHC)− Reproducible sample

Page 4: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 4

Data sampleContamination−Non-pertinent entries in the data sample

Omission− Pertinent papers are not included in the data sample

➩ Cross-checks−WoS/CDS, WoS/publishers’ web sites

WoS inconsistencies and errors− Total number of citations includes Conference database− Proceedings papers: false classifications and omissions ➩Manually corrected whenever possible

Automated analysis (whenever possible)

Manual evaluation: abstracts and full-text papers − Some degree of subjectivity

Page 5: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 5

S. Agostinelli et al.

Geant4: a simulation toolkitNIM A, vol. 506, no. 3, pp. 250-303, 2003

J. Allison et al.

Geant4 Developments and ApplicationsIEEE Trans. Nucl. Sci., vol. 53, no. 1, pp. 270-278, 2006

2934 citations (14 May 2012)

2026 citations excluding proceedings

Most cited CERN publication in WoS(excluding Rev. Part. Properties)

574 citations (14 May 2012)

381 citations excluding proceedings

Many papers cite the NIM paper, but they omit citing the TNS one, even though both are indicated in http://cern.ch/geant4

Many papers that use Geant4 do not cite either reference

Citation analysis: until 2011 (reproducibility)

Page 6: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 6

2003 2004 2005 2006 2007 2008 2009 2010 20110

100

200

300

400

500 Geant4 NIM Geant4 TNS

Year

Cit

ati

on

sRadiat. Prot. Dosim.

J. Korean Phys. Soc.Radiat. Meas.

Appl. Radiat. Isot.J. Phys. G

JHEPAstrop. Phys.

EPJCJINSTNIM B

Phys. Lett. BPhys. Rev. C

Phys. Med. Biol.Med. Phys.

Phys. Rev. Lett.TNS

Phys. Rev. DNIM A

0 50 100 150 200 250 300 350 400

Geant4 NIM: Citing Journals

Citations

30% Physics

75% citations(plot)

ISOLDE

ALICE

JET EFDA

BES III

N TOF

MiniBooNE

LUNA

CDF

HARP

LHCb

CMS

ATLAS

BaBar

0 20 40 60 80 100 120 140 160 180 200

G4 NIM: Citing Collaborations

Citations

LHCHEPOther

16% citations (plot)19% citations from collaborations

Born from LHC experimental requirementsMultidisciplinary sources of citations

Page 7: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 7

R. Brun and F. Rademakers

ROOT - An object oriented data analysis framework NIM A, vol. 389, no. 1-2, pp. 81-86, 1997

I. Antcheva et al.

ROOT - A C++ framework for petabyte data storage, statistical analysis and visualizationComp. Phys Comm., vol. 180, no. 12, pp. 2499-2512, 2009

540 citations (14 May 2012)

347 citations excluding proceedings

27 citations (14 May 2012)

20 citations excluding proceedings

AIHENP Workshop proceedings paper

Citation analysis: until 2011 (reproducibility)

Page 8: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 8

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

0

10

20

30

40

50

60 ROOT Proc. ROOT CPC

Year

Cit

ati

on

s

NIM ATNS

Comp. Phys. Comm.Phys. Rev. CPhys. Rev. D

JINSTPhys. Med. Biol.

EPJCMed. Phys.

JHEPNIM B

Lect. Notes Comp.Astropart. Phys.

0 20 40 60 80 100 120

ROOT Proc.: Citing Journals

Citations

AUGER

D0

H1

JET-EFDA

PHOBOS

RISING

ALICE

T2K

D0

CDF

0 1 2 3

Citations

75% citations

8% of all citations from collaborationsGeant4 % ROOT %

Technology 30.3 49.6

Physics 29.9 18.2

BioMedical 13.9 6.0

Field of citing journals

Page 9: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 9

HEP experiments

LEP• ALEPH• DELPHI• L3• OPAL

BaBarLHC

• ALICE• ATLAS• CMS• LHCb• TOTEM

LEP: 1989BaBar: 1999LHC: 2008

Start of run

Page 10: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 10

Time distribution LEP: 1989BaBar: 1999LHC: 2008

Run start

Publication year Rescaled w.r.t. year of start run

Page 11: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 11

Time distribution LEP: 1989BaBar: 1999LHC: 2008

Run start

Same as previous slide, rescaled by the number of experiment members

Page 12: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 12

Publications Share of hardware, software and DAQ-trigger

publications

Page 13: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 13

Physics publications

LEP experiments completed their life-cycleLHC experiments: at an early stage of their physics production

Page 14: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 14

Technological publications

Roughly constant trends, once the number of publications is normalized to the number of collaborators

Page 15: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 15

Software vs. hardware

Hardware publications: approximately 4 times more than softwareDAQ-trigger publications: approximately 1.3 times more than software

Page 16: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 16

Journals

hardware

DAQ-trigger

software

TNS

NIMA

JINST

Page 17: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 17

Journals: LEP and LHC

Still dominated by technological publications

LHCLEP

Dominated by physics publications

Page 18: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 18

Journals: pre- and post-2000

IEEE TNS is the most popular journal for HEP technological publications in recent years

Page 19: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 19

CitationsThe most cited papers are often the general reference papers about the detector published by each experiment

Citations of the most cited paperALEPH: 340DELPHI: 309L3: 509OPAL: 473BaBar: 859ALICE: 116CMS: 129LHCb: 101TOTEM: 35ATLAS: ATLAS pixel detector electronics and sensors: 185

0 citations: 4% 0 citations: 17%

0 citations: 27% 0 citations: 25%

Page 20: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 20

More references

more citations

ReferencesPhysics papers cite

more references than technological

papers

Bibliographical entries in software papers are often

web sites

Page 21: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 21

Pages

The number of pages of a paper depends on the format of the journal− 1 pageTNS ≈ 2.5 pagesJINST

Different journal formats in the same category

Evolutions of the format of some journals (e.g. NIM)

Page 22: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 22

Sources of citations to physics papers

Nucl. Phys. A

Phys. Atom. Nucl.

Mod. Phys. Lett. A

Phys. Rep.

NIM A

J. Phys. G

Acta Phys. Pol. B

Int. J. Mod. Phys. A

Z. Phys. C

JHEP

Phys. Rev. Lett.

Nucl. Phys. B Proc. Suppl.

Nucl. Phys. B

EPJC

Phys. Lett. B

Phys. Rev. D

0 5 10 15 20 25

DELPHI ALEPH

Citations (%)

Ann. Rev. Nucl. Part. Sci.

J. Cosm. Astrop. Phys.

JINST

New J. Phys.

Progr. Theor. Phys. Suppl.

Int. J. Mod. Phys. A

J. Phys. G Nucl.

Mod. Phys. Lett. A

Nucl. Phys. A

Phys. Rev. C

Acta Phys. Pol. B

Phys. Rev. Lett.

EPJC

Phys. Lett. B

JHEP

Phys. Rev. D

0 5 10 15 20 25 30

CMS ATLAS

Citations (%)

LHCLEP

Samples in plots account for >90% of citations

Citations to HEP physics papers mostly come from journals specialized in HEP and a few related fields (astroparticle and nuclear physics)

Page 23: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 23

Sources of citations to technological papers

Int. J. Mod. Phys A

Comp.Phys. Comm.

Phys. Lett. B

Nucl. Phys. B Proc. Suppl.

JHEP

Phys. Rev. D

EPJC

JINST

TNS

NIM A

0 10 20 30 40 50 60

CMS ATLAS

Citations (%)

Rep. Prog. Phys.

Rev. Mod. Phys.

Ann. Rev. Nucl. Part. Sci.

Phys. Rep.

Int. J. Mod. Phys. A

Acta Phys. Pol. B

JHEP

Phys. Rev. D

Nucle. Phys. B

Comp. Phys. Comm.

Z. Phys. C

Nucl. Phys. B Proc. Suppl.

TNS

Phys. Lett. B

EPJC

NIM A

0 5 10 15 20 25 30 35 40

DELPHI ALEPH

Citations (%)

Citations from HEP physics and technology journals

LHCLEP

Page 24: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 24

2008-2011 More refined analysis of technological papers published since start of LHC run

ATLAS CMS LHCb ALICE TOTEM LHC 0

5

10

15

20

25

TNS 2008-2011

Hardware Software DAQ-trigger

Nu

mb

er o

f p

aper

s

ATLAS CMS LHCb ALICE TOTEM LHC 0

5

10

15

20

25

30

35

40

45

50

NIM 2008-2011

Hardware Software

Nu

mb

er

of

pa

pe

rs

Page 25: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 25

ATLAS CMS LHCb ALICE TOTEM LHC 0

5

10

15

20

25

30

35

40 Hardware Software DAQ-trigger

Nu

mb

er

of

se

lf-c

ita

tio

ns

ATLAS CMS LHCb ALICE TOTEM LHC 0

5

10

15

20

25

30

35

40Hardware Software DAQ-trigger

Nu

mb

er

of

ou

tsid

e c

ita

tio

ns

TNS TNS

NIM ANIM A

Citations 2008-2011Self-citationsOutside citations

ATLAS CMS LHCb ALICE TOTEM LHC 0

10

20

30

40

50

60

70

80 Hardware Software

Nu

mb

er

of

se

lf-c

ita

tio

ns

ATLAS CMS LHCb ALICE TOTEM LHC 0

10

20

30

40

50

60

70

80 Hardware Software

Nu

mb

er

of

ou

tsid

e c

ita

tio

ns

Page 26: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 26

LCG – LHC Computing GridSakamoto, H Data grid deployment for high energy physics in Japan CPC 200

7Shiers, J The Worldwide LHC Computing Grid (worldwide LCG) CPC 200

7Belov, S et al. LCG MCDB - a knowledgebase of Monte-Carlo simulated events CPC 200

8Yin, Fet al. Grid resource management policies for load-balancing and energy-saving

by vacation queuing theoryCPC 200

9Malawski, M et al. Invocation of operations from script-based Grid applications Fut. Gen.

Comp. Syst.201

0Huedo, E et al. A modular meta-scheduling architecture for interfacing with pre-WS and

WS Grid resource management servicesFut. Gen. Comp. Syst.

2007

Agarwal, A et al. GridX1: A Canadian computational grid Fut. Gen. Comp. Syst.

2007

Chytracek, R et al. POOL development status and production experience TNS 2005

Hatlo, M et al. Developments of mathematical software libraries for the LHC experiments

TNS 2005

Pfeiffer, A et al. The LCG PI project: Using interfaces for physics data analysis TNS 2005

Munro, C et al. Measurement of the LCG2 and gLite File Catalogue's performance TNS 2006

Li, H Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids

IEEE Trans. Par. Distr. Syst.

2010

Andreeva, J et al. High-Energy Physics on the Grid: the ATLAS and CMS Experience J. Grid Comp. 2008

Munoz, VM et al. A Decentralized Deployment Strategy and Performance Evaluation of LCG File Catalog Service

J. Grid Comp. 2011

Hou, S et al. PacCAF: a Grid Portal in Pacific Asia for the CDF Experiment J. Grid Comp. 2009

Kim, BK et al. A Composition of Monitoring Services for the LHC Computing Grid J. Grid Comp. 2009

WoS

Page 27: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 27

LCG

Small sample of publications

Hard to perform any statistical analysis

Page 28: Publication patterns in HEP computing

Maria Grazia Pia, INFN Genova 28

ConclusionsSoftware is largely underrepresented in HEP scholarly literature w.r.t. hardware

Publication patterns appear similar in the LEP and LHC era

Citation patterns are different for publications by HEP experiments and about general software tools

Publish!…and don’t forget to cite