Enterprise and Infrastructure Resilience Conference

36
Enterprise and Infrastructure Resilience Conference: Managing Software for Scientific Projects U.S. EPA Office of Research and Development: Daniel L. Young Cincinnati, OH 12-13 August 2019 8/29/2019 1

Transcript of Enterprise and Infrastructure Resilience Conference

Page 1: Enterprise and Infrastructure Resilience Conference

Enterprise and Infrastructure Resilience Conference:

Managing Software for Scientific Projects

U.S. EPA Office of Research and Development: Daniel L. Young

Cincinnati, OH12-13 August 2019

8/29/2019 1

Page 2: Enterprise and Infrastructure Resilience Conference

Overview

• QA/QC and software development at US EPA and its availability to the public (GitHub).

– How we manage software in ORD.

• Summary of a few individual software programs and how they support sustainability and resilience.

Disclaimer:

The views expressed in this presentation are those of the authors and do not necessarily represent the views or

policies of the U.S. Environmental Protection Agency. Mention of trade names or commercial products does not

constitute endorsement or recommendation for use.

8/29/2019 2

Page 3: Enterprise and Infrastructure Resilience Conference

Outline

I. Software Life Cycle Management (SLCM)

II. Requirements | Design

III. Development | Coding

IV. Testing | V&V

V. Maintenance | Retire

VI. Case Studies

VII. Summary

VIII. References

8/29/2019 3

Page 5: Enterprise and Infrastructure Resilience Conference

System Life Cycle Management (SLCM)

• Software Engineering is the systematic approach to the requirements analysis, design,

development, testing, operation, maintenance and retirement of software.

• SLCM Phases:

System Life Cycle

Management

Requirements | Design

Development | Coding

Testing | V & V

Maintenance | Retire

8/29/2019 5

Page 6: Enterprise and Infrastructure Resilience Conference

System Life Cycle Management (SLCM)

• Agile Methodology

• Waterfall Model

• Spiral Model

8/29/2019 6

Page 7: Enterprise and Infrastructure Resilience Conference

System Life Cycle Management (SLCM)

US EPA Developer Guidance

• EPA has information to assist software developers @ https://developer.epa.gov/guide/.

• Information on Agile @ https://developer.epa.gov/guide/templates-guides/agile/.

• Open Source Code @ https://developer.epa.gov/guide/open-source-code/. *

• Waterfall @ https://developer.epa.gov/guide/templates-guides/waterfall/. *

– * More information forthcoming.

8/29/2019 7

Page 8: Enterprise and Infrastructure Resilience Conference

• Requirements | Design

8/29/2019 8

Page 9: Enterprise and Infrastructure Resilience Conference

Requirements | Design

• Open Source Software (OSS) Categorization.

– Level 1: Requires a full documentation suite, either via SW QAPP, or as an attached SCM Plan

within approved QAPP.

• Category A QAPP.

– Example:

• “Work Breakdown Structure (WBS) Cost Model and EPA Treatability Database (TDB)

PFAS Treatment Information Updates.”

8/29/2019 9

Page 10: Enterprise and Infrastructure Resilience Conference

Requirements | Design

• Open Source Software (OSS) Categorization

– Level 2: Provides a reduced documentation suite, may include attachment to QAPP.

• Category B QAPP.

– Examples:

• SciNote

• Toxicity Estimation Software Tool (TEST)

8/29/2019 10

Ruby of Rails | Docker ContainerWeb App (Internal Intranet)

Stand Alone Executable

Page 11: Enterprise and Infrastructure Resilience Conference

Requirements | Design

• Open Source Software (OSS) Categorization.

– Level 3: Prescribes the minimum level of documentation to meet the requirements of QA/QC

as outlined per "Guidance for Quality Assurance Project Plans EPA QA/G-5.“

• Category B QAPP and/or Software Plan.

– Examples:

• Excel workbook (GREENSCOPE/GEMM/GWSC), macros, etc.

8/29/2019 11

Level 3 SW excellent tool to verify Level 1 or 2 web apps…

Page 12: Enterprise and Infrastructure Resilience Conference

• Development | Coding

8/29/2019 12

Page 13: Enterprise and Infrastructure Resilience Conference

Development | Coding

• Software Configuration Management (SCM)

– NRMRL Standard Operating Procedure (SOP) for SCM G-LMMD-SOP-1399-0

• This procedure establishes the responsibilities and process for implementing and

documenting Software Configuration Management (SCM) activities within the Land and

Materials Management Division (LMMD). SCM activities in this procedure are designed to

establish and maintain the integrity of software work products throughout the life cycle of

software primarily using the ORD approved SCM Tool, GitHub. Regardless of the

methodology (Agile, Waterfall, Spiral) chosen, all EPA funded research MUST properly

document data in an EPA approved repository.

– EPA approved repositories (version control)

• US EPA GitHub https://github.com/USEPA

• *US EPA Bitbucket (Enterprise Only) http://intranet.epa.gov/atlassian/index.html

– *Newly added internal repo not for OSS.

8/29/2019 13

Page 14: Enterprise and Infrastructure Resilience Conference

Development | Coding

• Coding Standards (SOP in development)

– Code Style Preferences

– Style Guides

• PEP8

• PEP257

– Linting

• flake8

• mypy

• pep8

• pydocstyle

• pylama

• pylint

• JSONLint

• linter-ruby

• RuboCop

• PVS-Studio

• ESLint

• linter-javac

– Commenting Your Code

8/29/2019 14

Page 15: Enterprise and Infrastructure Resilience Conference

• Testing | V&V

8/29/2019 15

Page 16: Enterprise and Infrastructure Resilience Conference

Testing | V&V

• Verification & Validation (V&V)

– NRMRL Standard Operating Procedure (SOP) for SW V&V G-LMMD-SOP-1396-0

• The purpose of this procedure is to provide the methodology for performing the testing,

verification and validation (V&V), and / or independent verification and validation (IV&V)

processes throughout the software life cycle.

• This SOP has been developed to further assist researchers in the Land and Materials

Management Division (LMMD) with documenting data that is analyzed during the testing,

V&V, and independent verification & validation (IV&V) activities of a Software Development

Life Cycle (SDLC).

• The Software Test Report (STR) is a record of the qualification testing performed on a

Computer Software Configuration Item (CSCI), a software system or subsystem, or

another software-related item.

8/29/2019 16

Page 17: Enterprise and Infrastructure Resilience Conference

Testing | V&V

• Verification & Validation (V&V)

– NRMRL Standard Operating Procedure (SOP) for SW V&V G-LMMD-SOP-1396-0

• This includes the processes that address all system, software, and hardware life cycle processes including the Agreement, Organizational Project-Enabling, Project, Technical, Software Implementation, Software Support, and Software Reuse process groups. This SOP is compatible with all life cycle models (e.g., system, software, and hardware); however, not all life cycle models use all the processes listed in this SOP.

• Testing, V&V, and IV&V processes determine whether the development products of a given activity conform to the requirements of that activity and whether the product satisfies its intended use and user needs.

• This determination may include the analysis, evaluation, review, inspection, assessment, and testing of products and processes.

• The purpose of this SOP is to:

– (1) Establish a common framework of the Testing, V&V, and IV&V processes, activities, and tasks in support of all system, software, and hardware life cycle processes;

– (2) Define the Testing, V&V, and IV&V tasks, required inputs, and required outputs in each life cycle process;

– (3) Identify the minimum Testing, V&V, and IV&V tasks corresponding to a four-level integrity schema; and

– (4) Define the content of the Verification and Validation Plan.

8/29/2019 17

Page 18: Enterprise and Infrastructure Resilience Conference

Testing | V&V

• Software Test Report (STR)

– Examples are provided… e.g., Section one…

8/29/2019 18

Page 19: Enterprise and Infrastructure Resilience Conference

Testing | V&V

• Software Test Report (STR)

– Examples are provided… e.g., Road Map

8/29/2019 19

Page 20: Enterprise and Infrastructure Resilience Conference

Testing | V&V

• Software Test Report (STR)

– Examples are provided… e.g., Python Code Organization

8/29/2019 20

Page 21: Enterprise and Infrastructure Resilience Conference

• Maintenance | Retire

8/29/2019 21

Page 22: Enterprise and Infrastructure Resilience Conference

Maintenance | Retire

• The purpose of software

maintenance is defined in

the international standard

for software maintenance:

ISO/IEC/IEEE 14764.5

8/29/2019 22

Page 23: Enterprise and Infrastructure Resilience Conference

• Case Studies & Lessons Learned

8/29/2019 23

Page 24: Enterprise and Infrastructure Resilience Conference

Case Study: HELP

• Case Study | HELP

– Hydrologic Evaluation of Landfill.

Performance (HELP).

– Active Project

• G-LRPCD-0030107

• QAPP | SLCM Plan G-LRPCD-

0030107-QP-1-1 | 2019-07-22

– Fortran 90.

– Last update 1992.

– Legacy code initial commit GitHub 26 JUL

2018.

• Lessons Learned

– Legacy code for software still in use should

be added to repo asap!

– Fortran 90 code compatible with Fortran

2018 update code or retire software.

8/29/2019 24

Page 25: Enterprise and Infrastructure Resilience Conference

Case Study: EPANET

• Case Study | EPANET

– Open Source EPANET

– Active Project

• G-WSD-0031455 | 2018-01-04

• No QAPP | SLCM Plan logged

– User interface in development.

• Lessons Learned

– Legacy code for software still in use should

be added to repo…

• Legacy code added 17 AUG 2017

https://github.com/USEPA/EPANET-

legacy-user-interface.

• QAPP not required for legacy code

unless new development.

– Current Active Project requires QAPP.

8/29/2019 25

Page 26: Enterprise and Infrastructure Resilience Conference

Case Study: EPANET

• Case Study | EPANET

– EPANET https://www.epa.gov/water-

research/epanet.4

– Application for Modeling Drinking Water

Distribution Systems.

– Used throughout the world to model water

distribution systems.

– EPANET and SWMM have joint average

downloads of approximately 100,000 per year.4

– Tool for understanding the movement and fate

of drinking water constituents within

distribution systems.

– No major update to EPANET software since

the official release of EPANET version 2.0 in

2000.4

– During last 20+ years, EPANET community

primarily left one person—Dr. Lew Rossman at

the USEPA.4

8/29/2019 26

Page 27: Enterprise and Infrastructure Resilience Conference

Case Study: EPANET

• Case Study | EPANET

– Academicians and researchers have modified

and extended the source code algorithms.

– Commercial vendors integrated and

enhanced the core software codes within

their products.

– Open Source Software packages remains

fragmented with respect to development.

– Source code base for research purposes

have failed to coordinate the significant code

development efforts.

– Reference

• https://ascelibrary.org/doi/10.1061/(AS

CE)WR.1943-5452.0000918

8/29/2019 27

Page 28: Enterprise and Infrastructure Resilience Conference

Case Study: ORD QA Track

• Case Study | ORD QA Track

– Python | Django Web App

– Active Project

• G-STD-0019873 | 12 FEB 2014

• QAPP G-STD-0019873-QP-1-3

• Lessons Learned

– Push code on daily basis.

• Christmas 2016 lead P4 developer

passed away at 49 yrs. with 8-weeks of

work locked on EPA laptop.

• Potential of loss to Agency $48,000 (8hr

X 5days X 8weeks X $150per hr.)

• Work was performed on EPA

equipment, coding recovered after

several weeks, at cost to Agency.

8/29/2019 28

Page 29: Enterprise and Infrastructure Resilience Conference

Case Study: ORD QA Track

• Case Study | QATRACK

– Python/Django/PostgreSQL web app for managing all EPA/ORD QA/QC scientific

documentation.

– Project started in Sustainable Technology Division 2013, adopted by NRMRL 2015, rolled out to

all ORD 2016.

– Contract support of P4 computer programmers @ roughly $150/hr.

– SLCM in place since inception, however, continual struggle to have developers push code to

repository daily.

8/29/2019 29

Page 30: Enterprise and Infrastructure Resilience Conference

Case Study: ORD QA Track

QA TRACK:The app is maintained by assigned Django Administrators, however, any code added via the command line or with IDE can break the tool if not properly entered in repository.

8/29/2019 30

Page 31: Enterprise and Infrastructure Resilience Conference

Case Study: GEMM

• Case Study | GEMM

– Python | Django Web App

– Active Project

• G-LMMD-0031134 | 29 MAR 2015

• QAPP G-LMMD-0031134-QP-1-0

• Lessons Learned

– Django admin function in web apps help

keep researcher that may not be P3/P4

computer engineers/programmers involved

in the code…

• Downside, changes made via Django

admin must be pulled from server and

pushed to repo…

8/29/2019 31

Page 32: Enterprise and Infrastructure Resilience Conference

Case Study: GEMM

• Lessons Learned cont.

– Wiki’s are encouraged, in LMMD they are required.

– Wiki’s to include specific “Build & Run” guidance, Testing, Collaborations (especially for extramural projects), and License information.

– Minimum recommended Wiki pages are

• Home page,

• Build & Run page,

• Contributing page,

• Testing page, and

• License page.

8/29/2019 32

Page 33: Enterprise and Infrastructure Resilience Conference

Case Study: GEMM

• “Green Engineering Materials

Management” (GEMM) Lifecycle

Assessment (LCA) Tool

– https://ordgemm.epa.gov/

– https://gemm.engineering4sustainability.com/

– https://github.com/USEPA/E4S

8/29/2019 33

Page 34: Enterprise and Infrastructure Resilience Conference

Additional LMMD SW

• SciNote Electronic Laboratory & Field Notebook.

– https://ordscinote.epa.gov/.

• Ground Water Seepage Calculator (GWSC).

– https://groundwatercalculator.epa.gov/.

– https://groundwatercalculator.engineering4sustainability.com/.

• Gauging Reaction Effectiveness for the ENvironmental Sustainability of Chemistries

with a multi-Objective Process Evaluator (GREENSCOPE) too.

– https://greenscope.epa.gov/.

– https://greenscope.engineering4sustainability.com/.

• These apps follow same SLCM guidelines…

8/29/2019 34

Page 35: Enterprise and Infrastructure Resilience Conference

• Summary

• Questions?

8/29/2019 35

Page 36: Enterprise and Infrastructure Resilience Conference

References

1. GitHub Guidance. https://www.epa.gov/webguide/github-guidance.

2. Interim Open Source Software (OSS) Policy, Directive No.: CIO 2125.0, CIO Approval: 1-11-2018,

Transmittal No.: 18-001. https://www.epa.gov/open/interim-open-source-software-oss-policy.

3. System Life Cycle Management (SCM) Procedure. CIO 2121-P-03.0. Issued by the EPA Chief

Information Officer, Pursuant to Delegation 1-19, dated 07/07/2005.

https://www.epa.gov/sites/production/files/documents/cio_2121-p-03.0.pdf.

4. Uber, J. G., Boccelli, D. L., A.M.ASCE, Hatchett, S., S.M.ASCE, Kapelan, Z., Saldarriaga, J., A.M.ASCE,

Simpson, A. R., M.ASCE, Van Zyl, J. E., PH.D. “Let’s Get Moving and Write Software: An Open Source Project

for EPANET.” Journal of Water Resources Planning and Management. Volume 144 Issue 4 - April 2018.

Retrieved July 22, 2019, from https://ascelibrary.org/doi/10.1061/(ASCE)WR.1943-5452.0000918.

5. Swebokwiki.org. (2019). Chapter 5: Software Maintenance - SWEBOK. [online] Available at:

http://swebokwiki.org/Chapter_5:_Software_Maintenance [Accessed 24 Jul. 2019].

8/29/2019 36