JOHN TUNNA Director Office of Research and Development Office of Railroad Policy and Development

Testing a Strategic Evaluation Framework for Incrementally Building Evaluation Capacity in a Federal R&D Program

27th Annual Conference of the American Evaluation Association

Washington, DCOctober 17, 2013 JOHN TUNNA

DirectorOffice of Research and Development

Office of Railroad Policy and DevelopmentFederal Railroad Administration

Federal Railroad Administration (FRA)Evaluation Implementation Plan

• Introduction– R&D Evaluation Mandate– R&D Evaluation Goals– R&D Evaluation Standards

• Uses of Evaluation– Formative – Summative

• Types of Evaluation (CIPP Evaluation Model) – Context– Input– Implementation– Impact

• Evaluation Framework & Key Evaluation Questions• Start-up Pilot Evaluations• Institutionalizing and Mainstreaming Evaluation

– Metaevaluation– The Evaluation Manual

• Evaluation templates• Attestation of standards

R&D Evaluation Mandate• Congressional Mandates

– Government Performance and Results Act (GPRA, 1993)– Program Assessment Rating Tool (PARTs, 2002)– GPRA Modernization Act of 2010

• OMB Memos– M-13-17, July 26, 2013: Next Steps in the Evidence and Innovation Agenda– M-13-16, July 26, 2013: Science and Technology Priorities for the FY 2015 Budget– M-10-32, July 29, 2010: Evaluating Programs for Efficacy and Cost-Efficiency– M-10-01, October 7, 2009: Increased Emphasis on Program Evaluations– M-09-27, August 8, 2009: Science and Technology Priorities for the FY2011 Budget

• Federal Evaluation Working Group – Reconvened in 2012 to help build evaluation capacity across the federal government– “[We] need to use evidence and rigorous evaluation in budget, management, and policy decisions to make government

work effectively.”

• GAO reports– Program Evaluation: Strategies to Facilitate Agencies’ Use of Evaluation in Program Management and Policy Making

(June, 2013) – Program Evaluation: A Variety of Rigorous Methods Can Help Identify Effective Interventions (GAO-10-30, November,

2009)– Program Evaluation: Experienced Agencies Follow a Similar Model for Prioritizing Research (GAO-11-176 , January, 2011)

R&D Evaluation MandateOMB Memo M-13-16 (July 26, 2013)Subject: Science and Technology Priorities for the FY 2015 Budget

“Agencies . . . should give priority to R&D that strengthens the scientific basis for decision-making in their mission areas, including but not limited to health, safety, and environmental impacts. This includes efforts to enhance the accessibility and usefulness of data and tools for decision support, as well as research in the social and behavioral sciences to support evidence-based policy and effective policy implementation. “

“Agencies should work with their OMB contacts to agree on a format within their 2015 Budget submissions to: (1) explain agency progress in using evidence and (2) present their plans to build new knowledge of what works and is cost-effective.“

R&D Evaluation Goals

• Meet R&D accountability requirements• Guide and strengthen Division R&D

program effectiveness and impact• Facilitate knowledge diffusion and

technology transfer• Build R&D evaluation capacity• Improve railroad safety

FundedActivity“Family”

___________

ScientificResearch

TechnologyDevelopment

Deliverables/Products

Technical Report(s)

ForecastingModel(s)

Application of Research

Data Use Adoption of Guidelines, Standards

or Regulations

ReducedAccidentsInjuries

ACTIVITIES OUTPUTS OUTCOMES IMPACTS

ChangingPracticesEmergent

Outcomes

NegativeEnvironmental Effects

PositiveKnowledge Gains

Why Evaluation in R&D? Assessing the logic of R&D Programs

Research Evaluation

Primary Purpose: - contribute to knowledge - improve understanding

- program improvement - decision-making

Primary audience: - scholars - researchers - academicians

- program funders - administrators - decision makers

Types of Questions: - hypotheses - theory driven - preordinate

- practical - applied - open-ended, flexible

Sources of Data: - surveys - tests - experiments - pre-ordinate

- interviews - field observations - documents - mixed sources - open-ended, flexible

Criteria: - validity - reliability - generalizability

- utility - feasibility - propriety - accuracy - accountability

The Research-Evaluation Paradigm

Program Evaluation Standards:Guiding Principles for Conducting Evaluations

• Utility (useful): to ensure evaluations serve the information needs of the intended users.

• Feasibility (practical): to ensure evaluations are realistic, prudent, diplomatic, and frugal.

• Propriety (ethical): to ensure evaluations will be conducted legally, ethically, and with due regard for the welfare of those involved in the evaluation, as well as those affected by its results.

• Accuracy (valid): to ensure that an evaluation will reveal and convey valid and reliable information about all important features of the subject program.

• Accountability (professional): to ensure that those responsible for conducting the evaluation document and make available for inspection all aspects of the evaluation that are needed for independent assessments of its utility, feasibility, propriety, accuracy, and accountability.Note: The Program Evaluation Standards were developed by the Joint Committee on Standards for Educational Evaluation and have been accredited by the American National Standards Institute (ANSI).

CIPP Evaluation Model:(Context, Input, Process, Product)

• Context• Input

• Implementation• Impact

Daniel L. Stufflebeam's adaptation of his CIPP Evaluation Model framework for use in guiding program evaluations of the Federal Railroad Administration's Office of Research and Development. For additional information, see Stufflebeam, D.L. (2000). The CIPP model for evaluation. In D.L. Stufflebeam, G. F. Madaus, & T. Kellaghan, (Eds.), in Evaluation models (2nd ed.). (Chapter 16). Boston: Kluwer Academic Publishers.

Stakeholder engagement is key

Types of Evaluation

Context Inputs Implementation Impact

FormativeEvaluation (proactive)

Identifies:

• Needs• Problems• Assets

Helps set:

• Goals • Priorities

Assesses:

Alternative approaches

Develops:

Program plans, designs,budgets

Monitors, documents, & guides execution

Assesses:+/- outcomes

Reassess:project and program plans;

Informs:Policy development Strategic planning

Summative Evaluation (retroactive)

Assesses:

Original rogram goals & priorities

Assesses:

Original procedural plans & budget

Assesses: Execution

Assesses:

OutcomesImpacts Side effectsCost-effectiveness

Evaluation Framework:Roles and Types of Evaluation

11


FormativeEvaluation

What are the highest priority needs to improve safety culture in the U.S. rail industry?

What are the most promising alternatives for safety culture interventions (BBS, ISROP, Rules Revision, Close Calls, etc.)? How do they compare (potential success, costs, etc.)? How can these interventions be most effectively implemented? What are some potential barriers to implementation?

To what extent do safety culture interventions proceed on time, within budget, and effectively?

If needed, how can the intervention design be improved?

How can safety culture interventions be implemented to maximize effectiveness? What are some indicators of impact or use, if any, that have emerged to indicate that these interventions are being adopted more broadly? What are some emerging outcomes (positive or negative)? How can the implementations be modified to minimize costs and maximize effectiveness?

Summative Evaluation

To what extent did this intervention address the high priority safety need?

What intervention strategy was chosen, and why was it chosen compared to other viable strategies (re. prospects for success, feasibility, costs)?

To what extent was the intervention carried out as planned, or modified with an improved plan?

To what extent did these interventions improve safety/safety culture? Were there any unanticipated negative or positive side effects? What conclusions and lessons learned can be reached (i.e. cost effectiveness, stakeholder engagement, program effectiveness)?

Evaluation Framework:Key Evaluation Questions – Safety Culture

Evaluation as a Key Strategy Tool• Ask questions that matter.

About processes, products, programs, policies, and impacts Then develop appropriate and rigorous methods to answer them.

• Measure the extent to which, and ways, programs goals are being met. What’s working, and why, or why not?

• Use to refine program strategy, design and implementation. Inform others about lessons learned, progress, and program impacts.

• Improve likelihood of success with:– Intended users– Intended uses – Outcomes and impacts– Unanticipated (positive) outcomes

• Use evaluation to develop appropriate and useful performance measures for reporting R&D outcomes, and monitoring those outcomes for continuous improvement.

13

Michael CoplenSenior Evaluator

Office of Research & DevelopmentFederal Railroad Administration

[email protected]

13

mailto:[email protected]

QUESTIONS?

14

Supplemental Information

15


FormativeEvaluation

What are the highest priority needs for sleep health and safety in the railroad industry?

Given the need for sleep health education and training, what are the most promising alternatives (fatigue website, regulations, etc.)? How do they compare (potential success, costs, etc.)? How can this strategy be most effectively implemented? What are some potential barriers to implementation?

To what extent is the website project proceeding on time, within budget, and effectively? If needed, how can the design be improved?

To what extent are people using the website? What other indicators of use, if any, have emerged that indicate the website is being accessed and the information is being acted upon? What are some emerging outcomes (positive or negative)? How can the implementation be modified to maintain and measure success?

Summative Evaluation

To what extent did the fatigue website address this high priority need?

What strategy was chosen and why compared to other viable strategies (re. prospects for success, feasibility, costs)?

To what extent was the website carried out as planned, or modified with an improved plan?

To what extent did this project effectively address the need to educate railroad employees on sleep health and safety? Were there any unanticipated negative or positive side effects? What conclusions and lessons learned can be reached (i.e. cost effectiveness, stakeholder engagement, program effectiveness)?

Evaluation Framework:Illustrative Questions – Fatigue Website

Safety Culture

ValuesManagement

Establish SteeringCommittee

(M anagem ent)

Data Analysis & CAPlanning

(S teering C om m ittee,C A Team )

Corrective ActionsWorkers don’t have control(CA Team)Workers have control(Steering Committee)

Develop Checklist(S teering

C om m ittee)

Observer Training(Steering Committee

(O bservers)

Data Gathering &Feedback

(O bservers)

Attitudes Competencies Patterns of Behavior

At-RiskConditions

At-RiskBehaviors

Incidents

INTERVENTION(Management & Labor)

Clear Signal for Action (CSA) Theory of Change

Input Evaluation: Program Design and Partnership Commitment to Change

SafetyOutcomes

Implementation Evaluation

Continuous Improvement (CI)

Safety Leadership Development (SLD)

Peer-to-Peer Feedback

S.T.E.E.L. Activities General employee practices

Culture

S.T.E.E.L.-targeted employee practices

Reactions to problems

Corporate results

Employee well-being

Incidents

Steering committee

training

Checklist develop-

ment

Sampler training

Coaching

Commun-ications

Feedback

Data analysis

Sampling

Barrier identifica-

tion

Barrier removal

Leadership training

Implementation First Order Impacts Second Order Impacts Third Order Impacts

Attitude toward safety

Safe behaviors

Safety culture

Labor-management relations

Personal sense of control/responsibility

Equipment control

Close calls

Personal Injuries

Derailments

Collisions

Rule compliance

Job satisfaction

Safety hotline

Health

Stress

Liability

Incident costs

Productivity

Public image

Discipline

FTX results

Investigations

Decertifications

Management practices

Communication quality, amount and

consistency

Safety-enabling leadership behaviors

Awareness

Attitude toward safety

Employee involvement in S.T.E.E.L.

Other influences include:· Corporate policy changes· FRA practices

Impact Evaluation: Expected changes and possible metrics (Union Pacific example)

JOHN TUNNA Director Office of Research and Development Office of Railroad Policy and Development

Documents

Transcript of JOHN TUNNA Director Office of Research and Development Office of Railroad Policy and Development