Department of Defense Experimentation Guidebook

Department of Defense

Experimentation Guidebook

Office of the Under Secretary of Defense for

Research and Engineering

Prototypes and Experiments

October 2021

(Version 2.0)

DISTRIBUTION STATEMENT A. Approved for public release: distribution unlimited.

dkluzik

Cleared

i

Record of Changes

DATE VERSION CHANGE DESCRIPTION SECTION

08/01/2019 1.0 Original document All

10/08/2021 2.0 Grammar and readability edits

Removed references to outdated policy documents

Updated name of Federal Business Opportunities to

System for Award Management

All

2.0, 4.3.5,

5.1.1, 5.2.1,

6.0

5.3

ii

Table of Contents

1 Forward ................................................................................................................................... 1

2 Introduction ............................................................................................................................. 1

3 Purpose and Scope .................................................................................................................. 2

4 Experimentation Basics........................................................................................................... 3

4.1 Experimentation Fundamentals ...................................................................................... 3

4.2 Types of Experiments ..................................................................................................... 4

4.3 Why Experiment in the DoD? ......................................................................................... 5

4.4 Differentiating Experimentation from Prototyping, Testing, and Demonstration. ......... 6

4.5 Experimentation Methods ............................................................................................... 7

4.6 Cultural Implications for Experimentation ..................................................................... 9

5 Experimentation Activities ................................................................................................... 10

5.1 Formulating Experiments .............................................................................................. 11

5.2 Planning Experiments ................................................................................................... 13

5.3 Soliciting Proposed Solutions for Experiments ............................................................ 21

5.4 Selecting Potential Solutions for Experiments.............................................................. 21

5.5 Preparing For and Conducting Experiments ................................................................. 22

5.6 Data Analysis and Interpretation .................................................................................. 25

5.7 Results of Experimentation ........................................................................................... 25

6 Summary ............................................................................................................................... 27

Appendix 1: Acronyms .............................................................................................................. 28

Appendix 2: Definitions ............................................................................................................. 29

Appendix 3: References ............................................................................................................. 31

Table of Tables

Table 1: Primary Scenario Factors ................................................................................................ 16

Table 2: Risks Common to Experimentation ................................................................................ 19

Table 3: Examples of Selection Criteria for Navy's TnTE2 Methodology ................................... 22

1

1 Forward

Military Departments and Defense Agencies have long used experimentation to support

innovation and develop solutions to vexing military problems. Many of these organizations and

their subject matter experts (SME) have developed processes, methods, and tools that help them

succeed in their efforts. The Office of the Under Secretary of Defense for Research and

Engineering (OUSD(R&E), Prototypes and Experiments (P&E), was tasked to capture and

consolidate these approaches, best practices, and recommendations into a single reference

document for the Department of Defense (DoD). This guidebook is designed to complement

DoD, Military Service, and Defense Agency policy pertaining to experimentation, providing the

reader with discretionary best practices that should be tailored to the circumstances of each

experiment. It is a living document and will be updated periodically to ensure that direction

captured from governing documents is current and that best practices are fresh.

To draft this guidebook, P&E conducted an extensive literature review, gleaning information from

legal, congressional, academic, and regulatory documents and reports. This information was

refined through interviews with research and acquisition professionals across the DoD who

provided insights into proven experimentation programs and processes and who documented best

practices and lessons learned from previous defense experimentation efforts. This approach to

developing the guidebook resulted in a product with broad applicability to the defense

experimentation community.

2 Introduction

United States technological superiority has sustained U.S. military dominance for over 70 years.

However, the explosion of technological gains in the defense and commercial industries over the past several decades and their broad availability to both nation-states and non-state actors has

resulted in a dramatic increase in the technical prowess of U.S. adversaries and the erosion of the U.S. competitive military advantage.1 This erosion is exacerbated by the rate at which U.S.

adversaries are making these technological advances. Furthermore, U.S. adversaries are not just

embracing advanced technology; they are also studying U.S. strategies and tactics and are rapidly innovating novel applications of new and existing technologies to maximize their effectiveness

against those strategies and tactics.

A new (or renewed) approach to capability development is required—one that uses

experimentation, rapid concept exploration, and prototyping to integrate materiel and non-

materiel solutions in ways that most effectively address warfighter capability gaps. This

guidebook explores the topic of DoD experimentation and its role in rapid capability

development.

In general terms, experimentation answers the question, “If I do this, what will happen?” Defense experimentation extends that question to the military domain, providing decision makers with

information they need to make good decisions. Defense experiments provide opportunities for technologists and warfighters to evaluate potential solutions to existing or emerging warfighter

1 Mattis, Summary of the 2018 National Defense Strategy of the United States of America,

2

capability gaps and probe the integration of technology development and concept exploration in

order to maximize synergies that exist. Experimentation also enables rapid evaluation of a military problem, increasing the speed by which knowledge and understanding is gained and

decisions can be made. According to the Defense Science Board (DSB), “experimentation fuels the discovery and creation of knowledge and leads to the development and improvement of

products, processes, systems, and organizations.”2

True experimentation must embrace risk. In fact, experiments that result in the greatest benefits

are often accompanied by substantial risk. These high-risk experiments give the Department the

greatest opportunity to find transformative solutions to fill capability gaps and meet warfighter

needs. Historically, however, DoD’s risk-averse acquisition culture contributed to the failure of

past experimentation efforts. In 2013, the DSB observed that “experimentation in the Department

[became] synonymous with scripted demonstrations, testing, and training in an environment and

culture that is arguably much more risk-averse today than it was just 20 years ago.”3

The environment in DoD has slowly begun to change in recent years, however, evolving into one

that increasingly fosters innovation and risk taking and that promotes exploration through experimentation and prototyping. Congress recognized the need to further encourage this

evolution, stating in the Fiscal Year (FY) 2017 National Defense Authorization Act (NDAA) Conference Report that they “expect that the [USD(R&E)] would take risks, press the technology

envelope, test and experiment, and have the latitude to fail, as appropriate.”4 This type of

experimentation—risk tolerant experimentation—is necessary for DoD, and it is key to restoring the U.S. defense technology overmatch.

3 Purpose and Scope

This guidebook contains overarching guidance on the application of experimentation in DoD. It

provides a basic introduction to experimentation and details on specific defense experimentation

activities. This guidebook is primarily intended to be used by DoD personnel who plan to use

experimentation to explore solutions to existing and emerging military capability problems. It is

also intended to be used as an introductory and reference document by staff officers and senior

leaders seeking to increase their knowledge of experimentation.

This guidebook is not policy, nor is it intended to be directive in nature. It does not

supersede DoD, Military Service, or Defense Agency policy pertaining to acquisitions or

experimentation. It is not a substitute for Defense Acquisition University (DAU) training

and it does not describe every activity necessary to be effective.

2 Defense Science Board, The Defense Science Board Report on Technology and Innovation Enablers for Superiority

in 2030 (Washington DC: Department of Defense, 2013), 85,

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf. 3 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 78. 4 U.S. Congress, House, Conference Report: National Defense Authorization Act for Fiscal Year 2017, S. 2943, 114th

Cong., 2d sess. (2016), 1130, https://www.congress.gov/114/crpt/hrpt840/CRPT-114hrpt840.pdf.

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf

https://www.congress.gov/114/crpt/hrpt840/CRPT-114hrpt840.pdf

3

4 Experimentation Basics

So, what is “experimentation?” In its purest sense, experimentation is the application of the

scientific method (the processes used since the 17th century to explore natural science) to determine cause-and-effect relationships—manipulating one or more inputs, recording the effects on an output while controlling the environment and other potential influencers, and analyzing the data to validate the relationships.5 People conduct experiments all the time, sometimes formally (like a child in science class who enthusiastically watches to see what happens when baking soda and vinegar are combined), but more often informally in our day-to-day lives (e.g., if I take another route to work that is longer, but avoids traffic lights, will I make it to work more quickly?).

Defense experimentation is the extension of this type of thinking and activity into the military

domain. From the beginning of warfare, militaries have experimented with capabilities and

concepts to develop and identify better ways of conducting war and solving warfighting

capability gaps (e.g., using gunpowder to propel projectiles, using airpower to sink ships, and

targeting terrorists using armed drones). To ensure clarity regarding the term, this guidebook

will use the following definition for defense experimentation:

Defense Experimentation: Testing a hypothesis, under measured conditions, to explore

unknown effects of manipulating proposed warfighting concepts, technologies, or

conditions.

Experimentation is not an end in itself, nor is it a research, acquisition, or doctrine development process. Instead, experimentation is a tool that can be used in any of those processes to explore unknown relationships and outcomes that result from new disruptive technologies and concepts, new applications of existing capabilities, or emerging threats.6

4.1 Experimentation Fundamentals.

Before exploring defense experimentation further, it is important to explain some fundamental

principles regarding classic experimentation. Classic experiments are built around a hypothesis

that clearly states the proposed causal relationship, typically in an if-then statement. For

example, a hypothesis might read:

If a Hellfire missile is mounted on and fired from a reconnaissance drone,

then the kill-chain will be shortened.

This hypothesis is composed of an independent variable in the “If” statement, “a Hellfire

missile is mounted on and fired from a reconnaissance drone,” and a dependent variable in the “Then” statement, “the kill-chain will be shortened.” In addition to the independent and

dependent variables are intervening variables that impact the relationship between the

dependent and independent variables. Examples might include participants’ level of training, skill of the pilots, and weather. Experiments then manipulate the independent variable to see

5 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex): Guide for Understanding and

Implementing Defense Experimentation (Ottawa, Canada: Canadian Forces Experimentation Centre, 2006), 5-6,

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf. 6 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 79-80.

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf

4

if/how the dependent variable is affected. Classic experiments are conducted systematically

following scenarios under very controlled conditions in order to increase the confidence that the relationship is valid. In ideal experiments, only one independent variable is manipulated at a

time, while all intervening variables are controlled.7

It is important at this point to introduce four experimentation criteria—validity, reliability,

precision, and credibility. Validity addresses how well the experiment measures what it intends to

measure.8 Reliability pertains to the objectivity of the experiment and whether the same values

would be measured for the same observations every time. Precision addresses whether or not

instrumentation is calibrated to tolerances that enable detection of meaningful differences or

changes. Credibility pertains to whether or not the measures are understood and respected. With

the trades and assumptions that naturally need to be made during an experiment, it is the

responsibility of the experimentation team to ensure that the experiment is designed to most

effectively balance validity, reliability, precision, and credibility in order for the experiment to be

as useful as possible within the known limitations and constraints.

4.2 Types of Experiments.

The way practitioners categorize experiments depends on the prisms through which they view the

subject. Often the prisms reflect the environment or the specific disciplines within which the

experimenters operate. Even within a specific discipline, several categorizations may exist.

Defense experimentation is no different. For example, some experimenters group defense

experiments according to whether they assess materiel solutions (e.g., emerging technology) or

non-materiel solutions (e.g., a transformational concept, doctrine, concepts of operations

(CONOPS), etc.). Others categorize experiments by the level of realism inherent in the

experiment (i.e., technological experiments conducted in a controlled setting versus operational

experiments that are typically conducted in the field). One of the more prominent categorizations

of defense experiments found in literature addresses the maturity of the solution being assessed—

discovery experiments, hypothesis-testing experiments, and demonstration experiments.9

Regardless of how an experiment is categorized, the fundamental activities associated with

conducting an experiment are typically consistent across all types of experiments. As a result,

rather than attempt to address each of these different types of experiments individually, the

guidebook describes the activities common to most (if not all) types of experiments.

4.2.1 Classic Experimentation vs. Free Play.

It is, however, important to note the difference between classic experimentation, as described in

Section 4.1, and free play experiments as often conducted by DoD organizations. While the high

level of scientific rigor associated with classic experimentation enhances the validity and

reliability of the experiment, it requires significant control of the experiment variables,

participants, and environment, increasing the time and cost of experimentation. Instead, DoD

7 David S. Alberts and Richard E. Hayes, Code of Best Practice: Experimentation (Washington, DC: Command and

Control Research Program, 2002), 142, http://dodccrp.org/files/Alberts_Experimentation.pdf. 8 Validity comes in two forms—internal validity and external validity. Internal validity suggests that the experiment

has been designed and conducted in a way that ensures that no alternative explanations exist for the experiment results.

External validity suggests that the results of the experiment can be generalized to other environments. In defense

experiments, external validity relates to the operational realism of the experiment and whether the results can be

generalized to the combat environment. 9 Alberts, Code of Best Practice: Experimentation, 4.

http://dodccrp.org/files/Alberts_Experimentation.pdf

5

experimenters will often introduce new technologies, new applications of existing systems, and

new concepts at experimentation events and during exercises where they can be used by operators

to simply explore the question “what happens if I do ‘X’?” This more informal approach to

experimentation, referred to as free play, affords experimenters significant flexibility in designing

experiments vice a rigid scenario-based activity. The experimentation team must weigh the pros

and cons of enhanced scientific rigor against the experiment’s objective and determine the

appropriate balance between free play and scientific rigor in the experiment’s design. While free

play experiments are typically less formal than classic experimentation, it is important that

experimenters document their hypotheses prior to designing and planning the experiment to

ensure the experiment tests what is intended to be tested and so that analysis of the results can

clearly determine the validity of the hypotheses.

4.3 Why Experiment in the DoD?

The ultimate purpose for all experimentation is to enrich the understanding of a particular issue or

domain, providing knowledge to better inform decision makers. At the end of each experiment,

experimenters should be able to answer the questions that compelled the experiment, identify

additional information necessary for further research on the topic, and provide decision makers

with the information they need to make decisions. Defense experimentation includes the

additional purpose of accelerating the development and deployment of concepts and capabilities to

the warfighter. The following sections comprise a non-exhaustive list of purposes and benefits

associated with defense experimentation.

4.3.1 Identify and Refine Capability Gaps and Requirements.

Defense experimentation can be used to identify and help clarify current and future warfighting

problems. Bringing together operators, intelligence experts, and technologists to discuss and

explore current and future warfighting environments and the impact of existing and emerging

technologies enables the development, refinement, prioritization, and validation of capability

gaps and requirements. Independent teams that imitate anticipated adversary actions and

responses, known as red teams, can also be used in experiments to identify how adversaries

might use emerging technologies to create new threats or modify existing threats. Results from

these types of experiments can be used in the Capabilities Based Assessment process to help

guide development of alternative materiel and non-material solutions.

4.3.2 Explore Innovative Technology Solutions.

Experimentation can be used to explore, identify, and enhance technological solutions that

address capability gaps and requirements and identify opportunities that emerging technologies

afford. These solutions may be developed in DoD and National laboratories or by commercial

innovators (many of whom DoD may not otherwise have access to). Experiments are often used

to facilitate the exploration of numerous potential emerging technology solutions by warfighters

in order to identify the most promising solutions to pursue and their associated technical and

integration risks. Experiments are also used to explore new ways of applying existing

technologies to obtain a military advantage. Just as important, experiments can also help decision

makers identify innovations or current research and development (R&D) efforts that will not close

a capability gap, enabling them to terminate the efforts before they become programs of record

(PoR) and redirect funding to more promising solutions.

6

4.3.3 Explore Non-Materiel Solutions.

Experimentation can also be used to investigate the full range of possible non-materiel

innovations across the Doctrine, Organization, Training, Materiel, Leadership and Education,

Personnel, Facilities, and Policy (DOTMLPF-P) spectrum. These experiments help warfighters

investigate the impact of changes to organizational structure; CONOPS; tactics, techniques, and

procedures; training; etc. before operationalizing the changes. Larger, more complex

transformational concepts can also be developed, explored, and refined through experimentation.

4.3.4 Evaluate Operational Value.

Experiments that place capabilities in the hands of the warfighter in an operationally realistic

environment enable operators and technologists to explore the operational utility and limitations

of the capabilities, sometimes facilitating discovery of unexpected applications in an operational

environment.

4.3.5 Rapidly Learn.

Experimentation can be used to enable programs to take advantage of the “fail fast/fail cheap”

philosophy (referred to by some in the Department as “learn fast/learn cheap”). This philosophy

seeks to use the simplest and least expensive representative model possible (rather than an

expensive final development article) to quickly determine the value of a concept or technology

solution through incremental development and evaluation. When the experiment reveals

something isn’t working as expected or desired (i.e., a “failure”), the concept or technology can

either be modified or reevaluated, or decision makers can pivot to a different approach. The faster

the solution “fails,” the faster learning can occur, and the faster decisions can be made regarding

the next appropriate step in the development or innovation process.

4.3.6 Strengthen and Expand the Technology Base.

Defense experiments are also used to reach nontraditional defense contractors that might

otherwise have little interest in working through the arduous federal acquisition and contracting

process. These nontraditional partners are often the sources of disruptive innovation critical to the

U.S. military. Experiments allow both industry and operators to understand how novel

technologies can provide value to operations (and in some cases direction on how the technology

needs to be evolved) in a far less obstructive environment.

4.4 Differentiating Experimentation from Prototyping, Testing, and Demonstration. To help improve communication and reduce misunderstanding, this section explains how this

guidebook differentiates DoD experimentation from prototyping, testing, and demonstration.

4.4.1 Experimentation vs. Prototyping.

Among DoD personnel, the terms experimentation and prototyping are often mentioned in the

same sentence and sometimes used synonymously. The terms, however, are quite different.

Experimentation is used to address uncertainty when analysis is insufficient to draw

conclusions. Experimentation focuses on developing and evaluating a hypothesis to determine

if a causal relationship exists between two variables (i.e., to answer the question, “Does ‘A’

cause ‘B’?”). It also applies to informal experiments in which the question asked may be as

simple as: “What happens if I do X?” In contrast, for purposes of this guidebook, prototyping

has two meanings. First, prototyping is the act of designing and creating a representative

model for use in tests, experiments or demonstrations. For instance, X-planes were prototypes

that were used to conduct experiments in supersonic flight, variable geometry aerostructures,

7

etc. In this context, the prototypes developed are used to inform decisions and answer a broad

spectrum of questions (e.g., Are the requirements technically feasible? Can the end item be

manufactured affordably? Is the CONOPS valid?). The second meaning of prototyping

pertains to actions typically taken prior to mass producing a solution. When an experiment

identifies a promising design, prototyping develops and evaluates a representation of that

design to ensure it fully satisfies the need.

4.4.2 Experimentation vs. Testing.

Experimentation and testing are also closely linked and actually follow many of the same

processes. The key difference between experimentation and testing is that experiments typically

seek out “unknowns” in an attempt to uncover knowledge and confirm a cause-and-effect

relationship between variables. Experiments also often seek to identify and characterize

performance limitations in order to determine the point at which an item will fail. Testing, on the

other hand, verifies and validates that a capability meets user-defined requirements to

successfully accomplish a mission or mission thread, usually using pass-fail criteria.

4.4.3 Experimentation vs. Demonstration.

The key difference between experimentation and demonstration is that experimentation increases

knowledge in a specific domain, while demonstrations simply present and confirm what is already

known. Experimentation identifies specific areas of uncertainty and custom-designs and conducts

experiments to address that uncertainty. With demonstrations, however, the uncertainty has

already been resolved; demonstrations simply recreate that knowledge to reveal the relationships

between variables. DoD demonstrations are typically scripted and orchestrated activities that

minimize the risk that the solution demonstrated will fail. They are primarily intended to display a

solution’s military utility in specific operational environments to people unfamiliar with the

technology or concept or to senior leaders responsible for making decisions regarding its

employment, deployment, or acquisition in order to garner support for the technology or concept.

4.5 Experimentation Methods.

When considering experimentation, what often comes to mind are experiments conducted in

laboratories. While some defense experiments do occur in laboratories, they often take place

outside of the lab in a variety of settings using a variety of methods. The following are brief

summaries of several of the most common methods used in defense experimentation.

4.5.1 Workshops.

Workshops bring together a diverse talent of warfighters, policy makers, requirements writers,

threat analysts, and technologists to explore threats, technologies, and concepts. They are often

used to identify and refine capability gaps, establish requirements, identify and determine the

feasibility of a new technology, discover and generate concepts, and develop CONOPS.

Workshops can be conducted as informal brainstorming or idea-generation sessions or as structured

deliberations of the merits and weaknesses of the topic being discussed.

4.5.2 Wargames.

Wargames are simulations of warfare where technology, concepts, and CONOPS can be evaluated

without the dangers of military conflict. Wargames seek to enhance the physical and

psychological realism of a military problem to the extent possible by using warfighters as the

players and evaluating their actions using models or rulesets. Wargames are often conducted

using tabletop exercises and/or virtual environments.

8

4.5.2.1 Tabletop Exercise.

As the name implies, tabletop exercises do not involve fielded forces. They are typically

structured wargames where warfighters in a room together (or spread across multiple rooms to

simulate real communications), work through scenarios to discover and define capability gaps

and their boundaries, and where initial insights into the value of proposed solutions to those gaps,

across the full DOTMLPF-P spectrum, is discussed.

4.5.2.2 Virtual Wargames.

Wargames can also be played virtually. Modeling and simulation (M&S) can be used to create

virtual scenarios that simulate the interaction of two or more opposing forces. These simulations

can then be used by the warfighter to evaluate alternative technologies and concepts, refine

concepts, and help design future experiments. Types of simulations are differentiated by the level

of human involvement in the simulation, from no human involvement in constructive simulations to

a great degree of human involvement in human-in-the-loop (HITL) simulations. The following

subsections provide brief summaries of three types of virtual wargames.

4.5.2.2.1 Constructive Simulations.

In constructive simulations, the experiment designer chooses the input parameters of a force-on-

force simulation and initiates the simulation. No human intervention occurs once the simulation

begins. Results are then recorded and analyzed. This type of simulation enables participants to replay the same battle under identical conditions while systematically changing the input

parameters (e.g., different technological solutions), enabling a side-by-side comparison of the parameters.

4.5.2.2.2 Analytic Wargames.

Analytic wargames employ military participants organized in Blue, White, and Red Cells to plan and execute a military operation. In a typical engagement, the Blue Cell provides its course of

action to the White Cell, which communicates that action to the Red Cell. The Red Cell then communicates its counter move to the White Cell, which then runs the simulation using these

inputs. The simulation generates the outcome of the fight. Analytic wargames allow warfighters

to compare the operational values of multiple inputs by enabling the participants to fight the same battle multiple times using different inputs.10

4.5.2.2.3 Human-in-the-Loop (HITL) Simulations.

Of all the virtual wargames, HITL simulations are probably the most operationally realistic.

HITL simulations are real-time simulations with a great degree of human-machine interaction in which military participants receive real-time inputs from the simulation, make real-time

decisions, and direct simulated forces or platforms against simulated threat forces. A good example of a HITL simulation is a flight simulator. HITL simulations reflect warfighting

decision-making better than constructive simulations and analytic wargames, but, due to human involvement, they also introduce variability, making significant changes in results more difficult

to detect and cause and effect relationships more difficult to determine.11

4.5.3 Field Experiments.

Field experiments are the most realistic experimentation method because they can best replicate

real operational environments. Conducted in the anticipated operational environment using

10 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex), 28. 11 Kass, The Logic of Warfighting Experiments, 116.

9

military personnel and equipment, field experiments best emulate the conditions that

warfighters will likely face in combat. The scope of field experiments varies widely from

small-scale experiments, where operators are invited to simply try out new technologies and

concepts, to large-scale experiments and exercises that emulate a battle scenario.

4.5.3.1 Small-Scale Field Experiments.

Small-scale field experimentation provides warfighters the opportunity to explore the effects of a

proposed technology or concept solution in an operationally representative environment and

confirm whether the capability demonstrates military utility or meets particular performance

objectives. These small-scale experiments enable the warfighter to conduct multiple trials with a

single solution, collaborate with technologists to refine their solutions and observe the effect of the

changes in real-time, and compare the impact of multiple solutions in an operational environment.

Small-scale field experiments often set the stage for participation in a large-scale field experiment.

4.5.3.2 Large-Scale Field Experiments.

Large-scale field experiments, conducted at large experimentation venues or as part of major

military exercises, often provide the most realistic assessment of the effectiveness and utility of a

technology or concept at scale in combat operations. Large-scale field experiments that include

operational environment stresses can be used to validate technology solutions, obtain greater insight

into a solution’s endurance and reliability, and demonstrate safety characteristics of a proposed

solution. On the other hand, while highly applicable to combat operations, because of their scale,

multiple trials are seldom conducted in the field, making it difficult to observe changes and

determine true cause-and-effect relationships.12

4.6 Cultural Implications for Experimentation.

At the heart of good experimentation is the real likelihood that the experiment will fail. In fact, the most successful experiment designs ensure that failure is a possibility by stressing the object

of the experiment beyond known or expected limits. This provides both an understanding of whether the proposed solution will meet the capability need and if/when it might fail to deliver

the expected performance.13 It also allows solution developers to modify and retest failed capabilities or pivot away from them altogether and explore other opportunities.

Designing the possibility of failure into their experiments, however, is difficult for most DoD

experimenters. The typical DoD practice of evaluating experimenters based on the success of

their experiments has caused experimenters to become increasingly risk averse, as the DSB

observed as early as 2013.14 This heightened risk aversion often results in experiment designs

that have a low probability of failure, diminishing the quality and usefulness of the experiment.

One way to mitigate this risk-averse culture is to institutionalize, within in the Department, a new

understanding of what experimentation “success” and “failure” means. As mentioned earlier, the

ultimate purpose for all experimentation is to advance knowledge, providing decision makers with

information they need to make decisions. As a result, an experiment “succeeds” if it produces

sufficient evidence to conclude that a cause-and-effect relationship exists between two variables—

12 Kass, The Logic of Warfighting Experiments, 117. 13 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 103. 14 Defense Science Board, Report on Technology and Innovation Enablers for Superiority in 2030, 78.

10

even if the experiment does not produce the expected results. In other words, experiments that

establish the ineffectiveness of proposed solutions are not failures; rather, they are successful

learning activities. On the other hand, a “safe” experiment that does not advance knowledge by

producing important evidence is a “failed” experiment. It fails to increase knowledge about the

effectiveness of a proposed solution. The litmus test of “success” and “failure” in experimentation

has less to do with the expected results of the experiment and more to do with the data that the

experiment generates.

Congress recognized this in its FY17 NDAA Conference Report noting that USD(R&E) should take risks and have the latitude to fail, as appropriate.15 The Department agreed with Congress in its August 2017 report to Congress on “Restructuring the Department of Defense Acquisition, Technology and Logistics Organization and Chief Management Officer Organizations,” emphasizing: “This requires a culture change and the re-education of our workforce. This is a significant cultural shift that must be continually reinforced with risk tolerance and the move away from a perceived ‘zero risk’ mentality.”16

In order to show that experiments that fail to produce expected results can actually succeed in their

intended purpose, experimenters must clearly identify, up-front, the purpose of the experiment, the

information to be learned, and the value of that information. That way, even if the experiment fails

to produce the expected results, the developer can point to the metrics of success, which were

identified during the planning process, to justify the investment and demonstrate that the

experiment was, in fact, a success.

Institutionalizing new definitions of what constitutes experiment “success” and “failure” is critical to fostering a healthy culture of experimentation with tailored risk.17 Faster, less expensive “failures” in experimentation ultimately lead to more rapid, iterative system development that will reduce cost and technical risk.

5 Experimentation Activities

Even though each experiment is unique, several key activities are universally applicable and

should be considered for all experiments:

Formulating experiments

Planning experiments

Soliciting proposed solutions for experiments

Selecting proposed solutions for experiments

Preparing for and conducting experiments

Data analysis and interpretation

Results of experimentation

Depending on the specific experiment, experiment type, experiment scope, and the venue

selected, experimenters may determine that some of these activities are unnecessary or they may

discover that some activities are performed by the experimentation venue. Experimenters should

15 U.S. Congress, House, Conference Report: National Defense Authorization Act for Fiscal Year 2017, 1130. 16 Report to Congress, 30. 17 Mattis, Summary of the 2018 National Defense Strategy of the United States of America, 7.

11

tailor their activities to address their specific experiment. This section describes each of these

activities and provides recommendations for each based on best practices captured from literature

and from SMEs in the experimentation community.

5.1 Formulating Experiments.

Experimentation should start with a clear articulation of why the experiment is being conducted and

how the conclusions will be applied. This involves several iterative activities explained in this

section: generating the problem statement, establishing the experimentation team, and developing

the hypothesis. Since these activities are so closely aligned and their products iteratively refined

throughout the experiment, it is often difficult to identify which activity occurs first. Regardless,

the results of this formulation activity should include a clear, unambiguous problem statement and

hypothesis, along with a robust experimentation team.

5.1.1 Generate the Problem Statement.

Generating and refining the problem statement is one of the most critical activities in experimentation. The problem statement helps to identify appropriate team members and keeps

the team focused on the experiment’s purpose throughout the experiment lifecycle, from planning to execution and analysis. The problem statement should address the complete issue being

studied, not just the specific hypothesis being analyzed,18 and include the following components:

A clear articulation of the specific capability gap, need, opportunity, condition, or

obstacle to be overcome;

Identification of affected stakeholders; and

The specific capability needed.19

The robustness of the problem statement is a function of the formality of the experiment. For

informal experiments that allow significant free play, the problem statement should not be overly

restrictive, allowing sufficient flexibility for operators and technologists to pursue ideas. Problem

statements for more formal experiments, on the other hand, should be very specific, enabling

experimenters to adequately design and control the experiment in order to generate the information

needed for decision makers.

Experimenters can use numerous sources of information—both formal and informal—to identify

the core capability needs and develop the problem statements to be explored in their experiments.

The most obvious source of capability needs are validated requirements that are documented

through formal processes. Examples include requirements listed in approved Joint Capabilities

Integration and Development System (JCIDS) documents and strategic needs recorded in the

following documents:

National Defense Strategy;

USD(R&E)’s Road to Dominance modernization priorities;

The Chairman's Risk Assessment; and

The Joint Requirements Oversight Council-led Capability Gap Assessment.

18 Alberts, Code of Best Practice: Experimentation, 129. 19 Experiment Planning Guide (Norfolk, VA: Navy Warfare Development Command, 2013), 14.

12

Formal requirements also include capability gaps that have been validated by Components or the

Joint Staff and documented by Joint or Military Services’ requirements processes, such as

Integrated Priority Lists (IPL) and Initial Capability Documents. In addition, urgent needs are

often documented in Components’ urgent needs documents or in the Joint Staff’s Joint Urgent

Operational Needs Statements and Joint Emergent Operational Needs Statements.

Unlike formal acquisition programs, however, experimentation is not bound by traditional Joint or

Military Service requirements processes. Instead, experimenters can design and conduct experiments to address military capability gaps identified and provided by the warfighter, outside

of those requirements processes. Sources for these gaps include, but are not limited to, the following:

Critical intelligence parameter breaches;

Emerging needs and opportunities that are identified through threat, intelligence, and risk

assessments; and

Offsetting or disruptive needs that are identified through ongoing operations, other

experiments, demonstrations, and exercises.

5.1.2 Establish the Experimentation Team.

Membership on the experimentation team is not static, and active participation will ebb and flow

throughout the lifecycle of the experiment. That said, the core members of the team should

include the experiment lead, innovative operational experts, logistics representatives, financial

process experts (contracting, acquisition, etc.), and technologists (scientists, coders, and engineers

proficient in the experimentation domain). These members must be identified and must be

actively engaged in the problem statement and hypothesis development at the start of project and

throughout the design, planning, execution, and analysis of the experiment. In addition to core

members, experiment leads should consider including supporting elements such as planners,

requirements experts, vendors, red teams, experiment designers, trainers, knowledge management

experts, scenario developers, M&S experts, and data analysts. As relevant and feasible,

international partners and allies may be invited to participate to provide differing perspectives

which may improve the experimentation process.

5.1.3 Develop the Hypothesis.

As previously mentioned, a hypothesis is a formal statement of the problem being evaluated and a proposed solution to that problem. Hypotheses are not formal conclusions based on proven

theory; rather, they are educated guesses of expectations intended to guide the experiment.20

Hypotheses are often written in an if-then format that describes a proposed causal relationship

between the proposed solution and the problem, where the “If” part of the statement represents

the proposed solution (the independent variable) and the operational constraints to be controlled (intervening variables), and the “Then” part of the statement addresses the possible outcome to

the problem (the dependent variable). For example, a hypothesis might read:

If proposed solution (A) is deployed under operational conditions (C),

Then operational capability gap (B) will be resolved.

20 Kass, The Logic of Warfighting Experiments, 35-36.

13

Similar to the problem statement, the robustness of the hypothesis is a function of the formality

of the experiment being conducted. Formal experiments need to be designed and adequately

controlled to ensure the data produced generates the information needed by decision makers.

The hypothesis that guides these types of experiments should be very precise. Less formal

experiments, on the other hand, require flexibility for the warfighter and technologist to explore

possibilities and trade spaces. As a result, hypotheses developed for these types of experiments

will be less rigorous and more general.

Some DoD organizations have determined that the complexity of their field experiments makes it

nearly impossible to clearly produce a state of independence between variables. The intervening

variables are too numerous to control or account for appropriately. Instead, these organizations

develop robust objective statements that state the intent of the experiment, questions to be

answered, conditions required, measures and metrics to be taken, and the data to be collected. For

these organizations, the objective statement drives the design, planning, and execution of the

experiment.21

Best Practices for Formulating the Experiment

Key principles for developing effective problem statements:

o Be as precise as possible in specifying the issue; o Formulate the problem statement as a comparison against a baseline, if possible; and o Sufficiently research the problem to ensure the hypothesis includes all known factors.22

A problem statement should be just that…a clear statement of the problem. In order to minimize bias, problem statements should avoid assigning blame or proposing a cause for the problem, and they should also avoid taking a position or suggesting a solution.23

Experimenters should consider reviewing DoD databases that catalog reports, studies, and lessons from prior experiments conducted.24

Operations security (OPSEC) should be addressed early in the experimentation process and emphasized throughout the project. Experimenters should consider adding an OPSEC-trained SME to the experimentation team to assess experimentation planning, execution, and reporting.

In order to clearly articulate the problem or hypothesis, experimenters should consider diagraming the problem or the relationships between the hypothesis variables. These diagrams are often referred to as conceptual models.25

5.2 Planning Experiments.

Successful experimentation begins with effective planning. Experiment plans should be

constructed as living documents that act as roadmaps for their experiments, modified and added

to, as appropriate, from the start of experiment formulation through the execution of the

experiment. When execution starts, the plan should provide a comprehensive summary of all

aspects of the experiment and a compilation of the individual functional area plans in a single

location. At a minimum, experimenters should consider providing or discussing the following

topics in their plans:

21 Dr. Shelley P. Gallup, email message to author, June 20, 2019. 22 Alberts, Code of Best Practice: Experimentation, 128. 23 Experiment Planning Guide, D-1. 24 Databases available to DoD experimenters include: DTIC databases (https://discover.dtic.mil), Joint

Staff’s Joint Lessons Learned Information System (https://www.jcs.mil/Doctrine/Joint-Lessons-Learned),

and the Center for Army Lessons Learned database (https://call2.army.mil). 25 Experiment Planning Guide, 19.

https://discover.dtic.mil/

https://www.jcs.mil/Doctrine/Joint-Lessons-Learned

https://call2.army.mil/

14

Clear, unambiguous problem statement for the experiment;

Clear hypothesis (or set of hypotheses) to assess;

Contracting strategy;

Funding strategy;

General approach to the experiment and experiment design;

Schedule of events (to include experiment set-up and dry-run);

Organization (Blue force, Red force, and experiment team);

Scenarios and plans for free play;

Control plan;

Data collection and analysis;

Personnel;

Logistics and infrastructure;

Training and training materials;

Risk management;

OPSEC;

Communications;26

Safety considerations;27 and

Forecast for the next steps, given success or failure.

The type and scope of the experiment and the experiment venue used will determine the topics to

be included in the plans and the level of detail to be included in these sections. While impossible

to plan a perfect experiment, it is the experimenter’s responsibility to make the experiment as

useful as possible considering assumptions, limitations, and constraints and to caveat the results of

the experiment appropriately. The following subsections further develop some of the more critical

topics that the plan needs to address.

5.2.1 Selecting the Contracting Strategy.

Experimentation is a tool that can help streamline the process of developing capabilities and

delivering them to the warfighter. The speed at which experimentation can support the warfighter

is governed in large part by the tools experimenters have at their disposal to get these efforts on

contract. Experimenters have a number of expedited Federal Acquisition Regulation (FAR)-

based contracting and non-FAR-based non-contract vehicles available for use with

experimentation that are, in large part, the same strategies available for prototyping. Additional

information pertaining to contracting strategies can be found in Section 6 of the DoD Prototyping

Guidebook.28 Experimenters should express the urgency of their project to their contracting

authority and work with them to structure an appropriate contracting strategy for their effort.

5.2.2 Securing Funding for the Experiment.

One of the biggest obstacles to experimentation is securing funding to either conduct the

experiment or to apply the recommendations resulting from the experiment. Specific challenges

that experimenters face when securing funding include:

26 Experiment Planning Guide, 53 & H-1-1. 27 Experimenters should consider producing a document that describes the specific hazards of the experiment and

indicates the capability is safe for use and maintenance by typical troops. See discussion of “System Safety” at

https://www.dau.mil/acquipedia/pages/articledetails.aspx#!483. 28 Department of Defense, Department of Defense Prototyping Guidebook.

https://www.dau.mil/acquipedia/pages/articledetails.aspx#!483

15

DoD’s rigid funding structure that regulates the type of technology development that an

organization can pursue;

The length of time it takes DoD’s Planning, Programming, Budgeting, and Execution

process to make funding available (nearly two years from the time a funding need is

identified); and

Limitations Congress places on the specific use of funding in the NDAA.

Obtaining appropriate funding for experimentation is a challenge inherent in prototyping as well and is discussed in the DoD Prototyping Guidebook. For a summary of funding vehicles and DoD offices that can be pursued as potential funding sources for experimentation, please refer to Section 7 of that guidebook.29

5.2.3 Experiment Design.

Second in importance only to a well-defined problem statement is the experiment design. The design must ensure that, at the conclusion of the experiment, a determination can be made regarding the causal relationship in the hypothesis and that decision makers have confidence in both the results of the experiment and the information they need to make their decision.30

Best Practices for Experiment Design

When designing experiments, designers should consider several important topics:

o Ensure all relevant variables and associated ranges are identified; o Determine how each variable will be measured; o Identify the factors that are believed to influence the relationships between variables; o Determine how these variables will be controlled when needed; o Identify the baseline that will be used for comparison; o Select the sample size needed to achieve the statistical relevance desired; o Establish the number of trials that will be run; o Determine the amount and type of data that will be needed; and o Select the appropriate analytic strategy.31

Experimenters should consider using two-level factorial experiments32 to help focus subsequent experiments on the independent variables and their settings that have the greatest impact on the dependent variable(s). This enables experimenters to use their time and available resources on experiments that are most beneficial.

Include stakeholders early in the design process to ensure the experiment satisfies stakeholders’ objectives and intent.

Encourage early, firm decision-making on scenarios, participants, funding, technical environment, and study issues. The longer it takes to make decisions on these topics, the more difficult it will be to control

the variables.33

Address safety of personnel and equipment early in the planning process and throughout the planning and execution of the experiment.

29 Department of Defense, Department of Defense Prototyping Guidebook. 30 Kass, The Logic of Warfighting Experiments, 19. 31 Alberts, Code of Best Practice: Experimentation, 74. 32 For an example and further information regarding two-level factorial experiments, please refer to section R5.3.7 at the

following link: http://umich.edu/~elements/05chap/html/05prof2.htm. 33 The Technical Cooperation Program, Pocketbook Version of GUIDEx (Slim-Ex), 52.

http://umich.edu/~elements/05chap/html/05prof2.htm

16

Confidence in the results of an experiment is measured by an experiment’s validity, reliability,

precision, and credibility. Unfortunately, it is impossible to design experiments to satisfy all of

these measures 100 percent, and often emphasizing one criterion results in a decrease in another.

The challenge for experiment designers, then, is to design the experiment in a way that

emphasizes the desired validity, reliability, precision, and credibility for that particular experiment

within the funding and schedule constraints provided.

5.2.4 Scenario Development. To ensure their experiments generate the data that decision makers need, many experimenters rely on scenarios (scripted sequences of events) that focus the experiment on the problem being evaluated and provide boundaries for the experiment. Table 1 identifies the four primary factors that comprise scenarios and provides examples of each.34

Table 1: Primary Scenario Factors

Factor Examples

Context Objectives being pursued, the geopolitical situation, and other background information pertinent to the problem (e.g., timeframe)

Participants Numbers, types, intentions, and capabilities of Blue forces, Red forces, and other players.

Environment Physical location of the problem including manmade and natural obstacles and considerations (e.g., landmines, climate, weather)

Events Scenario injects, their purposes, and the activities to be observed

Scenarios are composed of pre-planned events, called scenario events or injects, that are intended

to drive the actions of experiment participants. A chronological listing of these events and actions

are often recorded in the master scenario event list (MSEL). Each entry in the MSEL includes

important information regarding the scenario event, such as

A designated time for delivering the inject;

An event synopsis;

The name of the experiment controller responsible for delivering the inject;

Special delivery instructions;

The task and objective to be demonstrated;

The expected action; and

The intended player receiving the inject.35

The type of experiment conducted drives the level of specificity and control included in the

scenario. Typically, the more formal the experiment, the more specific and controlled the

scenario. Scenario developers should be careful to appropriately scope the scenario for the type of

experiment being conducted. Scenarios written for more formal experiments that are too general

may fail to generate the data needed to support the analysis. Likewise, for less formal

34 NATO Code of Best Practice for C2 Assessment (Washington, DC: Command and Control Research Program,

2002), 164-165, http://dodccrp.org/files/NATO_COBP.pdf. 35 Department of Defense, DoD Participation in the National Exercise Program (NEP), DoD Instruction 3020.47,

(Washington DC: Department of Defense, 2019), 17,

https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-29-080914-067.

http://dodccrp.org/files/NATO_COBP.pdf

https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-29-080914-067

17

experiments, overly specific scenarios may inadvertently eliminate examination of some relevant

factors and relationships. Bottom line, scenarios need to be valid, reliable, and credible and be

developed or adapted in a way that supports the objectives of the experiment.

Best Practices for Scenario Development

Develop and use multiple scenarios in an experiment. Using only a single scenario encourages suboptimization and decreases how broadly the findings can be applied.36

Include three echelons of command in the scenario—one above and one below the focus of the experiment.37

To reduce experiment costs associated with scenario development, re-use or modify existing scenarios as appropriate and when doing so doesn’t compromise the experiment. Consider using commercial

games,38 if appropriate.39

5.2.5 Data Collection and Analysis Plans.

Planning for data collection and data analysis are critical efforts that need to begin early in the

experiment planning process and be coordinated with other aspects of the plan (e.g., scenario

development) to ensure valid, reliable, precise, and credible data are captured and that the analysis

will generate the information needed to address the issue being evaluated. Closely linked, the data

collection plan and the data analysis plan will be developed iteratively and will be updated

throughout the experiment’s lifecycle. Typically developed first, the data analysis plan contains a

description of the analysis tools that will be used to evaluate the experiment data and a discussion

of potential bias and risk in the experiment design. The data collection plan, on the other hand,

describes the data needed to be collected to support the analysis plan and provides the structure

that ensures the scenarios, participants, and environment will generate the data needed. Challenges

associated with data collection will be assessed and integrated in a revised data analysis plan. This

iterative process will continue through the life of the experiment.

The data analysis plan will usually include several types of analyses depending on the purpose and

focus of the experiment, the information required, and the data collection means available. The

type of experiment conducted will influence the tools selected to conduct the analysis. For

example, because less formal experiments offer significant opportunity for unscripted free play,

they require more open-ended analysis tools and techniques (e.g., histograms, scatter plots, mean

values, etc.). However, more formal experiments are rigidly planned, requiring rigorous tools and

techniques that enable statistical control (e.g., t-test, regression analysis, correlation analysis, etc.).40

36 Alberts, Code of Best Practice: Experimentation, 200. 37 Alberts, Code of Best Practice: Experimentation, 92. 38 DoD has used games for wargaming purposes for decades. Improvements in computer technology, especially with

commercial personal gaming, fueled the modification and re-use of commercial entertainment computer games for

military wargaming purposes. For example, ‘“America's Army,’ a modification of Unreal Tournament; ‘DARWARS

Ambush,’ and [sic] adaptation of ‘Operation Flashpoint;’ and X-Box's ‘Full Spectrum Warrior’ have all been used by

the military. ‘Marine Doom’ was…an early modification of idSoftware's ‘Doom II.’” 39 Alberts, Code of Best Practice: Experimentation, 93 & 222. 40 Alberts, Code of Best Practice: Experimentation, 113-114.

18

Best Practices for Data Analysis and Collection Plans

Experimenters should understand what data decision makers and stakeholders consider the most useful.

Address protection of vendor intellectual property rights, as appropriate.

Keep in mind that the only reason for collecting data is to support the data analysis. Losing sight of this can result in simply collecting data that is easy to collect as opposed to collecting the right data needed for the experiment.41

The Navy Warfare Development Command developed a data collection plan template that includes the following topics:

o Data collection methodologies: sensor system electronic data, communications network data, observer manual collections, surveys (form or electronic), interviews, etc.

o Collection plan specifics: battle rhythm, type, periodicity, format, location, timeframe, method, etc. o Data collection personnel: instructions, training requirements, location, timeframe, transportation

and billeting requirements o Collection form templates o Collection equipment o Observer logs o External collection requirements: related data that cannot be captured during the execution event,

e.g., surveys, interviews42

Data collection plans should include descriptions of the content of the data, collection methods,

and data handling and storage procedures. In developing the data collection plan, experiment

teams should consider the myriad ways that data collection can be accomplished. The most

reliable form of data collection is automated collection, in which the systems used to drive the

experiment or the operators’ systems collect and store the data. Care must be taken to ensure

that the data collection systems’ clocks are synchronized and that use of automated data

collection tools will not impact the functionality of the systems under testing. Other means of

data collection include screen captures, email archives, snapshots of databases, audio and/or

video recording, survey instruments, proficiency testing of subjects, and human observation.

Key steps to developing a data collection plan include the following:

Specify the variables to be measured;

Prioritize the variables to be measured;

Identify the collection method for each variable;

Ensure access for collecting data for each variable;

Specify the number of observations needed for each variable and confirm the expectation

to collect all observations;

Identify required training;

Specify the mechanisms that will be used to capture and store the data; and

Define the processes needed for data reduction and assembly.43

5.2.6 Risk Management.

As with any acquisition project, experimenters must analyze, mitigate, and monitor risks to their experiments. Table 2 summarizes the risks common to experimentation.44

41 Alberts, Code of Best Practice: Experimentation, 224-225. 42 Experiment Planning Guide, H-2-2. 43 Alberts, Code of Best Practice: Experimentation, 242. 44 Experiment Planning Guide, 22.

19

Table 2: Risks Common to Experimentation

Risk Category Description Examples

Experiment Risks Internal activities that could affect the success of the experiment

Failure of the concept or technology to perform as advertised

Insufficient participants with the correct skills and experience

Unrealistic timeline

Safety hazards System security challenges

Programmatic Risks Risks that are imposed externally

Insufficient funding

Schedule constraints Increases in scope

Operational Risks Risks associated with a solution’s ability to perform in an operational environment

Non-ruggedized equipment

Inappropriate operational environment “Acts of God”

Not all risks can be eliminated, but they should be identified, catalogued, prioritized, and managed to minimize their impact on the experiment. Experimenters can find additional information on risk management practices in DAU’s “Defense Acquisition Guidebook”45 or the “DoD Risk, Issue, and Opportunity Management Guide for Defense Acquisition Programs.”46

5.2.7 Selecting the Experimentation Venue. DoD holds numerous events each year where experiments are conducted. These venues are both

physical and virtual venues depending on the type of experiment being conducted and the objectives of the experiment. According to the U.S. Air Force Scientific Advisory Board, as long

as a venue facilitates the exploration of ideas and insights, the venue (whether physical or virtual)

can be used for experimentation.47 A critical component of the selection decision is the infrastructure that the venue offers. Physical venues should include appropriate infrastructure to

support data collection, enable capturing the locations of relevant entities over time, and permit or provide adequate communication for the experiment team.

5.2.7.1 Relevant Environment.

Experimenters should select a venue that maximizes the relevance of the environment to the

problem the experiment is intended to inform. Not all relevant environments need to be

operational environments. Depending on the problem, the relevant environment could be a

virtual crowdsourcing environment, a laboratory bench, a seminar or workshop, a wind tunnel, a

test and evaluation facility, a simulated environment, a defense experimentation venue, or a

training exercise—to name just a few. The key is to ensure the venue environment is relevant to

the problem statement and allows the experimentation team to implement the experiment as

designed. For example, a large exercise or wargame may seem like an ideal venue for an

experiment because of the opportunity for hands-on warfighter involvement with the proposed

solution. However, because of the cost and scope of these venues, it is unlikely that multiple

45 Defense Acquisition University, Defense Acquisition Guidebook (2018), https://www.dau.mil/tools/dag. 46 Department of Defense, Department of Defense Risk, Issue, and Opportunity Management Guide for Defense

Acquisition Programs (Washington, DC: Department of Defense, 2017),

https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf. 47 United States Air Force Scientific Advisory Board, United States Air Force Scientific Advisory Board Report on

System-Level Experimentation: Executive Summary and Annotated Brief, SAB-TR-06-02 (Washington DC: United

States Air Force Scientific Advisory Board, 2006), 10, https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf.

https://www.dau.mil/tools/dag

https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf

https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf

20

trials (necessary for many experiments) will be conducted, which could impact the ability of the

decision makers to make an informed decision. On the other hand, if the ultimate objective is to

deploy the solution for operational use, the relevant environment must include hands-on

experimenting with the solution by the warfighter in an operationally representative environment.

5.2.7.2 Examples of DoD Experimentation Venues.

This subsection provides a representative sample of DoD experimentation venues. Participants in

these events are typically responsible for covering their own costs. For additional information

regarding each event, the names of the events are hyperlinked to their applicable online presence

(as of the date of this publication).

5.2.7.2.1 Advanced Naval Technology Exercise (ANTX).

The Naval Undersea Warfare Center Division Newport conducts the annual ANTX, which

provides a maritime demonstration and experimentation environment that targets specific

technology focus areas or emerging warfighting concepts with a goal of getting potential

capabilities out to the warfighter in 12 to 18 months. ANTXs are low-barrier-to-entry, loosely

scripted experimentation events where technologists and warfighters are encouraged to explore

alternate tactics and technology pairings in a field or simulated environment. Participants

receive feedback from government technologists and operational SMEs. ANTXs are hosted by

labs and warfare centers from across the naval R&D establishment.

5.2.7.2.2 Army Expeditionary Warrior Experiment (AEWE). The Army Maneuver Center of Excellence conducts an annual AEWE campaign of

experimentation to identify concepts and capabilities that enhance the effectiveness of the current

and future forces by putting new technology in the hands of Soldiers. AEWE is executed in three

phases—live fire, non-networked, and force-on-force—providing participants the opportunity to

examine emerging technologies of promise, experiment with small unit concepts and capabilities,

and help determine DOTMLPF-P implications of new capabilities.

5.2.7.2.3 Chemical Biological Operational Analysis (CBOA). CBOAs are scenario-based events that support vulnerability and system limitation analysis of

emerging capabilities in chemically- and biologically-contested environments. These live field

experiments, conducted at operationally relevant venues, provide an opportunity for technology

developers to interact with operational personnel and determine how their efforts might fill

military capability gaps and meet high priority mission deficiencies. CBOAs are sponsored by

the Defense Threat Reduction Agency’s Research and Development-Chemical and Biological

Warfighter Integration Division.

5.2.7.2.4 Joint Interagency Field Experimentation (JIFX). The JIFX program conducts quarterly collaborative experimentation in an operational field

environment using established infrastructure at Camp Roberts and San Clemente Island. JIFX

experiments provide an environment where DoD and other organizations can conduct concept

experimentation using surrogate systems, demonstrate and evaluate new technologies, and

incorporate emerging technologies into their operations. JIFX is run by the Naval Postgraduate

School.

https://www.navsea.navy.mil/Home/Warfare-Centers/NUWC-Newport/What-We-Do/ANTX-2019/

http://www.benning.army.mil/MCoE/CDID/AEWE/

https://www.sam.gov/

https://my.nps.edu/web/fx

21

5.2.7.2.5 Sea Dragon 2025.

Sea Dragon 2025 is a series of real-world experiments intended to refine the U.S. Marine Corps

(USMC) of the future. Sea Dragon experiments are conducted in several phases that span a

number of years. The first phase concentrated on the future makeup of the USMC infantry

battalion. The second phase is an on-going three-year campaign focusing on hybrid logistics,

operations in the information environment, and expeditionary advanced base operations. Sea

Dragon 2025 is run by the Marine Corps Warfighting Laboratory.

5.2.7.2.6 U.S. Special Operations Command (USSOCOM) Technical

Experimentation (TE).

USSOCOM conducts TE events throughout the United States with Government, academia, and

private industry representation. TE events are typically held in austere, remote outdoor locations

under various weather and environmental conditions, creating a setting where technology

developers can interact with the Special Operations Forces (SOF) community in a collaborative

manner. TE events are conducted by USSOCOM’s SOF Acquisition, Technology, and Logistics

Center.

5.3 Soliciting Proposed Solutions for Experiments.

The need to solicit for solution proposals depends on the dynamics of the experiment. When a

technology or concept solution is already known and an experiment is planned to further refine

or determine the operational utility of the solution, this step is not needed. However, as is often

the case, the problem statement is drafted without a specific solution in mind. In these cases,

once the problem statement is clearly drafted and the experiment plan is developed, the next

major activity is soliciting potential solutions that meet the stated need. Potential solutions can

be obtained from a number of sources. DoD Project/Program Managers or Program Executive

Officers may recognize and offer legacy or new capabilities as potential solutions to the

problem. National laboratories, defense laboratories, centers of excellence, and other DoD

organizations are also great sources of new capability and prototypes that should be considered.

Another approach is reaching out to Federally Funded Research and Development Centers and

University Affiliated Research Centers that develop technology solutions. Finally, international

partners, industry, academia, and international partnerscan also be sources of innovative

solutions.

When seeking non-Government non-sole-source solutions, the FAR requires the use of the

System for Award Management (SAM) website, formerly known as FedBizOps

(https://www.sam.gov/), for opportunities greater than $25,000. This website is a great

resource for reaching traditional partners. However, experimenters who want to expand their

target audience to include nontraditional suppliers of potential solutions will need to exploit

alternative solicitation strategies. Additional information and best practices regarding

soliciting potential solutions from traditional and nontraditional suppliers can be found in

Section 5.3 of the DoD Prototyping Guidebook.48

5.4 Selecting Potential Solutions for Experiments. Determining which of the proposed solutions to include in the experiment is the next step in the

process. To identify the most promising, innovative, and cost effective solutions, experimenters

48 Department of Defense, Department of Defense Prototyping Guidebook.

https://www.marines.mil/News/Messages/Messages-Display/Article/1381238/usmc-fy18-experiment-plan-sea-dragon-25-phase-ii/

https://www.socom.mil/SOF-ATL/Pages/technical-experimentation.aspx

https://www.socom.mil/SOF-ATL/Pages/technical-experimentation.aspx

https://www.sam.gov/

22

Best Practices for Selecting Potential Solutions for Experimentation

The Warfighting Lab Incentive Fund office employs members of the Joint Staff and the Office of Cost Assessment and Program Evaluation to evaluate submissions using the following criteria:

o Potential for disruptive innovation o Potential contribution to offset key U.S. vulnerabilities o Potential for cost imposition/enhancements to U.S. national interest across the conflict continuum o Potential cost/benefit for the Department o Amount of funding requested o Time required to execute and generate results o Potential for advancing U.S. national interests o Past performance of proposing organization

The Navy’s Tactics and Technology Exploration and Experimentation (TnTE2) methodology uses two categories of criteria to select solutions—technical ability and potential operational utility. Table 3 provides examples of criteria considered under each of these categories.

should establish selection criteria that clearly address the purpose or objective of the experiment.

These criteria will often be weighted to emphasize specific attributes of the solution over others. Selection criteria and their weighting should be developed to address the problem statement

directly, the future decision to be made, and the data needed to make that decision. Additional information and best practices associated with selecting potential solutions can be found in Section

5.4 of the DoD Prototyping Guidebook.49

Table 3: Examples of Selection Criteria for Navy's TnTE2 Methodology

Technical Ability Potential Operational Utility

Technical maturity

Readiness to integrate with other systems

Reliability Standardization

Operational relevance

Personnel burden

Environmental constraints

5.5 Preparing For and Conducting Experiments.

Preparing for an experiment starts as soon as the venue is selected, long before the actual event occurs, and it proceeds in an iterative fashion throughout the planning and execution process. As

the evolving plan identifies new requirements for the experiment, experimenters begin the effort to satisfy those requirements. If the venue is unable to meet an experiment requirement,

experimenters will need to revise the plan. This iterative process continues through the experiment execution. All this planning and preparation typically culminates in an experiment

that runs for three days to two weeks.50

While scope and complexity of experiments differ significantly depending on the type of

experiment conducted, the following subsections address major topics of consideration that are

fairly universal for all experiments. Naturally, the activities associated with each of these topics

will vary greatly depending on the experimentation method employed. For example, the activities

associated with field experiments are nearly always more substantial and complex than the

activities associated with workshops or simulations. The following subsections are written to

address the activities typically required for more-rigorous field experiments. Regardless of the

scope and complexity of the experiment, however, experimenters should consider each of these

49 Department of Defense, Department of Defense Prototyping Guidebook. 50 Experiment Planning Guide, 83.

23

topics as they plan for, prepare, and execute their experiments. (For additional information

regarding setting up and executing tabletop exercises51 and wargames,52 please refer to the

footnoted references.)

5.5.1 Logistics and Set Up. The set up schedule is dictated by the scope and complexity of the experiment. The greater the

scope and complexity and the higher the levels of validity, reliability, precision, and/or credibility

required, the longer the lead-time needed to prepare for the experiment. Physical set up at the venue typically occurs two to four weeks before the experiment begins;53 however, many logistics

activities must begin long before the physical set up. For example, to ensure availability when needed, experimenters must begin the effort early to secure specific requirements, like frequency

spectrum, airspace clearance, and military training ranges. Likewise, experiment participants must be notified with sufficient lead-time to secure travel and billeting.

The following list contains examples of logistics activities that experimenters should address during

the two to four-week set up time prior to the experiment, ensuring that:

Necessary infrastructure is available and operable;

Systems operate correctly and interoperate, as appropriate, with other systems;

All nodes are sufficiently challenged and present an adequate representation of . . . ;

Communications methods function effectively;

Instrumentation is calibrated and synchronized; and

Contingency plans have been prepared.54

5.5.2 Training.

All experiment participants must be adequately trained to ensure they are able to effectively

perform their functions. Inadequately trained participants create a significant risk to an otherwise

well-constructed experiment. Training will usually occur during the two to four-week experiment

set-up period, but preparations must begin long before. A well-planned training program,

including training materials, is key to successful participant involvement. Training should focus

on four groups of participants: subjects, data collectors, experiment controllers, and the support

team.

5.5.2.1 Subjects.

Training for experiment subjects should focus on the purpose of the experiment, the background

and scenario(s), processes subjects will use during the experiment, and technical skills needed to

operate the systems being evaluated as well as infrastructure equipment necessary for the

experiment. If the experiment includes a comparison of multiple systems, subjects will need to be

proficient in all systems being evaluated, including hands-on training when possible.

Experimenters should consider requiring subjects to pass a proficiency exam prior to the start of

the experiment.

51 Eugene A. Razzetti, “Tabletop Exercises: for Added Value in Affordable Acquisition,” Defense AT&L Vol XLVI,

No. 6, DAU 259 (November - December 2017): 26-31, https://www.dau.mil/library/defense-

atl/_layouts/15/WopiFrame.aspx?sourcedoc=/library/defense-atl/DATLFiles/Nov-

Dec_2017/DATL_Nov_Dec2017.pdf&action=default. 52 United States Army War College, Strategic Wargaming Series: Handbook (Carlisle, PA: United States Army War

College, 2015), https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf. 53 Experiment Planning Guide, 83. 54 Experiment Planning Guide, 83.

http://www.dau.mil/library/defense-

https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf

24

5.5.2.2 Data Collectors.

While providing a thorough overview of the basics of the experiment (e.g., purpose, context,

problem statement, hypotheses, scenario(s), and major events), training for data collectors should

focus on techniques for observation and data collection as well as timing and location of data that

is to be collected. In addition, data collectors and experiment controllers must be clear on the data

that the analysts expect to receive and be prepared to identify and record anomalies so that the

analysts know what to do with the data. Experimenters should consider evaluating the data

collectors’ proficiency with collection methodologies, tools, and processes through a written exam

as well as a dry run of their data collection tasks.

5.5.2.3 Experiment Controllers.

Experiment controllers require training on experiment basics with an emphasis on the scenarios

and MSELs. Responsible for the successful execution of the scenarios, controllers must have a

thorough understanding of the timing and application of scenario injects and be proficient in other

appropriate controller responsibilities.

5.5.2.4 Support Team.

The support team must be well trained on the experiment basics as well as the systems they will

be expected to operate. Often, the support team ends up training other participants on their roles

or on the use of technical systems.

5.5.3 Pre-Experiment Dry Run.

Successful experiments are typically preceded by a full run-through of every aspect of the

experiment. This run-through includes conducting pretests of individual systems and experiment

components (e.g., workstations, communications networks, databases, etc.) in stand-alone mode to

ensure their functionality, as well as exercising them in an integrated system-of-systems approach

to confirm their interoperability. Experimenters should also run full trials of each scenario using

fully-trained subjects, data collectors, controllers, and support team staff, stressing the system to at

least the same level expected during the experiment. Finally, the dry run should produce the same

data expected during the experiment, and analysts should reduce and analyze the data as planned

during the experiment.

5.5.4 Execution.

Experiments typically run from three days to two weeks, with the duration being a function of the

scope and complexity of the experiment or the experiment venue. Each day should begin with a

review of planned activities for the day and should end with a review of experiment activities

conducted that day and a discussion of changes that should be made to increase the effectiveness

of the next day’s activities. The experiment management team typically performs the control

function during the experiment and is responsible for executing the MSEL events at the time and

in the manner that they are scheduled to occur. As with any event, flexibility and ingenuity are

required to address complications experienced during execution that challenge the schedule or

effectiveness of the experiment.

5.5.5 Data Collection and Management.

Data collectors should follow the collection, handling, and storage procedures contained in the

data collection plan. Experiment leads should evaluate data collection activities daily to ensure

the correct data are being collected in the format needed for analysis and that they are handled

and stored according to plans. Instrumentation used to collect data must be calibrated and

25

operated in a way that minimizes any disruption to the operational realism experienced by the

participants. The data collectors should be monitored and critiqued continuously to ensure that

data are being collected consistently across collectors and in the manner specified in the

collection plan. Raw data should be reduced as soon as possible, per the collection plan, and both

the raw data and reduced data must be archived for analysis purposes.

5.6 Data Analysis and Interpretation.

The final step in the experimentation process is analyzing and interpreting the data collected

during the experiment. Data analysis should be conducted using the tools and techniques detailed

in the data analysis plan.

While being sure to complete

the analysis contained in the

plan, analysts should also be

encouraged to pursue

excursions with data of

interest outside of the

analysis plan.

Technologists and

operational SMEs should

then interpret the results of the analysis, validating or invalidating the hypothesis, and provide

decision makers the information they need to inform the decision that initiated the experiment.

5.7 Results of Experimentation.

The measure of a successful experiment is whether it produces sufficient evidence to conclude

that a cause-and-effect relationship exists between two variables—even if the experiment does not

produce the expected results. If the experiment does not successfully produce the necessary

evidence, experimenters can

choose to conduct the

experiment again (if

schedule and funding

permit) or they can

terminate the effort to

identify a relationship

between the variables. However, experiments that do successfully produce sufficient evidence

typically result in one or more of the following actions.

5.7.1 Data are Used to Create or Update Models.

Experimentation data can be used to either create new models or validate and refine existing ones.

In some cases, experimenters will create a conceptual model at the start of the experimentation

process, to assist in developing the problem statement, hypothesis, or the experiment design.

Experimenters can then use the data from the experiment for sensitivity analysis, to validate the

model, to reveal model stability in light of the intervening variables, or they can use the data to

modify the model, so it better reflects the results of the experiment.

5.7.2 Results Generate or Refine Requirements for New Experiments.

A single experiment may not generate the information needed for senior leaders to conclude that a

proposed solution will or will not solve the problem. Sometimes, decisions require a series of experiments testing different facets of the solution. This is especially true when experimenters

Best Practices for Results of Experimentation

Experimenters should consider institutionalizing the results and valuable lessons learned during their experiment in available databases so other stakeholders across the defense community can benefit from their work.

Best Practices for Data Analysis and Interpretation

For experiments to have maximum effect, rather than simply tabulating the data, experimenters should interpret the data and draw applicable conclusions.

At the conclusion of the experiment, after the data has been analyzed and interpreted, experimenters should revisit the purpose/hypothesis for the experiment and, as explicitly as possible, state what was learned through the data and what was not.

26

initiate the process to solve a complex problem and recognize that they will need numerous

experiments to generate the type of information decision makers will need. Some people refer to these as campaigns of experimentation, when experimenters apply a systematic approach to

planning and conducting related serial and parallel experiments in order to methodically move a solution from a vague idea to a fielded system or approach.55 In these cases, the results of one

experiment can generate or refine hypotheses for subsequent experiments.

5.7.3 Results Generate Changes to the Proposed Solution.

Sometimes experimentation helps to mature the proposed solution. In the case of a non-materiel

DOTMLPF-P solution, results of the experiment may reveal changes that need to be made to the

proposed solution or another DOTMLPF-P element to make it more effective. In the case of a

materiel solution, the results of the experiment may support the transition of the technology further

along DoD’s technology readiness level continuum or identify changes that a technologist will

want to make to the design of a prototype to improve its effectiveness or reduce its lifecycle cost.

5.7.4 Failed Solutions are Filtered Out.

Successful experiments will sometimes identify potential solutions that fail to solve the problem

being studied. Identifying failed solutions is as important as identifying successful solutions as it

may provide decision makers with information they need to terminate R&D activities associated

with failed solutions and reallocate R&D resources to other promising capabilities.

5.7.5 Successful Solutions Transition to Operations.

In some cases, at the conclusion of the experiment, the solutions will transition to operational use

to address an existing critical warfighter capability gap. These solutions can exist along the entire

DOTMLPF-P spectrum. Experiments evaluating non-materiel solutions may result in

recommendations to operationalize one or more of the non-materiel DOTMLPF-P solutions

evaluated. Experiments evaluating materiel solutions may result in a fielded materiel operational

capability.

For experiments where operationalizing the solution is an objective, it is critical for the innovator,

program manager, and the operational unit to begin collaborating early in the planning phase and

continue interacting throughout the project. This collaboration will enable the stakeholders to:

Clearly understand the operational need;

Establish the criteria that defines a successful experiment in an operational environment;

Develop an appropriate sustainment package (e.g., standard operating procedures,

training requirements, etc.); and

Ensure appropriate system safety, security, and technical certifications are delivered with

the capability.

5.7.6 Successful Solutions Transition to Rapid Fielding.

In Section 804 of the FY16 NDAA, Congress provided an expedited acquisition pathway to rapidly field successful technical solutions.56 This Middle Tier Acquisition pathway is available to

55 For additional information regarding campaigns of experimentation, please refer to the following source: David S.

Alberts and Richard E. Hayes, Code of Best Practice: Campaigns of Experimentation (Washington, DC: Command

and Control Research Program, 2005), http://www.dodccrp.org/files/Alberts_Campaigns.pdf.

http://www.dodccrp.org/files/Alberts_Campaigns.pdf

27

decision makers for solutions that meet the following criteria:

Existing products and proven technology (with minimal development required) that meet

needs communicated by the warfighter;

Selected using a merit-based process;

Performance was successfully demonstrated and evaluated for current operational

purposes;

Lifecycle costs and issues of logistics support and system integration are addressed; and

Production must begin within six months and complete fielding within five years of an

approved requirement.

5.7.7 Successful Solutions Integrate Into Existing Programs of Record (PoRs) or Initiate

New Acquisition Programs.

Decision makers may choose to initiate new FAR-based acquisition programs for successful

solutions or integrate the solutions into an existing PoR through traditional acquisition pathways

pursuant to DoD Instruction 5000.02, “Implementation of the Defense Acquisition System.” If

this pathway is expected from the outset of experiment planning, early collaboration with

appropriate DoD and Military Services process owners and the receiving PoR should be initiated

to ensure integration and interoperability success.

6 Summary

U.S. national security is affected by the rapid development of technological advancements that are

accessible to both state and non-state actors and novel applications of technologies that are

integrated with new emerging concepts. This has eroded the technological overmatch the U.S.

military has operated in for decades. Current bureaucratic processes that emphasize exceptional

performance, thoroughness, and minimizing risk at the expense of speed have directly contributed

to this erosion.

Experimentation is a tool that enables speed, iterative approaches, tradeoffs, and expands roles of

warfighters and intelligence analysis. The information and best practices provided in this

guidebook is designed to help senior leaders, decision makers, staff officers, and experimenters

most effectively use experimentation to inform decisions, supporting the ultimate goal of

delivering capabilities to the warfighter at the speed of relevance.

56 National Defense Authorization Act for Fiscal Year 2016, Pub. L. No. 114-92 § 804, 129 Stat. 883 (2015),

https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-114publ92.pdf.

https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-114publ92.pdf

28

Appendix 1: Acronyms

AEWE Army Expeditionary Warrior Experiment

ANTX Advanced Naval Technology Exercise

CBOA Chemical Biological Operational Analysis

CONOPS Concept of Operations

DAU Defense Acquisition University

DoD Department of Defense

DOTMLPF-P Doctrine, Organization, Training, Materiel, Leadership and Education,

Personnel, Facilities, and Policy

DSB Defense Science Board

FAR Federal Acquisition Regulation

FY Fiscal Year

HITL Human-in-the-Loop

IPL Integrated Priority List

JCIDS Joint Capabilities Integration and Development System

JIFX Joint Interagency Field Experimentation

M&S Modeling and Simulation

MSEL Master Scenario Event List

NDAA National Defense Authorization Act

NDS National Defense Strategy

OPSEC Operations Security

P&E Prototypes and Experiments

PoR Program of Record

R&D Research and Development

SME Subject Matter Expert

SOF TE Special Operations Forces Technical Experimentation

USD (R&E) Under Secretary of Defense for Research and Engineering

USMC U.S. Marine Corps

USSOCOM U.S. Special Operations Command

29

Appendix 2: Definitions

Credibility. Measure of understanding, respect, and acceptance of the results by the

professional communities participating in the experiment.

Defense Experimentation. Testing a hypothesis, under measured conditions, to explore

unknown effects of manipulating proposed warfighting concepts, technologies, or conditions.

Dependent Variable. Feature or attribute of the subject of an experiment that is expected to change

as a result of the introduction or manipulation of other influencing factors.

External Validity. Experimental design and conduct that ensures the results of the experiment

can be generalized to other environments.

Hypothesis. A formal statement of the problem being evaluated and a proposed solution to that

problem. Often written in an if-then format that describes a proposed causal relationship between

the proposed solution and the problem, the “if” part of the statement represents the proposed

solution (the independent variable) and the operational constraints to be controlled (intervening

variables), and the “then” part of the statement addresses the possible outcome to the problem

(the dependent variable).

Independent Variable. An influencing factor in an experiment that is not changed by other

factors in the experiment and is introduced or manipulated in order to observe the impact on the

subject of the experiment.

Internal Validity. Experimental design and conduct that ensures that no alternative

explanations exist for the experiment results.

Intervening Variable. Feature of an experiment that, unless controlled, could affect the results of

the experiment.

Master Scenario Event List (MSEL). A document that lists all of the scenario events/injects

for an experiment.

Middle Tier Acquisition Pathway. Acquisition pathway that use the authorities in Section 804 of the FY16 NDAA to fill the gap between traditional PoRs and urgent operational needs. Rapid prototyping must be completed within a period of five years. Rapid fielding must begin production within six months of initiation and be completed within another five years.57

Military Capability Gap. Needs or capability gaps in meeting national defense strategies that

are generated by the user or user-representative to address mission area deficiencies, evolving

threats, emerging technologies, or weapon system cost improvements. For the purposes of

prototyping and rapid fielding, military capability gaps include both formal requirements listed

in approved JCIDS documents as well as other needs identified through the Combatant

Command IPL accepted into the Chairman’s Capability Gap Assessment process, critical

57 National Defense Authorization Act for Fiscal Year 2016, § 804, 129 Stat. 882-883.

30

intelligence parameter breaches, and emerging needs identified through formal threat,

intelligence, and risk assessments.

Nontraditional Defense Contractor. An entity that is not currently performing and has not

performed, for at least the one-year period preceding the solicitation of sources by the DoD for the procurement or transaction, any contract or subcontract for the DoD that is subject to full

coverage under the cost accounting standards prescribed pursuant to section 1502 of title 41 and the regulations implementing such section.58

Precision. Measure of whether or not the instrumentation is calibrated to tolerances that enable

detection of meaningful differences.

Prototype. A physical or virtual model that is used to evaluate feasibility and usefulness.

Reliability. Measure of the objectivity of the experiment. Experimental design and conduct that

ensures repeatability of the results of the experiment when conducted under similar conditions

by other experimenters.

Rapid Prototyping. A prototyping pathway using nontraditional acquisition processes to

rapidly develop and deploy prototypes of innovative technologies. It is the intent that these

technologies provide new capabilities to meet emerging military needs, are demonstrated in an

operational environment, and provide a residual operational capability within five years of

project approval.

Scenario Events/Injects. Pre-planned events intended to drive the actions of experiment

participants.

Technologists. Scientists and engineers proficient in the experiment domain.

Technology Base. The development efforts in basic and applied research.

Validity. Measure of how well the experiment measures what it intends to measure.

58 National Defense Authorization Act for Fiscal Year 2016, 10 U.S.C. § 2302(9) (2015),

https://www.law.cornell.edu/uscode/text/10/2302.

https://www.law.cornell.edu/uscode/text/10/2302

31

Appendix 3: References

Accelerating New Technologies to Meet Emerging Threats: Testimony before the U.S. Senate

Subcommittee on Emerging Threats and Capabilities of the Committee on Armed Services. 115th

Cong., 2018 (testimony of Michael D. Griffin, Under Secretary of Defense for Research and Engineering (USD(R&E)). https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-

18.pdf.

Alberts, David S. and Richard E. Hayes. Code of Best Practice: Experimentation. Washington,

DC: Command and Control Research Program, 2002.

http://dodccrp.org/files/Alberts_Experimentation.pdf.

Defense Acquisition University. Defense Acquisition Guidebook. 2018.

https://www.dau.mil/tools/dag.

Defense Science Board. The Defense Science Board Report on Technology and Innovation

Enablers for Superiority in 2030. Washington DC: Department of Defense, 2013.

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf.

Department of Defense. Department of Defense Prototyping Guidebook, Version 1.1.

Washington, DC: Department of Defense, 2019. https://www.dau.mil/tools/t/DoD-Prototyping-

Guidebook.

Department of Defense. Department of Defense Risk, Issue, and Opportunity Management Guide

for Defense Acquisition Programs. Washington, DC: Department of Defense, 2017.

https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf.

Department of Defense. DoD Participation in the National Exercise Program (NEP). DoD

Instruction 3020.47. Washington DC: Department of Defense, 2019.

https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/302047p.pdf?ver=2019-01-

29-080914-067.

Experiment Planning Guide. Norfolk, VA: Navy Warfare Development Command, 2013.

Jackson, Carly, Aileen Sansone, Christopher Mercer, and Douglas King. “Application of Set-

Based Decision Methods to Accelerate Acquisition through Tactics and Technology Exploration

and Experimentation (TnTE2).” In Proceedings of the Fifteenth Annual Acquisition Research

Symposium (May 2018): 335-361. https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-

AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y.

Kass, Richard A. The Logic of Warfighting Experiments. Washington, DC: Command and

Control Research Program, 2006. http://www.dodccrp.org/files/Kass_Logic.pdf.

Mattis, James N., Secretary of Defense. Summary of the 2018 National Defense Strategy of the

United States of America: Sharpening the American Military’s Competitive Edge. Washington,

DC: Department of Defense, 2018. https://dod.defense.gov/Portals/1/Documents/pubs/2018-

National-Defense-Strategy-Summary.pdf.

https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-18.pdf

https://www.armed-services.senate.gov/imo/media/doc/18-40_04-18-18.pdf

http://dodccrp.org/files/Alberts_Experimentation.pdf

https://www.dau.mil/tools/dag

https://www.acq.osd.mil/DSB/reports/2010s/DSB2030.pdf

https://www.dau.mil/tools/t/DoD-Prototyping-Guidebook

https://www.dau.mil/tools/t/DoD-Prototyping-Guidebook

https://www.dau.mil/tools/Lists/DAUTools/Attachments/140/RIO-Guide-January2017.pdf



https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y

https://calhoun.nps.edu/bitstream/handle/10945/58779/SYM-AM-18-095-024_Jackson.pdf?sequence=1&isAllowed=y

http://www.dodccrp.org/files/Kass_Logic.pdf

https://dod.defense.gov/Portals/1/Documents/pubs/2018-National-Defense-Strategy-Summary.pdf

https://dod.defense.gov/Portals/1/Documents/pubs/2018-National-Defense-Strategy-Summary.pdf

32

McLeroy, Carrie. “History of Military gaming.” U.S. Army. August 27, 2008.

https://www.army.mil/article/11936/history_of_military_gaming.

National Defense Authorization Act for Fiscal Year 2016. 10 U.S.C. § 2302(9). Washington, DC,

2015. https://www.law.cornell.edu/uscode/text/10/2302.

National Defense Authorization Act for Fiscal Year 2016. Pub. L. No. 114-92 § 804, 129 Stat.

882. Washington, DC, 2015. https://www.gpo.gov/fdsys/pkg/PLAW-114publ92/pdf/PLAW-

114publ92.pdf.

NATO Code of Best Practice for C2 Assessment. Washington, DC: Command and Control Research

Program, 2002. http://dodccrp.org/files/NATO_COBP.pdf.

Razzetti, Eugene A. “Tabletop Exercises: for Added Value in Affordable Acquisition.” Defense

AT&L Vol XLVI, No. 6, DAU 259 (November - December 2017): 26-31.

https://www.dau.mil/library/defense-

atl/_layouts/15/WopiFrame.aspx?sourcedoc=/library/defense-atl/DATLFiles/Nov-

Dec_2017/DATL_Nov_Dec2017.pdf&action=default.

Report to Congress: Restructuring the Department of Defense Acquisition, Technology and

Logistics Organization and Chief Management Officer Organization, In Response to Section 901

of the National Defense Authorization Act for Fiscal Year 2017 (Public Law 114 - 328).

Washington DC: Department of Defense, 2014.

https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf.

The Technical Cooperation Program. Pocketbook Version of GUIDEx (Slim-Ex): Guide for

Understanding and Implementing Defense Experimentation. Ottowa, Canada: Canadian Forces

Experimentation Centre, 2006.

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf.

United States Air Force Scientific Advisory Board. United States Air Force Scientific Advisory

Board Report on System-Level Experimentation: Executive Summary and Annotated Brief. SAB-

TR-06-02. Washington DC: United States Air Force Scientific Advisory Board, 2006.

https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf.

United States Army War College. Strategic Wargaming Series: Handbook. Carlisle, PA: United

States Army War College, 2015.

https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf.

U.S. Congress, House. Conference Report: National Defense Authorization Act for Fiscal Year

2017. S. 2943. 114th Cong., 2d sess. 2016. https://www.congress.gov/114/crpt/hrpt840/CRPT- 114hrpt840.pdf.

https://www.army.mil/article/11936/history_of_military_gaming

https://www.law.cornell.edu/uscode/text/10/2302



http://dodccrp.org/files/NATO_COBP.pdf



https://dod.defense.gov/Portals/1/Documents/pubs/Section-901-FY-2017-NDAA-Report.pdf

https://www.acq.osd.mil/ttcp/guidance/documents/GUIDExPocketbookMar2006.pdf

https://apps.dtic.mil/dtic/tr/fulltext/u2/a463950.pdf

https://ssi.armywarcollege.edu/PDFfiles/PCorner/WargameHandbook.pdf




Department of Defense Experimentation Guidebook

Documents

Transcript of Department of Defense Experimentation Guidebook