Getting it Right: Commanders’ Judgment, Decisions, and ... · critical decisions that have wide...
Transcript of Getting it Right: Commanders’ Judgment, Decisions, and ... · critical decisions that have wide...
Getting it Right: Commanders’ Judgment, Decisions, and Accuracy
by
Colonel Brandon R. Tegtmeier United States Army
Str
ate
gy
Re
se
arc
h P
roje
ct
Under the Direction of: Mr. John J. Patterson
United States Army War College Class of 2017
DISTRIBUTION STATEMENT: A
Approved for Public Release Distribution is Unlimited
The views expressed herein are those of the author(s) and do not necessarily reflect the official policy or position of the Department of the Army, Department of Defense, or the U.S. Government. The U.S. Army War College is accredited by
the Commission on Higher Education of the Middle States Association of Colleges and Schools, an institutional accrediting agency recognized by the U.S.
Secretary of Education and the Council for Higher Education Accreditation.
REPORT DOCUMENTATION PAGE Form Approved--OMB No. 0704-0188
The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and
maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including
suggestions for reducing the burden, to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson Davis Highway, Suite
1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.
1. REPORT DATE (DD-MM-YYYY)
01-04-2017
2. REPORT TYPE
STRATEGY RESEARCH PROJECT .33
3. DATES COVERED (From - To)
4. TITLE AND SUBTITLE
Getting it Right: Commanders’ Judgment, Decisions, and Accuracy 5a. CONTRACT NUMBER
5b. GRANT NUMBER
5c. PROGRAM ELEMENT NUMBER
6. AUTHOR(S)
Colonel Brandon R. Tegtmeier United States Army
5d. PROJECT NUMBER
5e. TASK NUMBER
5f. WORK UNIT NUMBER
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
Mr. John J. Patterson
8. PERFORMING ORGANIZATION REPORT NUMBER
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)
U.S. Army War College, 122 Forbes Avenue, Carlisle, PA 17013
10. SPONSOR/MONITOR'S ACRONYM(S)
11. SPONSOR/MONITOR'S REPORT NUMBER(S)
12. DISTRIBUTION / AVAILABILITY STATEMENT Distribution A: Approved for Public Release. Distribution is Unlimited.
To the best of my knowledge this SRP accurately depicts USG and/or DoD policy & contains no classified
information or aggregation of information that poses an operations security risk. Author: ☒ PA: ☒
13. SUPPLEMENTARY NOTES
Word Count: 5200
14. ABSTRACT
Commanders of joint inter-agency special operations task forces make hundreds of judgments every day
as they target enemy networks. Many of these judgments lead to critical decisions that have wide reaching
implications. Joint inter-agency special operations task force staffs make countless judgments, as well, in
order to assist the commander in making these decisions. While commanders have been very effective
historically, there is room for improvement. Recent findings from decision science research can improve
the accuracy of commanders’ judgments through standardization of probabilistic language and
documentation of accuracy for all individual judgments. Being wrong comes with the territory of any
decision maker. However, commanders of joint inter-agency special operations task forces can make real
adjustments to increase their accuracy, thereby lowering the risk to friendly forces, lowering the risk of
strategically negative events, and, most importantly, enhancing effects on the enemy.
15. SUBJECT TERMS
Prediction, Joint Inter-Agency Special Operations Task Force, Critical Thinking
16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT
UU
18. NUMBER OF PAGES
27 19a. NAME OF RESPONSIBLE PERSON
a. REPORT
UU b. ABSTRACT
UU c. THIS PAGE
UU 19b. TELEPHONE NUMBER (w/ area code)
Standard Form 298 (Rev. 8/98), Prescribed by ANSI Std. Z39.18
Getting it Right: Commanders’ Judgment, Decisions, and Accuracy
(5200 words)
Abstract
Commanders of joint inter-agency special operations task forces make hundreds of
judgments every day as they target enemy networks. Many of these judgments lead to
critical decisions that have wide reaching implications. Joint inter-agency special
operations task force staffs make countless judgments, as well, in order to assist the
commander in making these decisions. While commanders have been very effective
historically, there is room for improvement. Recent findings from decision science
research can improve the accuracy of commanders’ judgments through standardization
of probabilistic language and documentation of accuracy for all individual judgments.
Being wrong comes with the territory of any decision maker. However, commanders of
joint inter-agency special operations task forces can make real adjustments to increase
their accuracy, thereby lowering the risk to friendly forces, lowering the risk of
strategically negative events, and, most importantly, enhancing effects on the enemy.
Getting it Right: Commanders’ Judgment, Decisions, and Accuracy
Since 9/11, joint inter-agency special operations task forces have proven
extremely effective in their ability to target specific enemy networks in support of
national policy objectives. These task forces have leveraged military and civilian
expertise, the Nation’s network of intelligence and law enforcement organizations, a
flattened information network, and impressive military capabilities to disrupt, and in
some cases dismantle, enemy organizations. The speed at which these joint inter-
agency special operations task forces (JSOTF) can find a target, conduct an
appropriate action against that target, exploit, and then analyze the results to generate
follow-on targets has been critical to their success.1 This speed means that JSOTF
commanders are often making multiple critical decisions daily. The burden on
commanders is undeniable as these crucial decisions may result in the death of enemy
combatants, put their own force at risk, and put non-combatant lives at risk.
Commanders are also making hundreds of less critical decisions daily. Likewise,
members of JSOTF staffs are making numerous judgments as they advise the
commander throughout the targeting process.
While JSOTFs have been very successful making judgments, they have also
made tragic mistakes. On October 3, 2015, coalition forces mistakenly conducted an
airstrike on a hospital in Konduz, Afghanistan killing 42 civilian staff and patients.2 On
September 17, 2016 coalition aircraft mistakenly bombed Syrian troops believing the
target was Islamic State of Iraq and the Levant (ISIL) fighters.3 These mistakes, while
rare, often have strategically important negative impacts. While all of these decisions
were made by very well intentioned professionals operating in extremely challenging
environments, organizations need to be - and can be - more accurate with their
2
judgments. Accuracy is critical to maintain credibility with civilian policy makers,
maintain credibility with partner nation forces, justify resources, operate ethically, and
most importantly, produce desired effects on the enemy network.
Commanders of joint inter-agency special operations task forces can make real
adjustments to increase accuracy. Recent academic research dealing with decision,
prediction, and judgment is particularly compelling and the findings illuminate two
obvious paths to improving accuracy. JSOTF commanders can be more accurate in
their judgments first by standardizing the use of probabilistic language and then by
documenting the accuracy of all individual judgments. The resulting increase in
accuracy would thereby lower the risk to friendly forces, lower the risk of strategically
negative events, and, most importantly, enhance effects on the enemy. Assuming these
two recommendations achieve the desired success at the JSOTF level, the entire
Department of Defense and inter-agency community could subsequently employ them
at all levels.
This discussion, and the follow-on recommendations, will focus on problems of
cognition. Problems of cognition, as defined by James Surowiecki, are questions with
concrete answers.4 They are questions like, is individual X in compound 11? Are the two
individuals located with individual X enemy combatants or civilians? Is compound 11
used for enemy purposes only or does it also have civilian uses? These types of
questions can either be assessments of past occurrences, assessments of present
conditions, or reasonably constrained predictions for the future. All of these types of
assessments can be summarized simply as judgments.
3
Decision Research
Research on decision-making, prediction, and the limitations of human judgment
is robust and spans the fields of psychology, social sciences, economics, statistics, and
mathematics. There are well-known theories that have been in existence for decades
explaining how humans make judgments and the limitations of human cognition. These
more well understood theories can be characterized as critical thinking and they include
ideas such as heuristics and biases.5 Another well-studied and well-understood theory
of how humans make decisions is Irving Janis’ classic groupthink theory.6 Generally, the
joint inter-agency community appreciates the problems framed in these ideas. There are
findings from recent decision research, though, that JSOTF commanders may be able
to utilize to increase the accuracy of their judgments. This research includes findings
regarding what makes some individuals better at making judgments than others, as well
as research on how groups can be more accurate than even good individuals.
Philip E. Tetlock and several colleagues have conducted some particularly
compelling research about judgment and prediction. Tetlock, a social scientist, argues
that humankind’s greatest achievements are largely due to measurements. As an
example, humankind truly began to make progress in the medical field when medical
researchers began applying scientific methods and actually measuring results.7 Tetlock
argues that forecasters in any field must be measured to determine their credibility.
Through measurement, he concluded that individuals with an above average ability to
predict accurately are not born, but they can be made.8 Tetlock found that individuals
who were open-minded and had a broad, but perhaps shallow, base of knowledge
generally displayed better judgment abilities than confident experts in a particular field.9
He asserts that “foresight isn’t a mysterious gift bestowed at birth. It is the product of
4
particular ways of thinking, of gathering information, of updating beliefs. These habits of
thought can be learned and cultivated by any intelligent, thoughtful, and determined
person.”10
A research project called the Good Judgment Project conducted within the larger
research efforts of the Intelligence Advanced Research Projects Activity (IARPA) – an
organization founded to make the American intelligence community better at what it
does - supported this conclusion. 11 This project included over 20,000 people making
hundreds of predictions about world events. Predictions were recorded and determined
to be either correct or incorrect. The project was nothing more than a forecasting
tournament, consisting of multiple teams competing to make the most consistently
accurate predictions. The Good Judgment Project team exceeded the control group by
78% and even did better than “professional intelligence analysts with access to
classified data.”12
Through this research, Tetlock and his team were able to identify what he calls
superforecasters, which are individuals who not only beat chance, but also show a
consistent ability to be accurate.13 He then found common characteristics in the way
they thought and approached problems. Among others characteristics, they tended to
be cautious, humble, actively open-minded, intellectually curious, reflective, pragmatic,
comfortable with diverse views, probabilistic, thoughtful updaters, and motivated to
improve themselves.14 From this, they were able to develop forecasting rules.15 He also
put the best forecasters together in groups and found that they got even better than they
already were.16
5
Tetlock’s research uncovers three important findings that JSOTF commanders
may be able to leverage to increase their accuracy. First, forecasters need to think
probabilistically.17 Judgments should not be recorded as yes or no predictions, but
rather should be recorded as having a percentage likelihood of occurring, similar to how
meteorologists make predictions on whether or not it will rain tomorrow. It is important to
understand that if a forecaster determines that something is 90% likely to happen, and it
does not, that forecaster was not necessarily wrong.18 This instance simply happened to
fit into the 10% bin of the forecast. That said, statistically the forecaster should not be
wrong more than 10% of the time when making a 90% likely prediction. Research also
shows that not only do forecasters need to think probabilistically, but also that
communicating using numbers is far better than using verbal probabilities such as
“likely,” “probably,” and “almost certainly.” Significant problems with verbal probabilities
include the overall vagueness of the terms.19
Next, forecasters actually have to forecast, assess how they are doing through
measurements, and then adjust how they make judgments.20 As with anything,
forecasters must practice to become proficient, and they have to know when they are
accurate or not accurate. Additionally, research shows that accountability itself
increases accuracy.21 When individuals know that their specific judgments will be
assessed, or that they will have to defend a position, they do a better job thinking
critically, listening to others, searching evidence, and updating their ideas.
Finally, a forecaster in a team of forecasters, or a leader in a group, should strive
to understand all arguments and question personnel precisely without micromanaging
and stifling opposing ideas.22 It is important that all members of a team understand the
6
specifics of other members’ arguments so that all perspectives are understood, even if
they are not accepted.
Recent research also shows how decision-makers can leverage the power of
groups to increase the accuracy of judgments. James Surowiecki describes how “under
the right circumstances, groups are remarkably intelligent, and are often smarter than
the smartest people in them.”23 In a way, this is the flipside of Irving Janis’ groupthink
theory.24 While groups can come with risk, they can also be incredibly powerful if the
judgments of individuals are aggregated under the right conditions.
Surowiecki describes that for a group to be intelligent, it needs to have three key
characteristics. First, the group needs to be diverse in thought. Diversity of thought and
ideas are important so that the differences between ideas are significant, not just
trivial.25 The next condition that individuals in the group need to have is independence,
which is critical to effective decision making because it keeps an individual mistake from
having a significant negative influence upon the judgment of the group. 26 Keeping
everyone in a group independent is a challenge, however.
The final group characteristic is decentralization. Decentralization encourages
specialization and allows for tacit knowledge. Important to the idea of decentralization is
that the closer someone is to a problem, the more likely he or she is to have a good
solution for it.27 Decentralization can be a problem, though, if critical information from
one decentralized element does not get to other elements. A system to aggregate
judgments or information is key to harnessing decentralization.28 Stove-piping
information will kill the potential advantages of a group. Aggregating judgments within a
group can be even more powerful if success rates of the individuals comprising the
7
group can be accounted for.29 Instead of simply averaging the judgments of all, the
judgments from individuals with a poor track record can be minimized and the
judgments from successful individuals can be weighted. Obviously, leaders can only
use this technique if there is a track record depicting accuracy.
Current State of Affairs
No two JSOTF commanders make decisions in the same way, and no two
intelligence analysts, military services, or government agencies are alike. That said,
certain generalizations can be made which will provide a starting point from which to
make recommendations for improvement. It is important to understand how
commanders make decisions, how the task forces are organized, how personnel
communicate judgments, how individuals are trained, and how personnel within these
task forces are assessed.
First, authority for higher risk decisions is often withheld at the joint inter-agency
special operations task force command level. Decisions to execute raids and kinetic
strikes, as well as how to execute them all rest on the shoulders of the commander.
Commanders generally come with extensive experience making different types of
judgments.
Commanders generally solicit input to varying degrees, but they may lean heavily
on intuition.30 Most input is filtered through the hierarchical nature of the unit structure.
Therefore, assessments made at lower levels may be diluted through biases of senior
intelligence officers or analysts when they communicate recommendations to the
commander. Some commanders will seek the input of all personnel with any relevant
background on the problem whether a private with 4 months of experience or a 25-year
civilian intelligence professional, but, in the author’s observation, that is personality
8
dependent. Most commanders have a small group of trusted personnel (e.g., senior
enlisted advisor, senior operations officer, senior intelligence officer, or the senior
civilian) who hear the same recommendations that the commander does, and provide
him or her with final advice based upon all inputs.
Commanders’ decisions have tangible results. These decisions can have far-
reaching effects on human beings, sometimes potentially tragic ones. Generally,
commanders feel tension between three competing motivations. The first is the
motivation to produce results. If the unit is tasked with defeating enemy networks, then
the commander will be motivated to aggressively capture or kill key enemy personnel.
This is why the JSOTF exists. Next is the motivation to reduce the risk to the force as
low as possible and still achieve the assigned purpose. This motivation exists because
the commander needs to preserve combat power, maintain the trust of the force, and
cares for his subordinates. The third form of motivation a commander will experience is
the pressure to reduce the risk of negative strategic impacts. An example of this is the
need to reduce non-combatant casualties for both moral and strategic reasons. All three
of these motivations are integral to a commander’s decision calculus.
A commander, whether consciously or sub-consciously, will assess these three
types of motivation against the certainty of a judgment in order to make a decision as to
whether to execute an action, as well as how to execute that action. If the commander
has a clear understanding of the level of certainty that something is true, then he or she
can weigh that certainty against the level of risk to friendly forces and the level of risk of
strategically negative impacts. If the commander is 90% certain that individual X of an
enemy group is in compound 11, and non-combatants have not been observed there,
9
then the commander will likely be more comfortable conducting a raid to capture the
individual. The commander assesses that the high likelihood of success justifies the low
risk to non-combatants and medium risk to friendly forces. If, on the other hand, the
commander is only 55% certain that the same individual X is in the compound, and if
multiple non-combatants are present in the compound, then the commander may
decide that a raid is not worth the higher risk to non-combatants and the medium risk to
friendly forces. The commander may decide to wait until his certainty of mission
success is higher. If the commander does not have a clear understanding of his level of
certainty that individual X is in compound 11, it is impossible for him or her to effectively
weigh risk versus the likelihood of gain.
Next, joint inter-agency special operations task forces consist of a wide variety of
personnel with a vast array of experience levels and areas of expertise. A joint inter-
agency headquarters is usually manned with a core of military personnel from various
units augmented with civilian intelligence professionals from multiple government
agencies. The commander may have limited contact with analysts most knowledgeable
about the problem due to the organizational structure. Within a headquarters element,
intelligence sections are usually hierarchical with the senior intelligence officer, non-
commissioned officer, and senior intelligence analyst at the top.31 Below them, the
section is usually organized into either functional teams or enemy focused teams
depending on the overall task and purpose of the unit. As an example, an intelligence
section can be broken into separate teams that each focus on a different enemy
network, a geographical area, or some other specific problem set. These teams are
usually led by an operations or intelligence officer and senior intelligence analyst.
10
Of particular note, JSOTFs tend to be inconsistent in how they communicate
certainty. Within the intelligence community as a whole, there is a shortfall in how well
analysts communicate probabilities.32 Some intelligence organizations use common
verbal probability language instead of numerical probabilities, but JSOTFs do not
generally use this language. The National Intelligence Council, when communicating
intelligence assessments, includes an annex (see Figure 1) that defines verbal
equivalency to numerical value.33 Even when defined well, as in this case, the terms can
have up to a 20 percent variance. Even if this type of language were used consistently
and understood by all, it is hardly specific. Joint Publication 2-0, the joint force’s
intelligence manual, is even more vague. This publication establishes verbal probability
associated only with three categories along the entire spectrum of probability (see
Figure 2).34 There is no consistently used standard throughout the entire joint and inter-
agency community, and even where terms are defined, they have a wide certainty
variance.
11
Figure 1. National Intelligence Council Estimative Language35
12
Figure 2. Joint Publication 2-0 Confidence Language36
Next, while training and education vary for members of a JSOTF, most personnel
are trained well on basic biases, heuristics, and the value of critical thinking. As an
example, Army intelligence manuals describe techniques to avoid biases and
groupthink.37 Confirmation bias is particularly emphasized in Army intelligence training
and critical thinking models are taught in multiple professional development courses.
The Army funds and operates five courses of varying lengths designed to enhance
critical thinking through the University of Foreign Military and Cultural Studies, which
describes itself as “the US Army’s purveyor of critical thinking, decision support,
groupthink mitigation, fostering cultural empathy, self awareness and reflection, and red
teaming tools.”38 Another example is the National Intelligence University (NIU), which
describes its core curriculum as including education in “critical thinking and analytical
13
theory.”39 The question is not so much how better to train personnel on the
fundamentals of critical thinking, but rather how to refine individual performance on
critical thinking.
Finally, it is important to understand that JSOTF intelligence personnel are
usually not assessed on the accuracy of their judgments, but instead on characteristics
such as their ability to communicate, interpersonal skills, leadership abilities, technical
knowledge, and work ethic. If intelligence personnel are judged on how accurate their
assessments are, in the author’s experience it is only anecdotally. Results are generally
not recorded and used to judge the analysts.40 Where the joint force clearly evaluates a
close air support pilot on his ability to destroy a ground target under certain conditions,
and a tank gunner on his ability to destroy an enemy tank at designated ranges, the
joint force never systematically assesses an intelligence analyst based upon the
accuracy of his or her judgments.
The judgments that commanders and other operations personnel make are
usually not recorded and compared against results either. Some task forces will record
overall success of targeting, but this data is normally only kept as a measure of overall
success, not as precise data accounting for certainty assessments. Of course
commanders’ and analysts’ judgments will be anecdotally accounted for by
subordinates, peers, and superiors. The problem is that conclusions drawn from
anecdotal observations can be wildly inaccurate.
Recommendations to Improve Accuracy of Judgments
There are numerous ways that JSOTFs can use recent decision research to
improve the accuracy of commanders’ judgments. There are two key recommendations,
however, that can serve as foundations for all other future improvements. These two
14
recommendations are necessary for effective self-examination and provide the
launching pad for all follow-on accuracy improvement.41
The first recommendation to increase the accuracy of judgment is to express
assessments using numerical probabilities. There are three primary reasons to use
numerical probabilities rather than verbal ones. The first reason is that expressing
certainty in numerical probabilities prevents any confusion between what the
communicator means and what the receiver understands. Even if a JSOTF uses
standard verbal percentages that are well defined, like the National Intelligence Council
standard presented above, personnel will exhibit variance in their understanding of the
terms. This begs the question: why use qualitative terms that must be then translated
into numbers, when personnel can just use quantitative terms from the start, eliminating
potentially problematic translation?
As an example, a senior intelligence analyst can say to a commander, “I assess
that it is likely individual X is in compound 11.” So what does the analyst mean? If the
JSOTF uses the National Intelligence Council standard in Figure 1, the analyst can
mean that his certainty level is anywhere from 60% certain to 75% certain that the
enemy combatant is located there. It is actually a little unclear what the overall variance
is, because the term “likely” is not precisely defined in Figure 1. Assuming fifteen
percent is the variance, it is a significant variance. What if, based on the commander’s
assessment of the risk to force and the risk of strategic error, the commander is
comfortable executing a raid based on 75% certainty, but not based on 62% certainty?
How does the commander work through this variance? Why not use a numerical
percentage from the start?
15
The second reason for JSOTFs to communicate using numerical probabilities is
that it encourages precision, and, subsequently, revision. One of the habits of Tetlock’s
best forecasters is that they refine the probability as much as possible based on their
level of doubt, and they constantly update their forecasts based on the evidence.42 As
an example, an intelligence analyst may currently assess that there is a 65% certainty
that individual X is located in compound 11. If she receives an additional human
intelligence report that further confirms the likelihood of individual X being located in
compound 11, the intelligence analyst can now update her assessment. She may now
say that there is a 75% level of certainty. This is a significant difference that will likely
impact the commander’s overall assessment. If the JSOTF were still using verbal
probabilities, then the analyst would continue to use the same term “likely,” despite the
significant additional information. Nothing new would have been communicated to the
commander unless he or she were specifically briefed on that piece of additional
intelligence. The precision of numerical probabilities is compelling.
The third reason for JSOTFs to communicate using numerical probabilities is that
it will provide the foundation for effective aggregation by commanders. When all of the
personnel providing assessments use numerical probabilities, it allows the commander
to aggregate the recommendations easily into an overall numerical probability from
which he or she can then apply his or her own judgment in order to make the decision.
The primary argument against using numerical probabilities is that many personnel will
want to keep their judgments vague. Some forecasters are not comfortable with
precision, because they believe there is too much chance involved to give a precise
estimate of probability. Precision would also impose a particular standard upon the
16
predictor. The predictor may want to hedge his or her bets by only making broad
forecasts. The first problem with this argument is that, as discussed earlier, the research
shows that precise predictions result in more accurate predictions over time. The
second problem with this argument is that the commander does not have the luxury to
be vague. He or she has to make a decision that actually results in tangible
consequences. Why should a JSOTF commander use vague recommendations to
inform a very precise, concrete action?
The second recommendation is the most important fundamental change to
improve accuracy of judgments within a JSOTF. In addition to using numerical
probabilities, JSOTFs should begin documenting judgments of all intelligence analysts
and operational personnel who provide recommendations for any questions that can be
answered empirically. Commanders’ judgments should also be recorded. JSOTFs
should maintain a database that records numerical probability assessments by
individuals and then documents the outcome, once determined. These judgments
should be in the form of a numerical percentage level of certainty. Then, when those
questions have been answered with reliable information, all individual judgments should
be measured against the actual outcome. All personnel providing judgments to the
commander, and the commander himself, will develop a scorecard of accuracy over
time.
This accuracy database would do three primary things. First, while most JSOTF
personnel are very professional and diligent about their recommendations already, it
would communicate additional gravity of the judgment, perhaps inspiring an extra look,
a harder examination of possible biases, or better exchange of ideas with peers. It
17
would enable analysts to examine where they individually made accurate or inaccurate
assessments and critique themselves. Over time, all advisors and commanders can
track their accuracy against the norm and subsequently scrutinize performance to
enhance critical thinking skills. This crucial self-assessment is much less likely to occur
without a hard record.
The second result of the database would be that trends built over time for
commanders and staff could be scrutinized as an organization. Success or failure rates
of personal judgments would no longer be anecdotal. Personnel who are exceedingly
successful could be studied and their habits replicated. Personnel who are not
successful could be critiqued by their raters and peers in order to improve accuracy. In
the worst of cases, individuals who remain below the standard could be removed from
the organization.
The third result of using an accuracy database is that the aggregation the
commander can assemble will be much more accurate, because success rates of
individual advisors could be accounted for over time. In other words, judgments of
successful advisors could count for more than advisors with less successful track
records. As an example, the commander’s advisors may all be providing assessments
between 70 and 80% certainty that individual X is located in building 11, except for one.
This lone advisor may be providing input that he or she assesses only 55% certainty.
How does the commander know how much credence to give to this assessment?
Without knowing the accuracy of individual advisors, the commander would tend to give
all assessments equal credence. But, if the commander knows that the dissenting
advisor has been, by far, the most historically accurate, he or she will clearly give much
18
more credence to this 55% assessment. At a minimum, the commander would likely ask
more questions to determine why the dissenting advisor’s assessment is so different
from the norm.
There are several arguments against recording judgment accuracy in an
accuracy database. First, there is the question of how to determine what the truth
actually is. A JSOTF needs to determine the true outcome, in order to have a result to
compare the original judgments against. Determining the truth is itself an assessment.
Realistically, many judgments would not be able to be assessed either as being
accurate or inaccurate. Many judgments would though. Ultimately, the true outcome will
be a judgment based on a level of certainty. The JSOTF will have to establish a
standard for what to document as a true outcome for application in the accuracy
database, such as 95% certainty. As an example, if the JSOTF executes a raid on
compound 11 to capture individual X, and the force captures an individual who admits
that he is individual X, then the commander has likely reached a 95% or higher certainty
level that the task force indeed captured individual X. The database can be updated
appropriately. On the other hand, if there is no admission from the detainee, and no
additional intelligence is collected confirming or denying the capture of individual X, then
this particular judgment outcome may not be able to be recorded at all. Many times
however, the JSOTF will be able to get to 95% or higher certainty post action.
Another argument against recording accuracy of any JSOTF personnel is that it
might create a zero defect culture that values being correct over producing results. It
may discourage action by commanders. A commander cannot be wrong if there is no
action to judge. This risk can be mitigated though. First, as discussed earlier,
19
commanders are the only predictors that are already assessed to some degree. While
their accuracy rates are not aggregated over entire careers, in some JSOTFs they are
recorded over short time spans. It is doubtful recording all individual assessments within
a JSOTF will change commanders’ current attitudes. Furthermore, this is exactly why
accuracy would be recorded based on level of certainty, not solely based upon an
empirical result. As discussed earlier, just because a meteorologist forecasts a 90%
chance of rain and it does not rain, it does not mean that the meteorologist was wrong.
It just means that he should not be wrong more than 10% of the time when making a
90% chance forecast.
Another argument against establishing an accuracy database is that it would cost
money and valuable man-hours to establish and maintain. That said, the author can
think of no better use of resources than to spend them increasing the accuracy of
commanders’ judgments. When compared against aviation fuel expended on a wasted
raid, or still worse, a U.S. casualty, the resources spent on an accuracy database would
seem a mere pittance.
A final argument against using an accuracy database is similar to the counter-
argument described regarding numerical probabilities. There is a popular belief that
making assessments is not hard science, but instead an art that is riddled with chance.
Using this logic, applying a scientific method to it is useless. Warfare is far too uncertain
and complex. Theorist Carl Von Clausewitz highlighted the critical role that chance
plays in warfare and the U.S. military generally concurs.43 This argument presents a
formidable obstacle to improving judgment in warfare. The joint inter-agency force may
be hesitant to hold personnel accountable for judgments because its members have
20
been trained and educated that warfare is incredibly complex and uncertain.
Unfortunately, this way of thinking may be limiting the joint inter-agency force’s full
potential.
The problem with this argument is that if science cannot be applied to decision
making in warfare, then all of the decision research findings cited earlier would have to
be flawed as well. Yet, like many of the judgments required of a commander in war, the
judgments used for the decision research cited were not simple questions either. They
were empirical, but by no means easy questions.44 One of the examples that Clausewitz
uses when discussing chance and friction is one in which rain bogs down a hypothetical
battalion’s movement, “keeping it not three, but eight hours on the march.”45 What was
once deemed an axiomatic example of Clausewitz’s chance, the weather, is now
incredibly predictable. Humankind applied scientific inquiry to develop an impressive
ability to know what the weather is going to be during a battalion movement, and
therefore is able to appropriately plan for the movement time. Modern medicine is rife
with such examples. Healers used to bleed people when they had a fever. Ailments and
cures were attributed to mysticism. When humankind began applying scientific methods
to medicine however, we made incredible strides.46 This is exactly what Tetlock argues
in his work. Perhaps application of the scientific method to the field of judgment in
warfare can evoke similarly dramatic improvements in this field as well.
Conclusion
Naturally, successful implementation of these recommendations will not produce
omniscient commanders. Commanders will continue to make faulty assessments no
matter how adept we become at critical thinking, forecasting skills, and aggregation.
Errors will always be present simply due to the limits of human cognition, the limits of
21
data collection, and the complexity of the environment. That said, joint inter-agency
special operations task forces at all levels need to continue to strive for perfection.
JSOTF commanders can become more accurate by communicating using numerical
probabilities and by recording the accuracy of individual advisors and decision-makers.
While the recommendations herein are tailored for joint inter-agency special operations
task forces, these recommendations are applicable to any decision maker, in any
enterprise, at any level. In fact, once results have been assessed at the JSOTF level,
and assuming accuracy increases, these proposals can be implemented across the
entire joint force by updating doctrine, organization, training and education, and
materials. Overall, these simple changes should directly result in lower risk to friendly
forces, lower risk of strategically negative events, and, most importantly, pronounced
effects on the enemy. The author is 86% certain.
Endnotes
1 Although this paper deals specifically with joint inter-agency special operations task forces targeting enemy networks, the acronym JSOTF will be used for the purpose of brevity and clarity. It is important to understand that the joint and inter-agency nature of these task forces is far more important to the problem than their special operations nature.
2 Chad Garland, “Report: 16 to Get Reprimand for Konduz Hospital Airstrike,” Stars and Stripes, April 28, 2016.
3 Stephen Losey, “Investigation: ‘Confirmation Bias,’ Mistakes Led Coalition To Mistakenly Bomb Syrian Troops,” Air Force Times, November 29, 2016.
4 James Surowiecki, The Wisdom of Crowds (New York: Doubleday, 2004), xvii.
5 Daniel Kahneman. Thinking, Fast and Slow (New York: Farrar, Straus and Giroux, 2011), iBook.
6 Irving Janis, Groupthink (Boston: Houghton Mifflin Company, 1982), 9.
7 Philip E. Tetlock and Dan Gardner, Superforcasting: The Art and Science of Prediction (New York: Crown Publishers, 2015), 24-38.
22
8 Ibid., 4.
9 Philip E. Tetlock, Expert Political Judgment (Princeton, NJ: Princeton University Press, 2005), 21.
10 Tetlock, Superforcasting: The Art and Science of Prediction, 18.
11 Ibid., 16.
12 Ibid., 18.
13 Ibid., 3.
14 Ibid., 191-192.
15 Ibid., 277-284.
16 Ibid., 201.
17 Ibid., 191.
18 Ibid., 58.
19 Mandeep K. Dhami et al., “Improving Intelligence Analysis With Decision Science,” Perspectives on Psychological Science Online 10, no. 6 (2015): 754, http://journals.sagepub.com/doi/pdf/10.1177/1745691615598511 (accessed February 7, 2017).
20 Tetlock, Superforcasting: The Art and Science of Prediction, 277-284.
21 Jennifer S. Lerner and Philip E. Tetlock, “Accounting for the Effects of Accountability,” Psychological Bulletin Online 125 no. 3 (1999): 263, https://www.researchgate.net/publication/13203036_Accounting_for_The_Effects_of_Accountability (accessed February 7, 2017).
22 Tetlock, Superforcasting: The Art and Science of Prediction, 284.
23 Surowiecki, The Wisdom of Crowds, xiii.
24 Janis, Groupthink, 9.
25 Surowiecki, The Wisdom of Crowds, 28.
26 Ibid., 41.
27 Ibid., 71.
28 Ibid., 78.
29 David V. Budescu and Eva Chen, “Identifying Expertise and Using it to Extract the Wisdom of the Crowds,” Management Science Online 61, no. 2 (February 2015): 267-280.
23
30 Stephen J. Gerras and Leonard Wong, Changing Minds in the Army: Why it is so Difficult
and What to Do about it (Carlisle Barracks, PA: U.S. Army War College Press, October 28, 2013), 15.
31 U.S. Joint Chiefs of Staff, Special Operations, Joint Publication 3-05 (Washington, DC: U.S. Joint Chiefs of Staff, July 16, 2014), A-9.
32 Committee on Behavioral and Social Science Research to Improve Intelligence Analysis for National Security, Intelligence Analysis for Tomorrow: Advances from the Behavioral and Social Sciences (Washington, DC: The National Academies Press, 2011), 36.
33 National Intelligence Council, Intelligence Community Assessment: Assessing Russian Activities and Intentions in Recent US Elections (Washington, DC: Office of the Director of National Intelligence, 2017), 13.
34 U.S. Joint Chiefs of Staff, Joint Intelligence, Joint Publication 2-0, (Washington DC, U.S. Joint Chiefs of Staff, October 22, 2013), A-2.
35 National Intelligence Council, Intelligence Community Assessment: Assessing Russian Activities and Intentions in Recent US Elections, 13.
36 U.S. Joint Chiefs of Staff, Joint Intelligence, A-2.
37 U.S. Department of the Army, Soldier’s Manual and Trainer’s Guide For the Intelligence Analyst MOS 35F Skill Level 1/2/3/4, STP 34-35F14-SM-TG (Washington, DC: U.S. Department of the Army, January 7, 2014), E-6.
38 University of Foreign Military and Cultural Studies Home Page, http://usacac.army.mil/organizations/ufmcs-red-teaming (accessed February 8, 2017).
39 National Intelligence University, National Intelligence University 2016-2017 Catalog (Washington, DC: National Intelligence University, 2016), 5.
40 Dhami, “Improving Intelligence Analysis With Decision Science,” 754.
41 Mandeep K. Dhami et al., made recommendations similar to the author’s but specific to the international intelligence community in their article, “Improving Intelligence Analysis With Decision Science,” 753-757.
42 Tetlock, Superforcasting: The Art and Science of Prediction, 277-284.
43 Carl von Clausewitz, On War, eds. and trans. Michael Howard and Peter Paret (Princeton, NJ: Princeton University Press, 1976), 120.
44 Tetlock, Superforcasting: The Art and Science of Prediction, 88.
45 Clausewitz, On War, 120.
46 Tetlock, Superforcasting: The Art and Science of Prediction, 19.