Variability Analysis based on Multi-Objective Performance ...und+Bachelor_Arbeiten/MasterThe… ·...
Transcript of Variability Analysis based on Multi-Objective Performance ...und+Bachelor_Arbeiten/MasterThe… ·...
Thomas Hennig
Variability Analysis based on
Multi-Objective Performance and
Throw Acceleration in Dart-Game
Intelligent Cooperative Systems
Computational Intelligence
Variability Analysis based on Multi-Objective
Performance and Throw Acceleration in
Dart-Game
Master Thesis
Thomas Hennig
November 20, 2018
Supervisors: Prof. Dr. Sanaz Mostaghim,
Chair of Computational Intelligence
Prof. Dr. Frank Ohl,
OvGU, Leibniz Institute for Neurobiology
Advisors: M.Sc. Heiner Zille
Dipl.-Neurowiss. Lars T. Boenke
Thomas Hennig: Variability Analysis based on Multi-ObjectivePerformance and Throw Acceleration in Dart-GameOtto-von-Guericke UniversitätIntelligent Cooperative SystemsComputational IntelligenceMagdeburg, 2018.
Abstract
Movements are part of everyday activities, need to be learned and can be im-proved regarding their movement goal. In related work, occurring variancein movements is mostly considered as noise. Whereas theories from cyber-netics and dynamic system theory suggest that part of this variance supportsachieving the movement goal.
The aim of this thesis was to explore the relation of variability between throwperformance and throw physiology in throwing darts. Di�erences betweenplayers that perform better and players that perform worse should be identi-�ed. In Addition, di�erences between players that improve more and othersthat improve less should be identi�ed. In related work, a single measure is usedto quantify performance in throwing dart. In this thesis, multiple measuresshould be used to quantify performance. For analyzing the relation betweenperformance and the way of throwing, 6 visual analyses were conducted. Ineach analysis, a good and a bad group based on performance measures wereconfronted so that di�erences in their throw pattern could be identi�ed. Toquantify performance, two performance-measures for sets of throws were de-�ned to represent scatter and average deviation. To quantify the variance ofthrow movements, the variance of an acceleration model that was �tted toevery throw was determined.
A triangular relation between our performance measures was found that is nei-ther the e�ect of their de�nition nor an e�ect solely created by our method.Additional unique di�erences were identi�ed by analyzing both performancemeasures together using the fronts from non-dominated sorting. For the ma-jority of observed di�erences in the throw pattern, a better performance wasrelated to a smaller variance in the throw pattern. For few observed di�erencesthrowing, a better performance was related to a higher variance in the throwpattern. For the di�erences in the development of parts of the throw pattern,clearly less observed di�erences were related to the development of variance.
Di�erent types of variability relations are suggested by our results. The vari-ance of these relations is compound when quantifying the variance in the wayof throwing. Further work needs to be done to separate these compound vari-ances.
I
Preface
First of all, I like to thank Lars T. Boenke, Heiner Zille, Prof. Dr. Frank Ohl,and Prof. Dr. Sanaz Mostaghim for giving me the opportunity to work onthis topic. I thank them for all their time for our many meetings, for all theirsupport and all their advice during the work for this topic. I thank them andDominik Weikert for their preceding work and ideas.
Next, I want to thank Prof. Ross Cunnington, Tamara Powell et al. forproviding us with the data of the experiment for our analyses and for explainingthese in detail.
Furthermore, I like to thank Dr. Christoph Steup for sharing his know-how inthrowing darts and his expertise on analyzing measured data, as well as thestudents in the work group of Prof. Dr. Mostaghim for their advice and ideas.
It the end, as this thesis is also the completion of my studies, I like to thankmy partner and my family for their support throughout the years.
III
Contents
List of Figures VII
List of Tables IX
1 Introduction 1
2 Background 5
2.1 Game of Darts . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Theories of Motor Control . . . . . . . . . . . . . . . . . . . . . 7
2.3 Approaches to quantify Physiology and Performance . . . . . . 10
2.4 Approach to handle multiple Performance Measures . . . . . . . 13
2.5 Goals and Expectations . . . . . . . . . . . . . . . . . . . . . . 16
3 Data Basis 17
3.1 Dart Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Towards Performance Measures . . . . . . . . . . . . . . . . . . 20
3.3 Focus on accelerating Force as recorded physiological Measure . 23
4 Proposed Method 29
4.1 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Describing the Throw Pattern by Throw Features . . . . . . . . 31
4.3 Quantifying Development . . . . . . . . . . . . . . . . . . . . . 37
5 Results 39
5.1 Performance measures from Experiment Data . . . . . . . . . . 40
5.1.1 Regarding Block Performance . . . . . . . . . . . . . . . 40
5.1.2 Regarding Performance Development . . . . . . . . . . . 45
5.2 Relation between Performance and Throw Pattern . . . . . . . . 48
5.2.1 Groups based on Accuracy-error . . . . . . . . . . . . . . 49
V
Contents
5.2.2 Groups based on Precision-error . . . . . . . . . . . . . . 50
5.2.3 Groups based on Fronts from non-dominated Sorting . . 54
5.2.4 Comparison of Sets of Observations . . . . . . . . . . . . 56
5.3 Relation between Performance Developments and Throw Pat-
tern Developments . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.1 Groups based on Accuracy-error . . . . . . . . . . . . . . 60
5.3.2 Groups based on Precision-error . . . . . . . . . . . . . . 63
5.3.3 Groups based on Fronts from non-dominated Sorting . . 65
5.3.4 Comparison of Sets of Observations . . . . . . . . . . . . 68
6 Discussion 71
6.1 Our Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Bibliography 87
VI
List of Figures
2.1 The dartboard that we used in this thesis . . . . . . . . . . . . . 6
3.1 The inertial measurement unit attached to the wrist . . . . . . . 19
3.2 Distribution of hits with high scatter and high displacement . . 21
3.3 Distribution of hits with low scatter and low displacement . . . 22
3.4 Distribution of hits with low scatter and high displacement . . . 22
3.5 Distribution of hits with high scatter and low displacement . . . 23
3.6 Time series for accelerating forces for two subsequent throws . . 24
3.7 Properties of rest phases . . . . . . . . . . . . . . . . . . . . . . 27
4.1 Extracted time series of the accelerating force for all throws of
participant J01 . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Time related form features in the piecewise linear model . . . . 33
4.3 Amplitude related form features in the piecewise linear model . 33
4.4 Fit for the 42nd throw of participant J01 . . . . . . . . . . . . . 35
4.5 d_t1_t2 as example for a throw feature from a time di�erence . 36
4.6 d_v1_v4 as example for a throw feature from a amplitude dif-
ference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.7 s12 as example for a throw feature from a slope . . . . . . . . . 37
5.1 Occurring performance measure values . . . . . . . . . . . . . . 41
5.2 Visualizing distortions in the relation between our performance
measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Occurring values for development in performance measures . . . 46
5.4 Confrontation based on accuracy-error . . . . . . . . . . . . . . 51
5.5 Confrontation based on precision-error . . . . . . . . . . . . . . 52
VII
List of Figures
5.6 Confrontation based on non-dominated fronts on both perfor-
mance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.7 Confrontation based on development of accuracy-error . . . . . 62
5.8 Confrontation based on development of precision-error . . . . . 64
5.9 Confrontation based on fronts from non-dominated sorting on
the development in accuracy-error and precision-error . . . . . . 67
6.1 Overview of our method to prepare the confronting visualizations 72
6.2 Traces of block performance . . . . . . . . . . . . . . . . . . . . 76
6.3 Annihilation of forces . . . . . . . . . . . . . . . . . . . . . . . . 78
6.4 40th throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.5 41st throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.6 42nd throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.7 Development of IQR for d_t1_t2 over time . . . . . . . . . . . 83
VIII
List of Tables
4.1 Performace measure examples . . . . . . . . . . . . . . . . . . . 31
5.1 Overview of throw features with observed di�erences in perfor-
mance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Overview of throw features with observed di�erences in devel-
opment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
IX
1 Introduction
When we learn a second language, we learn words, phrases, grammar, pro-
nunciation and as part of this also often sounds that are speci�c to the new
language. For people with German as the �rst language, this might be the
th-sound when learning English. The other way around, for people with En-
glish as the �rst language who learn German, there are the umlauts. To create
the th-sound, we need to coordinate tongue and the tip of the upper teeth
to form a slit through which we press air with our breath to create a hissing
sound. For the umlauts, we take the example of the sound for ü. We start
by coordinating the jaw, tongue, and lips to continuously create the English
sound for the letter e. Now, keeping the jaw and the tongue in this position,
we purse our lips slowly until just a �ngertip-sized opening is left. We see that
we need to coordinate the movement of multiple parts of our body to create
a speci�c new sound. To use these sounds in the language, we need to learn
to create those sounds reliably and fast - so, others may understand us. As a
baby, there are not just a few sounds that we need to learn. We need to learn
to move mouth, jaw, tongue and vocal cords and to coordinate this with the
breath for the many sounds of the language of our parents.
For many everyday actions, we need to learn movements. For communication,
apart from talking, we also need to learn the movements for writing whether
using a pen or using a keyboard. We need to learn the movements for walking,
for grabbing. After an accident, we may need to learn to talk again. We may
need to learn to walk using an arti�cial limb. In sports, we need to learn
movements. For football, we need the moments to run and control a ball at
the same time. In darts or curling, we need a movement to accelerate a missile
to hit a distant target.
We know that we move due to forces that apply to limbs and tissue. We
know that these forces originate from muscles that contract. We know that
these muscles contract due to signals provoked in the brain. We know the
structure of the brain, that certain brain areas correspond to certain body
1
1 Introduction
parts. However, we do not know how it works so that movements emerge. We
do not know how the brain learns a new movement. We do not know how
the brain changes to improve the movement. If we would know, we could tell
what to do to be an expert and how to get there. We could choose strategies
to learn faster.
A question that is related to this is whether we execute always the exact
same movement. [Bernstein1967] observed that eventhough the movement
appears to be very similar, there is always a variability in the movement which
also covers alternative movements for the case that some movements are not
feasible. For example, we can walk. Yet, the movement di�ers as we walk on
a rough ground or a slippery groung, as we walk a slope up, down or on a �at
ground.
Our intention for this thesis is to make �rst steps in the exploration of the
variability in movements. Our goal is to explore for throwing darts the relation
between performance and the way of throwing with a focus on variability. As
we see in the related work in Chapter 2 that usually just a single objective
or in other words performance measures is used, we want to include multiple
performance measures in our analyses. The movement of throwing darts as an
object of investigation has several advantages compared to other movements.
We have a precisely de�ned goal with the dartboard. The movement itself
a�ects comparatively few limbs - mainly arm and hand. In addition, there are
few constraints to throwing darts. So, we do not need exceptional strength,
speed, agility, endurance or intelligence. In the exploration of the dart throwing
movement, we follow two research questions.
• What are the di�erences between players that show a high performance
and players that show a low performance?
• What are di�erences between players that improve more in their perfor-
mance than others?
To �nd answers to these questions, we structure this thesis into the following
parts.
• In the next chapter, Chapter 2, we present the basis to investigate the
movement of throwing darts.
• In Chapter 3, we present the dataset that is the basis for our analyses.
• In Chapter 4, we describe our developed method to analyze this dataset.
2
• In Chapter 5, we present the results of applying our method to the
dataset.
• In Chapter 6, we discuss the results, the strengths and weaknesses of our
method, as well as options for future work.
3
2 Background
In this chapter, we want to provide the knowledge that is required to under-
stand our method. We will start by describing the game of darts - what it is
and how it works. Based on this, we will introduce the topic of motor control
which gives us theories to explain how we create throw movements in darts
and which leads us to our expectations for the following chapters. Afterward,
we continue with the topic of quantifying throw performance as well as throw
physiology which will be necessary for analyses.
2.1 Game of Darts
Darts is a popular game and competitive sport. Participants throw repeatedly
from a de�ned distance small missiles - called dart - at a static target. The
target is a disk - the dartboard - which is attached in a de�ned height at a
wall. The dartboard is subdivided into multiple �elds. In the center, there
is a circular �eld called bull's eye. Around it, the dartboard is divided into
several rings as well as it is divided into several sectors of a circle which
forms a multitude of ring segments. Figure 2.1 shows the dartboard that
we used for this thesis. It has a diameter of about 43 cm and is subdivided
into 9 rings and 12 equal sized sectors that form 108 ring segments. There are
di�erent variants of the game with di�erent goals in which participants may
compete. These variants have in common that a certain amount of points is
assigned to each �eld which is gathered for every hit in the respective �eld.
In a variant of the game for more advanced participants, participants may
seek to hit a certain combination of �elds in order to gather points for their
personal scores with the goal to reach a certain sum without exceeding it. On
the other hand, beginners may just try to hit the bull's eye receiving more
points the closer their hit is to the bull's eye. The process of throwing a dart
starts with an acceleration phase in order to reach an appropriate velocity and
5
2 Background
Figure 2.1: The dartboard that we used in this thesis - The red circular �eld in
the center is the bull's eye. Around it, the dartboart is subdivided
by black and white rings, as well as 12 sectors of a circle into �elds
in the shape of ring segments.
movement direction so the dart may �y from the participant to the dartboard
bridging the distance between both. When the dart reaches an appropriate
velocity then the participant releases the darts into �ight. In �ight, the dart
is not in�uenced by the participant anymore. Instead, for example, gravity is
in�uencing the dart which was compensated by the participant before. The
�ight phase ends with the impact of the dart. It hits the dartboard or the
surrounding wall reducing it velocity greatly. On hitting the dartboard, the
dart usually gets stuck. Under the name hit, we refer to the impact position
of a single throw. Though, we name hits of throws that missed the dartboard
also misses. Regarding the throw process of participants, we observe that
participants prepare a throw by reaching an initial state that usually extends
the distance for acceleration. Assuming that they already hold the dart in one
hand, they may move their hands up to the level of the shoulder and backward.
Afterward, they start accelerating this hand. Doing so, the hand moves forward
towards maximum extension. There are many ways to accelerate the dart. To
mention just two, we observed players mainly rotating their forearm around the
elbow joint creating a rotational movement while for others, we saw an ejecting
movement by expanding forearm and upper arm rotating in elbow joint and
shoulder joint. Since there is a maximum extension of the arm, there must be
a phase of deceleration that reduces the velocity of hand and arm. Reaching
6
2.2 Theories of Motor Control
the maximum extension means that in this movement at some point the bones
of upper arm and forearm collide in the elbow joint. To prevent such a hard
break, participants also cancel acceleration before the maximum extension is
reached and decelerate actively to stop the arm. Releasing the darts between
those phases of acceleration and deceleration is a matter of timing. While
the arm is used to accelerate the dart, it is the hand that holds the dart and
releases it into �ight. On releasing too early, the dart may not have reached
the necessary velocity. On releasing too late, the dart may already be to slow
again due to deceleration. The throw of a single dart is a complex process. It
requires that the participant coordinates the hold and release of a dart with
the acceleration and deceleration of the arm in order to hit a distant target.
In the following section, we will take a look at theories that try to explain how
we control those complex motoric processes.
2.2 Theories of Motor Control
Motor control is the discipline in which scientists study how movements arise in
humans that su�ce certain movement goals (cf. [Edwards2011]). This includes
the question for the role of cognitive processes. In darts, an example movement
goal is to create repeatedly a movement to throw the dart and hit the bull's
eye. With the concept of performance we qualify the ful�llment of movement
goals in this context (cf. [Edwards2011]). Permanently hitting the bull's eye
ful�lls constantly and to a maximum extent the goal which represents the
highest performance. Hits close to the bull's eye ful�ll the goal more - showing
a higher performance - than hits that miss the dartboard. Based on motor
control, the discipline in which scientists study how movements are acquired
and their performance improved is called motor learning (cf. [Edwards2011]).
This discipline encompasses - for example - our research question of what
changes occur in throwing dart together with improvements in performance. So
in the game of darts, learning is connected to the improvement in performance.
For a detailed overview of the development and branches in motor control and
motor learning, we like to recommend [Edwards2011]. Here, we just want to
provide the steppingstones that lead to the expectations for our analyses.
According to [Edwards2011], the older branch in motor control are closed-
system theories or also called cognitive-based theories. Applying these, we
7
2 Background
consider mind and body to be separated and that the mind exclusively con-
trols the body. At the beginning of a movement the mind is fed by sensory
inputs which are then processed by the mind to extract information about body
and environment. Using the available information and following the movement
goal, the mind chooses a certain movement for which it plans a program to
steer the body. Afterward, the program is executed by sending signals to mus-
cles which evoke actions. A movement is the sum of these actions. Applying
this to darts, participants start by observing the environment and the arm in
the initial throwing position. From these sensory inputs, they extract infor-
mation on the vector towards the desired �eld of the dartboard. With this
information, they decide, e.g., for a rotational throw movement that creates a
throw trajectory towards that �eld. The participants start to plan which mus-
cles, joints, and bones are needed to create this motion. After the planning is
done, the signals are sent towards the muscles that extend the forearm regard-
ing the upper arm by contracting and pulling the forearm accelerating. This
accelerates the arm as well as the dart that is held in the hand. On another
signal to the muscles of the hand, the dart is released into �ight. On the last
signal, the muscles of the arm stop accelerating the arm and start decelerating
it instead until the arm rests again. The actions for acceleration, release, and
deceleration form the movement of a throw.
[Bernstein1967] objected the idea of movements to be exclusively controlled
by cognitive processes as he identi�ed two problems that oppose the idea of
cognitive-based theories. His �rst problem - the degree of freedom problem
- describes that from limbs down to the level of muscles, joints, and bones,
down to the level of muscle cells there are too many entities that would re-
quire control for a movement. So a controlling program for any movement
would be too complex to be able to be e�ciently executed. With his second
problem - context-conditioned variability - he questioned the assumption of
cognitive-based theories that the body is independent of external in�uences.
He observed that depending on di�erent external in�uences movements pur-
suing the same movement goal were generated di�erently. He observed that
while slowly lowering a stretched arm, muscles in the back are pulling the arm
down when another person is actively pushing the arm up. Whereas, when
only gravity is a�ecting the arm then muscles in the shoulder are pulling the
arm up.
Bernstein's chain of thoughts opposed an exclusive control of cognitive pro-
cesses and showed a strong interrelation of body and environment in the gen-
8
2.2 Theories of Motor Control
eration movements. To explain how we generate movements despite the de-
grees of freedom, he introduced the model of temporary groupings of muscles,
joints, and bones - called synergies - to certain action units that reduce the
number of degrees of freedom. Doing so, he opened the alternative branch of
dynamic-system theories in motor control in which we consider body, mind,
and environment as complex, interrelated subsystems in which movements
are emergent patterns from self-organizing structures in body and mind (cf.
[Edwards2011]).
Another of Bernstein's observations was that even though movements that
pursued the same movement goal were very similar, there were always varia-
tions. He named this phenomenon repetition without repetition as it describes
repetition in performance without repetition of the precise movement. Based
on this, [Latash2008] developed the concept of good and bad variance. While
there is a variance in the movements that su�ce the given goal there is another
variance that reduces the performance. An example of good variance may be
advanced darts players who show similar performance according to di�erent
techniques. They may execute a rotational throw and hit the bull's eye, they
may execute an ejecting throw and hit the bull's eye, and they may throw
backward above their shoulders and hit it again. On the other hand, we could
introduce bad variance by starting to shake them during the throw which adds
noise and degrades their performance.
From variance in motor control, we make a step to variety in cybernetics -
"the science that studies the abstract principles of organization in complex
systems" [Heylighen and Joslyn2001]. Here, [Ashby1958] described for sys-
tems with control that pursues a certain state with his law of requisite variety
and his idea of variety destroying variety the relation between the variety of the
state and the variety of in�uences. In his model, we have the system state for
which control pursues to keep it as close as possible to a goal state. There are
perturbations that in�uence the system state to deviate from the goal state.
The controller has certain actions to counter the e�ect of perturbations. Ac-
cording to his law, a controller needs enough variety in their actions to counter
the variety due to perturbations to the system in order to maintain the desired
state. Equation 2.1 represents the relation between the system's in�uences and
the goal variable (cf. [Heylighen and Joslyn2001]). V(S) represents the variety
in the system state. V(P) represents the variety of perturbances. The variety
9
2 Background
of regulating actions is represented by V(A). B is a spontaneous decrease in
variety due to bu�ering.
V (S) ≥ V (P )− V (A)−B (2.1)
Translating it to darts, we have - for example - the goal to hit the bull's eye.
Hitting the bull's eye, the relevant part of the system state is the desired state.
For hits close to the bull's eye the system state is closer to the desired state
than for hits in distance to the bull's eye. Ideally, we have no or a very small
variety in the system state always hitting the bull's eye �eld. Perturbations in
dart can be inside the player, e.g., due to breathing or due to the heartbeat.
They can also be in the environment, e.g., a sudden wind that in�uences the
path of the �ying dart. One action for the participant as a controller to counter
perturbations is, e.g., to wait for the right moment to avoid perturbations due
to heartbeat or breath. A bu�er in this example is the bull's eye itself. Though
it is small it covers a certain area allowing a variety of di�erent positions or
states to su�ce the goal or to be the goal state. If we imagine that we increase
the size of the bull's eye to the size of the dartboard then we see that for a
�xed set of hits more hits are inside the bull's eye achieving the goal state and
improving performance.
In our research questions, we ask for the di�erences between good and bad
participants in darts, and the di�erences between participants that improve
more than others. Now, we combine these with Ashby's law to concretize our
hypotheses. Following Ashby's "variety destroys variety", we expect for good
participants with lower variety in ful�lling the goal to observe a higher variety
in throwing the dart in comparison with participants with a bad performance.
For the development over time, we expect that participants that improve more
in performance also show a bigger increase in variety during their throws com-
pared to participants that improve less. So our approach is to compare variety
in the regulation - represented by the variety in the physiology during the
throw - with the variety in the ful�llment of the goal variable which relates
to the inverse of performance. This leads us to the task of quantifying the
physiology of throws as well as performance.
10
2.3 Approaches to quantify Physiology and Performance
2.3 Approaches to quantify Physiology and
Performance
We can quantify the throw process at di�erent levels. We could measure
the electrical activity for the brain to quantify cognitive processes. For the
subsystem of the muscles that contract to generate the forces that lead to a
movement, we could measure their electrical activity using EMG (electromyo-
gram). The next level could be the kinematics of the throwing arm and hand
- the forces and resulting accelerations, velocities and trajectories. Lastly, we
could also track the kinematics of the thrown dart. Let's take a look at what
others did in this area. [Cheng et al.2015] opposed a group of novice dart play-
ers with a group of expert dart players to �nd di�erences in electrical surface
activity of the brain. They quanti�ed physiology by measuring the electrical
surface activity on the brain using EEG (electroencephalography) and extract-
ing oscillations in a certain frequency interval. [Weber and Doppelmayr2016]
explored the e�ects of imagining throwing on performance and electrical sur-
face activity on the skull using EEG. Works that examine the kinematics of the
throwing arm include [Lee et al.2014]. They explore the di�erences in throw
movement due to alterations in the bone structure of the wrist after an injury.
To collect kinematic data, they simulated the throw process stepwise taking
pictures using CT (computer tomography) and reconstructed the movement
from the collected data. Even though in our related work nobody used kine-
matics to quantify the physiology of a throw, we see some options to do so.
[Preim et al.2009] do not investigate the movements of any kind of throw. But,
they do investigate the dynamics of a �uid in the blood circulation. In their
work, they give an overview of techniques to visually explore perfusion prop-
erties of body tissue. These properties are determined from tracking a �uid
contrast-agent in the blood circulation using, e.g., MRI (magnetic resonance
imaging) or CT. For each voxel of these recordings, they get a time-series on
the measured signal intensity that changes due to the moving contrast agent.
These series share a certain pattern when the contrast-agent crosses. In the
analyses of dart throw kinematics, we have also process patterns that occur in
every throw, e.g., the initial acceleration and the �nal deceleration of a throw.
[Preim et al.2009] �tted a simple process model for the signal intensity due to
the passage of the contrast agent in order to abstract the data and to extract
certain form properties for the process in each voxel. Afterward, they aggre-
gated these properties visually to show the distribution in the tissue. Having
11
2 Background
such properties for dart throws we may aggregate them to quantify variance
across several throws.
Looking at performance measures, each of the following related works set the
movement goal to hit the bull's eye by throwing the dart. [Cheng et al.2015]
and [Weber and Doppelmayr2016] quanti�ed performance as averaged ring
scores over a batch of throws. They assigned decreasing scores from the bull's
eye to the outer rings of the dartboard. Each hit was then awarded the score
of the hit �eld. Closely related to ring scores are displacements of hits in
relation to the bull's eye. While ring scores assign high values to hits close
to the bull's eye and small values to hits in distance, it is the opposite for
displacements. They measure the distance between the bull's eye and hit. So
they are small when a hit was close to the bull's eye and big when far away.
[Seidler et al.2017] investigated the e�ects of electrical surface stimulation on
the brain to the adaption to changed visual conditions. They quanti�ed perfor-
mance as the mean of the horizontal displacement in cm. [van Beers et al.2013]
used a di�erent approach while investigating the learning rate in a darts ex-
periment. They quantify performance using the variance in the displacement
of hits and motivate this with the fact that the variance of hits must be small
to reliably hit the bull's eye.
In the related work, each time just a single performance measure was used. We
recognize here the problem of using just a single performance measure because
for each one there are cases that seem equal according to the measure but that
are di�erent in ful�lling the goal on hitting the bull's eye. Regarding the use
of variance as only performance measures, though a small variance in hits is
needed to reliably hit the bull's eye, it is easy to see that this condition is not
su�cient. Let's imagine a participant with the smallest possible variance in
hits always hitting the same position. If this participant always hits the bull's
eye then this shows the highest possible performance. Yet, when the hits are
always at the lower rim of the dartboard then it fails the movement goal of
hitting the bull's eye for which we expect a low performance. Nevertheless,
the variance for both cases is equal. On the other hand, also regarding the
displacement measures, we want to show an example pair of equal performance
according to the displacement measure with di�erent degrees in ful�lling the
goal of hitting the bull's eye. For the �rst participant, we imagine a throw
behavior of hitting repeatedly just the same ring on the dart board. Based on
the mean displacement of this participant, we imagine a second one that shares
the same mean displacement but whose hits are scattered over the dartboard.
12
2.4 Approach to handle multiple Performance Measures
We can now argue that the �rst one ful�lls the goal less than the second because
the �rst one is unable to hit the bull's eye while for the second one some hits
may land inside.
We see that with just one of those measures we cannot completely quantify
performance on hitting the bull's eye. In the following thesis, we do not want
to search for a better measure for quantifying performance that overcomes
this problem. Yet, we want to use for our analyses multiple complementary
measures.
2.4 Approach to handle multiple Performance
Measures
In the discipline of multi-objective optimization optimization problems with
multiple performance measures are handled. Objective is the term for a
performance measure in this discipline. Having multiple objectives, we may
evaluate two elements for each of them. If all of those objective support that
an element x is better than an element y then it is simple to conclude that x
is better according to all objectives. However, if one objective supports that x
is better than y and another objective supports the opposite - that x is worse
than y - then we have a con�icting situation and no clear concept of better
and worse. To be still able to apply the concept of better and worse, [Deb2011]
propose to extract for a set of elements the non-dominated front which is a
subset that we can consider better than the rest of the set. Extracting non-
dominated fronts iteratively gives us a partition of the elements with multiple
performance measures into disjunct subsets for which it de�nes that certain
subsets are better than others.
The subsets are determined by iteratively computing the subset of non-
dominated elements - also called non-dominated front - and removing it from
the set for the next iteration. The non-dominated front is the set of elements
for which we cannot state that they are clearly worse than any other element.
Having multiple measures, we can de�ne a domination relation for each pair
of two elements based on the domination relations for single measures. For
a single measure, an element x dominates the other one y when we consider
x better than y (cf. Equation 2.2). If we think of scores, we might consider
13
2 Background
higher scores better than smaller ones. If we think of error measures, we usu-
ally consider smaller values to be better. For multiple measures, an element
x dominates an element y when x is not dominated in any measure by y and
x dominates y in at least one measure (cf. Equation 2.3). This relation al-
lows that an element x either dominates an element y, that x is dominated
by y, or that both do not dominate each other. Searching the elements that
are not dominated by any other element gives us the non-dominated front (cf.
Equation 2.4). The elements of the non-dominated front do not dominate each
other but they dominate all the remaining elements. For this reason, we con-
sider the elements of the this front to be better. By iteratively determining the
non-dominated front and removing it from the set, we partition the original set
into a series of fronts. For non-dominated fronts, all the elements inside a front
do not dominate each other. Though, all elements of a front are dominated by
elements of fronts that appear earlier in this series - except for the �rst front
of the original set. For this reason, we can consider fronts that appear earlier
in this series of fronts to be better than fronts that appear later. This gives us
an order based on multiple performance measures.
x dominates y regarding f ⇐⇒x≺
fy ⇐⇒ x is better than y regarding f
with f as objective
(2.2)
x dominates y ⇐⇒x≺ y ⇐⇒ (x dominates y regarding at least one objective)
and (x is not dominated by y regarding any objective)
⇐⇒ ( ∃f∈F
: x≺fy) ∧ (¬ ∃
f∈F: y≺
fx)
with F as set of objectives
(2.3)
non-dominated front =subset of X consiting of elements that
are not dominated by any other element of X
={x|(x ∈ X) ∧ (¬ ∃y∈X
: y≺x)}(2.4)
14
2.4 Approach to handle multiple Performance Measures
In order to visualize fronts based on 2 measures, we can exploit their inherent
order. Since the elements of a front do not dominate each other, we know that
for every two elements x and y the element x dominates y in one measure and
y dominates x in another measure. Since there are just two measures, we can
sort all elements in improving order according to one measure to a series that is
sorted at the same time in degrading order according to the second measures.
In Figure 5.3, Sub�gure (2) we depicted two performance measures along the
horizontal as well as the vertical axis. For each front, we used their inherent
order to create a polyline which connects all elements. In the sub�gure, the
best fronts are located near the bottom and left part of the area. The worst
fronts are in the top right.
Next, we take a look on how others applied non-dominated sorting for ana-
lyzing data using multiple performance measures. [Vrugt et al.2007] developed
and analyzed a simulation on bird migration between Europe and Africa. They
used �ight time and energy use as performance criteria for di�erent migration
routes. They analyzed the relation between the �rst non-dominated front on
these measures and the migration routes. They found that the tradeo� between
both measures leads to two main branches in migration routes - each one pri-
oritizing one measure. [Levin2014] explored the application of non-dominated
sorting in the analysis of startup performance. They used growth and prof-
itability as performance measures. By extracting the �rst non-dominated front,
they were able to retrieve the relation between the measures that were used to
generate their data. Both works focus on analyzing the relations in the �rst
non-dominated front. Doing so, they ignore big parts of their datasets that
are not optimal or even show bad performance.
[Hulikanthe Math2017] explored empirical data on startup performance using
survival time and the number of employees as performance measures. With
an analysis called intrafront-analysis, they opposed the ends of a given front
from non-dominated sorting in order to identify di�erences between the ends
or respectively tendencies along the front regarding other startup properties.
They applied this to chosen good and bad fronts. In addition, they opposed
a good and a bad front regarding their startup properties which they called
interfront-analysis. [Hulikanthe Math2017] did not focus just on the �rst non-
dominated front. Instead, using their intrafront- and interfront analysis, they
also explored the worse parts of the data using multiple performance measures.
15
2 Background
We see the possibility to use non-dominated sorting for our analysis using
multiple indented performance measures on the performance in throwing darts.
It will give us a concept of good and bad according to multiple measures.
Based on this concept, we can oppose well and badly performing participants
to �nd di�erences in their throw physiology. This will give us an answer to our
research question that will be di�erent from using just a single performance
measure.
2.5 Goals and Expectations
In this thesis, our goal is to explore the relation between throw performance
an throw physiology in darts. For this, we want to quantify the physiology of
the throw movement. In addition, we want to quantify the performance of the
throw movement. Here, we want to use multiple performance measures and
analyze them in a combined way using the fronts from non-dominated sorting.
Due to the connection to Ashby's law of requisite variety, we expect better
performing participants to show more variety in their physiology than worse
performing participants. For the development over time, we expect partici-
pants that show a stronger improvement in performance to also show a stronger
increase in variety in their physiology compared to players that improve less.
Lastly, we expect to get additional �ndings by applying multiple performance
measures and using them in a combined way for the analyses.
In the next chapter, we present our method that we developed to ful�ll these
goals and to test our expectations.
16
3 Data Basis
In this chapter, we describe the dataset that is the basis for our analyses.
Based on our research questions and our expectations, our goal is to explore
the relation between performance and physiology of dart throwing. We are
grateful as we were provided with the dataset for a dart experiment which
consists of data related to the outcome of throws and measures related to the
physiology of the respective throws. First, we describe the experiment and the
collected data. Second, we describe how the data is related to performance.
Lastly, we describe why we focus in this thesis on the data for accelerating
forces.
3.1 Dart Experiment
We thank Prof. Ross Cunnington, Tamara Powell, et al. from the University
of Queensland and Queensland Brain Institute for providing us the datasets
for the dart experiment that they conducted. For this experiment, they let
participants throw darts subsequently 500 times with the task to hit the bull's
eye.
They conducted this experiment at the University of Queensland and Queens-
land Brain Institute approved by the University of Queensland Medical Re-
search Ethics Committee (UQ Project No. 2011001394). They conducted the
experiment in two phases with Phase 1 in 2015 and Phase 2 in winter 2017
/ 2018.
In a well-lit room they marked a line on the ground with a distance of 267cm
to the wall with the dartboard marking the throw area for participants. In
Figure 2.1, we see the dartboard that they used in this experiment. In the
center, there is the red circular bull's eye. Around it, the dartboard is divided
into 9 Rings and 12 sectors of a circle which creates 108 ring segment-shaped
17
3 Data Basis
�elds. The dartboard spans in total a diameter of 43cm. It was attached to
the wall with a height between �oor and bull's eye of 155cm in Phase 1 and 143
cm in Phase 2. As for darts, in Phase 1 two sets of 5-6 darts were provided.
One set with a dart weight of 22g each as well as another with a dart weight
of 26g each. In Phase 2 only a set of 5 darts with 22g was supplied.
They instructed each participant to stand behind the line marked on the
ground, to try to hit the bull's eye, and to throw the set of 5-6 darts be-
fore retrieving the darts. In Phase 2, they gave participants additional in-
formation on how to throw according to the recommendations from http:
//dartbrokers.com/dart-technique.
In each recording session, there was just one participant. They conducted a
session in one piece. In the beginning, they equipped participants with an
inertial and EMG measurement device at the forearm as well as a mobile scalp
EEG. The participants threw their set of 5-6 darts with own pace. In total
participants threw 500 times separated into 5 sequences of 100 throws each.
We call these sequences of 100 throws blocks. So block 1 refers to the �rst 100
throws while block 5 refers to the �fth or last 100 throws. After each block,
there was a brief rest period for the participants.
During the experiment sessions, Cunnington et al. recorded the participant's
physiological properties using the equipped sensors as well as they observed
and recorded the hits on or o� the dartboard.
They recorded for each throw the approximate position of the hit by recording
the hit dartboard �eld using two measures score and clock time. The dart-
board assigns to a hit in the bull's eye a score of 10. For each of the subsequent
9 rings, it assigns a score reduced by one. So it rewards the outermost ring
with a score of 1. For missing the dartboard they assigned a score of 0. For
clock time, we have 12 evenly spaced sectors going clockwise - in reference to
a clock - having sector 12 starting on the top going clockwise until sector 1
starts. For each sector following clockwise the sector number increases by one.
Hits in the bull's eye �eld do not have a clock time assigned. This clock time
measure relates to the direction of the hit relative to the bull's eye. Except
for the �rst 7 participants of Phase 2, they also recorded the clock time when
participants missed the dartboard.
Previous to the throw session, they made the participants take part in a ques-
tionnaire to asses their familiarity with dart throwing asking them to esti-
mate their personal frequency of dart games per year as well as the time that
18
3.1 Dart Experiment
has passed since the last game. Immediately before the throw session, they
Figure 3.1: The inertial measurement unit attached to the wrist - Cunnington
et al. attached the Shimmer 3 device by Realtime technologies ltd
to the wrist of the trowing arm to record throw kinematics and
muscle activity.
equipped participants with the inertial measurement unit (IMU) Shimmer 3 by
Shimmer at their right wrist for measuring characteristics of the throw move-
ment (cf. Figure 3.1). The wrist is close to the �ngers which hold the dart. For
this reason, we expect that the measured in�uences are similar to the actual
in�uences that a�ect the darts during the throw process. The IMU recorded
during the experiment time series with a sampling rate of 512Hz for accelerat-
ing forces according to three axes relative to this device using an accelerometer,
as well as rotation velocity in three axes relative to the device using a gyro-
scope and orientation using a magnetometer. As the measurements took place
inside a building, we considered the magnetometer measurements not reliable
from the beginning. The accelerometer settings were modi�ed from a today
unknown setting in Phase 1 to a setting of +/- 16g in Phase 2. In Addition,
this device was extended by two bipolar electrodes for measuring electrical sur-
face activity that we a�xed on the arm above the extensor digitorum muscle.
19
3 Data Basis
This muscle takes part in releasing the darts. So, Cunnington et al. expected
high electrical activity to be related to the moment of releasing the dart during
the throw. As paid volunteers and having given written informed consent, 23
participants took part in Phase 1 as well as 23 other participants in Phase 2.
There were 26 females and 20 males between 18 and 40 years with a mean age
of 23 years. They had normal or corrected vision and no neurological or psy-
chiatric disorders in the present nor in the past. All of them were right-handed
and most of them were beginners in the game of darts.
As the �rst step of postprocessing, Cunnington et al. identi�ed markers for
individual throws in the recorded session data for further processing. Unfor-
tunately, the EMG data was too noisy to identify single throws. In gyroscope
data, they observed peaks in rotational velocity which correlate with throws.
During a throw, the throwing arm moves faster than in the rest phases which
also includes faster rotations. So, they scanned through the gyroscope data
and identi�ed those peaks and recorded for every throw the corresponding
peak time as a marker that lays inside the timeframe of a throw. Further-
more, they discarded the datasets for participants with artifacts in EEG data
or failed synchronization between EEG and IMU.
After all, we were provided with complete datasets consisting of hit data,
kinematic data, and data regarding the electrical activity of the brain for 17
participants in Phase 1 and for 15 participants in Phase 2. Based on this data,
we will quantify performance as well as the physiology of throws next.
3.2 Towards Performance Measures
In the experiment, for every throw, the approximate hit position was recorded
using the measures score and clock time that describe the hit �eld. We take
an initial look at the collected data. Figure 3.2 is a heatmap that shows the
distribution of the hits on the �elds of the dartboard and its surrounding sectors
for the 100 throws in the �rst block of participant J14. The top and bottom of
the diagram represent the top and bottom of the dartboard. Respectively, the
left and right represent the left and right of the dartboard. We see that there
are hits scattered all over the dartboard. Yet, the focus of hits lays in the lower
half of the diagram mostly missing the bottom of the dartboard to the 4 lower
sectors. Most hits are far from the bull's eye thus showing a bad performance.
20
3.2 Towards Performance Measures
In Figure 3.3, we see for a di�erent participant that the distribution of hits is
less scattered and less displaced from the bull's eye. This participant seems to
perform better. The distribution for the next participant in Figure 3.4 is also
more concentrated than the distribution for the �rst participant. On the other
hand, the focus of the distribution is not centered around the bull's eye. It is
displaced downwards. So, this distribution shows a performance worse than
the second participant. Lastly, in Figure 3.5, we see a distribution that is also
scattered all over the dartboard but without a strong tendency for a general
displacement in a certain direction. Here, the performance also is worse than
the performance of the second participant. To sum it up, we reach a higher
performance when our hits are generally closer to the bull's eye and when our
hits are less scattered.
20 0 20
20
10
0
10
20
J14 B1
0246810121416
hits
in %
Figure 3.2: Distribution of hits with high scatter and high displacement -
We depicted the distribution of hits on the �elds of the dartboard
and the surrounding sectors as a heatmap. This heatmap shows
distribution of the 100 hits from the �rst block of participant J14.
21
3 Data Basis
20 0 20
20
10
0
10
20
J02 B1
0246810121416
hits
in %
Figure 3.3: Distribution of hits with low scatter and low displacement
20 0 20
20
10
0
10
20
J13 B3
0246810121416
hits
in %
Figure 3.4: Distribution of hits with low scatter and high displacement
22
3.3 Focus on accelerating Force as recorded physiological Measure
20 0 20
20
10
0
10
20
J01 B4
0246810121416
hits
in %
Figure 3.5: Distribution of hits with high scatter and low displacement
3.3 Focus on accelerating Force as recorded
physiological Measure
In the experiment, Cunnington et al. recorded the way of throwing darts re-
garding the executed movements and the activity of the brain. For the throw
execution, they recorded time series about the kinematic in�uences close to
the thrown dart measuring the accelerating force and rotational velocity. Re-
garding the activity of the brain, they recorded time series on the activity
of di�erent brain areas using EEG. In the conceptual chain from brain ac-
tivity, over muscle activity, over emerging movements, over �ight trajectories,
towards the ful�llment of the goal, we see more steps from brain activity to
performance than from emerging movements towards performance. To begin
analyzing the provided dataset, we decided to analyze the shorter connection
from emerging movements to the ful�llment of the movement goal. For this
reason, we ignore the EEG data in the following parts of the thesis. Instead,
we focus on the kinematic data that describes the movements which emerged.
For the kinematic data, we have the 3-dimensional time-series from the gyro-
scope, the accelerometer, and the magnetometer of the inertial measurement
unit. We already dropped the measurements of the magnetometer for being
23
3 Data Basis
not reliable due to being recorded indoors. For the gyroscope - which measures
the rotational velocity relative to the device - and the accelerometer - which
measures the accelerating forces also relative to the device -, we decided to fo-
cus further on the measurements of the accelerometer. Our reason is, that we
see the measurements for accelerating forces to be more relevant towards the
throw of darts because the rotational velocity depends on the technique used
for throwing. In the case of a rotational throw - when the throw is executed
mainly by a rotation in the elbow joint -, the rotational velocity correlates to
a high degree to the velocity of the sensor and respectively to the velocity of
the thrown dart. Yet, using a di�erent technique like an ejecting throw - when
the rotation takes place in the elbow joint and the shoulder joint -, weakens
this correlation. Since we do not know which techniques the participants used,
we decided to focus on the acceleration forces that a�ect the sensor instead.
The accelerating forces of the sensor are similar to the accelerating forces that
a�ect the dart while it is held. In Figure 3.6, we show the time series of two
1573 1574 1575 1576 1577 1578 1579t in s
15
10
5
0
5
10
15
20
Figure 3.6: Time series for the accelerating force of two subsequent throws -
We depicted the time series for the recorded strength of the accel-
erating force according of the three axes of the sensor in red, green,
and blue. In addition, we added the time series for the norm of the
accelerating force in black.
subsequent throws for the 3 axes of the sensor in red, green, and blue. First,
we see for both throws after the time 1573 and after the time 1578 distinct
amplitudes. Second, also between both throws, the amplitudes do not collec-
tively get close to 0 though it is a phase of slow motions and small accelerations
24
3.3 Focus on accelerating Force as recorded physiological Measure
compared to throws - so-called a rest phase. Between throws, participants
usually rest their arm and hand or take the next dart. A second accelerating
force that the accelerometer recorded compound with accelerating forces to
accelerate the dart is gravity. Gravity is a static accelerating force that is di-
rected downwards. For each time of the recorded series of accelerating forces,
we have three components according to the three measuring axes of the sensor.
Together, these components represent a force vector relative to the sensor. We
can compute the length or strength of this force vector using the Euclidean
norm (cf. Equation 3.1).
‖~x‖ =√x1
2 + x22 + x3
2
with ‖‖ as Euclidean norm
with ~x as vector
with x1, x2, and x3 as vector components
(3.1)
We aggregated using the norm the three channels for every measurement time
to create a time series for the norm. We depict this series as the black series in
Figure 3.6. The black series shows us for the rest phase between both throws
a certain baseline that is shifted to the value of 2.5. The almost constant
strength of the accelerating forces in the rest phase matches to the constant
strength of gravity with only small accelerations due to movements. Though
the strength of the accelerating forces is almost constant in the rest phase,
we also see that the amplitudes for the separate channels change between
both throws. For example, the value for the green series raises, drops, and
raises again almost in parallel to the blue series. The reason for this is our
sensor and its measuring axes that rotate during movements while the gravity
remains pointing downward. This also a�ects the accelerating forces that we
are interested in to quantify the physiology of the throw movement. So due to
this rotation, a constant accelerating force may initially appear in one channel,
may be spread over time to several channels, and may, in the end, remain in a
di�erent channel. The norm is independent of the rotations of the force vector
because rotations do not change the length of the vector. For this reason, we
decided to focus further on the norm of the accelerating forces to quantify the
physiological properties of throws.
Yet, another observation that we made is that the scales for the accelerating
forces for the records of Phase 1 and Phase 2 data do not match. In rest
phases, the accelerating forces due to movement should be small while the
25
3 Data Basis
main accelerating force is gravity with a constant strength. For this reason,
we expect the average value and the scatter of values in rest phases to be
similar. In the time series of the norm of accelerating forces for all �ve blocks
of throws for 9 participants from Phase 1 and 9 participants from Phase 2,
we selected manually each time three rest phases. For these 54 rest phases,
we determined the mean and standard deviation to characterize the position
and the scatter of the values. In Figure 3.7, we confront the mean along the
horizontal axis and the standard deviation along the vertical axis. Every point
represents the properties of a single rest phase. The blue points for the rest
phases of Phase 1 form a small cluster in the lower right with small values
for mean and standard deviation. The orange points for the rest phases of
Phase 2 are clearly separated from this cluster having higher mean values.
We consider the rest phases to be corresponding reference values of the scales
that were used to measure the accelerating forces. In the �gure, we see that
these references values do not correspond well between Phase 1 and Phase 2
which implies the application of di�erent scales. Thus, we cannot compare
the amplitudes between both phases because any value has di�erent meanings
according to the di�erent scales, e.g., a participant with the height value 190
is big according to the centimeter scale but small according to the millimeter
scale. As for this reason the amplitudes are not comparable and as the scales
in Phase 2 vary more, we decided to drop the Phase 2 data and focus on the
data for the 17 participants from Phase 1.
In consequence of our decisions, we focus on the norm of the accelerating forces
for the participants in Phase 1 in order to quantify the physiology of throws.
26
3.3 Focus on accelerating Force as recorded physiological Measure
Figure 3.7: Properties of rest phases - We manually selected 3 rest phases in
the time series for the norm of accelerating forces for each of 9
participants from Phase 1 and 9 participants from Phase 2. For
the selected rest phases, we determined the mean and the variance.
We confront these properties with the mean along the horizontal
axis and the standard deviation (std) along the vertical axis.
27
4 Proposed Method
In this chapter, we want to describe our developed method.
Based on our research questions and our expectations, we want to explore
the relation between performance in dart and the way of throwing the dart
which we call the physiology of throws. The main goal of our method is
to provide information to explore the relation between performance and throw
physiology. To do so, we quantify performance and throw physiology, as well as
the development over time in those measures. Finally, we analyze in Sections
5.2 and 5.3 the developed visualizations that oppose both sides of throwing
dart, which we used to characterize the relation.
4.1 Performance Measures
In the provided experiment, the outcome of every throw was recorded using
score and clock time that represent the approximate position for the hit on the
dartboard. Now, we want to use these measures to quantify the throw perfor-
mance of participants. In Section 2.3, we decided to use multiple performance
measures to quantify the performance of throwing dart which is di�erent to
the corresponding related work. So, we observed in Section 3.2 that we con-
sider a distribution of hits better regarding the goal of hitting the bull's eye
when the distribution is less scattered and generally shows a smaller deviation
from the bull's eye. In order to capture both - the scatter of hits and a general
displacement -, we de�ne two performance measures. We de�ne both measures
as error measures, i.e, the smaller the error the better the performance. So, an
error of 0 represents the maximum possible performance according to the re-
spective measure. First, we name the concept of performance from the general
proximity of hits towards the bull's eye accuracy. The participant in Figure
3.2 showed a smaller accuracy than the participant in Figure 3.5 because the
hits of the �rst one generally deviated from the bull's eye downwards while
29
4 Proposed Method
the hits of the second one are generally centered at the bull's eye. Based on
accuracy, we de�ne the accuracy-error as the distance between the center of
the bull's eye and the center of the hits (cf. Equation 4.1). As we will see in
an instant, we can remove the position of the center of the bull's eye from the
equation.
accuracy-error =‖~xcenter of bull's eye − ~xcenter of hits‖
=‖~0− 1
|H|∑~x∈H
~x‖
=1
|H|‖∑~x∈H
~x‖
with ~x as position vector for a hit
with H as sequence of hits
(4.1)
For the second performance measure, we name the concept of performance
from smaller scatter of hits precision. The participant in Figure 3.3 has
a higher precision than the participant in Figure 3.5 because the �rst one
shows a distribution of hits that is more concentrated than the distribution of
the second one. Based on precision, we de�ne precision-error as the mean
distance between two hits (cf. Equation 4.2).
precision-error =1
|P |∑
{~x1,~x2}∈P
‖~x1 − ~x2‖
with ~x1 and ~x2 as position vectors for hits
with P as sequence of all pairs of hits
(4.2)
For the de�nition of accuracy-error and precision-error, we assumed that we
already have the exact positions of the hits. Yet, in our experiment, we col-
lected just an approximate position consisting of score and clock time that
describe the hit �eld. Thus, we estimate exact hit positions from the dimen-
sions of the dartboard and its �elds relative to the center of the bull's eye with
the following strategy. For hits in the bull's eye �eld, we mapped the position
to the center of the bull's eye with coordinates ( 00 ). For hits in ring segment-
shaped �elds, we mapped the position to the intersection of the inner arc and
the angle bisector of the respective ring segment. For misses, we mapped the
position to the intersection of the outer rim of the dartboard and the angle bi-
sector of the respective sector. Since all estimated positions are relative to the
center of the bull's eye at ( 00 ) = ~0, we can remove the center of the bull's eye
30
4.2 Describing the Throw Pattern by Throw Features
from Equation 4.1 for the computation of accuracy-error. In addition, as we
use the original dimensions in cm, the values for both performance measures
also have centimeters as units.
In Table 4.1, we show the values for accuracy-error and precision-error for the
examples from Section 3.2. As before, the second distribution is better than
the �rst distribution according to both measures representing less scatter and
general less deviation from the bull's eye. The third distribution is better
than the �rst according to precision-error being less scattered but worse than
the second distribution according to accuracy-error having generally a bigger
displacement from the bull's eye. The fourth distribution of hits is worse than
the second and the third according to precision-error being more scattered.
Yet, it is better than the �rst distribution according to accuracy-error. The
fourth distribution shows the smallest accuracy-error being better than both
distributions with low precision-error.
Table 4.1: Performace measure examples - We computed for the examples
from Section 3.2 the values for accuracy-error and precision-error
rounded to one position after decimal point.
�gure accuracy-error precision-error
1st 3.2 8.5 14.4
2nd 3.3 1.2 5.1
3rd 3.4 2.3 7.2
4th 3.5 0.6 11.2
In order to quantify the performance for a block of throws, we estimate the
exact positions of hits which we use to compute our performance measures
accuracy-error and precision-error.
4.2 Describing the Throw Pattern by Throw
Features
In Section 3.3, we decided to focus on the norm of accelerating forces as a
measure to characterize the physiology of throws. In this section, we describe
how we extract speci�ed features from single throws using a model regarding
31
4 Proposed Method
the physiological processes and how we aggregate those features over many
throws to quantify the physiology of throwing.
To prepare the data, we determine the norm of accelerating forces and extract
the time series for every throw. For every throw, we use the peak time as a
reference point in time and extract an interval starting 1500 milliseconds before
the peak time and ending 500 milliseconds after it. In Figure 4.1, we see the
extracted series for all throws of participant J01. We recognize a characteristic
pattern consisting of to subsequent peaks for the processes during throwing.
As we can see a clear pattern for the curves of the time series, we follow the
Figure 4.1: Extracted time series of the accelerating force for all throws of
participant J01
idea of [Preim et al.2009] to simplify the time series to few form features that
describe a single throw. To do so, we develop a model which we match to the
curve of a throw to extract characteristic throw features.
These features base on a conceptual model of the throw process. In Section 2.1,
we already described the process of throwing a dart consisting of a preparation
phase to move the dart to an initial throw position, an acceleration phase to
accelerate the dart to a certain velocity, and a deceleration phase to stop
the throwing arm. The movements in the preparation phase are slow and
for this accompanied by weak acceleration. In the acceleration phase, the
velocity of the arm increases a lot in a short time. So here, the acceleration is
strong. Lastly, in the deceleration phase, the velocity of the arm decreases in
a short time a lot. So in this phase, there is a strong deceleration. Hence, the
acceleration phase and the deceleration phase with their strong acceleration
correspond to the �rst and the second peak that we see in Figure 4.1.
The model that we �t to the curve is piecewise linear, i.e., it consists of linear
and constant functions that apply in a certain interval. We can see in Figure
4.2, that we approximate the baseline in the rest phases before and after throws
using a horizontal line. Using a raising and a falling line segment, we model
32
4.2 Describing the Throw Pattern by Throw Features
each peak for acceleration and deceleration phase. Using this model, we
t1 t2 t3 t4 t5
1
2
3
4
5
Figure 4.2: Time related form features in the piecewise linear model
v1
v2
v3
v4
1
2
3
4
5
Figure 4.3: Amplitude related form features in the piecewise linear model
gather 5 points in time at the connection of the line segments to characterize
the model which we show in Figure 4.2. We describe with t1 the begin of the
acceleration phase, with t2 the time of the peak acceleration, with t3 the time
of the transition towards the deceleration phase, with t4 the time of the peak
deceleration, and lastly with t5 the end of the deceleration phase. On the other
hand, we have 4 amplitude related features at the connection of line segments.
With v1, we describe the amplitude of the baseline, with v2 the amplitude
of the peak acceleration, with v3 the amplitude at the transition towards
the deceleration phase, and with v4 the amplitude of the peak deceleration.
There is no need for a v5 since at the end of the deceleration phase we reach
the baseline whose amplitude is already described by v1. To limit the valid
33
4 Proposed Method
con�gurations of those form features, we de�ned a set of constraints including
that the time-related form features keep their order (cf. Equation 4.3) and
that the amplitude-related form features maintain both peaks (cf. Equation
4.4).
t1 < t2 < t3 < t4 < t5 (4.3)
v1 <v2
v2 >v3
v3 <v4
v4 >v1
(4.4)
In order to match our model as good as possible to the time series of a
throw, we construct �rst a good initial solution for the model and optimize
all form features afterward using the Levenberg�Marquardt algorithm (see
[Ranganathan2004]) that depends on a good initial solution as it �nds local
optima. To quantify the distance between data and model, we de�ne our resid-
ual as the sum of squared di�erences between data and model over the whole
time series for the throw. In Figure 4.4, we show the time series for the norm
of the accelerating forces in black. To �nd a good initial solution, we start with
a static prede�ned model that we scale to match the maximum amplitude in
the data. Afterward, we search with a brute force approach the best temporal
o�set. These three intermediate models are depicted in the �gure with grey
dotted lines. Finally, we apply the Levenberg-Marquardt algorithm to adjust
all form features to minimize the residual value. The �nal model, we depict as
red polyline in the �gure. We see that it could capture the baseline as well as
the peaks though there is some noise that our model cannot represent.
At this point, we have a set of form features that describe the shape of the time
series for a single throw. Yet, the time-related features are still relative to the
peak time which may vary relative to the process in the throw. The amplitude-
related features still contain the in�uence of the gravity which is not part of
the accelerations due to the throw movement. To avoid those in�uences, we
de�ne throw features which base on the di�erences between form features.
From all combinations of the 5 time-related features, we de�ne 10 di�erences
as time-related throw features. With the notation d_t1_t2, we refer to the
34
4.2 Describing the Throw Pattern by Throw Features
Figure 4.4: Fit for the 42nd throw of participant J01 - We depicted the time-
series as the black line, the representation of the �nal �tted model
as the red line, and the intermediate results for �nding a good
initial model for �tting as grey dotted lines.
35
4 Proposed Method
di�erence of t1 to t2 - in other words, d_t1_t2 = t2−t1. In Figure 4.5, we see arepresentation of d_t1_t2. So, it represents the duration from the begin of the
acceleration phase to the time of the peak acceleration. From all combinations
of the 4 amplitude-related form features, we de�ne 6 di�erences as amplitude-
related throw features. Respectively, with the notation d_v1_v4, we refer to
the di�erence from v1 to v4, i.e., d_v1_v4 = v4 − v1. This throw feature
represents the amplitude of the peak deceleration without the height of the
baseline as we can see in Figure 4.6. Lastly, we de�ne as 4 additional throw
features the 4 slopes for the acceleration and deceleration phase. With the
notation s12, we refer to the slope between the �rst and the second connection
(cf. Figure 4.7), i.e., s12 = d_v1_v2/d_t1_t2.
t1 t2
1
2
d_t1_t2
Figure 4.5: d_t1_t2 as example for a throw feature from a time di�erence
v1
v4
1
4d_v1_v4
Figure 4.6: d_v1_v4 as example for a throw feature from a amplitude di�er-
ence
36
4.3 Quantifying Development
1
2
s12
Figure 4.7: s12 as example for a throw feature from a slope
Having those 20 throw features, we can characterize throws individually. Yet,
we have no measure for the variance of the movement in general. To character-
ize general properties of the throwing of a participant, we determine properties
of the distribution for each throw feature over a block of throws. We character-
ize the scatter or variance of each feature using the IQR (interquartile range).
For the average value of each feature, we determine the median. We chose
both for the reason that they are more robust against outliers that may occur
in the �tting.
In order to quantify the physiological properties of throwing, we abstract the
time series data of accelerating forces by �tting a model from which we extract
throw parameters that characterize each throw individually. Afterward, we
determine for each throw feature the distribution properties median and IQR
during a block of throws to describe the general value and the scatter.
4.3 Quantifying Development
We have now performance measures that describe in the time frame of a
certain block the status of performance. In addition, we have distribution
properties of throw features that describe also in the time frame of a certain
block the way of throwing dart. As we are also interested in the changes
over time, we quantify the development as follows. We compute the di�er-
ences for performance measures as well as the distribution properties. We
set the �rst block as reference value computing the di�erences in all these
37
4 Proposed Method
measures from the value regarding the �rst block to the value of later blocks,
e.g., developmentaccuracy_error = block2accuracy_error − block1accuracy_error. As
we have 5 blocks in the experiment we will have for one measure 4 di�erences
- to the second, third, fourth, and �fth block.
38
5 Results
In the following sections, we describe in detail the analyses that we make
to answer the research questions as well as the corresponding observations.
First, in Section 5.1, we have a closer look at the performance measures that
occurred during our experiment and which are the basis for the confrontation
in the following analyses. In Section 5.2, we look for di�erences in our throw
features based on confronting good and bad groups based on the performance
measures. In Section 5.3, we look for di�erences in the development of our
throw features.
In order to explore the relation between performance measures and phys-
iological measures of throws, we follow the idea as seen in the work of
[Günnemann et al.2010]. They select groups of similar elements according to
one space and visually explore their distribution in a di�erent space with the
help of scatter plots. Based on these visualizations, they identify manually
patterns of interest. We select a good and a bad group according to the per-
formance measures and explore their relation according to the distribution
properties of throw features.
Regarding our �rst expectation - that participants with a better performance
have a higher variance of throwing -, we conduct three analyses. We select both
groups �rst based on accuracy-error (see Section 5.2.1) and second based on
precision-error (see Section 5.2.2). Third, we select them according to the non-
dominated fronts on both measures in the manner of an interfront analysis (see
Section 5.2.3). We do not conduct an intrafront analyses because two groups
based on the opposite ends of front have no notion of good and bad, or better
and worse. Regarding our second expectation - that participants that improve
more also show a stronger increase in the variance of throwing -, we conduct
another three analyses. First and second, we confront groups based on the
development in either accuracy-error (see Section 5.3.1) or precision-error (see
Section 5.3.2). As third and last analyses, we select the groups based on the
39
5 Results
non-dominated fronts on the development in both performance measures as
interfront analysis (see Section 5.3.3).
5.1 Performance measures from Experiment
Data
In the following two subsections, we want to get an impression on the oc-
curring values for performance measures that we receive from applying our
performance measures to the experimental data. In addition, we want to get
an idea of how the characteristics of these values will in�uence the following
analyses for answering our research questions. In the �rst subsection, we will
have a look at the actual values for accuracy-error and precision-error which
we introduced in Section 4.1. In the second subsection, we will examine the
development in those measures.
5.1.1 Regarding Block Performance
We aggregate the performance data from our experiment for single hits - score
and clock time - for each of the �ve blocks for each participant computing the
values for precision-error and accuracy-error. These represent respectively the
mean distance between two hits on the dartboard in cm and the distance of
the center of the hits to the dartboard center in cm. Since both measures are
error measures and it is better to make smaller errors, we consider smaller
values better than higher values. Or in other words, we consider low values
- relating to the range of occurring values - as good and the ones with high
values as bad. The resulting values for the performance measures are shown
in the sub�gures of Figure 5.1.
Sub�gure (0) shows us a histogram regarding accuracy-error where we get an
impression on the distribution of occurring values. We see the range starting
from the left close to zero ending on the right close to 10. We see this distri-
bution to be skewed towards the left or the better values with the majority of
data points having small values for accuracy-error. Regarding precision-error,
we see the corresponding histogram in Sub�gure (1). The occurring values
for precision-error range from 5 up to 20. In a di�erent way from accuracy-
error, this distribution is slightly skewed towards the right, i.e., towards higher
40
5.1 Performance measures from Experiment Data
5 10accuracy_error
0
10
20
(0)
5 10 15precision_error
0
10
(1)
0 5 10accuracy_error
0
10
prec
ision
_erro
r
(2)
Figure 5.1: Occurring performance measure values - We determined our per-
formance measures for each block of each participant using the
data from our experiment. Sub�gures (0) and (1) are histograms
on this data showing the frequency of occurring values. Sub�gure
(2) is a scatterplot confronting both measures and also showing the
non-dominated fronts as polylines.
values for precision error. Comparing Sub�gures (0) and (1), we observe as
an additional di�erence that for accuracy-error some participants in certain
blocks were able to reach error values close to 0. Yet, for precision-error, we
see nobody with a precision-error less than 5 in any block.
From these two sub�gures, we get an idea for the distribution of one perfor-
mance measure independent from the other one. The opposing skewness of
both distributions may be a hint towards a negative correlation, i.e., that low
values for accuracy-error may occur together with high values for precision-
error.
In order to get an impression on the relation between both measures in our
experiment, we look now at the remaining sub�gure - Figure 5.1, Sub�gure (2).
This diagram is a scatterplot confronting accuracy-error along the horizontal
axes with precision-error along the vertical axis. Each point still represents
a certain block of a certain participant. In addition, we see the fronts from
non-dominated sorting as polylines connecting the elements that belong to it.
We explained how we determine these fronts in Section 2.4.
Looking at the data points, we see that they cover for accuracy-error a range
starting from left close to zero going to the right up to a value of 10 like before
in Sub�gure (0). Also in precision-error, they cover the range starting at the
bottom with about 5 going up to a value close to 20 just like in Sub�gure (1).
However, we see no negative correlation which would be visible as a perceiv-
41
5 Results
able line or strip of points going from the top left to the bottom right of the
diagram. What we see resembles more a positive correlation. It is a triangle-
like distribution of points starting at the bottom left with performances good
in accuracy-error and precision-error going up to performances still with good
accuracy-error but now with bad precision-error values and ending in the top
right with performances that are bad according to both measures. Regarding
the non-dominated fronts, the best are located on the left and the worst in
the top right. The best fronts with low accuracy-error mainly go vertical from
performances with high precision-error to performances with low precision-
error whereas the worst fronts follow a direction that is diagonal to horizontal.
Underneath the diagonal of this observed triangle are no data points. The
area that would complete the upper triangle of data points to a rectangle, we
refer to it in the following as empty triangle. This empty triangle means
that in our experiment and according to our method no participant showed a
performance in any block that would lead to a combination of low values for
precision-error and high values for accuracy-error.
Observing this empty triangle, we wonder for the reason of this shape. One
possible reason is the errors that we introduce in our method that may distort
the distribution of occurring values. We measure the position of the hit of a
dart on the dartboard using the dartboard �eld that the dart hit. Later, we
estimate for each hit a position in these �elds independent of the actual posi-
tion in the respective �eld in order to apply our performance measures. Here,
we introduce an error between the actual position of hit and the estimated
position. Since the dartboard �elds increase in area from the inside to the out-
side, participants may be a�ected to a di�erent extent for example according
to their accuracy-error. So participants with higher accuracy-error - a bigger
o�set of their hits from the center - are a�ected by bigger errors as they tend
to hit the bigger �elds more often than participants with low accuracy-error.
This distortion may move occurring performances with low accuracy and high
precision above the diagonal of the triangle where we cannot distinguish them
anymore from the other performances. Since we determine the fronts based
on these values they may not capture the initially intended combination of
elements that are exceptionally good in just one performance measure and
elements that are good in both measures. This would in�uence the further
analyses based on these fronts.
In order to test whether our method is the reason for the empty triangle or
in other words whether it is possible to achieve performances in it, we visual-
42
5.1 Performance measures from Experiment Data
ized the distortions according to both measures. Following the idea of what
would be di�erent if we had the exact o�set in horizontal and vertical direc-
tion instead of clock time and score, we made a simulation to generate arti�cial
performance data. For this, we generated arti�cial performance data to con-
front performance measures that contain the quanti�cation to the �elds of the
dartboard and performance measures that do not. We chose simple circular
two-dimensional normal distributions on the plane of the wall as representa-
tives for the distribution of hits relative to the dartboard center from which
we drew samples as hits with x and y relative to the dartboard center. For
these distributions, we have 2 parameters - one for the location or center of the
distribution and one for the extent of horizontal as well as vertical scatter - the
standard deviation. Since the dartboard is rotational invariant, we consider
for the location parameter just a positive o�set to the right of the dartboard
center. We de�ned 500 di�erent distributions based on the combinations of 50
di�erent values for the horizontal o�set to the center and 10 di�erent values
for the standard deviation. From each of these distributions, we drew a sample
block, i.e., 100 samples as simulated hit positions. For each of these arti�cial
blocks, we computed the performance measures using the actual hit position
from sampling for a �rst version and using the quanti�ed positions from map-
ping to score and clock time for a second version. If the distortions that lead
to the empty triangle are a general property of our performance measures we
should observe these distortions and the empty triangle when we look at the
�rst version of the performance measures. If these distortions are a result of
our process - of using the dartboard and estimating the position of hits - we
should observe the empty triangle just for the performance measures from the
arti�cial data that was mapped to the dartboard �elds.
Similar to Figure 5.1, Sub�gure (2), we confront in Figure 5.2 accuracy-error
along the horizontal axis and precision-error along the vertical axis. Here, we
marked the triangle-like area that contains performances that occurred during
our experiment with a continuous line. The empty triangle is marked with
a dashed line. For each of the 500 distributions that we sampled there is
an empty circle that represents the performance measures that are directly
based on the sampled hit positions. For the empty circles with precision-
error close to 0, we see the regular distances that we chose for the values of
the o�set parameter. The more we look up the harder it gets to recognize
these lines. In the dashed triangle representing the empty triangle, we see
lines of empty circles. So just according to our performance measures, it is
43
5 Results
0 10 20 30 40accuracy_error
0
10
20
30
40
prec
ision
_erro
r
based on sampled positionbased on dart board fieldsempty triangle in experimentarea of observed values in experiment
Figure 5.2: Visualizing distortions in the relation between our performance
measures - We generated arti�cial blocks of hit positions that we
used to compute the performance measures directly (empty circles)
as well as by mapping them to the �elds of the dartboard �rst
(�lled circles). We marked as a reference the area that contains
the occurring values in our experiment as well the empty triangle
underneath.
44
5.1 Performance measures from Experiment Data
possible to reach this area. For each of the sampled distributions, there is
a �lled circle in the �gure which represents the performance measures based
on score and clock time. For these, we see multiple distortions. First, we
see that the distribution is limited to the top and the right by a bow-like
structure. Most likely this originates from the limited size of the dartboard
and the mapping of missing hits to the margin of the dartboard. For example, if
participants always miss the dartboard to the right then it makes no di�erence
how far they miss as all these hits are later mapped to the right margin of
the dartboard. According to this fact, we also see the maximum in accuracy-
error for the �lled circle at about 21 cm which is the radius of the dartboard.
The second distortion is a positive correlation that we see for precision-error
below 5. That �ts the notion that the error that we introduce gets bigger
to the outer rings. Nevertheless also in the area of the empty triangle, we
see �lled circles. Comparing corresponding �lled and empty circles inside the
triangle for the occurring values from our experiment, we see the �lled circles
shifted downwards. This means that using the dartboard for measurement, we
underestimate precision-error.
In summary, we observed between our two performance measures from our
experiment a triangle-shaped relation that is - according to our simulation
- neither caused by their de�nition nor caused by introducing quanti�cation
errors by using the dartboard.
5.1.2 Regarding Performance Development
In order to quantify development with respect to the performance measures,
we determine the di�erences in the performance measures from �rst blocks to
corresponding later blocks. A negative di�erence for example in accuracy-error
tells us that the accuracy-error in the later block was smaller than in the �rst
block. This means that the corresponding participant improved in accuracy-
error respective to his accuracy-error in the �rst block. Correspondingly, a
positive di�erence refers to an increase which means that the participant de-
graded. Apart from improving and degrading, we can also observe that par-
ticipants show no or hardly any development with di�erence values close to 0.
Applied to the performance measures from our experiment, we determine for
each participant the di�erences from the �rst block to the later 4 blocks.
45
5 Results
2.5 0.0 2.5accuracy_error
0
10
20(0)
5 0precision_error
0
10
(1)
2.5 0.0 2.5accuracy_error
5
0
prec
ision
_erro
r
(2)
Figure 5.3: Occurring values for development in performance measures - We
determined from the occurring performance measure values the
di�erences from the �rst to the later blocks for each participant.
Sub�gures (0) and (1) are histograms on this data showing the
frequency of occurring values. Sub�gure (2) is a scatterplot con-
fronting the di�erences in both measures and also showing the non-
dominated fronts as polylines.
We visualize these di�erences in the sub�gures of Figure 5.3. Sub�gure (0)
is a histogram that shows us the distribution of occurring di�erence values
regarding accuracy-error. The distribution ranges from the smallest values -
improvements of around -5 - on the left to the biggest values - degradations
up to 2.5 - to the right. We observe the highest peak of the distribution to
cover negative values but also to show very small development close to 0. Also,
looking at the area left and right of 0, we see that the majority of occurring
di�erence values are negative. For accuracy-error, we see that participants
improved more often and also to a bigger extent than they degraded. So, we
see a tendency of getting better in accuracy. Switching to precision-error, Sub-
�gure (1) shows us accordingly a histogram for the distribution of occurring
di�erence values precision-error. Similarly, the values range from -5 up to 2.5,
the peak covers negative values close to zero, and occurrences have predomi-
nantly negative values. Yet, this time, the peak is not as pronounced as before.
Also for precision-error, we see that participants improved more often and to a
bigger extent which we interpret as a tendency of improving in precision-error
over time. Even though both distributions are similar, yet, we do not know
the relation between the development in both measures. For this, we look at
the remaining Sub�gure (2). This diagram is a scatterplot that confronts -
like before in Figure 5.1, Sub�gure (2) - the values for accuracy-error along
the horizontal axis and the values for precision-error along the vertical axis.
46
5.1 Performance measures from Experiment Data
Whereas this time, the values are the di�erences for these measures. A point
in this diagram represents the di�erences in accuracy-error and precision-error
from the �rst block to a certain later block for a certain participant. Based
on these points, we also determined the non-dominated fronts which we will
use in later analyses. Here, we show these fronts again as polylines connecting
the corresponding points. To emphasize increase and decrease, or improve-
ment and degradation, we added orthogonal to both axes lines that mark the
border between both di�erent developments at the value 0. This gives us 4
quadrants. The lower left quadrant shows us an area with improvement in
both measures, the upper left improvement in accuracy-error and degradation
in precision-error, the upper right degradation in both measures, and the lower
right quadrant an area with degradation in accuracy-error and improvement in
precision-error. We see that the points are distributed across all quadrants. So
all combinations of developments occurred which is di�erent from the empty
triangle in the previous section. We also see points with no development in one
of the measures or no development in both. Generally, the density of points
near the intersection at (0, 0) is higher than in the distant parts. A tendency
of improving regarding both measures, we see from the fact that the lower left
quadrant for improvement regarding both measures contains the most points
in comparison to the other quadrants. Regarding the non-dominated fronts,
we see the best fronts starting from the lower left to the worst fronts in the
top right. The best fronts span from the upper left quadrant over the lower
left quadrant to the lower right quadrant. Doing so, they combine points with
strong improvement in one and degradation in the other measure with points
that show improvement in both measures. The worst front span inside the
upper right quadrant combining points with degradations according to both
performance measures.
So far, we examined the occurring values for development in performance mea-
sures that we will confront in Section 5.3.3 with development in distribution
properties of our throw features in order to identify characteristics of partici-
pants that improve more than others. We observed the tendency to improve
according to both measures. Apart from improvement, we also observed cases
of no development and degradation. This gives us the possibility to confront
cases of development later on. Di�erent from the fronts on the actual values
for the performance measures in the previous section, this time, especially the
best fronts span a full range and are not just in�uenced by just one of the
measures. Here they combine both accuracy-error and precision.
47
5 Results
5.2 Relation between Performance and Throw
Pattern
In the following, we want to analyze the performance data in their relation to
the throw features. We expect that better participants show a higher variety in
their throw physiology. So far, we determined for each of the 5 blocks - so for
every 100 throws - of a participant the performance measures accuracy-error
and precision-error, as well as the distribution properties median and IQR for
each throw feature. Applying IQR to a throw features gives us an physiological
variance measure. At this point, we have for each participant 5 data points
which are the basis for the following Figures 5.4, 5.5, and 5.6. Each of them
shows in Sub�gure (0) a visualization of one or both performance measures
as well as in Sub�gures (1) to (20) the distribution measures for each feature.
Looking at Figure 5.5, Sub�gure (6) just as an example, we see a scatterplot
that opposes for the data points of all participants for the feature d_t2_t4
the median along the horizontal axis as well as the IQR along the vertical axis.
In addition, we see data points colored in transparent grey, blue or red. Yet
in order to get to know this kind of diagram, we will ignore the meaning of
the color for this moment. Considering the vertical axis on IQR, we observe
that the elements of the blue group of data points tend to reach higher values
for IQR than the elements of the red group. For the feature of this sub�gure -
d_t2_t4 - which is the latency between the time of the maximum acceleration
and the time of maximum deceleration, this means that the variations of this
latency for the blue group tend to be bigger than the respective variations
for the red group. On the other hand considering the horizontal axis on the
median, we see that the red group is more concentrated as it covers a smaller
interval between values from 100 to 150. However, the blue group seems to
be more scattered as it covers a greater interval. Regarding the latency
between peak acceleration and peak deceleration this means that the elements
of the red group tend to be more similar in their average latency. Whereas the
elements of the blue group reach higher as well as lower average latency being
less similar. Also, when we consider both axes, we see that the red points are
more concentrated than the blue points showing that the elements of the red
group are also more similar regarding IQR.
In order to �nd answers for our research question of which di�erences exist
between participants with a high performance and participants with a low
48
5.2 Relation between Performance and Throw Pattern
performance, we de�ne a group of good elements with a high performance
and a group of bad elements with a low performance based on di�erent crite-
ria regarding our performance measures - accuracy-error and precision-error.
Having these groups, we oppose both of them visually to look for di�erences in
their distribution properties for throw features. In the following Figures 5.4,
5.5, and 5.6, we show the good group in red color and the bad group in blue
color. Elements that we do not assign to a group, we show in a transparent
grey for context. Applied to our initial example from Figure 5.5, Sub�gure (6),
we see that it is the group of good elements in red that shows less variance in
the given feature and whose elements are more similar to each other than the
elements in the bad group in blue. To shorten the following descriptions, we
abbreviate the good red group as R as well as the bad blue group as B.
According to our notion of variability in motor control, we expect participants
with a higher performance to show higher variance in their throw physiology.
We quantify this variance using the IQR distribution property of throw features
over a whole block of throws. In consequence, we expect for the scatterplots
for the distribution measures of the throw features to see that R shows higher
IQR values than B. So what we see in our example in Sub�gure (6) does not
match to what we expect. It rather opposes what we expect as R shows less
IQR instead of more IQR.
In the next three sections, we de�ne R and B, �rst and second solely based
on one of our performance measures and third based on both of them using
fronts from non-dominated sorting. Each time, we use these groups to identify
di�erences according to the throw features and check whether they �t our
expectations. Afterward, we summarize and compare the sets of observations
that we make.
5.2.1 Groups based on Accuracy-error
In this �rst analysis, we select R and B solely based on the accuracy-error
performance measure. As this measure is an error measure it is better if the
error's value is smaller. Because of this, we selected for R the 10 elements
with the smallest values for accuracy-error. On the other hand, we selected for
B the 10 elements with the highest values for accuracy-error. In Figure 5.4,
Sub�gure (0), we see a histogram on the occurring values of accuracy-error.
49
5 Results
With respect to our selection R forms the left border with the smallest values
while B forms the right border with the highest values.
Based on this de�nition of R and B from accuracy error, we will now describe
our observations of di�erences between both groups regarding the distribution
measures for each throw feature in Figure 5.4. Yet, we will not describe features
or in other words, Sub�gures in which we see no clear di�erence.
Regarding the features d_t2_t3 and d_t2_t4 in Sub�gures (5) and (6), we
observe that R except for two outliers is more concentrated according to IQR
also tending to have smaller IQR than B. This opposes our expectations as we
expect higher IQR for the elements of R.
Looking at Sub�gure (13), we see R having the tendency to have higher IQR
than B which �ts what we expect.
Lastly, in Sub�gure (19), we observe - ignoring two outliers - B tending to have
lower IQR for a given median. So also here R shows what we expect, i.e., a
higher IQR than B.
Using solely the accuracy-error to de�ne R and B, we observed di�erences in
4 of 20 throw features according to their distribution properties. Though the
di�erences for 2 features match our expectations there are also 2 features that
show di�erences that oppose them.
5.2.2 Groups based on Precision-error
In this analysis, we de�ne the selections for good as well as bad based solely on
the precision-error performance measure. As this measure is an error measure
like the accuracy-error in the previous analysis, we consider smaller values
preferable or better than bigger values. Respectively, we selected for the group
of good elements R the 10 elements with the lowest values in precision-error
and for the group of bad elements B the 10 elements with the highest values.
Looking at the histogram in Figure 5.5, Sub�gure (0), we see R at the left
border of the distribution with the lowest occurring values and we see B at the
right border with the highest occurring values.
Regarding the Sub�gure (1) for the �rst feature d_t1_t2, we see R and B
separated by a conceivable diagonal. So for a given median R shows smaller
50
5.2 Relation between Performance and Throw Pattern
5 10accuracy_error
0
10
20
(0)
100 200d_t1_t2_median
50
100
150
d_t1
_t2_
iqr
(1)
100 200 300d_t1_t3_median
50
100
150
d_t1
_t3_
iqr
(2)
200 400d_t1_t4_median
50
100
150
d_t1
_t4_
iqr
(3)
300 400d_t1_t5_median
50
100
150
d_t1
_t5_
iqr
(4)
50 100d_t2_t3_median
25
50
75d_
t2_t
3_iq
r(5)
100 150d_t2_t4_median
25
50
75
d_t2
_t4_
iqr
(6)
150 200d_t2_t5_median
20
40
60
80
d_t2
_t5_
iqr
(7)
25 50d_t3_t4_median
20
40
60
d_t3
_t4_
iqr
(8)
100 150d_t3_t5_median
20
40
60
d_t3
_t5_
iqr
(9)
50 100d_t4_t5_median
25
50
75d_
t4_t
5_iq
r(10)
5 10d_v1_v2_median
2
4
d_v1
_v2_
iqr
(11)
0 10d_v1_v3_median
1
2
3
4
d_v1
_v3_
iqr
(12)
10 20 30d_v1_v4_median
2
4
6
d_v1
_v4_
iqr
(13)
10 0d_v2_v3_median
2
4
d_v2
_v3_
iqr
(14)
10 20d_v2_v4_median
2
4
6d_
v2_v
4_iq
r(15)
10 20 30d_v3_v4_median
2
4
6
8
d_v3
_v4_
iqr
(16)
0.1 0.2s12_median
0.000
0.025
0.050
0.075
s12_
iqr
(17)
0.2 0.0s23_median
0.0
0.1
0.2
0.3
s23_
iqr
(18)
0.250.500.751.00s34_median
0.1
0.2
0.3
s34_
iqr
(19)
0.750.500.25s45_median
0.0
0.2
0.4
0.6
s45_
iqr
(20)
Figure 5.4: Confrontation based on accuracy-error - Sub�gure (0) represents
the selected good and bad group based on accuracy-error. Sub�g-
ures (1) to (20) show the distribution properties of these groups
for each throw feature.51
5 Results
5 10 15precision_error
0
5
10
15(0)
100 200d_t1_t2_median
50
100
150
d_t1
_t2_
iqr
(1)
100 200 300d_t1_t3_median
50
100
150
d_t1
_t3_
iqr
(2)
200 400d_t1_t4_median
50
100
150
d_t1
_t4_
iqr
(3)
300 400d_t1_t5_median
50
100
150
d_t1
_t5_
iqr
(4)
50 100d_t2_t3_median
25
50
75d_
t2_t
3_iq
r(5)
100 150d_t2_t4_median
25
50
75
d_t2
_t4_
iqr
(6)
150 200d_t2_t5_median
20
40
60
80
d_t2
_t5_
iqr
(7)
25 50d_t3_t4_median
20
40
60
d_t3
_t4_
iqr
(8)
100 150d_t3_t5_median
20
40
60
d_t3
_t5_
iqr
(9)
50 100d_t4_t5_median
25
50
75d_
t4_t
5_iq
r(10)
5 10d_v1_v2_median
2
4
d_v1
_v2_
iqr
(11)
0 10d_v1_v3_median
1
2
3
4
d_v1
_v3_
iqr
(12)
10 20 30d_v1_v4_median
2
4
6
d_v1
_v4_
iqr
(13)
10 0d_v2_v3_median
2
4
d_v2
_v3_
iqr
(14)
10 20d_v2_v4_median
2
4
6d_
v2_v
4_iq
r(15)
10 20 30d_v3_v4_median
2
4
6
8
d_v3
_v4_
iqr
(16)
0.1 0.2s12_median
0.000
0.025
0.050
0.075
s12_
iqr
(17)
0.2 0.0s23_median
0.0
0.1
0.2
0.3
s23_
iqr
(18)
0.250.500.751.00s34_median
0.1
0.2
0.3
s34_
iqr
(19)
0.750.500.25s45_median
0.0
0.2
0.4
0.6
s45_
iqr
(20)
Figure 5.5: Confrontation based on precision-error
52
5.2 Relation between Performance and Throw Pattern
values for IQR than B which opposes our expectation of R having higher IQR
values.
Looking at Sub�gure (5), we observe R having lower values and being less
scattered according to IQR. This stands again against what we expect.
Similar to this, we see in Sub�gure (6) the same properties for R. In addition
R is also more concentrated regarding the median values.
In Sub�gure (11) for d_v1_v2 which represents the maximum acceleration R
and B intersect. Yet R is more concentrated with small values for median and
IQR. B tends to have higher IQR, which is not what we expect.
In the next sub�gure - Sub�gure (12) -, we see a similar yet stronger di�er-
ence as R and B just slightly overlap while R reaches lower values for both
distribution measures.
Next, we look at Sub�gures (13) and (15) for the related features d_v1_v4
and d_v2_v4 which represent respectively the maximum deceleration and the
di�erence between maximum acceleration and maximum deceleration. Both
times, we observe - ignoring one outlier each time - that R tends to have higher
IQR and smaller median. This �ts both times our expectation.
Regarding d_v2_v3 in Sub�gure (14), we observe R to be more concentrated
according to IQR as well as median. B reaches higher values for the median
as well as the IQR. This opposes what we expect to see.
Looking at Sub�gure (17) for s12 - the slope at the beginning of the acceleration
phase -, we observe R and B intersecting yet R being concentrated with small
values in both distribution measures. On the other hand, B reaches higher
values for median and IQR which opposes our expectations.
Lastly, we look at s23 in Sub�gure (18) which opposes again what we expected
as R shows lower IQR for a given median.
Using solely precision-error to de�ne a good group and a bad group, we ob-
served di�erences between those groups in the distribution measures for 10 of
20 throw features. Here just 2 observations match our expectation of R having
higher IQR while 8 observations show the opposite.
53
5 Results
5.2.3 Groups based on Fronts from non-dominated
Sorting
As we are interested if we get more insight by not looking at a single perfor-
mance criterion but instead considering multiple performance measures in a
combined way, we determine our groups in this last comparison based on the
fronts from non-dominated sorting on both performance measures - precision-
error and accuracy-error.
As these are error measures smaller values are more preferable. So smaller
values are more dominant for both measures during the non-dominated sorting.
We see the resulting fronts in Figure 5.6 Sub�gure (0) as polylines with the
best fronts starting from the left going to the worst fronts in the upper right.
Having these fronts, we de�ne R - the set of good elements - using the best three
fronts which contain in total 14 data points. For B - the set of bad elements
-, we select the last four fronts which contain a total of 15 data points.
In Sub�gure (0), we see R and B respectively in red and blue. R contains the
elements with the lowest values for accuracy-error and spans from elements
with the lowest occurring values for precision-error almost up to elements with
the highest values. This is a consequence of the absence of data points with
low precision-error and high accuracy-error which we discussed in Section 5.1.
So R consists mainly of elements from a slice starting from the left solely
in�uenced by accuracy-error. On the other hand, B consists of elements with
a medium to high accuracy-error and precision-error.
Looking at Sub�gures (5), we see for d_t2_t3 as well as d_t2_t4 which is the
latency between the time of maximum acceleration and the time of maximum
deceleration that R tends to be more concentrated with lower values for IQR.
This opposes our expectations.
Also, regarding d_v1_v2 which is the maximum acceleration, we see in Sub-
�gure (11) that R is more concentrated regarding IQR. Here it is intersecting
with the lower part of B. Yet again it opposes what we expect to observe.
In Sub�gure (12), we observe that R tends to have lower values for median
and IQR than B. This is contrary to our expectation.
Regarding Sub�gures (13) and (15), we observe the distribution measures for
d_v1_v4 and d_v2_v4 respectively the maximum deceleration and the dif-
ference from maximum acceleration to maximum deceleration. Here R tends
54
5.2 Relation between Performance and Throw Pattern
0 10accuracy_error
0
5
10
15pr
ecisi
on_e
rror
(0)
100 200d_t1_t2_median
50
100
150
d_t1
_t2_
iqr
(1)
100 200 300d_t1_t3_median
50
100
150
d_t1
_t3_
iqr
(2)
200 400d_t1_t4_median
50
100
150
d_t1
_t4_
iqr
(3)
300 400d_t1_t5_median
50
100
150
d_t1
_t5_
iqr
(4)
50 100d_t2_t3_median
25
50
75d_
t2_t
3_iq
r(5)
100 150d_t2_t4_median
25
50
75
d_t2
_t4_
iqr
(6)
150 200d_t2_t5_median
20
40
60
80
d_t2
_t5_
iqr
(7)
25 50d_t3_t4_median
20
40
60
d_t3
_t4_
iqr
(8)
100 150d_t3_t5_median
20
40
60
d_t3
_t5_
iqr
(9)
50 100d_t4_t5_median
25
50
75d_
t4_t
5_iq
r(10)
5 10d_v1_v2_median
2
4
d_v1
_v2_
iqr
(11)
0 10d_v1_v3_median
1
2
3
4
d_v1
_v3_
iqr
(12)
10 20 30d_v1_v4_median
2
4
6
d_v1
_v4_
iqr
(13)
10 0d_v2_v3_median
2
4
d_v2
_v3_
iqr
(14)
10 20d_v2_v4_median
2
4
6d_
v2_v
4_iq
r(15)
10 20 30d_v3_v4_median
2
4
6
8
d_v3
_v4_
iqr
(16)
0.1 0.2s12_median
0.000
0.025
0.050
0.075
s12_
iqr
(17)
0.2 0.0s23_median
0.0
0.1
0.2
0.3
s23_
iqr
(18)
0.250.500.751.00s34_median
0.1
0.2
0.3
s34_
iqr
(19)
0.750.500.25s45_median
0.0
0.2
0.4
0.6
s45_
iqr
(20)
Figure 5.6: Confrontation based on non-dominated fronts on both perfor-
mance measures
55
5 Results
to be smaller regarding the median but it is also more scattered regarding IQR
and tends to reach higher values in it. This matches what we expect.
Also in Sub�gure (16) for d_v3_v4, we see R being more scattered and tending
to reach higher values for IQR which �ts our expectations.
For s12 - the slope towards the maximum acceleration - in Sub�gure (17) R
is more concentrated with low values according to both distribution measures
which opposes what we expect.
Lastly regarding Sub�gure (18) for s23 which is the slope of decreasing accel-
eration after maximum acceleration R tends to have smaller median than B.
This neither �ts nor opposes our expectations.
Using groups based on the fronts from non-dominated on our performance
measures, we found di�erences in the distribution measures of 9 of 20 throw
features. Though 3 observed di�erences match our expectation of higher IQR
for R there are also 5 observations that oppose it.
5.2.4 Comparison of Sets of Observations
In each of the previous three sections, we de�ned a good and a bad group based
on di�erent strategies. The �rst time, we de�ned the groups based on the
accuracy-error, the second time based on precision-error, and the third time
based on the fronts from non-dominated sorting to combine both measures.
We used these groups to identify throw features that showed di�erences in the
distribution measures according to these groups.
In total, we made 23 observations of di�erences which we show in Table 5.1.
There, we marked that we observed a di�erence for a certain feature using a
certain strategy to de�ne the groups by marking the respective table cell with
an "x".
To get more insight into the data, we decided at the beginning of the devel-
opment of our method to use the two performance measures accuracy-error
and precision-error, and to combine them using non-dominated sorting. Now,
we see in the table that the accuracy-error based strategy gave us - with 4
observed di�erences - the fewest observations. The other two strategies gave
us - with about 10 each - more observations. Yet, for each strategy, the set of
56
5.3 Relation between Performance Developments and Throw Pattern Developments
observations is di�erent to the others and also contains at least one observation
that is unique as it appears under no other strategy.
In Section 5.2.3, we described that the selected R for the front-based selection
resembles a slice of data points according to just accuracy-error which we
can see in Figure 5.6, Sub�gure (0). Also, B appears to be close to being
selected by another slice just on accuracy-error except for some data points
below B. With this impression the groups for that section are very similar to
the groups from Section 5.2.1 with groups exclusively based on accuracy-error
which are represented in Figure 5.4, Sub�gure (0). Nevertheless, in Table 5.1,
we see that for the front-based selection that we have more than double the
amount of observed di�erences making it very di�erent than the accuracy-
based approach. According to the number of observations, the front-based
approach is more similar to the precision-based approach.
Regarding our expectations of higher variance for better players, we assessed
for each feature in which we saw a di�erence between the good and the bad
group whether this di�erence matches to our expectation or whether it opposes
showing the opposite. We summarized also this in Table 5.1 by marking a
match with a "(+)" as well an opposition with a "(-)". 22 of 23 observations
are either matching or opposing what we expect. Only one - for the feature
s23 in the front-based approach - shows a di�erence that neither matches nor
opposes as the di�erence relates exclusively to the median which we see in
Figure 5.6, Sub�gure (18). We want to emphasize that our expectations just
focus on IQR and we also observed di�erences in median in many other cases.
Looking row-wise through the table comparing the matching of observations
for a feature using di�erent approaches, we see for 7 of 8 rows with multiple
observations that they show the same whether they match or oppose. Overall,
we made 7 observations that �t our expectation but also 15 that oppose.
5.3 Relation between Performance
Developments and Throw Pattern
Developments
Based on our de�nition of development as the di�erence from the �rst block
to the later blocks, we have 4 data points for each participant. Each data
57
5 Results
Table 5.1: Overview of throw features with observed di�erences in perfor-
mance - We de�ned a good group and a bad group according to
di�erent strategies. When we observed a di�erence between the
groups according to a certain strategy (column) in a certain feature
(row) then we mark it here with a x. In addition, we marked dif-
ferences that �t or oppose our expectations respectively with (+)
or (-).
sub�gure feature accuracy-based precision-based front-based
see section 5.2.1 see section 5.2.2 see section 5.2.3
1 d_t1_t2 x (-)
5 d_t2_t3 x (-) x (-) x (-)
6 d_t2_t4 x (-) x (-) x (-)
11 d_v1_v2 x (-) x (-)
12 d_v1_v3 x (-) x (-)
13 d_v1_v4 x (+) x (+) x (+)
14 d_v2_v3 x (-)
15 d_v2_v4 x (+) x (+)
16 d_v3_v4 x (+)
17 s12 x (-) x (-)
18 s23 x (-) x
19 s34 x (+)
58
5.3 Relation between Performance Developments and Throw Pattern Developments
point consists of di�erences for performance measures as well as di�erences of
distribution measures for throw features. In the following Figures 5.7, 5.8, and
5.9, we show those di�erences for performance measures each time as Sub�gure
(0) and for distribution measures in Sub�gures (1) to (20).
Looking at Figure 5.7, Sub�gure (1) as an example, we see a scatterplot that
confronts the di�erences of median as well as IQR for the feature d_t1_t2.
In order to understand what we see in this type of diagram, we look at the
set of blue elements and ignore the meaning of this set for now. According to
median that is represented along the horizontal axis, we see that this blue set
shows small as well as above yet positive di�erence values. Some lay with a
very small di�erence value close to the vertical 0-line showing very small or no
development. Others, we see further to the right showing us an increase in
the median distribution measure. For the feature d_t1_t2 that captures the
time from the start of the acceleration until the maximum acceleration, this
means that the elements of the blue group take longer on average later on to
reach that peak acceleration. Switching to the vertical axis for the di�erences
in IQR, we see for some elements of the blue group that they are very close
to the horizontal 0-line or in other words that their di�erence value is very
close to 0. These show again very small or no development. The rest of the
blue group shows absolute bigger but negative di�erence values in IQR for
the given feature which makes them represent a decrease in IQR. For the
d_t1_t2 feature, this means that the variance of the time to the maximum
acceleration got smaller for these elements, which may mean that the process
of throwing got more stable or similar.
We see in this blue group for the d_t1_t2 feature a tendency for an increase
in median with a decrease in IQR though some elements show no development
in median or IQR. This observation may be more signi�cant or convincing
if all elements would show this increase and decrease. Yet the blue group is
more clear in its development being more concentrated compared to the red
group. While some elements of the red group lay close to the blue elements
there are others that actually show stronger developments but also opposite
developments which makes it more diverse and appear more scattered. For
example, there are also red elements in the bottom right of the �gure that
show a decrease in median as well as IQR.
Returning to our research question of what are di�erences between partici-
pants which improve a lot and participants which do not we will de�ne in the
59
5 Results
following a group that is good or better according to a chosen criteria based
on the development as well as another group that is bad or worse based on the
same criteria. We show the good group as the red group as well as we show
the bad group as the blue group. For our example from before from Figure
5.7, Sub�gure (1) this means that both groups tend to reduce IQR. Yet they
are di�erent in the development according to median as the bad group just
shows an increase while the good group shows a decrease as well as increase.
According to our idea of variability in motor control, we expect that the im-
provement of participants is related to an increase in variability which is here
represented by the distribution measure IQR for throw features. According to
this, our good group should show positive IQR di�erence values which repre-
sent an increase in IQR. Comparing both groups, we expect the IQR di�erences
for the good group to be bigger than the IQR di�erences for the bad group.
Considering our example from before, we see that the red good group and
the blue bad group are not separated according to IQR di�erences. Also, for
both, we rather see a decrease in IQR than an increase in IQR. Though some
elements show an increase in IQR yet our example shows a combination that
we did not expect.
In the following three sections, we observe di�erences in the developments of
throw features of our good and bad group based on three di�erent strategies
to de�ne those groups. First and second, we de�ne the good and bad group
solely based on the development in one of our performance measures. Third,
we de�ne them based on fronts from non-dominated sorting on developments
in both measures. Afterward, we compare the sets of observations that we
make.
5.3.1 Groups based on Accuracy-error
Now, we want to de�ne both groups based on the development in the accuracy-
error performance using di�erences in that measure from the �rst block to later
blocks. The good group is again assigned with the color red. We abbreviate
it in the following observations with R. On the other side the bad group is
assigned again to the color blue. So, we abbreviate it in the following observa-
tions with B. We selected the 11 elements with the biggest negative di�erence
values as R. They represent the elements with the biggest improvement as
their accuracy-error decreased the most from the �rst to the later block. In
60
5.3 Relation between Performance Developments and Throw Pattern Developments
addition, we selected the 11 elements with the biggest di�erence values as B.
Looking at Figure 5.7 Sub�gure (0), we see that they show a degradation as
their accuracy-error increased. From here, we can see that R and B represent
the left and right end of the distribution of occurring di�erence values.
Let's describe the observations for di�erences according to a good and a bad
group de�ned based on the development in accuracy-error performance mea-
sure. We focus exclusively on Figure 5.7. Yet, we do not describe sub�gures
in which we cannot see a clear di�erence between the groups.
Looking at Sub�gure (1) for the development of the distribution measures of
the d_t1_t2 throw feature, we see that B as well as R mostly change a little
or rather gets smaller in IQR. This does not match our expectations. However,
we see a di�erence in median as R is more scattered than B showing partially
an increase and partially a decrease in median while B shows just an increase
in median.
Looking at Sub�gure (2) for the development of the distribution of d_t1_t3,
we see that B is more concentrated with the tendency of increasing the median
while R is more scattered with di�erent combinations of increases or decreases
in median and IQR. We see similar developments for d_t1_t4 in Sub�gure (3)
as well as d_t1_t5 in Sub�gure (4).
In Sub�gure (6) for d_t2_t4, we see for B a slight decrease in median time
whereas R shows rather a decrease in IQR. This means the latency between
peak acceleration and peak deacceleration gets smaller for B whereas for R it
is more characteristic that the variance of that latency decreases meaning that
the throws getting more similar regarding that feature.
Regarding Sub�gure (8) for d_t3_t4 B seems to be slightly more scattered
according to developments in IQR. Yet, we see no di�erence that matches our
expectations.
In Sub�gure (9), we see for the total time of deceleration - d_t3_t5 - for R
mainly an increase in IQR whereas B shows rather a decrease in IQR. This
matches our expectations.
Starting to look at the amplitude features, we see for the maximum acceleration
- d_v1_v2 - that B decreases in IQR while R shows smaller also increasing
developments. Also, this di�erence does not support our expectations.
In Sub�gure (15) for d_v2_v4 - the amplitude di�erence between the maxi-
mum acceleration and the maximum deceleration -, we see no or in comparison
61
5 Results
2.5 0.0 2.5accuracy_error
0
10
20(0)
50 0 50d_t1_t2_median
100
0
d_t1
_t2_
iqr
(1)
50 0 50d_t1_t3_median
100
50
0
50
d_t1
_t3_
iqr
(2)
50 0 50d_t1_t4_median
100
50
0
50
d_t1
_t4_
iqr
(3)
50 0 50d_t1_t5_median
100
50
0
50
d_t1
_t5_
iqr
(4)
0 50d_t2_t3_median
40
20
0
d_t2
_t3_
iqr
(5)
50 0d_t2_t4_median
40
20
0
20
d_t2
_t4_
iqr
(6)
0 50d_t2_t5_median
40
20
0
20
d_t2
_t5_
iqr
(7)
25 0d_t3_t4_median
20
0
20
d_t3
_t4_
iqr
(8)
25 0 25d_t3_t5_median
25
0
25
d_t3
_t5_
iqr
(9)
50 0d_t4_t5_median
25
0
25
50d_
t4_t
5_iq
r(10)
2.5 0.0 2.5d_v1_v2_median
2
0
d_v1
_v2_
iqr
(11)
5 0d_v1_v3_median
2
0
d_v1
_v3_
iqr
(12)
0 10d_v1_v4_median
2
0
2
d_v1
_v4_
iqr
(13)
5 0d_v2_v3_median
2
1
0
d_v2
_v3_
iqr
(14)
0 10d_v2_v4_median
2
0
d_v2
_v4_
iqr
(15)
0 10d_v3_v4_median
2
0
d_v3
_v4_
iqr
(16)
0.05 0.00 0.05s12_median
0.05
0.00
s12_
iqr
(17)
0.1 0.0s23_median
0.3
0.2
0.1
0.0
s23_
iqr
(18)
0.00 0.25s34_median
0.2
0.0
0.2
s34_
iqr
(19)
0.5 0.0 0.5s45_median
0.2
0.0
0.2
s45_
iqr
(20)
Figure 5.7: Confrontation based on development of accuracy-error
62
5.3 Relation between Performance Developments and Throw Pattern Developments
to the rest small changes for R. On the other hand, B shows a tendency for a
reduction in median while it is also more diverse in IQR. Also this di�erence,
we cannot �t to our expected di�erences.
Looking at Sub�gure (18) for s23, we see as the only di�erence between B and
R that R is less scattered than B which we cannot �t to our expectations.
For Sub�gure (20) the feature s45, we see just for R an increase in IQR. This
�ts our expectations.
Summing up, we found 11 features with di�erences in development for both
groups based on the developments in accuracy-error, e.g., di�erent develop-
ments in median or IQR, or di�erences in the variations of the strength of
developments. Yet, there were just 2 cases that actually �t our expectations
on increasing variance.
5.3.2 Groups based on Precision-error
This time, we de�ne the groups based on the developments in precision-error
that is de�ned as the di�erences in precision-error of the �rst block to later
blocks. Like before, we select for the red good group - R - the 11 elements with
the smallest di�erence values. As we can see in Figure 5.8, Sub�gure (0) these
are elements at the lower border of the distribution of occurring di�erences
with a negative di�erence that stands for a decrease in precision-error. So
they are the elements that improved the most according to the precision-error.
On the other side, we assigned the 11 elements with the biggest di�erence
values which are closest to the upper limit of the distribution in Sub�gure (0)
to the blue bad group - B. They show di�erences close to 0 or positive. So they
show a small to medium increase in precision-error which is a degradation.
Looking at Sub�gure (1), we see for B a tendency to slightly increase in me-
dian as well as to slightly decrease in IQR. R is di�erent as it shows stronger
developments that are also more diverse because there are in addition elements
with opposite developments as described for B.
In Sub�gure (5) R appears again scattered compared to B without a certain
tendency for a common development. On the other hand, B is rather concen-
trated with a tendency to slightly decrease in median.
In Sub�gures (6), (7), (9) as well as (20), we see B as rather concentrated while
we see R rather scattered without a certain tendency for the developments.
63
5 Results
5 0precision_error
0
5
10
15(0)
50 0 50d_t1_t2_median
100
0
d_t1
_t2_
iqr
(1)
50 0 50d_t1_t3_median
100
50
0
50
d_t1
_t3_
iqr
(2)
50 0 50d_t1_t4_median
100
50
0
50
d_t1
_t4_
iqr
(3)
50 0 50d_t1_t5_median
100
50
0
50
d_t1
_t5_
iqr
(4)
0 50d_t2_t3_median
40
20
0
d_t2
_t3_
iqr
(5)
50 0d_t2_t4_median
40
20
0
20
d_t2
_t4_
iqr
(6)
0 50d_t2_t5_median
40
20
0
20
d_t2
_t5_
iqr
(7)
25 0d_t3_t4_median
20
0
20
d_t3
_t4_
iqr
(8)
25 0 25d_t3_t5_median
25
0
25
d_t3
_t5_
iqr
(9)
50 0d_t4_t5_median
25
0
25
50d_
t4_t
5_iq
r(10)
2.5 0.0 2.5d_v1_v2_median
2
0
d_v1
_v2_
iqr
(11)
5 0d_v1_v3_median
2
0
d_v1
_v3_
iqr
(12)
0 10d_v1_v4_median
2
0
2
d_v1
_v4_
iqr
(13)
5 0d_v2_v3_median
2
1
0
d_v2
_v3_
iqr
(14)
0 10d_v2_v4_median
2
0
d_v2
_v4_
iqr
(15)
0 10d_v3_v4_median
2
0
d_v3
_v4_
iqr
(16)
0.05 0.00 0.05s12_median
0.05
0.00
s12_
iqr
(17)
0.1 0.0s23_median
0.3
0.2
0.1
0.0
s23_
iqr
(18)
0.00 0.25s34_median
0.2
0.0
0.2
s34_
iqr
(19)
0.5 0.0 0.5s45_median
0.2
0.0
0.2
s45_
iqr
(20)
Figure 5.8: Confrontation based on development of precision-error
64
5.3 Relation between Performance Developments and Throw Pattern Developments
Regarding d_v1_v4 in Sub�gure (13), we observe a stronger tendency for R
to reduce in median compared to B.
Looking at d_v2_v3 in Sub�gure (14), we see for R decrease in IQR that we
do not see for B. This is opposing what we expect.
In Sub�gure (17) for s12 which represents the slope towards the maximum
acceleration, we see for B the tendency to decrease in median as well as in
IQR. We do not see this tendency for R.
Summing up, we identi�ed 9 features with di�erences in development based
on groups from development in precision-error. None of these observations
actually �ts our expectations while one of them even opposes our expectations.
5.3.3 Groups based on Fronts from non-dominated
Sorting
In this analysis, we de�ne R and B based on fronts from non-dominated sorting.
Similar to the application of non-dominated sorting in the analysis of status
for the multi-criteria case in Section 5.2.3, we determine the fronts based on
minimizing each measure. In that analysis, a smaller value for each of the
performance measures was better as they represent errors where a smaller error
is preferable. This time, we use the developments instead, i.e., the di�erences
in the performance measures from the �rst to later blocks. Yet, also with
these, a smaller value is preferable as we consider di�erences now. Negative
values represent a decrease in the errors while positive di�erences represent
an increase in error or in other words a degradation. The result of the non-
dominated sorting, we see in Figure 5.9, Sub�gure (0) where the resulting
fronts are visible as polylines with the best fronts starting in the bottom left
and the worst fronts in the top right. For the group of good elements that
we show again in red and abbreviate with R, we selected the elements of the
�rst 2 fronts with 12 elements. So the group size is close to the group size
in the previous analyses. As we can see in Sub�gure (0) R contains elements
with strong improvements in both performance measures showing elements
with negative di�erence values in both performance measures. Yet, at the
ends of the selected fronts, we see elements with strong improvement in one
measure while there is a degradation in the other measure. For the group of
bad elements -B -we selected the elements of the last 4 fronts with 10 elements.
65
5 Results
In Sub�gure (0), we see that this group consists of elements that slightly or
stronger degrade in both performance measures.
Looking at Sub�gure (1), we observe B being more concentrated with a slight
increase in median while R shows increases as well decreases in the median of
the feature d_t1_t2. In addition for R, we see a tendency to reduce in IQR.
This, plus the fact that B shows almost no development in IQR seems to show
the opposite of what we expect.
Regarding the next feature d_t1_t3 in Sub�gure (2), we also see that B is
more concentrated with a tendency to slightly increase in median while R
shows stronger developments with increases as well as decreases in median.
This also applies for the following feature in Sub�gure (3).
In Sub�gure (5) for d_t2_t3, we observe that B shows no development in IQR
while R shows decrease in IQR. This observation opposes our expectation.
Regarding the next Sub�gures (6) and (7), we see a similar di�erence for B
and R like in Sub�gure (5). Again R shows a stronger decrease in IQR which
opposes our expectation.
Looking at Sub�gure (8) for feature d_t3_t4, we observe for R a tendency to
decrease in in median while for B we observe a tendency to increase.
In Sub�gure (12), for B as well as R, we see a tendency to decrease in median.
Yet R decreases stronger.
For feature d_v2_v4 in Sub�gure (15), we observe for B the tendency to
decrease in IQR while for R we see increases as well as decreases in IQR. This
�ts partly to our expectations. Though the B and parts of R show decrease in
IQR nevertheless other parts of R show the expected increase in IQR and the
stronger increase than B.
In Sub�gure (16) for d_v3_v4, we observe for R the tendency to increase in
median while in B we see a weaker increase and also decrease.
Lastly, in Sub�gure (18), we see again for R a stronger tendency to decrease in
IQR while B is rather concentrated with no development in IQR which opposes
our expectations.
Summing it up using groups based on fronts from non-dominated sorting, we
found 11 features that showed di�erences in development according to those
groups. We found no feature in which the groups showed the development that
66
5.3 Relation between Performance Developments and Throw Pattern Developments
2.5 0.0 2.5accuracy_error
5
0
prec
ision
_erro
r(0)
50 0 50d_t1_t2_median
100
0
d_t1
_t2_
iqr
(1)
50 0 50d_t1_t3_median
100
50
0
50
d_t1
_t3_
iqr
(2)
50 0 50d_t1_t4_median
100
50
0
50
d_t1
_t4_
iqr
(3)
50 0 50d_t1_t5_median
100
50
0
50
d_t1
_t5_
iqr
(4)
0 50d_t2_t3_median
40
20
0
d_t2
_t3_
iqr
(5)
50 0d_t2_t4_median
40
20
0
20
d_t2
_t4_
iqr
(6)
0 50d_t2_t5_median
40
20
0
20
d_t2
_t5_
iqr
(7)
25 0d_t3_t4_median
20
0
20
d_t3
_t4_
iqr
(8)
25 0 25d_t3_t5_median
25
0
25
d_t3
_t5_
iqr
(9)
50 0d_t4_t5_median
25
0
25
50d_
t4_t
5_iq
r(10)
2.5 0.0 2.5d_v1_v2_median
2
0
d_v1
_v2_
iqr
(11)
5 0d_v1_v3_median
2
0
d_v1
_v3_
iqr
(12)
0 10d_v1_v4_median
2
0
2
d_v1
_v4_
iqr
(13)
5 0d_v2_v3_median
2
1
0
d_v2
_v3_
iqr
(14)
0 10d_v2_v4_median
2
0
d_v2
_v4_
iqr
(15)
0 10d_v3_v4_median
2
0
d_v3
_v4_
iqr
(16)
0.05 0.00 0.05s12_median
0.05
0.00
s12_
iqr
(17)
0.1 0.0s23_median
0.3
0.2
0.1
0.0
s23_
iqr
(18)
0.00 0.25s34_median
0.2
0.0
0.2
s34_
iqr
(19)
0.5 0.0 0.5s45_median
0.2
0.0
0.2
s45_
iqr
(20)
Figure 5.9: Confrontation based on fronts from non-dominated sorting on the
development in accuracy-error and precision-error
67
5 Results
we expected. Yet, we found 5 features where the groups showed developments
that rather oppose our expectations.
5.3.4 Comparison of Sets of Observations
We made 3 analyses using 3 di�erent strategies to de�ne good and bad groups.
For the �rst two analyses, we used the development in one of the perfor-
mance measures each. For the last analysis, we combined the development
in both measures using the fronts from non-dominated sorting. We opposed
these groups in each analysis in order to �nd di�erences in the development of
features that �t our expectation of a rising variance.
As we summarize in Table 5.2, we found 31 di�erences for 18 of 20 features,
in at least 9 features in each of our analyses.
Though we see di�erent developments many times, most of the time those
observed di�erences do not match to our expectation. Even 5 times they seem
to oppose our expectation and there are just 2 observations that match to
them.
Our initial motivation to use fronts from non-dominated sorting to combine
both performance measures was to observe e�ects that could not be found using
single performance measures. After these 3 analyses, we see that the selected
groups for the 3rd analysis - that we can see in Figure 5.9, Sub�gure (0) - could
not have been selected by the single measure approach on this data. For R, we
combined elements with the smallest values according to accuracy error which
were selected for R in our �rst analyses with elements with the smallest values
according to precision error which were selected for R in the second analyses.
For B it is more like an intersection, i.e., selecting elements that just appear
in both sets, of the Bs that we selected in the �rst two analyses. Also, the
observations that we made appear to us like a combination of both measures
as for 12 of 16 cases, we see in Table 5.2 that if we observed a di�erence in the
1st or the 2nd analyses then we also observed a di�erence in the 3rd analyses.
Yet, what we observed in the 3rd analysis is also di�erent because we see the
most cases of observed di�erences that oppose our expectations. In addition
for 2 features, we saw di�erences that we did not see in the analyses that were
based on the single performance measures.
68
5.3 Relation between Performance Developments and Throw Pattern Developments
Table 5.2: Overview of throw features with observed di�erences in develop-
ment - We de�ned a good and a bad group according to di�erent
strategies. When we observed a di�erence between the groups ac-
cording to a certain strategy (column) in a certain feature (row)
then we mark it here with a x. In addition, we marked di�erences
that �t or oppose our expectations respectively with (+) or (-).
sub�gure feature accuracy-based precision-based front-based
see section 5.3.1 see section 5.3.2 see section 5.3.3
1 d_t1_t2 x x x (-)
2 d_t1_t3 x x
3 d_t1_t4 x x
4 d_t1_t5 x
5 d_t2_t3 x x (-)
6 d_t2_t4 x x x (-)
7 d_t2_t5 x x (-)
8 d_t3_t4 x x
9 d_t3_t5 x (+) x
11 d_v1_v2 x
12 d_v1_v3 x
13 d_v1_v4 x
14 d_v2_v3 x (-)
15 d_v2_v4 x x
16 d_v3_v4 x
17 s12 x
18 s23 x x (-)
20 s45 x (+) x
69
6 Discussion
In this chapter, we will �rst interpret whether our method and results ful�ll
our goals and expectations. Afterward, we will give answers to our research
questions. At last, we will show ideas for future work to improve and extend
this work, and to �nd answers to related questions.
First, we like to recap this thesis. In Chapter 1, we said that we need to
learn movements for everyday activities. Based on the example of playing
darts, we planned to explore the relation between performance and the way
of throwing and named our research questions accordingly. What are the
di�erences between better and worse players? What are the di�erences between
players that improve more than others?
In Chapter 2, we followed the concept of variety destroying variety. Based on
this, we expected that better players show a bigger variety in their throws and
that players that improve more also show a bigger increase in the variety of
their throws. In addition, we decided to use multiple complementary measures
to quantify performance because we expected to make additional �ndings.
To test our expectations, we wanted to quantify throw performance and throw
physiology, as well as their developments over time, and to oppose them in
order to explore their relation. Additionally, we wanted to combine multiple
performance measures using non-dominated sorting in our analyses.
In Chapter 4, we proposed accuracy-error and precision-error to quantify per-
formance. For quantifying the physiology of throwing, we proposed to describe
the pattern of single throws regarding the time series of the norm of accelerat-
ing forces using throw features and to describe the variations of this pattern for
a set of throws by determining median and IQR for every throw features (cf.
Figure 6.1). Applying those measures to experimental data, we identi�ed in
Chapter 5 di�erences in the distribution properties of throw features between
good and bad groups based the performance measures.
71
6 Discussion
Figure 6.1: Overview of our method to prepare the confronting visualizations
72
6.1 Our Results
6.1 Our Results
In this section, we will evaluate how our results match our expectations in
order to give an answer to our research questions. We expected to make ad-
ditional �ndings by using multiple performance measures and by combining
them using fronts from non-dominated sorting. In Chapter 5, we found an
unexpected triangle-like relation between our performance measures. In ad-
dition, we observed di�erences regarding the status of physiological measures
and their development that were only visible with this approach. We expected
to observe for better performing participants a higher variance in their throw
physiology. Yet, in Section 5.2, we saw that much more observed di�erences
were opposing this expectation showing better players to have a lower vari-
ance. Just for a few features that are related to the maximum deceleration
- namely d_v1_v4, d_v2_v4, d_v3_v4, and s34 -, we observed di�erences
that match those expectations. For stronger improving participants, we ex-
pected to observe a higher increase in variance of throw physiology. Again, the
observations that we made in section 5.3 majorly did not ful�ll this expecta-
tion. Mostly they did neither support nor oppose it. So generally, more of our
observations opposed those expectations regarding the variance in throw kine-
matics. Apart from the in�uences of our method that we discuss in the next
section, we see our expectations as a reason. They base on an understanding
in which we assume that the variance in throw physiology is exclusively coun-
tering perturbations. This way, we ignored variance in the way of throwing
that is not related to improvements in performance. We want to give an illus-
trating example. When we start shaking a participant who is throwing darts
then the variance in throwing will increase but we do not expect that the per-
formance will also improve. Quite the contrary, we expect the performance to
degrade due to this additional perturbations. This leads us back to concept
of [Latash2008] of good and bad variance. In this thesis, we were ignoring
that certain variance in throwing decreases performance which is also part of
the variance of the throwing physiology that we measure. To recap, Latash's
good and bad variance describe relations between the variety of a movement
and the variety of ful�lling its goal. Here, good variance represents a negative
correlation as more variance in movement leads to less variance in ful�lling
the goal. On the other hand, bad variance represents a positive correlation
as more variance in movement leads to more variance in ful�lling the goal.
[Boenke2018] subclassi�ed Latash's bad variance. They name the relation of
73
6 Discussion
good variance quality variability. The relation of bad variance is separated
into two types. Trivial variability is related to pursuing di�erent goals. If a
participant would alternatingly throw a dart towards a dartboard on a wall
and drop a dart on a dartboard in front of their feet then we would observe
more variety in their movement considering both movements a throw. The
second part of bad variance is noise which relates to unintended disturbances
of the movement. Shaking a participant during a throw increases the variance
of the throwing but also decreases the performance. When we try to explore
those correlations, we can measure the variance of the movement as we did in
this thesis. Yet, this variance is compound from the variance of all three in-
�uences. As also the strength of these in�uences is unknown, the in�uences of
triavial and noise variability may have superimposed the in�uences of quality
variability. So, with our method, we may have observed the e�ect of trivial and
noise variability instead of the in�uence of quality variability that we wanted
to observe.
Based on the results of this thesis, we can answer our research questions like
this.
• What are the di�erences between good and bad dart players?
� For the participants in our experiment, we saw that better partici-
pants generally showed less variance in the kinematic properties of
their throws.
• What are di�erences between participants that improved more and par-
ticipants that improved less?
� For the participants in our experiment, we saw mostly no clear
di�erence according to the development in the variance of kinematic
properties of their throws.
6.2 Our Method
In this section, we will evaluate whether our method ful�lls our goals and point
out in which parts we see strengths and weaknesses in our approach.
We will brie�y summarize our method, �rst. We started with a dart throw
experiment which we used to collect data for the physiology of throws and
74
6.2 Our Method
their outcome. Based on this, we quanti�ed - on the one hand - performance
according to two measures. On the other hand, we quanti�ed physiology by
focussing on the norm of the accelerating forces from the kinematic data, by
�tting an acceleration model in order to extract throw features, by quantifying
distribution properties of those features using IQR and median. We quan-
ti�ed the development in performance measures and physiological measures
using the di�erences from the �rst to later blocks. We opposed performance
and physiology by selecting good and bad groups according to single or both
performance measure and visualizing these groups for every throw feature.
Similarly, we opposed development in performance and development in physi-
ological measures by selecting such groups and visualizing them regarding the
distribution measures of throw features. Based on these visualizations, we
identi�ed di�erences regarding our expectations.
We see factors in our method that may in�uence the results to be less reliable
and less convincing.
First, we want to emphasize the in�uence of the variance of participants from
the experiment. We aggregate the throws block-wise to compute our perfor-
mance and physiological-related measures. Then, we analyze the compound
dataset including all 5 blocks for the status analyses and the 4 di�erences for
each participant to later blocks in the analyses of the development. In Figure
6.2, we visualized again the values for accuracy-error and precision-error as we
already did before in Figure 5.1, Sub�gure (2) and in Figure 5.6, Sub�gure (0).
Yet, this time, we also connected for every participant the data points in the
order of the blocks to a polyline. We see in the lower left of the diagram the
polylines for three participants with low values according to both measures.
They do not intersect each other and lay separated from the polylines for the
rest of the participants. Comparing those three to the other polylines, we see
that their points lay closer to each other. This matches the concept of the law
of practice which states that under practice the performance changes less and
less over time (cf. [Edwards2011]). The three advanced participants - with
presumably more practice in their lives - show smaller changes in the perfor-
mance measures between blocks. The participants with worse performance
measure values - presumably beginners - show bigger changes in comparison of
subsequent blocks which makes the points of their polylines lay further apart.
Based on these measures, we select in our method the good and bad groups
for opposing physiological throw properties visually. Since the points of those
advanced participants lay closer together, it is more probable that more points
75
6 Discussion
0 2 4 6 8 10accuracy-error
0.0
2.5
5.0
7.5
10.0
12.5
15.0
17.5
prec
ision
-erro
r
Figure 6.2: Traces of block performance - After aggregating the performance
measures for every block of every participant we connected for ev-
ery participant the data points in block order to a polyline. The
start and respectively Block 1 is marked with a dot.
76
6.2 Our Method
of the same participant get selected for a good group. As the points for the
beginners are more scattered, it is more probable that the points of many dif-
ferent participants get selected for the bad groups. To exaggerate, we could
imagine that for the good group we select only the points of a single partici-
pant and compare this group in our analyses to a bad group where each point
belongs to a di�erent participant. So, to a certain extent, we generalize for the
bad performances more than for the good performances.
Furthermore, we see in Figure 6.2 that the majority of participants does not
reach the proximity of the initial performance of the participants in the lower
left. So, there are at least two classes of performance levels in the data in our
experiment. This leads us to the open question of whether our participants
are actually comparable using our method.
We see as another in�uence the errors that we introduce in our method. We
have measuring errors during the recording of the experiment data. We intro-
duce errors while �tting the model due to the nature of �tting a model but
also due to our model not �tting to all occurring cases. Lastly, we manually
identify the di�erences which is a subjective process.
We use a model that describes the throw process regarding the strength of
the acceleration. We expect two short and strong accelerations - �rst, an
acceleration phase to accelerate the dart to an appropriate velocity and second
a deceleration phase to stop the throwing arm again. We model this using a
piecewise linear model with two subsequent peaks. In our experiment, we
measured the accelerating forces at the wrist of the throwing arm according
to three axes. We simpli�ed this 3-dimensional time series data by computing
the norm of the accelerating forces. Then we �tted for every throw our model
to the time-series of the norm.
So, we �tted our model about the strength of acceleration to data about the
strength of accelerating forces. One in�uence that occurs as accelerating force
but which we did not model is the gravity. In Section 3.3, we already observed
the e�ect of gravity to lift the baseline of the accelerating forces in rest phases.
We could compensate this for our amplitude features by using the di�erence
to the baseline for characterizing, e.g., the height of the �rst or the second
peak. Yet, there is another in�uence which appears to displace certain form
features. In Figure 6.3, we see again a time-series of the norm of the accel-
erating forces for a throw as a black line and the �tted model as a red line.
Along the horizontal axis, we see the time in milliseconds in an interval with a
77
6 Discussion
length of 2000 milliseconds. Along the vertical axis, we depicted the strength.
Additionally, we can see the intermediate versions of the �tted model with
dotted lines. According to the data as well as the model, we can see a baseline
at a strength of about 10. We clearly see the two peaks for the acceleration
phases. Yet, we also observe a stronger mismatch between model and data
between the two peaks and even stronger after both peaks or in other words
at the time 0. Here, we see that the data falls below the baseline. Our model
cannot follow because it assumes the baseline to be the minimum. We suppose
Figure 6.3: Annihilation of forces - Between both peaks and after the second
peak at time 0 the norm of the measured forces fall far below the
baseline which the model cannot represent.
that this is an e�ect of measuring accelerating forces from the throw process
together with gravity. With our sensor, we measure the sum of accelerating
forces over time according to three axes. One accelerating in�uence is gravity
that is globally pointing downwards and static with a �xed strength. In phases
in which the sensor rests or in phases of weak accelerations and slow velocity -
e.g., when the participants slowly lift the arm to the initial throwing position -
it is the strongest force that superimposes others. For this reason, we observe
in those phases a baseline that is lifted to the static value for the strength
of gravity. Yet, in phases of strong acceleration and deceleration, the related
forces superimpose the e�ect of gravity. For this reason, we observe our two
78
6.2 Our Method
peaks. Yet, both forces are often not additive. For example, at the end of
the phase of deceleration, some participants are moving their throwing hand
almost vertically. They decelerate by carrying out a decelerating force that is
directed upward. The sensor measures the sum of these force vectors. While
gravity is pointing downwards, decelerating forces with a similar strength are
pointing up. The summed vector that a�ects the sensor is shorter - or in other
words, has a smaller strength - than gravity. In other words, those forces an-
nihilate each other. For the norm of the accelerating forces at that moment it
means that the norm has a value beneath the baseline which is what we can
see in Figure 6.3. This annihilation depends on the directions of the occurring
forces. Thus, it is speci�c to the throw pattern of participants. So it may
occur for some participants and may not occur for others. The consequences
are that the shape of the time-series contains unexpected features which our
model cannot handle. For this reason, certain form features can deviate from
their intended position and our model is less robust to be �tted to the data,
which may, later on, in�uence the distribution properties by displacing median
and increasing IQR. These form features are the basis for the throw features
and in�uence therefore further analyses.
Another case in which the �tting of our model is less robust occurs when both
peaks for the acceleration phases are badly separated. In Figure 6.4 and Figure
6.5 we have the data and the �tted models for two subsequent throws of the
same participant. The course of curves for the data in both �gures is similar
with a small peak for the acceleration phase at time -100 ms, a big peak for
the deceleration phase at time 0 ms, and an unclear transition between both.
While both time-series are similar, the �tted models vary in this interval. In
the following throw which we can see in Figure 6.6, the �rst peak could not be
identi�ed. Though our measures - median and IQR - are more robust against
outliers, if the frequency of mis�ts is too high then this will a�ect median and
IQR on these features presumably shifting median and increasing IQR.
Lastly, in our analyses, we manually identi�ed the di�erences between good
and bad groups according to the diagrams that we created. Yet, this is also a
subjective and sequential process in which we may change our subjective crite-
ria for identifying a di�erence. On time we may have accepted a visual pattern
as a di�erence. Another time, we may have ignored a similar visual pattern.
We see that there are multiple in�uences that may reduce the reliabliity of the
results of our method. Yet, their extent is unknown so far.
79
6 Discussion
Figure 6.4: 40th throw - The model (red) could not be �tted to capture the
valley between both peaks in the data (black).
Figure 6.5: 41st throw - The model (red) could be �tted to capture the valley
between both peaks in the data (black).
80
6.3 Future Work
Figure 6.6: 42nd throw - The model (red) could not be �tted to capture the
�rst peak for the acceleration phase and the valley towards the
peak for the deceleration phase in the data (black).
Nevertheless, we managed to base our analyses on relevant data from a dart
experiment in which we recorded data regarding performance and physiology
of throws. Based on this data, we managed to make multichannel time-series
of throws comparable by capturing kinematic features of throws. As we in-
tended, we found and applied a way to quantify performance and physiology
regarding darts-game, as well as the development in those measures. We also
managed to apply multiple performance measures in a combined way in our
analyses. Lastly, by assessing the relation between performance-related and
physiological measures, we were able to identify the di�erences between high
and low performing respectively strongly and weakly improving participants
which we could compare against our expectations.
6.3 Future Work
In this last section, we like to connect this thesis to future research by pro-
viding ideas for future work for improving and extending our method, and to
answer related questions. First, how may we improve our method to answer
81
6 Discussion
the questions apart from the examination and handling of the problems that
we mentioned in Section 6.2?
To seperate the compound variances for the di�erent types of variability in
order to explore the quality variability, [Boenke2018] propose to focus solely
on throws that ful�ll the goal in order to avoid trivial variability and noise
in order to keep the e�ect of the movement stable. We see another option
to explore those correlations by exploring the development over time. We
assume that trivial variability as well as noise variability decrease over time
while quality variability increases over time. As participants practice, they �nd
more options to ful�ll the goal. Applying more di�erent movements that reach
the goal increase the variance in the movement that we observe. Regarding
noise variability, participants may get better in avoiding noise. They may
start to time their movements with breathing and the heartbeat. This will
decrease the variance of noise variability over time towards a certain minimum.
Regarding trivial variability, participants may get better over time to focus
on the given goal. Also here, the variance in the movement will decrease
over time to a certain minimum. Following the idea that trivial variability
and noise variability decrease over time and only quality variability increases
over time, we should see for participants that decrease in trivial and noise
variability initially faster than they increase in quality variability a certain
v-shape of the development. As the decrease in trivial and noise variability
superimpose in the sum, the sum initially decreases as well. Over time the
changes in both variabilities get smaller until the increase in variance for quality
variability superimposes and the sum of variances start to increase. This is
just an example development. In Figure 6.7 we depicted for every participant
the development of the IQR of the feature d_t1_t2 over the 5 blocks of our
experiment. The red lines mark participants who show in the �fth block a
higher IQR than in the �rst Block. We see for several participants a decrease
towards the second or third block. We also see for some participants an increase
in towards the 4th and �fth block. This is a tempting approach which leaves the
question of how these developments relate to the development in performance.
We decided in this thesis to exclude the data that we recorded in the second
phase of the execution of our experiment from further analyses for several
reasons like mismatching scales and artifacts from recording. We could try to
solve these problems to extend the dataset. We may then use the additional
data to validate the di�erences that we observed here.
82
6.3 Future Work
B1 B2 B3 B4 B5
20406080
100120140160180
d_t1
_t2_
iqr
Figure 6.7: Development of IQR for d_t1_t2 over time
In order to reduce the errors that we introduce by abstracting the throw time-
series data, we may search for a better model and improve the �tting process
itself. Our model based on a piecewise linear function. Eventually, we could
model the acceleration phases as bell curves or similar curves.
We could try to answer the research questions in di�erent ways.
Following the idea of [Hulikanthe Math2017], we could explore the tendencies
following a front from non-dominated sorting and compare the tendencies be-
tween good and bad fronts. This could provide us with hints regarding the
in�uence of single performance criteria.
From the physiological data that we collected in our experiment, we focused
solely on the strength of the accelerating forces. A di�erent approach could be
to explore in a similar way other measures like the rotational velocity. Next,
we could also focus on multi-dimensional data. We could focus on the multi-
dimensional kinematic data eventually de�ning a multidimensional model or
reconstructing and analyzing the trajectories of throwing movements. Another
multidimensional measure that we collected are the electrical activities from
83
6 Discussion
EEG. With them, we could explore the relation between brain activity and
performance or between brain activity and the variance in physiological throw
measures.
In this thesis, we decided to apply di�erences to quantify development in per-
formance measures and physiological measures. Though there is also the option
of ratios as quanti�cation for changes, we are curious about a multidimensional
option. In this thesis, we already noticed the e�ect that is described by the
law of practice. Depending on the progress of practice which is related to
the performance the acquired improvements get smaller and smaller. To ex-
tract whether a participant improved much or few in comparison to others
and in relation to the initial performance, we could determine the fronts from
non-dominated sorting based on the performance and the development in per-
formance.
In order to identify di�erences, we opposed good and bad groups which we se-
lected based on the performance measures and the fronts from non-dominated
sorting. We can think of other selection strategies. For development in per-
formance measures, we have the concept of improvement and degradation.
According to this, we could also oppose elements that improve with elements
that degrade. So far, we followed the idea of selecting according to performance
and observe the di�erences in the physiological measures. Another option is
to select groups according to physiological measures and observe their di�er-
ences according to performance measures. Summing up, future work can be
done regarding di�erent ways of quantifying physiology and performance, and
other strategies to select groups for comparison to answer the same research
questions.
From this work, we also met related questions.
Assuming that participants do not improve anymore, e.g., because they already
reached some kind of personal maximum performance, we wonder if there is a
tradeo� between our two performance measures. This is related to our next
question.
In Section 5.1.1, we observed the triangle-like relation between our performance
measures. We were able to show that our method does not evoke this pattern
in general. So, what creates this pattern? Some of our considerations suspect
some kind of feedback mechanism. With low precision, the hits of a participant
are scattered all over the dartboard. This makes it harder to access a bias like
84
6.3 Future Work
our accuracy-error compared to the case of densely concentrated hits for a
participant with high precision. In other words, it may be easier to perceive
the extent of the accuracy-error and so to counter it when the precision-error
is smaller. This may relate to the batch size that we use to aggregate our
measures. We �xed our batch size to 100 which represents a block. Assuming
that we would aggregate just 10 throws instead of 100, we suppose that we
observe performance beneath the diagonal of the triangle that we observed.
So, how does the batch size in�uence our results?
Since we considered the relation between performance and physiological mea-
sures, as well as the relation between the development of performance and
physiological measures, a slightly di�erent question would be the relation be-
tween initial status and development. So, how do good or bad performing
participants change over time?
Lastly, We manually identi�ed the di�erences between good and bad groups.
We spend quite some time to do this. In addition, the results are a�ected by
our subjective judgment. To improve this we wonder how we could automatize
this process which may include the interfront and intrafront analyses.
85
Bibliography
[Ashby1958] W. R. Ashby. Requisite variety and its implications for the con-
trol of complex systems. Cybernetica, (1:2):83�99, 1958.
[Bernstein1967] N. Bernstein. The co-ordination and regulation of move-
ments. Pergamon Press, Oxford, 1. engl. ed. edition, 1967.
[Boenke2018] Lars T. Boenke. Reframing variability: From nuisance to an aid
to understand complex systems dynamics, Talk presented in the McIntosh
Lab, Rotman Research Institute - Baycrest Centre, Toronto, Canada, 2018.
[Cheng et al.2015] Ming-Yang Cheng, Chiao-Ling Hung, Chung-Ju Huang,
Yu-Kai Chang, Li-Chuan Lo, Cheng Shen, and Tsung-Min Hung. Expert-
novice di�erences in smr activity during dart throwing. Biological psychol-
ogy, 110:212�218, 2015.
[Deb2011] Kalyanmoy Deb. Multi-objective optimisation using evolutionary
algorithms: An introduction. In Multi-objective evolutionary optimisation
for product design and manufacturing, pages 3�34. Springer, London [u.a.],
2011.
[Edwards2011] William H. Edwards. An introduction to motor learning and
motor control. Wadsworth Cengage Learning, Belmont CA, 1. ed., interna-
tional ed. edition, 2011.
[Günnemann et al.2010] Stephan Günnemann, Ines Färber, Hardy Kremer,
and Thomas Seidl. Coda. Proceedings of the VLDB Endowment, 3(1-
2):1633�1636, 2010.
[Heylighen and Joslyn2001] Francis Heylighen and Cli� Joslyn. Cybernetics
and second-order cybernetics. Encyclopedia of physical science & technology,
4:155�170, 2001.
[Hulikanthe Math2017] Nithish Hulikanthe Math. Multi-Objective Success
Factor Analysis for IT based Startups. Master thesis, Otto-von-Guericke-
University, Magdeburg, 2017.
87
Bibliography
[Latash2008] Mark L. Latash. Synergy. Oxford University Press, Oxford and
New York, 2008.
[Lee et al.2014] S. Lee, Y. S. Kim, C. S. Park, K. G. Kim, Y. H. Lee, H. S.
Gong, H. J. Lee, and G. H. Baek. Ct-based three-dimensional kinematic
comparison of dart-throwing motion between wrists with malunited distal
radius and contralateral normal wrists. Clinical radiology, 69(5):462�467,
2014.
[Levin2014] Andreas Levin. Analyzing startup performance: How methods of
machine learning and multi-criteria optimization may help to uncover suc-
cessful startups. Master thesis, Karlsruhe Institute of Technology, Karlsruhe,
2014.
[Preim et al.2009] Bernhard Preim, Ste�en Oeltze, Matej Mlejnek, Eduard
Gröeller, Anja Hennemuth, and Sarah Behrens. Survey of the visual explo-
ration and analysis of perfusion data. IEEE transactions on visualization
and computer graphics, 15(2):205�220, 2009.
[Ranganathan2004] Ananth Ranganathan. The levenberg-marquardt algo-
rithm. Tutorial on LM algorithm, 11(1):101�110, 2004.
[Seidler et al.2017] Rachael D. Seidler, Brittany S. Gluskin, and Brian Gree-
ley. Right prefrontal cortex transcranial direct current stimulation enhances
multi-day savings in sensorimotor adaptation. Journal of neurophysiology,
117(1):429�435, 2017.
[van Beers et al.2013] Robert J. van Beers, Yor van der Meer, and Richard M.
Veerman. What autocorrelation tells us about motor variability: insights
from dart throwing. PLoS ONE, 8(5):e64332, 2013.
[Vrugt et al.2007] Jasper A. Vrugt, Jelmer van Belle, and Willem Bouten.
Pareto front analysis of �ight time and energy use in long-distance bird
migration. Journal of Avian Biology, 38(4):432�442, 2007.
[Weber and Doppelmayr2016] E. Weber and M. Doppelmayr. Kinesthetic
motor imagery training modulates frontal midline theta during imagination
of a dart throw. International journal of psychophysiology : o�cial journal
of the International Organization of Psychophysiology, 110:137�145, 2016.
88
Declaration of Authorship
I hereby declare that this thesis was created by me and me alone using only
the stated sources and tools.
Thomas Hennig Magdeburg, November 20, 2018