Variability Analysis based on Multi-Objective Performance ...und+Bachelor_Arbeiten/MasterThe… ·...

Thomas Hennig

Variability Analysis based on

Multi-Objective Performance and

Throw Acceleration in Dart-Game

Intelligent Cooperative Systems

Computational Intelligence

Variability Analysis based on Multi-Objective

Performance and Throw Acceleration in

Dart-Game

Master Thesis

Thomas Hennig

November 20, 2018

Supervisors: Prof. Dr. Sanaz Mostaghim,

Chair of Computational Intelligence

Prof. Dr. Frank Ohl,

OvGU, Leibniz Institute for Neurobiology

Advisors: M.Sc. Heiner Zille

Dipl.-Neurowiss. Lars T. Boenke

Thomas Hennig: Variability Analysis based on Multi-ObjectivePerformance and Throw Acceleration in Dart-GameOtto-von-Guericke UniversitätIntelligent Cooperative SystemsComputational IntelligenceMagdeburg, 2018.

Abstract

Movements are part of everyday activities, need to be learned and can be im-proved regarding their movement goal. In related work, occurring variancein movements is mostly considered as noise. Whereas theories from cyber-netics and dynamic system theory suggest that part of this variance supportsachieving the movement goal.

The aim of this thesis was to explore the relation of variability between throwperformance and throw physiology in throwing darts. Di�erences betweenplayers that perform better and players that perform worse should be identi-�ed. In Addition, di�erences between players that improve more and othersthat improve less should be identi�ed. In related work, a single measure is usedto quantify performance in throwing dart. In this thesis, multiple measuresshould be used to quantify performance. For analyzing the relation betweenperformance and the way of throwing, 6 visual analyses were conducted. Ineach analysis, a good and a bad group based on performance measures wereconfronted so that di�erences in their throw pattern could be identi�ed. Toquantify performance, two performance-measures for sets of throws were de-�ned to represent scatter and average deviation. To quantify the variance ofthrow movements, the variance of an acceleration model that was �tted toevery throw was determined.

A triangular relation between our performance measures was found that is nei-ther the e�ect of their de�nition nor an e�ect solely created by our method.Additional unique di�erences were identi�ed by analyzing both performancemeasures together using the fronts from non-dominated sorting. For the ma-jority of observed di�erences in the throw pattern, a better performance wasrelated to a smaller variance in the throw pattern. For few observed di�erencesthrowing, a better performance was related to a higher variance in the throwpattern. For the di�erences in the development of parts of the throw pattern,clearly less observed di�erences were related to the development of variance.

Di�erent types of variability relations are suggested by our results. The vari-ance of these relations is compound when quantifying the variance in the wayof throwing. Further work needs to be done to separate these compound vari-ances.

I

Preface

First of all, I like to thank Lars T. Boenke, Heiner Zille, Prof. Dr. Frank Ohl,and Prof. Dr. Sanaz Mostaghim for giving me the opportunity to work onthis topic. I thank them for all their time for our many meetings, for all theirsupport and all their advice during the work for this topic. I thank them andDominik Weikert for their preceding work and ideas.

Next, I want to thank Prof. Ross Cunnington, Tamara Powell et al. forproviding us with the data of the experiment for our analyses and for explainingthese in detail.

Furthermore, I like to thank Dr. Christoph Steup for sharing his know-how inthrowing darts and his expertise on analyzing measured data, as well as thestudents in the work group of Prof. Dr. Mostaghim for their advice and ideas.

It the end, as this thesis is also the completion of my studies, I like to thankmy partner and my family for their support throughout the years.

III

Contents

List of Figures VII

List of Tables IX

1 Introduction 1

2 Background 5

2.1 Game of Darts . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Theories of Motor Control . . . . . . . . . . . . . . . . . . . . . 7

2.3 Approaches to quantify Physiology and Performance . . . . . . 10

2.4 Approach to handle multiple Performance Measures . . . . . . . 13

2.5 Goals and Expectations . . . . . . . . . . . . . . . . . . . . . . 16

3 Data Basis 17

3.1 Dart Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Towards Performance Measures . . . . . . . . . . . . . . . . . . 20

3.3 Focus on accelerating Force as recorded physiological Measure . 23

4 Proposed Method 29

4.1 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . 29

4.2 Describing the Throw Pattern by Throw Features . . . . . . . . 31

4.3 Quantifying Development . . . . . . . . . . . . . . . . . . . . . 37

5 Results 39

5.1 Performance measures from Experiment Data . . . . . . . . . . 40

5.1.1 Regarding Block Performance . . . . . . . . . . . . . . . 40

5.1.2 Regarding Performance Development . . . . . . . . . . . 45

5.2 Relation between Performance and Throw Pattern . . . . . . . . 48

5.2.1 Groups based on Accuracy-error . . . . . . . . . . . . . . 49

V

Contents

5.2.2 Groups based on Precision-error . . . . . . . . . . . . . . 50

5.2.3 Groups based on Fronts from non-dominated Sorting . . 54

5.2.4 Comparison of Sets of Observations . . . . . . . . . . . . 56

5.3 Relation between Performance Developments and Throw Pat-

tern Developments . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3.1 Groups based on Accuracy-error . . . . . . . . . . . . . . 60

5.3.2 Groups based on Precision-error . . . . . . . . . . . . . . 63

5.3.3 Groups based on Fronts from non-dominated Sorting . . 65

5.3.4 Comparison of Sets of Observations . . . . . . . . . . . . 68

6 Discussion 71

6.1 Our Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.2 Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Bibliography 87

VI

List of Figures

2.1 The dartboard that we used in this thesis . . . . . . . . . . . . . 6

3.1 The inertial measurement unit attached to the wrist . . . . . . . 19

3.2 Distribution of hits with high scatter and high displacement . . 21

3.3 Distribution of hits with low scatter and low displacement . . . 22

3.4 Distribution of hits with low scatter and high displacement . . . 22

3.5 Distribution of hits with high scatter and low displacement . . . 23

3.6 Time series for accelerating forces for two subsequent throws . . 24

3.7 Properties of rest phases . . . . . . . . . . . . . . . . . . . . . . 27

4.1 Extracted time series of the accelerating force for all throws of

participant J01 . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2 Time related form features in the piecewise linear model . . . . 33

4.3 Amplitude related form features in the piecewise linear model . 33

4.4 Fit for the 42nd throw of participant J01 . . . . . . . . . . . . . 35

4.5 d_t1_t2 as example for a throw feature from a time di�erence . 36

4.6 d_v1_v4 as example for a throw feature from a amplitude dif-

ference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.7 s12 as example for a throw feature from a slope . . . . . . . . . 37

5.1 Occurring performance measure values . . . . . . . . . . . . . . 41

5.2 Visualizing distortions in the relation between our performance

measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3 Occurring values for development in performance measures . . . 46

5.4 Confrontation based on accuracy-error . . . . . . . . . . . . . . 51

5.5 Confrontation based on precision-error . . . . . . . . . . . . . . 52

VII

List of Figures

5.6 Confrontation based on non-dominated fronts on both perfor-

mance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.7 Confrontation based on development of accuracy-error . . . . . 62

5.8 Confrontation based on development of precision-error . . . . . 64

5.9 Confrontation based on fronts from non-dominated sorting on

the development in accuracy-error and precision-error . . . . . . 67

6.1 Overview of our method to prepare the confronting visualizations 72

6.2 Traces of block performance . . . . . . . . . . . . . . . . . . . . 76

6.3 Annihilation of forces . . . . . . . . . . . . . . . . . . . . . . . . 78

6.4 40th throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.5 41st throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.6 42nd throw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.7 Development of IQR for d_t1_t2 over time . . . . . . . . . . . 83

VIII

List of Tables

4.1 Performace measure examples . . . . . . . . . . . . . . . . . . . 31

5.1 Overview of throw features with observed di�erences in perfor-

mance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2 Overview of throw features with observed di�erences in devel-

opment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

IX

1 Introduction

When we learn a second language, we learn words, phrases, grammar, pro-

nunciation and as part of this also often sounds that are speci�c to the new

language. For people with German as the �rst language, this might be the

th-sound when learning English. The other way around, for people with En-

glish as the �rst language who learn German, there are the umlauts. To create

the th-sound, we need to coordinate tongue and the tip of the upper teeth

to form a slit through which we press air with our breath to create a hissing

sound. For the umlauts, we take the example of the sound for ü. We start

by coordinating the jaw, tongue, and lips to continuously create the English

sound for the letter e. Now, keeping the jaw and the tongue in this position,

we purse our lips slowly until just a �ngertip-sized opening is left. We see that

we need to coordinate the movement of multiple parts of our body to create

a speci�c new sound. To use these sounds in the language, we need to learn

to create those sounds reliably and fast - so, others may understand us. As a

baby, there are not just a few sounds that we need to learn. We need to learn

to move mouth, jaw, tongue and vocal cords and to coordinate this with the

breath for the many sounds of the language of our parents.

For many everyday actions, we need to learn movements. For communication,

apart from talking, we also need to learn the movements for writing whether

using a pen or using a keyboard. We need to learn the movements for walking,

for grabbing. After an accident, we may need to learn to talk again. We may

need to learn to walk using an arti�cial limb. In sports, we need to learn

movements. For football, we need the moments to run and control a ball at

the same time. In darts or curling, we need a movement to accelerate a missile

to hit a distant target.

We know that we move due to forces that apply to limbs and tissue. We

know that these forces originate from muscles that contract. We know that

these muscles contract due to signals provoked in the brain. We know the

structure of the brain, that certain brain areas correspond to certain body

1

1 Introduction

parts. However, we do not know how it works so that movements emerge. We

do not know how the brain learns a new movement. We do not know how

the brain changes to improve the movement. If we would know, we could tell

what to do to be an expert and how to get there. We could choose strategies

to learn faster.

A question that is related to this is whether we execute always the exact

same movement. [Bernstein1967] observed that eventhough the movement

appears to be very similar, there is always a variability in the movement which

also covers alternative movements for the case that some movements are not

feasible. For example, we can walk. Yet, the movement di�ers as we walk on

a rough ground or a slippery groung, as we walk a slope up, down or on a �at

ground.

Our intention for this thesis is to make �rst steps in the exploration of the

variability in movements. Our goal is to explore for throwing darts the relation

between performance and the way of throwing with a focus on variability. As

we see in the related work in Chapter 2 that usually just a single objective

or in other words performance measures is used, we want to include multiple

performance measures in our analyses. The movement of throwing darts as an

object of investigation has several advantages compared to other movements.

We have a precisely de�ned goal with the dartboard. The movement itself

a�ects comparatively few limbs - mainly arm and hand. In addition, there are

few constraints to throwing darts. So, we do not need exceptional strength,

speed, agility, endurance or intelligence. In the exploration of the dart throwing

movement, we follow two research questions.

• What are the di�erences between players that show a high performance

and players that show a low performance?

• What are di�erences between players that improve more in their perfor-

mance than others?

To �nd answers to these questions, we structure this thesis into the following

parts.

• In the next chapter, Chapter 2, we present the basis to investigate the

movement of throwing darts.

• In Chapter 3, we present the dataset that is the basis for our analyses.

• In Chapter 4, we describe our developed method to analyze this dataset.

2

• In Chapter 5, we present the results of applying our method to the

dataset.

• In Chapter 6, we discuss the results, the strengths and weaknesses of our

method, as well as options for future work.

3

2 Background

In this chapter, we want to provide the knowledge that is required to under-

stand our method. We will start by describing the game of darts - what it is

and how it works. Based on this, we will introduce the topic of motor control

which gives us theories to explain how we create throw movements in darts

and which leads us to our expectations for the following chapters. Afterward,

we continue with the topic of quantifying throw performance as well as throw

physiology which will be necessary for analyses.

2.1 Game of Darts

Darts is a popular game and competitive sport. Participants throw repeatedly

from a de�ned distance small missiles - called dart - at a static target. The

target is a disk - the dartboard - which is attached in a de�ned height at a

wall. The dartboard is subdivided into multiple �elds. In the center, there

is a circular �eld called bull's eye. Around it, the dartboard is divided into

several rings as well as it is divided into several sectors of a circle which

forms a multitude of ring segments. Figure 2.1 shows the dartboard that

we used for this thesis. It has a diameter of about 43 cm and is subdivided

into 9 rings and 12 equal sized sectors that form 108 ring segments. There are

di�erent variants of the game with di�erent goals in which participants may

compete. These variants have in common that a certain amount of points is

assigned to each �eld which is gathered for every hit in the respective �eld.

In a variant of the game for more advanced participants, participants may

seek to hit a certain combination of �elds in order to gather points for their

personal scores with the goal to reach a certain sum without exceeding it. On

the other hand, beginners may just try to hit the bull's eye receiving more

points the closer their hit is to the bull's eye. The process of throwing a dart

starts with an acceleration phase in order to reach an appropriate velocity and

5

2 Background

Figure 2.1: The dartboard that we used in this thesis - The red circular �eld in

the center is the bull's eye. Around it, the dartboart is subdivided

by black and white rings, as well as 12 sectors of a circle into �elds

in the shape of ring segments.

movement direction so the dart may �y from the participant to the dartboard

bridging the distance between both. When the dart reaches an appropriate

velocity then the participant releases the darts into �ight. In �ight, the dart

is not in�uenced by the participant anymore. Instead, for example, gravity is

in�uencing the dart which was compensated by the participant before. The

�ight phase ends with the impact of the dart. It hits the dartboard or the

surrounding wall reducing it velocity greatly. On hitting the dartboard, the

dart usually gets stuck. Under the name hit, we refer to the impact position

of a single throw. Though, we name hits of throws that missed the dartboard

also misses. Regarding the throw process of participants, we observe that

participants prepare a throw by reaching an initial state that usually extends

the distance for acceleration. Assuming that they already hold the dart in one

hand, they may move their hands up to the level of the shoulder and backward.

Afterward, they start accelerating this hand. Doing so, the hand moves forward

towards maximum extension. There are many ways to accelerate the dart. To

mention just two, we observed players mainly rotating their forearm around the

elbow joint creating a rotational movement while for others, we saw an ejecting

movement by expanding forearm and upper arm rotating in elbow joint and

shoulder joint. Since there is a maximum extension of the arm, there must be

a phase of deceleration that reduces the velocity of hand and arm. Reaching

6

2.2 Theories of Motor Control

the maximum extension means that in this movement at some point the bones

of upper arm and forearm collide in the elbow joint. To prevent such a hard

break, participants also cancel acceleration before the maximum extension is

reached and decelerate actively to stop the arm. Releasing the darts between

those phases of acceleration and deceleration is a matter of timing. While

the arm is used to accelerate the dart, it is the hand that holds the dart and

releases it into �ight. On releasing too early, the dart may not have reached

the necessary velocity. On releasing too late, the dart may already be to slow

again due to deceleration. The throw of a single dart is a complex process. It

requires that the participant coordinates the hold and release of a dart with

the acceleration and deceleration of the arm in order to hit a distant target.

In the following section, we will take a look at theories that try to explain how

we control those complex motoric processes.


Motor control is the discipline in which scientists study how movements arise in

humans that su�ce certain movement goals (cf. [Edwards2011]). This includes

the question for the role of cognitive processes. In darts, an example movement

goal is to create repeatedly a movement to throw the dart and hit the bull's

eye. With the concept of performance we qualify the ful�llment of movement

goals in this context (cf. [Edwards2011]). Permanently hitting the bull's eye

ful�lls constantly and to a maximum extent the goal which represents the

highest performance. Hits close to the bull's eye ful�ll the goal more - showing

a higher performance - than hits that miss the dartboard. Based on motor

control, the discipline in which scientists study how movements are acquired

and their performance improved is called motor learning (cf. [Edwards2011]).

This discipline encompasses - for example - our research question of what

changes occur in throwing dart together with improvements in performance. So

in the game of darts, learning is connected to the improvement in performance.

For a detailed overview of the development and branches in motor control and

motor learning, we like to recommend [Edwards2011]. Here, we just want to

provide the steppingstones that lead to the expectations for our analyses.

According to [Edwards2011], the older branch in motor control are closed-

system theories or also called cognitive-based theories. Applying these, we

7

2 Background

consider mind and body to be separated and that the mind exclusively con-

trols the body. At the beginning of a movement the mind is fed by sensory

inputs which are then processed by the mind to extract information about body

and environment. Using the available information and following the movement

goal, the mind chooses a certain movement for which it plans a program to

steer the body. Afterward, the program is executed by sending signals to mus-

cles which evoke actions. A movement is the sum of these actions. Applying

this to darts, participants start by observing the environment and the arm in

the initial throwing position. From these sensory inputs, they extract infor-

mation on the vector towards the desired �eld of the dartboard. With this

information, they decide, e.g., for a rotational throw movement that creates a

throw trajectory towards that �eld. The participants start to plan which mus-

cles, joints, and bones are needed to create this motion. After the planning is

done, the signals are sent towards the muscles that extend the forearm regard-

ing the upper arm by contracting and pulling the forearm accelerating. This

accelerates the arm as well as the dart that is held in the hand. On another

signal to the muscles of the hand, the dart is released into �ight. On the last

signal, the muscles of the arm stop accelerating the arm and start decelerating

it instead until the arm rests again. The actions for acceleration, release, and

deceleration form the movement of a throw.

[Bernstein1967] objected the idea of movements to be exclusively controlled

by cognitive processes as he identi�ed two problems that oppose the idea of

cognitive-based theories. His �rst problem - the degree of freedom problem

- describes that from limbs down to the level of muscles, joints, and bones,

down to the level of muscle cells there are too many entities that would re-

quire control for a movement. So a controlling program for any movement

would be too complex to be able to be e�ciently executed. With his second

problem - context-conditioned variability - he questioned the assumption of

cognitive-based theories that the body is independent of external in�uences.

He observed that depending on di�erent external in�uences movements pur-

suing the same movement goal were generated di�erently. He observed that

while slowly lowering a stretched arm, muscles in the back are pulling the arm

down when another person is actively pushing the arm up. Whereas, when

only gravity is a�ecting the arm then muscles in the shoulder are pulling the

arm up.

Bernstein's chain of thoughts opposed an exclusive control of cognitive pro-

cesses and showed a strong interrelation of body and environment in the gen-

8


eration movements. To explain how we generate movements despite the de-

grees of freedom, he introduced the model of temporary groupings of muscles,

joints, and bones - called synergies - to certain action units that reduce the

number of degrees of freedom. Doing so, he opened the alternative branch of

dynamic-system theories in motor control in which we consider body, mind,

and environment as complex, interrelated subsystems in which movements

are emergent patterns from self-organizing structures in body and mind (cf.

[Edwards2011]).

Another of Bernstein's observations was that even though movements that

pursued the same movement goal were very similar, there were always varia-

tions. He named this phenomenon repetition without repetition as it describes

repetition in performance without repetition of the precise movement. Based

on this, [Latash2008] developed the concept of good and bad variance. While

there is a variance in the movements that su�ce the given goal there is another

variance that reduces the performance. An example of good variance may be

advanced darts players who show similar performance according to di�erent

techniques. They may execute a rotational throw and hit the bull's eye, they

may execute an ejecting throw and hit the bull's eye, and they may throw

backward above their shoulders and hit it again. On the other hand, we could

introduce bad variance by starting to shake them during the throw which adds

noise and degrades their performance.

From variance in motor control, we make a step to variety in cybernetics -

"the science that studies the abstract principles of organization in complex

systems" [Heylighen and Joslyn2001]. Here, [Ashby1958] described for sys-

tems with control that pursues a certain state with his law of requisite variety

and his idea of variety destroying variety the relation between the variety of the

state and the variety of in�uences. In his model, we have the system state for

which control pursues to keep it as close as possible to a goal state. There are

perturbations that in�uence the system state to deviate from the goal state.

The controller has certain actions to counter the e�ect of perturbations. Ac-

cording to his law, a controller needs enough variety in their actions to counter

the variety due to perturbations to the system in order to maintain the desired

state. Equation 2.1 represents the relation between the system's in�uences and

the goal variable (cf. [Heylighen and Joslyn2001]). V(S) represents the variety

in the system state. V(P) represents the variety of perturbances. The variety

9

2 Background

of regulating actions is represented by V(A). B is a spontaneous decrease in

variety due to bu�ering.

V (S) ≥ V (P )− V (A)−B (2.1)

Translating it to darts, we have - for example - the goal to hit the bull's eye.

Hitting the bull's eye, the relevant part of the system state is the desired state.

For hits close to the bull's eye the system state is closer to the desired state

than for hits in distance to the bull's eye. Ideally, we have no or a very small

variety in the system state always hitting the bull's eye �eld. Perturbations in

dart can be inside the player, e.g., due to breathing or due to the heartbeat.

They can also be in the environment, e.g., a sudden wind that in�uences the

path of the �ying dart. One action for the participant as a controller to counter

perturbations is, e.g., to wait for the right moment to avoid perturbations due

to heartbeat or breath. A bu�er in this example is the bull's eye itself. Though

it is small it covers a certain area allowing a variety of di�erent positions or

states to su�ce the goal or to be the goal state. If we imagine that we increase

the size of the bull's eye to the size of the dartboard then we see that for a

�xed set of hits more hits are inside the bull's eye achieving the goal state and

improving performance.

In our research questions, we ask for the di�erences between good and bad

participants in darts, and the di�erences between participants that improve

more than others. Now, we combine these with Ashby's law to concretize our

hypotheses. Following Ashby's "variety destroys variety", we expect for good

participants with lower variety in ful�lling the goal to observe a higher variety

in throwing the dart in comparison with participants with a bad performance.

For the development over time, we expect that participants that improve more

in performance also show a bigger increase in variety during their throws com-

pared to participants that improve less. So our approach is to compare variety

in the regulation - represented by the variety in the physiology during the

throw - with the variety in the ful�llment of the goal variable which relates

to the inverse of performance. This leads us to the task of quantifying the

physiology of throws as well as performance.

10

2.3 Approaches to quantify Physiology and Performance

2.3 Approaches to quantify Physiology and

Performance

We can quantify the throw process at di�erent levels. We could measure

the electrical activity for the brain to quantify cognitive processes. For the

subsystem of the muscles that contract to generate the forces that lead to a

movement, we could measure their electrical activity using EMG (electromyo-

gram). The next level could be the kinematics of the throwing arm and hand

- the forces and resulting accelerations, velocities and trajectories. Lastly, we

could also track the kinematics of the thrown dart. Let's take a look at what

others did in this area. [Cheng et al.2015] opposed a group of novice dart play-

ers with a group of expert dart players to �nd di�erences in electrical surface

activity of the brain. They quanti�ed physiology by measuring the electrical

surface activity on the brain using EEG (electroencephalography) and extract-

ing oscillations in a certain frequency interval. [Weber and Doppelmayr2016]

explored the e�ects of imagining throwing on performance and electrical sur-

face activity on the skull using EEG. Works that examine the kinematics of the

throwing arm include [Lee et al.2014]. They explore the di�erences in throw

movement due to alterations in the bone structure of the wrist after an injury.

To collect kinematic data, they simulated the throw process stepwise taking

pictures using CT (computer tomography) and reconstructed the movement

from the collected data. Even though in our related work nobody used kine-

matics to quantify the physiology of a throw, we see some options to do so.

[Preim et al.2009] do not investigate the movements of any kind of throw. But,

they do investigate the dynamics of a �uid in the blood circulation. In their

work, they give an overview of techniques to visually explore perfusion prop-

erties of body tissue. These properties are determined from tracking a �uid

contrast-agent in the blood circulation using, e.g., MRI (magnetic resonance

imaging) or CT. For each voxel of these recordings, they get a time-series on

the measured signal intensity that changes due to the moving contrast agent.

These series share a certain pattern when the contrast-agent crosses. In the

analyses of dart throw kinematics, we have also process patterns that occur in

every throw, e.g., the initial acceleration and the �nal deceleration of a throw.

[Preim et al.2009] �tted a simple process model for the signal intensity due to

the passage of the contrast agent in order to abstract the data and to extract

certain form properties for the process in each voxel. Afterward, they aggre-

gated these properties visually to show the distribution in the tissue. Having

11

2 Background

such properties for dart throws we may aggregate them to quantify variance

across several throws.

Looking at performance measures, each of the following related works set the

movement goal to hit the bull's eye by throwing the dart. [Cheng et al.2015]

and [Weber and Doppelmayr2016] quanti�ed performance as averaged ring

scores over a batch of throws. They assigned decreasing scores from the bull's

eye to the outer rings of the dartboard. Each hit was then awarded the score

of the hit �eld. Closely related to ring scores are displacements of hits in

relation to the bull's eye. While ring scores assign high values to hits close

to the bull's eye and small values to hits in distance, it is the opposite for

displacements. They measure the distance between the bull's eye and hit. So

they are small when a hit was close to the bull's eye and big when far away.

[Seidler et al.2017] investigated the e�ects of electrical surface stimulation on

the brain to the adaption to changed visual conditions. They quanti�ed perfor-

mance as the mean of the horizontal displacement in cm. [van Beers et al.2013]

used a di�erent approach while investigating the learning rate in a darts ex-

periment. They quantify performance using the variance in the displacement

of hits and motivate this with the fact that the variance of hits must be small

to reliably hit the bull's eye.

In the related work, each time just a single performance measure was used. We

recognize here the problem of using just a single performance measure because

for each one there are cases that seem equal according to the measure but that

are di�erent in ful�lling the goal on hitting the bull's eye. Regarding the use

of variance as only performance measures, though a small variance in hits is

needed to reliably hit the bull's eye, it is easy to see that this condition is not

su�cient. Let's imagine a participant with the smallest possible variance in

hits always hitting the same position. If this participant always hits the bull's

eye then this shows the highest possible performance. Yet, when the hits are

always at the lower rim of the dartboard then it fails the movement goal of

hitting the bull's eye for which we expect a low performance. Nevertheless,

the variance for both cases is equal. On the other hand, also regarding the

displacement measures, we want to show an example pair of equal performance

according to the displacement measure with di�erent degrees in ful�lling the

goal of hitting the bull's eye. For the �rst participant, we imagine a throw

behavior of hitting repeatedly just the same ring on the dart board. Based on

the mean displacement of this participant, we imagine a second one that shares

the same mean displacement but whose hits are scattered over the dartboard.

12

2.4 Approach to handle multiple Performance Measures

We can now argue that the �rst one ful�lls the goal less than the second because

the �rst one is unable to hit the bull's eye while for the second one some hits

may land inside.

We see that with just one of those measures we cannot completely quantify

performance on hitting the bull's eye. In the following thesis, we do not want

to search for a better measure for quantifying performance that overcomes

this problem. Yet, we want to use for our analyses multiple complementary

measures.

2.4 Approach to handle multiple Performance

Measures

In the discipline of multi-objective optimization optimization problems with

multiple performance measures are handled. Objective is the term for a

performance measure in this discipline. Having multiple objectives, we may

evaluate two elements for each of them. If all of those objective support that

an element x is better than an element y then it is simple to conclude that x

is better according to all objectives. However, if one objective supports that x

is better than y and another objective supports the opposite - that x is worse

than y - then we have a con�icting situation and no clear concept of better

and worse. To be still able to apply the concept of better and worse, [Deb2011]

propose to extract for a set of elements the non-dominated front which is a

subset that we can consider better than the rest of the set. Extracting non-

dominated fronts iteratively gives us a partition of the elements with multiple

performance measures into disjunct subsets for which it de�nes that certain

subsets are better than others.

The subsets are determined by iteratively computing the subset of non-

dominated elements - also called non-dominated front - and removing it from

the set for the next iteration. The non-dominated front is the set of elements

for which we cannot state that they are clearly worse than any other element.

Having multiple measures, we can de�ne a domination relation for each pair

of two elements based on the domination relations for single measures. For

a single measure, an element x dominates the other one y when we consider

x better than y (cf. Equation 2.2). If we think of scores, we might consider

13

2 Background

higher scores better than smaller ones. If we think of error measures, we usu-

ally consider smaller values to be better. For multiple measures, an element

x dominates an element y when x is not dominated in any measure by y and

x dominates y in at least one measure (cf. Equation 2.3). This relation al-

lows that an element x either dominates an element y, that x is dominated

by y, or that both do not dominate each other. Searching the elements that

are not dominated by any other element gives us the non-dominated front (cf.

Equation 2.4). The elements of the non-dominated front do not dominate each

other but they dominate all the remaining elements. For this reason, we con-

sider the elements of the this front to be better. By iteratively determining the

non-dominated front and removing it from the set, we partition the original set

into a series of fronts. For non-dominated fronts, all the elements inside a front

do not dominate each other. Though, all elements of a front are dominated by

elements of fronts that appear earlier in this series - except for the �rst front

of the original set. For this reason, we can consider fronts that appear earlier

in this series of fronts to be better than fronts that appear later. This gives us

an order based on multiple performance measures.

x dominates y regarding f ⇐⇒x≺

fy ⇐⇒ x is better than y regarding f

with f as objective

(2.2)

x dominates y ⇐⇒x≺ y ⇐⇒ (x dominates y regarding at least one objective)

and (x is not dominated by y regarding any objective)

⇐⇒ ( ∃f∈F

: x≺fy) ∧ (¬ ∃

f∈F: y≺

fx)

with F as set of objectives

(2.3)

non-dominated front =subset of X consiting of elements that

are not dominated by any other element of X

={x|(x ∈ X) ∧ (¬ ∃y∈X

: y≺x)}(2.4)

14

2.4 Approach to handle multiple Performance Measures

In order to visualize fronts based on 2 measures, we can exploit their inherent

order. Since the elements of a front do not dominate each other, we know that

for every two elements x and y the element x dominates y in one measure and

y dominates x in another measure. Since there are just two measures, we can

sort all elements in improving order according to one measure to a series that is

sorted at the same time in degrading order according to the second measures.

In Figure 5.3, Sub�gure (2) we depicted two performance measures along the

horizontal as well as the vertical axis. For each front, we used their inherent

order to create a polyline which connects all elements. In the sub�gure, the

best fronts are located near the bottom and left part of the area. The worst

fronts are in the top right.

Next, we take a look on how others applied non-dominated sorting for ana-

lyzing data using multiple performance measures. [Vrugt et al.2007] developed

and analyzed a simulation on bird migration between Europe and Africa. They

used �ight time and energy use as performance criteria for di�erent migration

routes. They analyzed the relation between the �rst non-dominated front on

these measures and the migration routes. They found that the tradeo� between

both measures leads to two main branches in migration routes - each one pri-

oritizing one measure. [Levin2014] explored the application of non-dominated

sorting in the analysis of startup performance. They used growth and prof-

itability as performance measures. By extracting the �rst non-dominated front,

they were able to retrieve the relation between the measures that were used to

generate their data. Both works focus on analyzing the relations in the �rst

non-dominated front. Doing so, they ignore big parts of their datasets that

are not optimal or even show bad performance.

[Hulikanthe Math2017] explored empirical data on startup performance using

survival time and the number of employees as performance measures. With

an analysis called intrafront-analysis, they opposed the ends of a given front

from non-dominated sorting in order to identify di�erences between the ends

or respectively tendencies along the front regarding other startup properties.

They applied this to chosen good and bad fronts. In addition, they opposed

a good and a bad front regarding their startup properties which they called

interfront-analysis. [Hulikanthe Math2017] did not focus just on the �rst non-

dominated front. Instead, using their intrafront- and interfront analysis, they

also explored the worse parts of the data using multiple performance measures.

15

2 Background

We see the possibility to use non-dominated sorting for our analysis using

multiple indented performance measures on the performance in throwing darts.

It will give us a concept of good and bad according to multiple measures.

Based on this concept, we can oppose well and badly performing participants

to �nd di�erences in their throw physiology. This will give us an answer to our

research question that will be di�erent from using just a single performance

measure.

2.5 Goals and Expectations

In this thesis, our goal is to explore the relation between throw performance

an throw physiology in darts. For this, we want to quantify the physiology of

the throw movement. In addition, we want to quantify the performance of the

throw movement. Here, we want to use multiple performance measures and

analyze them in a combined way using the fronts from non-dominated sorting.

Due to the connection to Ashby's law of requisite variety, we expect better

performing participants to show more variety in their physiology than worse

performing participants. For the development over time, we expect partici-

pants that show a stronger improvement in performance to also show a stronger

increase in variety in their physiology compared to players that improve less.

Lastly, we expect to get additional �ndings by applying multiple performance

measures and using them in a combined way for the analyses.

In the next chapter, we present our method that we developed to ful�ll these

goals and to test our expectations.

16

3 Data Basis

In this chapter, we describe the dataset that is the basis for our analyses.

Based on our research questions and our expectations, our goal is to explore

the relation between performance and physiology of dart throwing. We are

grateful as we were provided with the dataset for a dart experiment which

consists of data related to the outcome of throws and measures related to the

physiology of the respective throws. First, we describe the experiment and the

collected data. Second, we describe how the data is related to performance.

Lastly, we describe why we focus in this thesis on the data for accelerating

forces.

3.1 Dart Experiment

We thank Prof. Ross Cunnington, Tamara Powell, et al. from the University

of Queensland and Queensland Brain Institute for providing us the datasets

for the dart experiment that they conducted. For this experiment, they let

participants throw darts subsequently 500 times with the task to hit the bull's

eye.

They conducted this experiment at the University of Queensland and Queens-

land Brain Institute approved by the University of Queensland Medical Re-

search Ethics Committee (UQ Project No. 2011001394). They conducted the

experiment in two phases with Phase 1 in 2015 and Phase 2 in winter 2017

/ 2018.

In a well-lit room they marked a line on the ground with a distance of 267cm

to the wall with the dartboard marking the throw area for participants. In

Figure 2.1, we see the dartboard that they used in this experiment. In the

center, there is the red circular bull's eye. Around it, the dartboard is divided

into 9 Rings and 12 sectors of a circle which creates 108 ring segment-shaped

17

3 Data Basis

�elds. The dartboard spans in total a diameter of 43cm. It was attached to

the wall with a height between �oor and bull's eye of 155cm in Phase 1 and 143

cm in Phase 2. As for darts, in Phase 1 two sets of 5-6 darts were provided.

One set with a dart weight of 22g each as well as another with a dart weight

of 26g each. In Phase 2 only a set of 5 darts with 22g was supplied.

They instructed each participant to stand behind the line marked on the

ground, to try to hit the bull's eye, and to throw the set of 5-6 darts be-

fore retrieving the darts. In Phase 2, they gave participants additional in-

formation on how to throw according to the recommendations from http:

//dartbrokers.com/dart-technique.

In each recording session, there was just one participant. They conducted a

session in one piece. In the beginning, they equipped participants with an

inertial and EMG measurement device at the forearm as well as a mobile scalp

EEG. The participants threw their set of 5-6 darts with own pace. In total

participants threw 500 times separated into 5 sequences of 100 throws each.

We call these sequences of 100 throws blocks. So block 1 refers to the �rst 100

throws while block 5 refers to the �fth or last 100 throws. After each block,

there was a brief rest period for the participants.

During the experiment sessions, Cunnington et al. recorded the participant's

physiological properties using the equipped sensors as well as they observed

and recorded the hits on or o� the dartboard.

They recorded for each throw the approximate position of the hit by recording

the hit dartboard �eld using two measures score and clock time. The dart-

board assigns to a hit in the bull's eye a score of 10. For each of the subsequent

9 rings, it assigns a score reduced by one. So it rewards the outermost ring

with a score of 1. For missing the dartboard they assigned a score of 0. For

clock time, we have 12 evenly spaced sectors going clockwise - in reference to

a clock - having sector 12 starting on the top going clockwise until sector 1

starts. For each sector following clockwise the sector number increases by one.

Hits in the bull's eye �eld do not have a clock time assigned. This clock time

measure relates to the direction of the hit relative to the bull's eye. Except

for the �rst 7 participants of Phase 2, they also recorded the clock time when

participants missed the dartboard.

Previous to the throw session, they made the participants take part in a ques-

tionnaire to asses their familiarity with dart throwing asking them to esti-

mate their personal frequency of dart games per year as well as the time that

18

http://dartbrokers.com/dart-technique

http://dartbrokers.com/dart-technique

3.1 Dart Experiment

has passed since the last game. Immediately before the throw session, they

Figure 3.1: The inertial measurement unit attached to the wrist - Cunnington

et al. attached the Shimmer 3 device by Realtime technologies ltd

to the wrist of the trowing arm to record throw kinematics and

muscle activity.

equipped participants with the inertial measurement unit (IMU) Shimmer 3 by

Shimmer at their right wrist for measuring characteristics of the throw move-

ment (cf. Figure 3.1). The wrist is close to the �ngers which hold the dart. For

this reason, we expect that the measured in�uences are similar to the actual

in�uences that a�ect the darts during the throw process. The IMU recorded

during the experiment time series with a sampling rate of 512Hz for accelerat-

ing forces according to three axes relative to this device using an accelerometer,

as well as rotation velocity in three axes relative to the device using a gyro-

scope and orientation using a magnetometer. As the measurements took place

inside a building, we considered the magnetometer measurements not reliable

from the beginning. The accelerometer settings were modi�ed from a today

unknown setting in Phase 1 to a setting of +/- 16g in Phase 2. In Addition,

this device was extended by two bipolar electrodes for measuring electrical sur-

face activity that we a�xed on the arm above the extensor digitorum muscle.

19

3 Data Basis

This muscle takes part in releasing the darts. So, Cunnington et al. expected

high electrical activity to be related to the moment of releasing the dart during

the throw. As paid volunteers and having given written informed consent, 23

participants took part in Phase 1 as well as 23 other participants in Phase 2.

There were 26 females and 20 males between 18 and 40 years with a mean age

of 23 years. They had normal or corrected vision and no neurological or psy-

chiatric disorders in the present nor in the past. All of them were right-handed

and most of them were beginners in the game of darts.

As the �rst step of postprocessing, Cunnington et al. identi�ed markers for

individual throws in the recorded session data for further processing. Unfor-

tunately, the EMG data was too noisy to identify single throws. In gyroscope

data, they observed peaks in rotational velocity which correlate with throws.

During a throw, the throwing arm moves faster than in the rest phases which

also includes faster rotations. So, they scanned through the gyroscope data

and identi�ed those peaks and recorded for every throw the corresponding

peak time as a marker that lays inside the timeframe of a throw. Further-

more, they discarded the datasets for participants with artifacts in EEG data

or failed synchronization between EEG and IMU.

After all, we were provided with complete datasets consisting of hit data,

kinematic data, and data regarding the electrical activity of the brain for 17

participants in Phase 1 and for 15 participants in Phase 2. Based on this data,

we will quantify performance as well as the physiology of throws next.

3.2 Towards Performance Measures

In the experiment, for every throw, the approximate hit position was recorded

using the measures score and clock time that describe the hit �eld. We take

an initial look at the collected data. Figure 3.2 is a heatmap that shows the

distribution of the hits on the �elds of the dartboard and its surrounding sectors

for the 100 throws in the �rst block of participant J14. The top and bottom of

the diagram represent the top and bottom of the dartboard. Respectively, the

left and right represent the left and right of the dartboard. We see that there

are hits scattered all over the dartboard. Yet, the focus of hits lays in the lower

half of the diagram mostly missing the bottom of the dartboard to the 4 lower

sectors. Most hits are far from the bull's eye thus showing a bad performance.

20

3.2 Towards Performance Measures

In Figure 3.3, we see for a di�erent participant that the distribution of hits is

less scattered and less displaced from the bull's eye. This participant seems to

perform better. The distribution for the next participant in Figure 3.4 is also

more concentrated than the distribution for the �rst participant. On the other

hand, the focus of the distribution is not centered around the bull's eye. It is

displaced downwards. So, this distribution shows a performance worse than

the second participant. Lastly, in Figure 3.5, we see a distribution that is also

scattered all over the dartboard but without a strong tendency for a general

displacement in a certain direction. Here, the performance also is worse than

the performance of the second participant. To sum it up, we reach a higher

performance when our hits are generally closer to the bull's eye and when our

hits are less scattered.

20 0 20

20

10

0

10

20

J14 B1

0246810121416

hits

in %

Figure 3.2: Distribution of hits with high scatter and high displacement -

We depicted the distribution of hits on the �elds of the dartboard

and the surrounding sectors as a heatmap. This heatmap shows

distribution of the 100 hits from the �rst block of participant J14.

21

3 Data Basis

20 0 20

20

10

0

10

20

J02 B1

0246810121416

hits

in %

Figure 3.3: Distribution of hits with low scatter and low displacement

20 0 20

20

10

0

10

20

J13 B3

0246810121416

hits

in %

Figure 3.4: Distribution of hits with low scatter and high displacement

22

3.3 Focus on accelerating Force as recorded physiological Measure

20 0 20

20

10

0

10

20

J01 B4

0246810121416

hits

in %

Figure 3.5: Distribution of hits with high scatter and low displacement

3.3 Focus on accelerating Force as recorded

physiological Measure

In the experiment, Cunnington et al. recorded the way of throwing darts re-

garding the executed movements and the activity of the brain. For the throw

execution, they recorded time series about the kinematic in�uences close to

the thrown dart measuring the accelerating force and rotational velocity. Re-

garding the activity of the brain, they recorded time series on the activity

of di�erent brain areas using EEG. In the conceptual chain from brain ac-

tivity, over muscle activity, over emerging movements, over �ight trajectories,

towards the ful�llment of the goal, we see more steps from brain activity to

performance than from emerging movements towards performance. To begin

analyzing the provided dataset, we decided to analyze the shorter connection

from emerging movements to the ful�llment of the movement goal. For this

reason, we ignore the EEG data in the following parts of the thesis. Instead,

we focus on the kinematic data that describes the movements which emerged.

For the kinematic data, we have the 3-dimensional time-series from the gyro-

scope, the accelerometer, and the magnetometer of the inertial measurement

unit. We already dropped the measurements of the magnetometer for being

23

3 Data Basis

not reliable due to being recorded indoors. For the gyroscope - which measures

the rotational velocity relative to the device - and the accelerometer - which

measures the accelerating forces also relative to the device -, we decided to fo-

cus further on the measurements of the accelerometer. Our reason is, that we

see the measurements for accelerating forces to be more relevant towards the

throw of darts because the rotational velocity depends on the technique used

for throwing. In the case of a rotational throw - when the throw is executed

mainly by a rotation in the elbow joint -, the rotational velocity correlates to

a high degree to the velocity of the sensor and respectively to the velocity of

the thrown dart. Yet, using a di�erent technique like an ejecting throw - when

the rotation takes place in the elbow joint and the shoulder joint -, weakens

this correlation. Since we do not know which techniques the participants used,

we decided to focus on the acceleration forces that a�ect the sensor instead.

The accelerating forces of the sensor are similar to the accelerating forces that

a�ect the dart while it is held. In Figure 3.6, we show the time series of two

1573 1574 1575 1576 1577 1578 1579t in s

15

10

5

0

5

10

15

20

Figure 3.6: Time series for the accelerating force of two subsequent throws -

We depicted the time series for the recorded strength of the accel-

erating force according of the three axes of the sensor in red, green,

and blue. In addition, we added the time series for the norm of the

accelerating force in black.

subsequent throws for the 3 axes of the sensor in red, green, and blue. First,

we see for both throws after the time 1573 and after the time 1578 distinct

amplitudes. Second, also between both throws, the amplitudes do not collec-

tively get close to 0 though it is a phase of slow motions and small accelerations

24


compared to throws - so-called a rest phase. Between throws, participants

usually rest their arm and hand or take the next dart. A second accelerating

force that the accelerometer recorded compound with accelerating forces to

accelerate the dart is gravity. Gravity is a static accelerating force that is di-

rected downwards. For each time of the recorded series of accelerating forces,

we have three components according to the three measuring axes of the sensor.

Together, these components represent a force vector relative to the sensor. We

can compute the length or strength of this force vector using the Euclidean

norm (cf. Equation 3.1).

‖~x‖ =√x1

2 + x22 + x3

2

with ‖‖ as Euclidean norm

with ~x as vector

with x1, x2, and x3 as vector components

(3.1)

We aggregated using the norm the three channels for every measurement time

to create a time series for the norm. We depict this series as the black series in

Figure 3.6. The black series shows us for the rest phase between both throws

a certain baseline that is shifted to the value of 2.5. The almost constant

strength of the accelerating forces in the rest phase matches to the constant

strength of gravity with only small accelerations due to movements. Though

the strength of the accelerating forces is almost constant in the rest phase,

we also see that the amplitudes for the separate channels change between

both throws. For example, the value for the green series raises, drops, and

raises again almost in parallel to the blue series. The reason for this is our

sensor and its measuring axes that rotate during movements while the gravity

remains pointing downward. This also a�ects the accelerating forces that we

are interested in to quantify the physiology of the throw movement. So due to

this rotation, a constant accelerating force may initially appear in one channel,

may be spread over time to several channels, and may, in the end, remain in a

di�erent channel. The norm is independent of the rotations of the force vector

because rotations do not change the length of the vector. For this reason, we

decided to focus further on the norm of the accelerating forces to quantify the

physiological properties of throws.

Yet, another observation that we made is that the scales for the accelerating

forces for the records of Phase 1 and Phase 2 data do not match. In rest

phases, the accelerating forces due to movement should be small while the

25

3 Data Basis

main accelerating force is gravity with a constant strength. For this reason,

we expect the average value and the scatter of values in rest phases to be

similar. In the time series of the norm of accelerating forces for all �ve blocks

of throws for 9 participants from Phase 1 and 9 participants from Phase 2,

we selected manually each time three rest phases. For these 54 rest phases,

we determined the mean and standard deviation to characterize the position

and the scatter of the values. In Figure 3.7, we confront the mean along the

horizontal axis and the standard deviation along the vertical axis. Every point

represents the properties of a single rest phase. The blue points for the rest

phases of Phase 1 form a small cluster in the lower right with small values

for mean and standard deviation. The orange points for the rest phases of

Phase 2 are clearly separated from this cluster having higher mean values.

We consider the rest phases to be corresponding reference values of the scales

that were used to measure the accelerating forces. In the �gure, we see that

these references values do not correspond well between Phase 1 and Phase 2

which implies the application of di�erent scales. Thus, we cannot compare

the amplitudes between both phases because any value has di�erent meanings

according to the di�erent scales, e.g., a participant with the height value 190

is big according to the centimeter scale but small according to the millimeter

scale. As for this reason the amplitudes are not comparable and as the scales

in Phase 2 vary more, we decided to drop the Phase 2 data and focus on the

data for the 17 participants from Phase 1.

In consequence of our decisions, we focus on the norm of the accelerating forces

for the participants in Phase 1 in order to quantify the physiology of throws.

26


Figure 3.7: Properties of rest phases - We manually selected 3 rest phases in

the time series for the norm of accelerating forces for each of 9

participants from Phase 1 and 9 participants from Phase 2. For

the selected rest phases, we determined the mean and the variance.

We confront these properties with the mean along the horizontal

axis and the standard deviation (std) along the vertical axis.

27

4 Proposed Method

In this chapter, we want to describe our developed method.

Based on our research questions and our expectations, we want to explore

the relation between performance in dart and the way of throwing the dart

which we call the physiology of throws. The main goal of our method is

to provide information to explore the relation between performance and throw

physiology. To do so, we quantify performance and throw physiology, as well as

the development over time in those measures. Finally, we analyze in Sections

5.2 and 5.3 the developed visualizations that oppose both sides of throwing

dart, which we used to characterize the relation.

4.1 Performance Measures

In the provided experiment, the outcome of every throw was recorded using

score and clock time that represent the approximate position for the hit on the

dartboard. Now, we want to use these measures to quantify the throw perfor-

mance of participants. In Section 2.3, we decided to use multiple performance

measures to quantify the performance of throwing dart which is di�erent to

the corresponding related work. So, we observed in Section 3.2 that we con-

sider a distribution of hits better regarding the goal of hitting the bull's eye

when the distribution is less scattered and generally shows a smaller deviation

from the bull's eye. In order to capture both - the scatter of hits and a general

displacement -, we de�ne two performance measures. We de�ne both measures

as error measures, i.e, the smaller the error the better the performance. So, an

error of 0 represents the maximum possible performance according to the re-

spective measure. First, we name the concept of performance from the general

proximity of hits towards the bull's eye accuracy. The participant in Figure

3.2 showed a smaller accuracy than the participant in Figure 3.5 because the

hits of the �rst one generally deviated from the bull's eye downwards while

29

4 Proposed Method

the hits of the second one are generally centered at the bull's eye. Based on

accuracy, we de�ne the accuracy-error as the distance between the center of

the bull's eye and the center of the hits (cf. Equation 4.1). As we will see in

an instant, we can remove the position of the center of the bull's eye from the

equation.

accuracy-error =‖~xcenter of bull's eye − ~xcenter of hits‖

=‖~0− 1

|H|∑~x∈H

~x‖

=1

|H|‖∑~x∈H

~x‖

with ~x as position vector for a hit

with H as sequence of hits

(4.1)

For the second performance measure, we name the concept of performance

from smaller scatter of hits precision. The participant in Figure 3.3 has

a higher precision than the participant in Figure 3.5 because the �rst one

shows a distribution of hits that is more concentrated than the distribution of

the second one. Based on precision, we de�ne precision-error as the mean

distance between two hits (cf. Equation 4.2).

precision-error =1

|P |∑

{~x1,~x2}∈P

‖~x1 − ~x2‖

with ~x1 and ~x2 as position vectors for hits

with P as sequence of all pairs of hits

(4.2)

For the de�nition of accuracy-error and precision-error, we assumed that we

already have the exact positions of the hits. Yet, in our experiment, we col-

lected just an approximate position consisting of score and clock time that

describe the hit �eld. Thus, we estimate exact hit positions from the dimen-

sions of the dartboard and its �elds relative to the center of the bull's eye with

the following strategy. For hits in the bull's eye �eld, we mapped the position

to the center of the bull's eye with coordinates ( 00 ). For hits in ring segment-

shaped �elds, we mapped the position to the intersection of the inner arc and

the angle bisector of the respective ring segment. For misses, we mapped the

position to the intersection of the outer rim of the dartboard and the angle bi-

sector of the respective sector. Since all estimated positions are relative to the

center of the bull's eye at ( 00 ) = ~0, we can remove the center of the bull's eye

30

4.2 Describing the Throw Pattern by Throw Features

from Equation 4.1 for the computation of accuracy-error. In addition, as we

use the original dimensions in cm, the values for both performance measures

also have centimeters as units.

In Table 4.1, we show the values for accuracy-error and precision-error for the

examples from Section 3.2. As before, the second distribution is better than

the �rst distribution according to both measures representing less scatter and

general less deviation from the bull's eye. The third distribution is better

than the �rst according to precision-error being less scattered but worse than

the second distribution according to accuracy-error having generally a bigger

displacement from the bull's eye. The fourth distribution of hits is worse than

the second and the third according to precision-error being more scattered.

Yet, it is better than the �rst distribution according to accuracy-error. The

fourth distribution shows the smallest accuracy-error being better than both

distributions with low precision-error.

Table 4.1: Performace measure examples - We computed for the examples

from Section 3.2 the values for accuracy-error and precision-error

rounded to one position after decimal point.

�gure accuracy-error precision-error

1st 3.2 8.5 14.4

2nd 3.3 1.2 5.1

3rd 3.4 2.3 7.2

4th 3.5 0.6 11.2

In order to quantify the performance for a block of throws, we estimate the

exact positions of hits which we use to compute our performance measures

accuracy-error and precision-error.

4.2 Describing the Throw Pattern by Throw

Features

In Section 3.3, we decided to focus on the norm of accelerating forces as a

measure to characterize the physiology of throws. In this section, we describe

how we extract speci�ed features from single throws using a model regarding

31

4 Proposed Method

the physiological processes and how we aggregate those features over many

throws to quantify the physiology of throwing.

To prepare the data, we determine the norm of accelerating forces and extract

the time series for every throw. For every throw, we use the peak time as a

reference point in time and extract an interval starting 1500 milliseconds before

the peak time and ending 500 milliseconds after it. In Figure 4.1, we see the

extracted series for all throws of participant J01. We recognize a characteristic

pattern consisting of to subsequent peaks for the processes during throwing.

As we can see a clear pattern for the curves of the time series, we follow the

Figure 4.1: Extracted time series of the accelerating force for all throws of

participant J01

idea of [Preim et al.2009] to simplify the time series to few form features that

describe a single throw. To do so, we develop a model which we match to the

curve of a throw to extract characteristic throw features.

These features base on a conceptual model of the throw process. In Section 2.1,

we already described the process of throwing a dart consisting of a preparation

phase to move the dart to an initial throw position, an acceleration phase to

accelerate the dart to a certain velocity, and a deceleration phase to stop

the throwing arm. The movements in the preparation phase are slow and

for this accompanied by weak acceleration. In the acceleration phase, the

velocity of the arm increases a lot in a short time. So here, the acceleration is

strong. Lastly, in the deceleration phase, the velocity of the arm decreases in

a short time a lot. So in this phase, there is a strong deceleration. Hence, the

acceleration phase and the deceleration phase with their strong acceleration

correspond to the �rst and the second peak that we see in Figure 4.1.

The model that we �t to the curve is piecewise linear, i.e., it consists of linear

and constant functions that apply in a certain interval. We can see in Figure

4.2, that we approximate the baseline in the rest phases before and after throws

using a horizontal line. Using a raising and a falling line segment, we model

32


each peak for acceleration and deceleration phase. Using this model, we

t1 t2 t3 t4 t5

1

2

3

4

5

Figure 4.2: Time related form features in the piecewise linear model

v1

v2

v3

v4

1

2

3

4

5

Figure 4.3: Amplitude related form features in the piecewise linear model

gather 5 points in time at the connection of the line segments to characterize

the model which we show in Figure 4.2. We describe with t1 the begin of the

acceleration phase, with t2 the time of the peak acceleration, with t3 the time

of the transition towards the deceleration phase, with t4 the time of the peak

deceleration, and lastly with t5 the end of the deceleration phase. On the other

hand, we have 4 amplitude related features at the connection of line segments.

With v1, we describe the amplitude of the baseline, with v2 the amplitude

of the peak acceleration, with v3 the amplitude at the transition towards

the deceleration phase, and with v4 the amplitude of the peak deceleration.

There is no need for a v5 since at the end of the deceleration phase we reach

the baseline whose amplitude is already described by v1. To limit the valid

33

4 Proposed Method

con�gurations of those form features, we de�ned a set of constraints including

that the time-related form features keep their order (cf. Equation 4.3) and

that the amplitude-related form features maintain both peaks (cf. Equation

4.4).

t1 < t2 < t3 < t4 < t5 (4.3)

v1 <v2

v2 >v3

v3 <v4

v4 >v1

(4.4)

In order to match our model as good as possible to the time series of a

throw, we construct �rst a good initial solution for the model and optimize

all form features afterward using the Levenberg�Marquardt algorithm (see

[Ranganathan2004]) that depends on a good initial solution as it �nds local

optima. To quantify the distance between data and model, we de�ne our resid-

ual as the sum of squared di�erences between data and model over the whole

time series for the throw. In Figure 4.4, we show the time series for the norm

of the accelerating forces in black. To �nd a good initial solution, we start with

a static prede�ned model that we scale to match the maximum amplitude in

the data. Afterward, we search with a brute force approach the best temporal

o�set. These three intermediate models are depicted in the �gure with grey

dotted lines. Finally, we apply the Levenberg-Marquardt algorithm to adjust

all form features to minimize the residual value. The �nal model, we depict as

red polyline in the �gure. We see that it could capture the baseline as well as

the peaks though there is some noise that our model cannot represent.

At this point, we have a set of form features that describe the shape of the time

series for a single throw. Yet, the time-related features are still relative to the

peak time which may vary relative to the process in the throw. The amplitude-

related features still contain the in�uence of the gravity which is not part of

the accelerations due to the throw movement. To avoid those in�uences, we

de�ne throw features which base on the di�erences between form features.

From all combinations of the 5 time-related features, we de�ne 10 di�erences

as time-related throw features. With the notation d_t1_t2, we refer to the

34


Figure 4.4: Fit for the 42nd throw of participant J01 - We depicted the time-

series as the black line, the representation of the �nal �tted model

as the red line, and the intermediate results for �nding a good

initial model for �tting as grey dotted lines.

35

4 Proposed Method

di�erence of t1 to t2 - in other words, d_t1_t2 = t2−t1. In Figure 4.5, we see arepresentation of d_t1_t2. So, it represents the duration from the begin of the

acceleration phase to the time of the peak acceleration. From all combinations

of the 4 amplitude-related form features, we de�ne 6 di�erences as amplitude-

related throw features. Respectively, with the notation d_v1_v4, we refer to

the di�erence from v1 to v4, i.e., d_v1_v4 = v4 − v1. This throw feature

represents the amplitude of the peak deceleration without the height of the

baseline as we can see in Figure 4.6. Lastly, we de�ne as 4 additional throw

features the 4 slopes for the acceleration and deceleration phase. With the

notation s12, we refer to the slope between the �rst and the second connection

(cf. Figure 4.7), i.e., s12 = d_v1_v2/d_t1_t2.

t1 t2

1

2

d_t1_t2

Figure 4.5: d_t1_t2 as example for a throw feature from a time di�erence

v1

v4

1

4d_v1_v4

Figure 4.6: d_v1_v4 as example for a throw feature from a amplitude di�er-

ence

36

4.3 Quantifying Development

1

2

s12

Figure 4.7: s12 as example for a throw feature from a slope

Having those 20 throw features, we can characterize throws individually. Yet,

we have no measure for the variance of the movement in general. To character-

ize general properties of the throwing of a participant, we determine properties

of the distribution for each throw feature over a block of throws. We character-

ize the scatter or variance of each feature using the IQR (interquartile range).

For the average value of each feature, we determine the median. We chose

both for the reason that they are more robust against outliers that may occur

in the �tting.

In order to quantify the physiological properties of throwing, we abstract the

time series data of accelerating forces by �tting a model from which we extract

throw parameters that characterize each throw individually. Afterward, we

determine for each throw feature the distribution properties median and IQR

during a block of throws to describe the general value and the scatter.

4.3 Quantifying Development

We have now performance measures that describe in the time frame of a

certain block the status of performance. In addition, we have distribution

properties of throw features that describe also in the time frame of a certain

block the way of throwing dart. As we are also interested in the changes

over time, we quantify the development as follows. We compute the di�er-

ences for performance measures as well as the distribution properties. We

set the �rst block as reference value computing the di�erences in all these

37

4 Proposed Method

measures from the value regarding the �rst block to the value of later blocks,

e.g., developmentaccuracy_error = block2accuracy_error − block1accuracy_error. As

we have 5 blocks in the experiment we will have for one measure 4 di�erences

- to the second, third, fourth, and �fth block.

38

5 Results

In the following sections, we describe in detail the analyses that we make

to answer the research questions as well as the corresponding observations.

First, in Section 5.1, we have a closer look at the performance measures that

occurred during our experiment and which are the basis for the confrontation

in the following analyses. In Section 5.2, we look for di�erences in our throw

features based on confronting good and bad groups based on the performance

measures. In Section 5.3, we look for di�erences in the development of our

throw features.

In order to explore the relation between performance measures and phys-

iological measures of throws, we follow the idea as seen in the work of

[Günnemann et al.2010]. They select groups of similar elements according to

one space and visually explore their distribution in a di�erent space with the

help of scatter plots. Based on these visualizations, they identify manually

patterns of interest. We select a good and a bad group according to the per-

formance measures and explore their relation according to the distribution

properties of throw features.

Regarding our �rst expectation - that participants with a better performance

have a higher variance of throwing -, we conduct three analyses. We select both

groups �rst based on accuracy-error (see Section 5.2.1) and second based on

precision-error (see Section 5.2.2). Third, we select them according to the non-

dominated fronts on both measures in the manner of an interfront analysis (see

Section 5.2.3). We do not conduct an intrafront analyses because two groups

based on the opposite ends of front have no notion of good and bad, or better

and worse. Regarding our second expectation - that participants that improve

more also show a stronger increase in the variance of throwing -, we conduct

another three analyses. First and second, we confront groups based on the

development in either accuracy-error (see Section 5.3.1) or precision-error (see

Section 5.3.2). As third and last analyses, we select the groups based on the

39

5 Results

non-dominated fronts on the development in both performance measures as

interfront analysis (see Section 5.3.3).

5.1 Performance measures from Experiment

Data

In the following two subsections, we want to get an impression on the oc-

curring values for performance measures that we receive from applying our

performance measures to the experimental data. In addition, we want to get

an idea of how the characteristics of these values will in�uence the following

analyses for answering our research questions. In the �rst subsection, we will

have a look at the actual values for accuracy-error and precision-error which

we introduced in Section 4.1. In the second subsection, we will examine the

development in those measures.

5.1.1 Regarding Block Performance

We aggregate the performance data from our experiment for single hits - score

and clock time - for each of the �ve blocks for each participant computing the

values for precision-error and accuracy-error. These represent respectively the

mean distance between two hits on the dartboard in cm and the distance of

the center of the hits to the dartboard center in cm. Since both measures are

error measures and it is better to make smaller errors, we consider smaller

values better than higher values. Or in other words, we consider low values

- relating to the range of occurring values - as good and the ones with high

values as bad. The resulting values for the performance measures are shown

in the sub�gures of Figure 5.1.

Sub�gure (0) shows us a histogram regarding accuracy-error where we get an

impression on the distribution of occurring values. We see the range starting

from the left close to zero ending on the right close to 10. We see this distri-

bution to be skewed towards the left or the better values with the majority of

data points having small values for accuracy-error. Regarding precision-error,

we see the corresponding histogram in Sub�gure (1). The occurring values

for precision-error range from 5 up to 20. In a di�erent way from accuracy-

error, this distribution is slightly skewed towards the right, i.e., towards higher

40

5.1 Performance measures from Experiment Data

5 10accuracy_error

0

10

20

(0)

5 10 15precision_error

0

10

(1)

0 5 10accuracy_error

0

10

prec

ision

_erro

r

(2)

Figure 5.1: Occurring performance measure values - We determined our per-

formance measures for each block of each participant using the

data from our experiment. Sub�gures (0) and (1) are histograms

on this data showing the frequency of occurring values. Sub�gure

(2) is a scatterplot confronting both measures and also showing the

non-dominated fronts as polylines.

values for precision error. Comparing Sub�gures (0) and (1), we observe as

an additional di�erence that for accuracy-error some participants in certain

blocks were able to reach error values close to 0. Yet, for precision-error, we

see nobody with a precision-error less than 5 in any block.

From these two sub�gures, we get an idea for the distribution of one perfor-

mance measure independent from the other one. The opposing skewness of

both distributions may be a hint towards a negative correlation, i.e., that low

values for accuracy-error may occur together with high values for precision-

error.

In order to get an impression on the relation between both measures in our

experiment, we look now at the remaining sub�gure - Figure 5.1, Sub�gure (2).

This diagram is a scatterplot confronting accuracy-error along the horizontal

axes with precision-error along the vertical axis. Each point still represents

a certain block of a certain participant. In addition, we see the fronts from

non-dominated sorting as polylines connecting the elements that belong to it.

We explained how we determine these fronts in Section 2.4.

Looking at the data points, we see that they cover for accuracy-error a range

starting from left close to zero going to the right up to a value of 10 like before

in Sub�gure (0). Also in precision-error, they cover the range starting at the

bottom with about 5 going up to a value close to 20 just like in Sub�gure (1).

However, we see no negative correlation which would be visible as a perceiv-

41

5 Results

able line or strip of points going from the top left to the bottom right of the

diagram. What we see resembles more a positive correlation. It is a triangle-

like distribution of points starting at the bottom left with performances good

in accuracy-error and precision-error going up to performances still with good

accuracy-error but now with bad precision-error values and ending in the top

right with performances that are bad according to both measures. Regarding

the non-dominated fronts, the best are located on the left and the worst in

the top right. The best fronts with low accuracy-error mainly go vertical from

performances with high precision-error to performances with low precision-

error whereas the worst fronts follow a direction that is diagonal to horizontal.

Underneath the diagonal of this observed triangle are no data points. The

area that would complete the upper triangle of data points to a rectangle, we

refer to it in the following as empty triangle. This empty triangle means

that in our experiment and according to our method no participant showed a

performance in any block that would lead to a combination of low values for

precision-error and high values for accuracy-error.

Observing this empty triangle, we wonder for the reason of this shape. One

possible reason is the errors that we introduce in our method that may distort

the distribution of occurring values. We measure the position of the hit of a

dart on the dartboard using the dartboard �eld that the dart hit. Later, we

estimate for each hit a position in these �elds independent of the actual posi-

tion in the respective �eld in order to apply our performance measures. Here,

we introduce an error between the actual position of hit and the estimated

position. Since the dartboard �elds increase in area from the inside to the out-

side, participants may be a�ected to a di�erent extent for example according

to their accuracy-error. So participants with higher accuracy-error - a bigger

o�set of their hits from the center - are a�ected by bigger errors as they tend

to hit the bigger �elds more often than participants with low accuracy-error.

This distortion may move occurring performances with low accuracy and high

precision above the diagonal of the triangle where we cannot distinguish them

anymore from the other performances. Since we determine the fronts based

on these values they may not capture the initially intended combination of

elements that are exceptionally good in just one performance measure and

elements that are good in both measures. This would in�uence the further

analyses based on these fronts.

In order to test whether our method is the reason for the empty triangle or

in other words whether it is possible to achieve performances in it, we visual-

42


ized the distortions according to both measures. Following the idea of what

would be di�erent if we had the exact o�set in horizontal and vertical direc-

tion instead of clock time and score, we made a simulation to generate arti�cial

performance data. For this, we generated arti�cial performance data to con-

front performance measures that contain the quanti�cation to the �elds of the

dartboard and performance measures that do not. We chose simple circular

two-dimensional normal distributions on the plane of the wall as representa-

tives for the distribution of hits relative to the dartboard center from which

we drew samples as hits with x and y relative to the dartboard center. For

these distributions, we have 2 parameters - one for the location or center of the

distribution and one for the extent of horizontal as well as vertical scatter - the

standard deviation. Since the dartboard is rotational invariant, we consider

for the location parameter just a positive o�set to the right of the dartboard

center. We de�ned 500 di�erent distributions based on the combinations of 50

di�erent values for the horizontal o�set to the center and 10 di�erent values

for the standard deviation. From each of these distributions, we drew a sample

block, i.e., 100 samples as simulated hit positions. For each of these arti�cial

blocks, we computed the performance measures using the actual hit position

from sampling for a �rst version and using the quanti�ed positions from map-

ping to score and clock time for a second version. If the distortions that lead

to the empty triangle are a general property of our performance measures we

should observe these distortions and the empty triangle when we look at the

�rst version of the performance measures. If these distortions are a result of

our process - of using the dartboard and estimating the position of hits - we

should observe the empty triangle just for the performance measures from the

arti�cial data that was mapped to the dartboard �elds.

Similar to Figure 5.1, Sub�gure (2), we confront in Figure 5.2 accuracy-error

along the horizontal axis and precision-error along the vertical axis. Here, we

marked the triangle-like area that contains performances that occurred during

our experiment with a continuous line. The empty triangle is marked with

a dashed line. For each of the 500 distributions that we sampled there is

an empty circle that represents the performance measures that are directly

based on the sampled hit positions. For the empty circles with precision-

error close to 0, we see the regular distances that we chose for the values of

the o�set parameter. The more we look up the harder it gets to recognize

these lines. In the dashed triangle representing the empty triangle, we see

lines of empty circles. So just according to our performance measures, it is

43

5 Results

0 10 20 30 40accuracy_error

0

10

20

30

40

prec

ision

_erro

r

based on sampled positionbased on dart board fieldsempty triangle in experimentarea of observed values in experiment

Figure 5.2: Visualizing distortions in the relation between our performance

measures - We generated arti�cial blocks of hit positions that we

used to compute the performance measures directly (empty circles)

as well as by mapping them to the �elds of the dartboard �rst

(�lled circles). We marked as a reference the area that contains

the occurring values in our experiment as well the empty triangle

underneath.

44


possible to reach this area. For each of the sampled distributions, there is

a �lled circle in the �gure which represents the performance measures based

on score and clock time. For these, we see multiple distortions. First, we

see that the distribution is limited to the top and the right by a bow-like

structure. Most likely this originates from the limited size of the dartboard

and the mapping of missing hits to the margin of the dartboard. For example, if

participants always miss the dartboard to the right then it makes no di�erence

how far they miss as all these hits are later mapped to the right margin of

the dartboard. According to this fact, we also see the maximum in accuracy-

error for the �lled circle at about 21 cm which is the radius of the dartboard.

The second distortion is a positive correlation that we see for precision-error

below 5. That �ts the notion that the error that we introduce gets bigger

to the outer rings. Nevertheless also in the area of the empty triangle, we

see �lled circles. Comparing corresponding �lled and empty circles inside the

triangle for the occurring values from our experiment, we see the �lled circles

shifted downwards. This means that using the dartboard for measurement, we

underestimate precision-error.

In summary, we observed between our two performance measures from our

experiment a triangle-shaped relation that is - according to our simulation

- neither caused by their de�nition nor caused by introducing quanti�cation

errors by using the dartboard.

5.1.2 Regarding Performance Development

In order to quantify development with respect to the performance measures,

we determine the di�erences in the performance measures from �rst blocks to

corresponding later blocks. A negative di�erence for example in accuracy-error

tells us that the accuracy-error in the later block was smaller than in the �rst

block. This means that the corresponding participant improved in accuracy-

error respective to his accuracy-error in the �rst block. Correspondingly, a

positive di�erence refers to an increase which means that the participant de-

graded. Apart from improving and degrading, we can also observe that par-

ticipants show no or hardly any development with di�erence values close to 0.

Applied to the performance measures from our experiment, we determine for

each participant the di�erences from the �rst block to the later 4 blocks.

45

5 Results

2.5 0.0 2.5accuracy_error

0

10

20(0)

5 0precision_error

0

10

(1)


5

0

prec

ision

_erro

r

(2)

Figure 5.3: Occurring values for development in performance measures - We

determined from the occurring performance measure values the

di�erences from the �rst to the later blocks for each participant.

Sub�gures (0) and (1) are histograms on this data showing the

frequency of occurring values. Sub�gure (2) is a scatterplot con-

fronting the di�erences in both measures and also showing the non-

dominated fronts as polylines.

We visualize these di�erences in the sub�gures of Figure 5.3. Sub�gure (0)

is a histogram that shows us the distribution of occurring di�erence values

regarding accuracy-error. The distribution ranges from the smallest values -

improvements of around -5 - on the left to the biggest values - degradations

up to 2.5 - to the right. We observe the highest peak of the distribution to

cover negative values but also to show very small development close to 0. Also,

looking at the area left and right of 0, we see that the majority of occurring

di�erence values are negative. For accuracy-error, we see that participants

improved more often and also to a bigger extent than they degraded. So, we

see a tendency of getting better in accuracy. Switching to precision-error, Sub-

�gure (1) shows us accordingly a histogram for the distribution of occurring

di�erence values precision-error. Similarly, the values range from -5 up to 2.5,

the peak covers negative values close to zero, and occurrences have predomi-

nantly negative values. Yet, this time, the peak is not as pronounced as before.

Also for precision-error, we see that participants improved more often and to a

bigger extent which we interpret as a tendency of improving in precision-error

over time. Even though both distributions are similar, yet, we do not know

the relation between the development in both measures. For this, we look at

the remaining Sub�gure (2). This diagram is a scatterplot that confronts -

like before in Figure 5.1, Sub�gure (2) - the values for accuracy-error along

the horizontal axis and the values for precision-error along the vertical axis.

46


Whereas this time, the values are the di�erences for these measures. A point

in this diagram represents the di�erences in accuracy-error and precision-error

from the �rst block to a certain later block for a certain participant. Based

on these points, we also determined the non-dominated fronts which we will

use in later analyses. Here, we show these fronts again as polylines connecting

the corresponding points. To emphasize increase and decrease, or improve-

ment and degradation, we added orthogonal to both axes lines that mark the

border between both di�erent developments at the value 0. This gives us 4

quadrants. The lower left quadrant shows us an area with improvement in

both measures, the upper left improvement in accuracy-error and degradation

in precision-error, the upper right degradation in both measures, and the lower

right quadrant an area with degradation in accuracy-error and improvement in

precision-error. We see that the points are distributed across all quadrants. So

all combinations of developments occurred which is di�erent from the empty

triangle in the previous section. We also see points with no development in one

of the measures or no development in both. Generally, the density of points

near the intersection at (0, 0) is higher than in the distant parts. A tendency

of improving regarding both measures, we see from the fact that the lower left

quadrant for improvement regarding both measures contains the most points

in comparison to the other quadrants. Regarding the non-dominated fronts,

we see the best fronts starting from the lower left to the worst fronts in the

top right. The best fronts span from the upper left quadrant over the lower

left quadrant to the lower right quadrant. Doing so, they combine points with

strong improvement in one and degradation in the other measure with points

that show improvement in both measures. The worst front span inside the

upper right quadrant combining points with degradations according to both

performance measures.

So far, we examined the occurring values for development in performance mea-

sures that we will confront in Section 5.3.3 with development in distribution

properties of our throw features in order to identify characteristics of partici-

pants that improve more than others. We observed the tendency to improve

according to both measures. Apart from improvement, we also observed cases

of no development and degradation. This gives us the possibility to confront

cases of development later on. Di�erent from the fronts on the actual values

for the performance measures in the previous section, this time, especially the

best fronts span a full range and are not just in�uenced by just one of the

measures. Here they combine both accuracy-error and precision.

47

5 Results

5.2 Relation between Performance and Throw

Pattern

In the following, we want to analyze the performance data in their relation to

the throw features. We expect that better participants show a higher variety in

their throw physiology. So far, we determined for each of the 5 blocks - so for

every 100 throws - of a participant the performance measures accuracy-error

and precision-error, as well as the distribution properties median and IQR for

each throw feature. Applying IQR to a throw features gives us an physiological

variance measure. At this point, we have for each participant 5 data points

which are the basis for the following Figures 5.4, 5.5, and 5.6. Each of them

shows in Sub�gure (0) a visualization of one or both performance measures

as well as in Sub�gures (1) to (20) the distribution measures for each feature.

Looking at Figure 5.5, Sub�gure (6) just as an example, we see a scatterplot

that opposes for the data points of all participants for the feature d_t2_t4

the median along the horizontal axis as well as the IQR along the vertical axis.

In addition, we see data points colored in transparent grey, blue or red. Yet

in order to get to know this kind of diagram, we will ignore the meaning of

the color for this moment. Considering the vertical axis on IQR, we observe

that the elements of the blue group of data points tend to reach higher values

for IQR than the elements of the red group. For the feature of this sub�gure -

d_t2_t4 - which is the latency between the time of the maximum acceleration

and the time of maximum deceleration, this means that the variations of this

latency for the blue group tend to be bigger than the respective variations

for the red group. On the other hand considering the horizontal axis on the

median, we see that the red group is more concentrated as it covers a smaller

interval between values from 100 to 150. However, the blue group seems to

be more scattered as it covers a greater interval. Regarding the latency

between peak acceleration and peak deceleration this means that the elements

of the red group tend to be more similar in their average latency. Whereas the

elements of the blue group reach higher as well as lower average latency being

less similar. Also, when we consider both axes, we see that the red points are

more concentrated than the blue points showing that the elements of the red

group are also more similar regarding IQR.

In order to �nd answers for our research question of which di�erences exist

between participants with a high performance and participants with a low

48

5.2 Relation between Performance and Throw Pattern

performance, we de�ne a group of good elements with a high performance

and a group of bad elements with a low performance based on di�erent crite-

ria regarding our performance measures - accuracy-error and precision-error.

Having these groups, we oppose both of them visually to look for di�erences in

their distribution properties for throw features. In the following Figures 5.4,

5.5, and 5.6, we show the good group in red color and the bad group in blue

color. Elements that we do not assign to a group, we show in a transparent

grey for context. Applied to our initial example from Figure 5.5, Sub�gure (6),

we see that it is the group of good elements in red that shows less variance in

the given feature and whose elements are more similar to each other than the

elements in the bad group in blue. To shorten the following descriptions, we

abbreviate the good red group as R as well as the bad blue group as B.

According to our notion of variability in motor control, we expect participants

with a higher performance to show higher variance in their throw physiology.

We quantify this variance using the IQR distribution property of throw features

over a whole block of throws. In consequence, we expect for the scatterplots

for the distribution measures of the throw features to see that R shows higher

IQR values than B. So what we see in our example in Sub�gure (6) does not

match to what we expect. It rather opposes what we expect as R shows less

IQR instead of more IQR.

In the next three sections, we de�ne R and B, �rst and second solely based

on one of our performance measures and third based on both of them using

fronts from non-dominated sorting. Each time, we use these groups to identify

di�erences according to the throw features and check whether they �t our

expectations. Afterward, we summarize and compare the sets of observations

that we make.

5.2.1 Groups based on Accuracy-error

In this �rst analysis, we select R and B solely based on the accuracy-error

performance measure. As this measure is an error measure it is better if the

error's value is smaller. Because of this, we selected for R the 10 elements

with the smallest values for accuracy-error. On the other hand, we selected for

B the 10 elements with the highest values for accuracy-error. In Figure 5.4,

Sub�gure (0), we see a histogram on the occurring values of accuracy-error.

49

5 Results

With respect to our selection R forms the left border with the smallest values

while B forms the right border with the highest values.

Based on this de�nition of R and B from accuracy error, we will now describe

our observations of di�erences between both groups regarding the distribution

measures for each throw feature in Figure 5.4. Yet, we will not describe features

or in other words, Sub�gures in which we see no clear di�erence.

Regarding the features d_t2_t3 and d_t2_t4 in Sub�gures (5) and (6), we

observe that R except for two outliers is more concentrated according to IQR

also tending to have smaller IQR than B. This opposes our expectations as we

expect higher IQR for the elements of R.

Looking at Sub�gure (13), we see R having the tendency to have higher IQR

than B which �ts what we expect.

Lastly, in Sub�gure (19), we observe - ignoring two outliers - B tending to have

lower IQR for a given median. So also here R shows what we expect, i.e., a

higher IQR than B.

Using solely the accuracy-error to de�ne R and B, we observed di�erences in

4 of 20 throw features according to their distribution properties. Though the

di�erences for 2 features match our expectations there are also 2 features that

show di�erences that oppose them.

5.2.2 Groups based on Precision-error

In this analysis, we de�ne the selections for good as well as bad based solely on

the precision-error performance measure. As this measure is an error measure

like the accuracy-error in the previous analysis, we consider smaller values

preferable or better than bigger values. Respectively, we selected for the group

of good elements R the 10 elements with the lowest values in precision-error

and for the group of bad elements B the 10 elements with the highest values.

Looking at the histogram in Figure 5.5, Sub�gure (0), we see R at the left

border of the distribution with the lowest occurring values and we see B at the

right border with the highest occurring values.

Regarding the Sub�gure (1) for the �rst feature d_t1_t2, we see R and B

separated by a conceivable diagonal. So for a given median R shows smaller

50


5 10accuracy_error

0

10

20

(0)

100 200d_t1_t2_median

50

100

150

d_t1

_t2_

iqr

(1)

100 200 300d_t1_t3_median

50

100

150

d_t1

_t3_

iqr

(2)


50

100

150

d_t1

_t4_

iqr

(3)


50

100

150

d_t1

_t5_

iqr

(4)


25

50

75d_

t2_t

3_iq

r(5)


25

50

75

d_t2

_t4_

iqr

(6)


20

40

60

80

d_t2

_t5_

iqr

(7)

25 50d_t3_t4_median

20

40

60

d_t3

_t4_

iqr

(8)


20

40

60

d_t3

_t5_

iqr

(9)


25

50

75d_

t4_t

5_iq

r(10)

5 10d_v1_v2_median

2

4

d_v1

_v2_

iqr

(11)

0 10d_v1_v3_median

1

2

3

4

d_v1

_v3_

iqr

(12)

10 20 30d_v1_v4_median

2

4

6

d_v1

_v4_

iqr

(13)

10 0d_v2_v3_median

2

4

d_v2

_v3_

iqr

(14)

10 20d_v2_v4_median

2

4

6d_

v2_v

4_iq

r(15)


2

4

6

8

d_v3

_v4_

iqr

(16)

0.1 0.2s12_median

0.000

0.025

0.050

0.075

s12_

iqr

(17)

0.2 0.0s23_median

0.0

0.1

0.2

0.3

s23_

iqr

(18)

0.250.500.751.00s34_median

0.1

0.2

0.3

s34_

iqr

(19)

0.750.500.25s45_median

0.0

0.2

0.4

0.6

s45_

iqr

(20)

Figure 5.4: Confrontation based on accuracy-error - Sub�gure (0) represents

the selected good and bad group based on accuracy-error. Sub�g-

ures (1) to (20) show the distribution properties of these groups

for each throw feature.51

5 Results

5 10 15precision_error

0

5

10

15(0)


50

100

150

d_t1

_t2_

iqr

(1)

100 200 300d_t1_t3_median

50

100

150

d_t1

_t3_

iqr

(2)


50

100

150

d_t1

_t4_

iqr

(3)


50

100

150

d_t1

_t5_

iqr

(4)


25

50

75d_

t2_t

3_iq

r(5)


25

50

75

d_t2

_t4_

iqr

(6)


20

40

60

80

d_t2

_t5_

iqr

(7)

25 50d_t3_t4_median

20

40

60

d_t3

_t4_

iqr

(8)


20

40

60

d_t3

_t5_

iqr

(9)


25

50

75d_

t4_t

5_iq

r(10)

5 10d_v1_v2_median

2

4

d_v1

_v2_

iqr

(11)

0 10d_v1_v3_median

1

2

3

4

d_v1

_v3_

iqr

(12)


2

4

6

d_v1

_v4_

iqr

(13)

10 0d_v2_v3_median

2

4

d_v2

_v3_

iqr

(14)

10 20d_v2_v4_median

2

4

6d_

v2_v

4_iq

r(15)


2

4

6

8

d_v3

_v4_

iqr

(16)

0.1 0.2s12_median

0.000

0.025

0.050

0.075

s12_

iqr

(17)

0.2 0.0s23_median

0.0

0.1

0.2

0.3

s23_

iqr

(18)

0.250.500.751.00s34_median

0.1

0.2

0.3

s34_

iqr

(19)

0.750.500.25s45_median

0.0

0.2

0.4

0.6

s45_

iqr

(20)

Figure 5.5: Confrontation based on precision-error

52


values for IQR than B which opposes our expectation of R having higher IQR

values.

Looking at Sub�gure (5), we observe R having lower values and being less

scattered according to IQR. This stands again against what we expect.

Similar to this, we see in Sub�gure (6) the same properties for R. In addition

R is also more concentrated regarding the median values.

In Sub�gure (11) for d_v1_v2 which represents the maximum acceleration R

and B intersect. Yet R is more concentrated with small values for median and

IQR. B tends to have higher IQR, which is not what we expect.

In the next sub�gure - Sub�gure (12) -, we see a similar yet stronger di�er-

ence as R and B just slightly overlap while R reaches lower values for both

distribution measures.

Next, we look at Sub�gures (13) and (15) for the related features d_v1_v4

and d_v2_v4 which represent respectively the maximum deceleration and the

di�erence between maximum acceleration and maximum deceleration. Both

times, we observe - ignoring one outlier each time - that R tends to have higher

IQR and smaller median. This �ts both times our expectation.

Regarding d_v2_v3 in Sub�gure (14), we observe R to be more concentrated

according to IQR as well as median. B reaches higher values for the median

as well as the IQR. This opposes what we expect to see.

Looking at Sub�gure (17) for s12 - the slope at the beginning of the acceleration

phase -, we observe R and B intersecting yet R being concentrated with small

values in both distribution measures. On the other hand, B reaches higher

values for median and IQR which opposes our expectations.

Lastly, we look at s23 in Sub�gure (18) which opposes again what we expected

as R shows lower IQR for a given median.

Using solely precision-error to de�ne a good group and a bad group, we ob-

served di�erences between those groups in the distribution measures for 10 of

20 throw features. Here just 2 observations match our expectation of R having

higher IQR while 8 observations show the opposite.

53

5 Results

5.2.3 Groups based on Fronts from non-dominated

Sorting

As we are interested if we get more insight by not looking at a single perfor-

mance criterion but instead considering multiple performance measures in a

combined way, we determine our groups in this last comparison based on the

fronts from non-dominated sorting on both performance measures - precision-

error and accuracy-error.

As these are error measures smaller values are more preferable. So smaller

values are more dominant for both measures during the non-dominated sorting.

We see the resulting fronts in Figure 5.6 Sub�gure (0) as polylines with the

best fronts starting from the left going to the worst fronts in the upper right.

Having these fronts, we de�ne R - the set of good elements - using the best three

fronts which contain in total 14 data points. For B - the set of bad elements

-, we select the last four fronts which contain a total of 15 data points.

In Sub�gure (0), we see R and B respectively in red and blue. R contains the

elements with the lowest values for accuracy-error and spans from elements

with the lowest occurring values for precision-error almost up to elements with

the highest values. This is a consequence of the absence of data points with

low precision-error and high accuracy-error which we discussed in Section 5.1.

So R consists mainly of elements from a slice starting from the left solely

in�uenced by accuracy-error. On the other hand, B consists of elements with

a medium to high accuracy-error and precision-error.

Looking at Sub�gures (5), we see for d_t2_t3 as well as d_t2_t4 which is the

latency between the time of maximum acceleration and the time of maximum

deceleration that R tends to be more concentrated with lower values for IQR.

This opposes our expectations.

Also, regarding d_v1_v2 which is the maximum acceleration, we see in Sub-

�gure (11) that R is more concentrated regarding IQR. Here it is intersecting

with the lower part of B. Yet again it opposes what we expect to observe.

In Sub�gure (12), we observe that R tends to have lower values for median

and IQR than B. This is contrary to our expectation.

Regarding Sub�gures (13) and (15), we observe the distribution measures for

d_v1_v4 and d_v2_v4 respectively the maximum deceleration and the dif-

ference from maximum acceleration to maximum deceleration. Here R tends

54


0 10accuracy_error

0

5

10

15pr

ecisi

on_e

rror

(0)


50

100

150

d_t1

_t2_

iqr

(1)

100 200 300d_t1_t3_median

50

100

150

d_t1

_t3_

iqr

(2)


50

100

150

d_t1

_t4_

iqr

(3)


50

100

150

d_t1

_t5_

iqr

(4)


25

50

75d_

t2_t

3_iq

r(5)


25

50

75

d_t2

_t4_

iqr

(6)


20

40

60

80

d_t2

_t5_

iqr

(7)

25 50d_t3_t4_median

20

40

60

d_t3

_t4_

iqr

(8)


20

40

60

d_t3

_t5_

iqr

(9)


25

50

75d_

t4_t

5_iq

r(10)

5 10d_v1_v2_median

2

4

d_v1

_v2_

iqr

(11)

0 10d_v1_v3_median

1

2

3

4

d_v1

_v3_

iqr

(12)


2

4

6

d_v1

_v4_

iqr

(13)

10 0d_v2_v3_median

2

4

d_v2

_v3_

iqr

(14)

10 20d_v2_v4_median

2

4

6d_

v2_v

4_iq

r(15)


2

4

6

8

d_v3

_v4_

iqr

(16)

0.1 0.2s12_median

0.000

0.025

0.050

0.075

s12_

iqr

(17)

0.2 0.0s23_median

0.0

0.1

0.2

0.3

s23_

iqr

(18)

0.250.500.751.00s34_median

0.1

0.2

0.3

s34_

iqr

(19)

0.750.500.25s45_median

0.0

0.2

0.4

0.6

s45_

iqr

(20)

Figure 5.6: Confrontation based on non-dominated fronts on both perfor-

mance measures

55

5 Results

to be smaller regarding the median but it is also more scattered regarding IQR

and tends to reach higher values in it. This matches what we expect.

Also in Sub�gure (16) for d_v3_v4, we see R being more scattered and tending

to reach higher values for IQR which �ts our expectations.

For s12 - the slope towards the maximum acceleration - in Sub�gure (17) R

is more concentrated with low values according to both distribution measures

which opposes what we expect.

Lastly regarding Sub�gure (18) for s23 which is the slope of decreasing accel-

eration after maximum acceleration R tends to have smaller median than B.

This neither �ts nor opposes our expectations.

Using groups based on the fronts from non-dominated on our performance

measures, we found di�erences in the distribution measures of 9 of 20 throw

features. Though 3 observed di�erences match our expectation of higher IQR

for R there are also 5 observations that oppose it.

5.2.4 Comparison of Sets of Observations

In each of the previous three sections, we de�ned a good and a bad group based

on di�erent strategies. The �rst time, we de�ned the groups based on the

accuracy-error, the second time based on precision-error, and the third time

based on the fronts from non-dominated sorting to combine both measures.

We used these groups to identify throw features that showed di�erences in the

distribution measures according to these groups.

In total, we made 23 observations of di�erences which we show in Table 5.1.

There, we marked that we observed a di�erence for a certain feature using a

certain strategy to de�ne the groups by marking the respective table cell with

an "x".

To get more insight into the data, we decided at the beginning of the devel-

opment of our method to use the two performance measures accuracy-error

and precision-error, and to combine them using non-dominated sorting. Now,

we see in the table that the accuracy-error based strategy gave us - with 4

observed di�erences - the fewest observations. The other two strategies gave

us - with about 10 each - more observations. Yet, for each strategy, the set of

56

5.3 Relation between Performance Developments and Throw Pattern Developments

observations is di�erent to the others and also contains at least one observation

that is unique as it appears under no other strategy.

In Section 5.2.3, we described that the selected R for the front-based selection

resembles a slice of data points according to just accuracy-error which we

can see in Figure 5.6, Sub�gure (0). Also, B appears to be close to being

selected by another slice just on accuracy-error except for some data points

below B. With this impression the groups for that section are very similar to

the groups from Section 5.2.1 with groups exclusively based on accuracy-error

which are represented in Figure 5.4, Sub�gure (0). Nevertheless, in Table 5.1,

we see that for the front-based selection that we have more than double the

amount of observed di�erences making it very di�erent than the accuracy-

based approach. According to the number of observations, the front-based

approach is more similar to the precision-based approach.

Regarding our expectations of higher variance for better players, we assessed

for each feature in which we saw a di�erence between the good and the bad

group whether this di�erence matches to our expectation or whether it opposes

showing the opposite. We summarized also this in Table 5.1 by marking a

match with a "(+)" as well an opposition with a "(-)". 22 of 23 observations

are either matching or opposing what we expect. Only one - for the feature

s23 in the front-based approach - shows a di�erence that neither matches nor

opposes as the di�erence relates exclusively to the median which we see in

Figure 5.6, Sub�gure (18). We want to emphasize that our expectations just

focus on IQR and we also observed di�erences in median in many other cases.

Looking row-wise through the table comparing the matching of observations

for a feature using di�erent approaches, we see for 7 of 8 rows with multiple

observations that they show the same whether they match or oppose. Overall,

we made 7 observations that �t our expectation but also 15 that oppose.

5.3 Relation between Performance

Developments and Throw Pattern

Developments

Based on our de�nition of development as the di�erence from the �rst block

to the later blocks, we have 4 data points for each participant. Each data

57

5 Results

Table 5.1: Overview of throw features with observed di�erences in perfor-

mance - We de�ned a good group and a bad group according to

di�erent strategies. When we observed a di�erence between the

groups according to a certain strategy (column) in a certain feature

(row) then we mark it here with a x. In addition, we marked dif-

ferences that �t or oppose our expectations respectively with (+)

or (-).

sub�gure feature accuracy-based precision-based front-based

see section 5.2.1 see section 5.2.2 see section 5.2.3

1 d_t1_t2 x (-)

5 d_t2_t3 x (-) x (-) x (-)

6 d_t2_t4 x (-) x (-) x (-)

11 d_v1_v2 x (-) x (-)

12 d_v1_v3 x (-) x (-)

13 d_v1_v4 x (+) x (+) x (+)

14 d_v2_v3 x (-)

15 d_v2_v4 x (+) x (+)

16 d_v3_v4 x (+)

17 s12 x (-) x (-)

18 s23 x (-) x

19 s34 x (+)

58


point consists of di�erences for performance measures as well as di�erences of

distribution measures for throw features. In the following Figures 5.7, 5.8, and

5.9, we show those di�erences for performance measures each time as Sub�gure

(0) and for distribution measures in Sub�gures (1) to (20).

Looking at Figure 5.7, Sub�gure (1) as an example, we see a scatterplot that

confronts the di�erences of median as well as IQR for the feature d_t1_t2.

In order to understand what we see in this type of diagram, we look at the

set of blue elements and ignore the meaning of this set for now. According to

median that is represented along the horizontal axis, we see that this blue set

shows small as well as above yet positive di�erence values. Some lay with a

very small di�erence value close to the vertical 0-line showing very small or no

development. Others, we see further to the right showing us an increase in

the median distribution measure. For the feature d_t1_t2 that captures the

time from the start of the acceleration until the maximum acceleration, this

means that the elements of the blue group take longer on average later on to

reach that peak acceleration. Switching to the vertical axis for the di�erences

in IQR, we see for some elements of the blue group that they are very close

to the horizontal 0-line or in other words that their di�erence value is very

close to 0. These show again very small or no development. The rest of the

blue group shows absolute bigger but negative di�erence values in IQR for

the given feature which makes them represent a decrease in IQR. For the

d_t1_t2 feature, this means that the variance of the time to the maximum

acceleration got smaller for these elements, which may mean that the process

of throwing got more stable or similar.

We see in this blue group for the d_t1_t2 feature a tendency for an increase

in median with a decrease in IQR though some elements show no development

in median or IQR. This observation may be more signi�cant or convincing

if all elements would show this increase and decrease. Yet the blue group is

more clear in its development being more concentrated compared to the red

group. While some elements of the red group lay close to the blue elements

there are others that actually show stronger developments but also opposite

developments which makes it more diverse and appear more scattered. For

example, there are also red elements in the bottom right of the �gure that

show a decrease in median as well as IQR.

Returning to our research question of what are di�erences between partici-

pants which improve a lot and participants which do not we will de�ne in the

59

5 Results

following a group that is good or better according to a chosen criteria based

on the development as well as another group that is bad or worse based on the

same criteria. We show the good group as the red group as well as we show

the bad group as the blue group. For our example from before from Figure

5.7, Sub�gure (1) this means that both groups tend to reduce IQR. Yet they

are di�erent in the development according to median as the bad group just

shows an increase while the good group shows a decrease as well as increase.

According to our idea of variability in motor control, we expect that the im-

provement of participants is related to an increase in variability which is here

represented by the distribution measure IQR for throw features. According to

this, our good group should show positive IQR di�erence values which repre-

sent an increase in IQR. Comparing both groups, we expect the IQR di�erences

for the good group to be bigger than the IQR di�erences for the bad group.

Considering our example from before, we see that the red good group and

the blue bad group are not separated according to IQR di�erences. Also, for

both, we rather see a decrease in IQR than an increase in IQR. Though some

elements show an increase in IQR yet our example shows a combination that

we did not expect.

In the following three sections, we observe di�erences in the developments of

throw features of our good and bad group based on three di�erent strategies

to de�ne those groups. First and second, we de�ne the good and bad group

solely based on the development in one of our performance measures. Third,

we de�ne them based on fronts from non-dominated sorting on developments

in both measures. Afterward, we compare the sets of observations that we

make.

5.3.1 Groups based on Accuracy-error

Now, we want to de�ne both groups based on the development in the accuracy-

error performance using di�erences in that measure from the �rst block to later

blocks. The good group is again assigned with the color red. We abbreviate

it in the following observations with R. On the other side the bad group is

assigned again to the color blue. So, we abbreviate it in the following observa-

tions with B. We selected the 11 elements with the biggest negative di�erence

values as R. They represent the elements with the biggest improvement as

their accuracy-error decreased the most from the �rst to the later block. In

60


addition, we selected the 11 elements with the biggest di�erence values as B.

Looking at Figure 5.7 Sub�gure (0), we see that they show a degradation as

their accuracy-error increased. From here, we can see that R and B represent

the left and right end of the distribution of occurring di�erence values.

Let's describe the observations for di�erences according to a good and a bad

group de�ned based on the development in accuracy-error performance mea-

sure. We focus exclusively on Figure 5.7. Yet, we do not describe sub�gures

in which we cannot see a clear di�erence between the groups.

Looking at Sub�gure (1) for the development of the distribution measures of

the d_t1_t2 throw feature, we see that B as well as R mostly change a little

or rather gets smaller in IQR. This does not match our expectations. However,

we see a di�erence in median as R is more scattered than B showing partially

an increase and partially a decrease in median while B shows just an increase

in median.

Looking at Sub�gure (2) for the development of the distribution of d_t1_t3,

we see that B is more concentrated with the tendency of increasing the median

while R is more scattered with di�erent combinations of increases or decreases

in median and IQR. We see similar developments for d_t1_t4 in Sub�gure (3)

as well as d_t1_t5 in Sub�gure (4).

In Sub�gure (6) for d_t2_t4, we see for B a slight decrease in median time

whereas R shows rather a decrease in IQR. This means the latency between

peak acceleration and peak deacceleration gets smaller for B whereas for R it

is more characteristic that the variance of that latency decreases meaning that

the throws getting more similar regarding that feature.

Regarding Sub�gure (8) for d_t3_t4 B seems to be slightly more scattered

according to developments in IQR. Yet, we see no di�erence that matches our

expectations.

In Sub�gure (9), we see for the total time of deceleration - d_t3_t5 - for R

mainly an increase in IQR whereas B shows rather a decrease in IQR. This

matches our expectations.

Starting to look at the amplitude features, we see for the maximum acceleration

- d_v1_v2 - that B decreases in IQR while R shows smaller also increasing

developments. Also, this di�erence does not support our expectations.

In Sub�gure (15) for d_v2_v4 - the amplitude di�erence between the maxi-

mum acceleration and the maximum deceleration -, we see no or in comparison

61

5 Results


0

10

20(0)

50 0 50d_t1_t2_median

100

0

d_t1

_t2_

iqr

(1)


100

50

0

50

d_t1

_t3_

iqr

(2)


100

50

0

50

d_t1

_t4_

iqr

(3)


100

50

0

50

d_t1

_t5_

iqr

(4)

0 50d_t2_t3_median

40

20

0

d_t2

_t3_

iqr

(5)

50 0d_t2_t4_median

40

20

0

20

d_t2

_t4_

iqr

(6)

0 50d_t2_t5_median

40

20

0

20

d_t2

_t5_

iqr

(7)

25 0d_t3_t4_median

20

0

20

d_t3

_t4_

iqr

(8)


25

0

25

d_t3

_t5_

iqr

(9)

50 0d_t4_t5_median

25

0

25

50d_

t4_t

5_iq

r(10)

2.5 0.0 2.5d_v1_v2_median

2

0

d_v1

_v2_

iqr

(11)

5 0d_v1_v3_median

2

0

d_v1

_v3_

iqr

(12)

0 10d_v1_v4_median

2

0

2

d_v1

_v4_

iqr

(13)

5 0d_v2_v3_median

2

1

0

d_v2

_v3_

iqr

(14)

0 10d_v2_v4_median

2

0

d_v2

_v4_

iqr

(15)

0 10d_v3_v4_median

2

0

d_v3

_v4_

iqr

(16)

0.05 0.00 0.05s12_median

0.05

0.00

s12_

iqr

(17)

0.1 0.0s23_median

0.3

0.2

0.1

0.0

s23_

iqr

(18)

0.00 0.25s34_median

0.2

0.0

0.2

s34_

iqr

(19)

0.5 0.0 0.5s45_median

0.2

0.0

0.2

s45_

iqr

(20)

Figure 5.7: Confrontation based on development of accuracy-error

62


to the rest small changes for R. On the other hand, B shows a tendency for a

reduction in median while it is also more diverse in IQR. Also this di�erence,

we cannot �t to our expected di�erences.

Looking at Sub�gure (18) for s23, we see as the only di�erence between B and

R that R is less scattered than B which we cannot �t to our expectations.

For Sub�gure (20) the feature s45, we see just for R an increase in IQR. This

�ts our expectations.

Summing up, we found 11 features with di�erences in development for both

groups based on the developments in accuracy-error, e.g., di�erent develop-

ments in median or IQR, or di�erences in the variations of the strength of

developments. Yet, there were just 2 cases that actually �t our expectations

on increasing variance.

5.3.2 Groups based on Precision-error

This time, we de�ne the groups based on the developments in precision-error

that is de�ned as the di�erences in precision-error of the �rst block to later

blocks. Like before, we select for the red good group - R - the 11 elements with

the smallest di�erence values. As we can see in Figure 5.8, Sub�gure (0) these

are elements at the lower border of the distribution of occurring di�erences

with a negative di�erence that stands for a decrease in precision-error. So

they are the elements that improved the most according to the precision-error.

On the other side, we assigned the 11 elements with the biggest di�erence

values which are closest to the upper limit of the distribution in Sub�gure (0)

to the blue bad group - B. They show di�erences close to 0 or positive. So they

show a small to medium increase in precision-error which is a degradation.

Looking at Sub�gure (1), we see for B a tendency to slightly increase in me-

dian as well as to slightly decrease in IQR. R is di�erent as it shows stronger

developments that are also more diverse because there are in addition elements

with opposite developments as described for B.

In Sub�gure (5) R appears again scattered compared to B without a certain

tendency for a common development. On the other hand, B is rather concen-

trated with a tendency to slightly decrease in median.

In Sub�gures (6), (7), (9) as well as (20), we see B as rather concentrated while

we see R rather scattered without a certain tendency for the developments.

63

5 Results

5 0precision_error

0

5

10

15(0)


100

0

d_t1

_t2_

iqr

(1)


100

50

0

50

d_t1

_t3_

iqr

(2)


100

50

0

50

d_t1

_t4_

iqr

(3)


100

50

0

50

d_t1

_t5_

iqr

(4)

0 50d_t2_t3_median

40

20

0

d_t2

_t3_

iqr

(5)

50 0d_t2_t4_median

40

20

0

20

d_t2

_t4_

iqr

(6)

0 50d_t2_t5_median

40

20

0

20

d_t2

_t5_

iqr

(7)

25 0d_t3_t4_median

20

0

20

d_t3

_t4_

iqr

(8)


25

0

25

d_t3

_t5_

iqr

(9)

50 0d_t4_t5_median

25

0

25

50d_

t4_t

5_iq

r(10)

2.5 0.0 2.5d_v1_v2_median

2

0

d_v1

_v2_

iqr

(11)

5 0d_v1_v3_median

2

0

d_v1

_v3_

iqr

(12)

0 10d_v1_v4_median

2

0

2

d_v1

_v4_

iqr

(13)

5 0d_v2_v3_median

2

1

0

d_v2

_v3_

iqr

(14)

0 10d_v2_v4_median

2

0

d_v2

_v4_

iqr

(15)

0 10d_v3_v4_median

2

0

d_v3

_v4_

iqr

(16)

0.05 0.00 0.05s12_median

0.05

0.00

s12_

iqr

(17)

0.1 0.0s23_median

0.3

0.2

0.1

0.0

s23_

iqr

(18)

0.00 0.25s34_median

0.2

0.0

0.2

s34_

iqr

(19)

0.5 0.0 0.5s45_median

0.2

0.0

0.2

s45_

iqr

(20)

Figure 5.8: Confrontation based on development of precision-error

64


Regarding d_v1_v4 in Sub�gure (13), we observe a stronger tendency for R

to reduce in median compared to B.

Looking at d_v2_v3 in Sub�gure (14), we see for R decrease in IQR that we

do not see for B. This is opposing what we expect.

In Sub�gure (17) for s12 which represents the slope towards the maximum

acceleration, we see for B the tendency to decrease in median as well as in

IQR. We do not see this tendency for R.

Summing up, we identi�ed 9 features with di�erences in development based

on groups from development in precision-error. None of these observations

actually �ts our expectations while one of them even opposes our expectations.

5.3.3 Groups based on Fronts from non-dominated

Sorting

In this analysis, we de�ne R and B based on fronts from non-dominated sorting.

Similar to the application of non-dominated sorting in the analysis of status

for the multi-criteria case in Section 5.2.3, we determine the fronts based on

minimizing each measure. In that analysis, a smaller value for each of the

performance measures was better as they represent errors where a smaller error

is preferable. This time, we use the developments instead, i.e., the di�erences

in the performance measures from the �rst to later blocks. Yet, also with

these, a smaller value is preferable as we consider di�erences now. Negative

values represent a decrease in the errors while positive di�erences represent

an increase in error or in other words a degradation. The result of the non-

dominated sorting, we see in Figure 5.9, Sub�gure (0) where the resulting

fronts are visible as polylines with the best fronts starting in the bottom left

and the worst fronts in the top right. For the group of good elements that

we show again in red and abbreviate with R, we selected the elements of the

�rst 2 fronts with 12 elements. So the group size is close to the group size

in the previous analyses. As we can see in Sub�gure (0) R contains elements

with strong improvements in both performance measures showing elements

with negative di�erence values in both performance measures. Yet, at the

ends of the selected fronts, we see elements with strong improvement in one

measure while there is a degradation in the other measure. For the group of

bad elements -B -we selected the elements of the last 4 fronts with 10 elements.

65

5 Results

In Sub�gure (0), we see that this group consists of elements that slightly or

stronger degrade in both performance measures.

Looking at Sub�gure (1), we observe B being more concentrated with a slight

increase in median while R shows increases as well decreases in the median of

the feature d_t1_t2. In addition for R, we see a tendency to reduce in IQR.

This, plus the fact that B shows almost no development in IQR seems to show

the opposite of what we expect.

Regarding the next feature d_t1_t3 in Sub�gure (2), we also see that B is

more concentrated with a tendency to slightly increase in median while R

shows stronger developments with increases as well as decreases in median.

This also applies for the following feature in Sub�gure (3).

In Sub�gure (5) for d_t2_t3, we observe that B shows no development in IQR

while R shows decrease in IQR. This observation opposes our expectation.

Regarding the next Sub�gures (6) and (7), we see a similar di�erence for B

and R like in Sub�gure (5). Again R shows a stronger decrease in IQR which

opposes our expectation.

Looking at Sub�gure (8) for feature d_t3_t4, we observe for R a tendency to

decrease in in median while for B we observe a tendency to increase.

In Sub�gure (12), for B as well as R, we see a tendency to decrease in median.

Yet R decreases stronger.

For feature d_v2_v4 in Sub�gure (15), we observe for B the tendency to

decrease in IQR while for R we see increases as well as decreases in IQR. This

�ts partly to our expectations. Though the B and parts of R show decrease in

IQR nevertheless other parts of R show the expected increase in IQR and the

stronger increase than B.

In Sub�gure (16) for d_v3_v4, we observe for R the tendency to increase in

median while in B we see a weaker increase and also decrease.

Lastly, in Sub�gure (18), we see again for R a stronger tendency to decrease in

IQR while B is rather concentrated with no development in IQR which opposes

our expectations.

Summing it up using groups based on fronts from non-dominated sorting, we

found 11 features that showed di�erences in development according to those

groups. We found no feature in which the groups showed the development that

66



5

0

prec

ision

_erro

r(0)


100

0

d_t1

_t2_

iqr

(1)


100

50

0

50

d_t1

_t3_

iqr

(2)


100

50

0

50

d_t1

_t4_

iqr

(3)


100

50

0

50

d_t1

_t5_

iqr

(4)

0 50d_t2_t3_median

40

20

0

d_t2

_t3_

iqr

(5)

50 0d_t2_t4_median

40

20

0

20

d_t2

_t4_

iqr

(6)

0 50d_t2_t5_median

40

20

0

20

d_t2

_t5_

iqr

(7)

25 0d_t3_t4_median

20

0

20

d_t3

_t4_

iqr

(8)


25

0

25

d_t3

_t5_

iqr

(9)

50 0d_t4_t5_median

25

0

25

50d_

t4_t

5_iq

r(10)

2.5 0.0 2.5d_v1_v2_median

2

0

d_v1

_v2_

iqr

(11)

5 0d_v1_v3_median

2

0

d_v1

_v3_

iqr

(12)

0 10d_v1_v4_median

2

0

2

d_v1

_v4_

iqr

(13)

5 0d_v2_v3_median

2

1

0

d_v2

_v3_

iqr

(14)

0 10d_v2_v4_median

2

0

d_v2

_v4_

iqr

(15)

0 10d_v3_v4_median

2

0

d_v3

_v4_

iqr

(16)

0.05 0.00 0.05s12_median

0.05

0.00

s12_

iqr

(17)

0.1 0.0s23_median

0.3

0.2

0.1

0.0

s23_

iqr

(18)

0.00 0.25s34_median

0.2

0.0

0.2

s34_

iqr

(19)

0.5 0.0 0.5s45_median

0.2

0.0

0.2

s45_

iqr

(20)

Figure 5.9: Confrontation based on fronts from non-dominated sorting on the

development in accuracy-error and precision-error

67

5 Results

we expected. Yet, we found 5 features where the groups showed developments

that rather oppose our expectations.

5.3.4 Comparison of Sets of Observations

We made 3 analyses using 3 di�erent strategies to de�ne good and bad groups.

For the �rst two analyses, we used the development in one of the perfor-

mance measures each. For the last analysis, we combined the development

in both measures using the fronts from non-dominated sorting. We opposed

these groups in each analysis in order to �nd di�erences in the development of

features that �t our expectation of a rising variance.

As we summarize in Table 5.2, we found 31 di�erences for 18 of 20 features,

in at least 9 features in each of our analyses.

Though we see di�erent developments many times, most of the time those

observed di�erences do not match to our expectation. Even 5 times they seem

to oppose our expectation and there are just 2 observations that match to

them.

Our initial motivation to use fronts from non-dominated sorting to combine

both performance measures was to observe e�ects that could not be found using

single performance measures. After these 3 analyses, we see that the selected

groups for the 3rd analysis - that we can see in Figure 5.9, Sub�gure (0) - could

not have been selected by the single measure approach on this data. For R, we

combined elements with the smallest values according to accuracy error which

were selected for R in our �rst analyses with elements with the smallest values

according to precision error which were selected for R in the second analyses.

For B it is more like an intersection, i.e., selecting elements that just appear

in both sets, of the Bs that we selected in the �rst two analyses. Also, the

observations that we made appear to us like a combination of both measures

as for 12 of 16 cases, we see in Table 5.2 that if we observed a di�erence in the

1st or the 2nd analyses then we also observed a di�erence in the 3rd analyses.

Yet, what we observed in the 3rd analysis is also di�erent because we see the

most cases of observed di�erences that oppose our expectations. In addition

for 2 features, we saw di�erences that we did not see in the analyses that were

based on the single performance measures.

68


Table 5.2: Overview of throw features with observed di�erences in develop-

ment - We de�ned a good and a bad group according to di�erent

strategies. When we observed a di�erence between the groups ac-

cording to a certain strategy (column) in a certain feature (row)

then we mark it here with a x. In addition, we marked di�erences

that �t or oppose our expectations respectively with (+) or (-).

sub�gure feature accuracy-based precision-based front-based

see section 5.3.1 see section 5.3.2 see section 5.3.3

1 d_t1_t2 x x x (-)

2 d_t1_t3 x x

3 d_t1_t4 x x

4 d_t1_t5 x

5 d_t2_t3 x x (-)

6 d_t2_t4 x x x (-)

7 d_t2_t5 x x (-)

8 d_t3_t4 x x

9 d_t3_t5 x (+) x

11 d_v1_v2 x

12 d_v1_v3 x

13 d_v1_v4 x

14 d_v2_v3 x (-)

15 d_v2_v4 x x

16 d_v3_v4 x

17 s12 x

18 s23 x x (-)

20 s45 x (+) x

69

6 Discussion

In this chapter, we will �rst interpret whether our method and results ful�ll

our goals and expectations. Afterward, we will give answers to our research

questions. At last, we will show ideas for future work to improve and extend

this work, and to �nd answers to related questions.

First, we like to recap this thesis. In Chapter 1, we said that we need to

learn movements for everyday activities. Based on the example of playing

darts, we planned to explore the relation between performance and the way

of throwing and named our research questions accordingly. What are the

di�erences between better and worse players? What are the di�erences between

players that improve more than others?

In Chapter 2, we followed the concept of variety destroying variety. Based on

this, we expected that better players show a bigger variety in their throws and

that players that improve more also show a bigger increase in the variety of

their throws. In addition, we decided to use multiple complementary measures

to quantify performance because we expected to make additional �ndings.

To test our expectations, we wanted to quantify throw performance and throw

physiology, as well as their developments over time, and to oppose them in

order to explore their relation. Additionally, we wanted to combine multiple

performance measures using non-dominated sorting in our analyses.

In Chapter 4, we proposed accuracy-error and precision-error to quantify per-

formance. For quantifying the physiology of throwing, we proposed to describe

the pattern of single throws regarding the time series of the norm of accelerat-

ing forces using throw features and to describe the variations of this pattern for

a set of throws by determining median and IQR for every throw features (cf.

Figure 6.1). Applying those measures to experimental data, we identi�ed in

Chapter 5 di�erences in the distribution properties of throw features between

good and bad groups based the performance measures.

71

6 Discussion

Figure 6.1: Overview of our method to prepare the confronting visualizations

72

6.1 Our Results

6.1 Our Results

In this section, we will evaluate how our results match our expectations in

order to give an answer to our research questions. We expected to make ad-

ditional �ndings by using multiple performance measures and by combining

them using fronts from non-dominated sorting. In Chapter 5, we found an

unexpected triangle-like relation between our performance measures. In ad-

dition, we observed di�erences regarding the status of physiological measures

and their development that were only visible with this approach. We expected

to observe for better performing participants a higher variance in their throw

physiology. Yet, in Section 5.2, we saw that much more observed di�erences

were opposing this expectation showing better players to have a lower vari-

ance. Just for a few features that are related to the maximum deceleration

- namely d_v1_v4, d_v2_v4, d_v3_v4, and s34 -, we observed di�erences

that match those expectations. For stronger improving participants, we ex-

pected to observe a higher increase in variance of throw physiology. Again, the

observations that we made in section 5.3 majorly did not ful�ll this expecta-

tion. Mostly they did neither support nor oppose it. So generally, more of our

observations opposed those expectations regarding the variance in throw kine-

matics. Apart from the in�uences of our method that we discuss in the next

section, we see our expectations as a reason. They base on an understanding

in which we assume that the variance in throw physiology is exclusively coun-

tering perturbations. This way, we ignored variance in the way of throwing

that is not related to improvements in performance. We want to give an illus-

trating example. When we start shaking a participant who is throwing darts

then the variance in throwing will increase but we do not expect that the per-

formance will also improve. Quite the contrary, we expect the performance to

degrade due to this additional perturbations. This leads us back to concept

of [Latash2008] of good and bad variance. In this thesis, we were ignoring

that certain variance in throwing decreases performance which is also part of

the variance of the throwing physiology that we measure. To recap, Latash's

good and bad variance describe relations between the variety of a movement

and the variety of ful�lling its goal. Here, good variance represents a negative

correlation as more variance in movement leads to less variance in ful�lling

the goal. On the other hand, bad variance represents a positive correlation

as more variance in movement leads to more variance in ful�lling the goal.

[Boenke2018] subclassi�ed Latash's bad variance. They name the relation of

73

6 Discussion

good variance quality variability. The relation of bad variance is separated

into two types. Trivial variability is related to pursuing di�erent goals. If a

participant would alternatingly throw a dart towards a dartboard on a wall

and drop a dart on a dartboard in front of their feet then we would observe

more variety in their movement considering both movements a throw. The

second part of bad variance is noise which relates to unintended disturbances

of the movement. Shaking a participant during a throw increases the variance

of the throwing but also decreases the performance. When we try to explore

those correlations, we can measure the variance of the movement as we did in

this thesis. Yet, this variance is compound from the variance of all three in-

�uences. As also the strength of these in�uences is unknown, the in�uences of

triavial and noise variability may have superimposed the in�uences of quality

variability. So, with our method, we may have observed the e�ect of trivial and

noise variability instead of the in�uence of quality variability that we wanted

to observe.

Based on the results of this thesis, we can answer our research questions like

this.

• What are the di�erences between good and bad dart players?

� For the participants in our experiment, we saw that better partici-

pants generally showed less variance in the kinematic properties of

their throws.

• What are di�erences between participants that improved more and par-

ticipants that improved less?

� For the participants in our experiment, we saw mostly no clear

di�erence according to the development in the variance of kinematic

properties of their throws.

6.2 Our Method

In this section, we will evaluate whether our method ful�lls our goals and point

out in which parts we see strengths and weaknesses in our approach.

We will brie�y summarize our method, �rst. We started with a dart throw

experiment which we used to collect data for the physiology of throws and

74

6.2 Our Method

their outcome. Based on this, we quanti�ed - on the one hand - performance

according to two measures. On the other hand, we quanti�ed physiology by

focussing on the norm of the accelerating forces from the kinematic data, by

�tting an acceleration model in order to extract throw features, by quantifying

distribution properties of those features using IQR and median. We quan-

ti�ed the development in performance measures and physiological measures

using the di�erences from the �rst to later blocks. We opposed performance

and physiology by selecting good and bad groups according to single or both

performance measure and visualizing these groups for every throw feature.

Similarly, we opposed development in performance and development in physi-

ological measures by selecting such groups and visualizing them regarding the

distribution measures of throw features. Based on these visualizations, we

identi�ed di�erences regarding our expectations.

We see factors in our method that may in�uence the results to be less reliable

and less convincing.

First, we want to emphasize the in�uence of the variance of participants from

the experiment. We aggregate the throws block-wise to compute our perfor-

mance and physiological-related measures. Then, we analyze the compound

dataset including all 5 blocks for the status analyses and the 4 di�erences for

each participant to later blocks in the analyses of the development. In Figure

6.2, we visualized again the values for accuracy-error and precision-error as we

already did before in Figure 5.1, Sub�gure (2) and in Figure 5.6, Sub�gure (0).

Yet, this time, we also connected for every participant the data points in the

order of the blocks to a polyline. We see in the lower left of the diagram the

polylines for three participants with low values according to both measures.

They do not intersect each other and lay separated from the polylines for the

rest of the participants. Comparing those three to the other polylines, we see

that their points lay closer to each other. This matches the concept of the law

of practice which states that under practice the performance changes less and

less over time (cf. [Edwards2011]). The three advanced participants - with

presumably more practice in their lives - show smaller changes in the perfor-

mance measures between blocks. The participants with worse performance

measure values - presumably beginners - show bigger changes in comparison of

subsequent blocks which makes the points of their polylines lay further apart.

Based on these measures, we select in our method the good and bad groups

for opposing physiological throw properties visually. Since the points of those

advanced participants lay closer together, it is more probable that more points

75

6 Discussion

0 2 4 6 8 10accuracy-error

0.0

2.5

5.0

7.5

10.0

12.5

15.0

17.5

prec

ision

-erro

r

Figure 6.2: Traces of block performance - After aggregating the performance

measures for every block of every participant we connected for ev-

ery participant the data points in block order to a polyline. The

start and respectively Block 1 is marked with a dot.

76

6.2 Our Method

of the same participant get selected for a good group. As the points for the

beginners are more scattered, it is more probable that the points of many dif-

ferent participants get selected for the bad groups. To exaggerate, we could

imagine that for the good group we select only the points of a single partici-

pant and compare this group in our analyses to a bad group where each point

belongs to a di�erent participant. So, to a certain extent, we generalize for the

bad performances more than for the good performances.

Furthermore, we see in Figure 6.2 that the majority of participants does not

reach the proximity of the initial performance of the participants in the lower

left. So, there are at least two classes of performance levels in the data in our

experiment. This leads us to the open question of whether our participants

are actually comparable using our method.

We see as another in�uence the errors that we introduce in our method. We

have measuring errors during the recording of the experiment data. We intro-

duce errors while �tting the model due to the nature of �tting a model but

also due to our model not �tting to all occurring cases. Lastly, we manually

identify the di�erences which is a subjective process.

We use a model that describes the throw process regarding the strength of

the acceleration. We expect two short and strong accelerations - �rst, an

acceleration phase to accelerate the dart to an appropriate velocity and second

a deceleration phase to stop the throwing arm again. We model this using a

piecewise linear model with two subsequent peaks. In our experiment, we

measured the accelerating forces at the wrist of the throwing arm according

to three axes. We simpli�ed this 3-dimensional time series data by computing

the norm of the accelerating forces. Then we �tted for every throw our model

to the time-series of the norm.

So, we �tted our model about the strength of acceleration to data about the

strength of accelerating forces. One in�uence that occurs as accelerating force

but which we did not model is the gravity. In Section 3.3, we already observed

the e�ect of gravity to lift the baseline of the accelerating forces in rest phases.

We could compensate this for our amplitude features by using the di�erence

to the baseline for characterizing, e.g., the height of the �rst or the second

peak. Yet, there is another in�uence which appears to displace certain form

features. In Figure 6.3, we see again a time-series of the norm of the accel-

erating forces for a throw as a black line and the �tted model as a red line.

Along the horizontal axis, we see the time in milliseconds in an interval with a

77

6 Discussion

length of 2000 milliseconds. Along the vertical axis, we depicted the strength.

Additionally, we can see the intermediate versions of the �tted model with

dotted lines. According to the data as well as the model, we can see a baseline

at a strength of about 10. We clearly see the two peaks for the acceleration

phases. Yet, we also observe a stronger mismatch between model and data

between the two peaks and even stronger after both peaks or in other words

at the time 0. Here, we see that the data falls below the baseline. Our model

cannot follow because it assumes the baseline to be the minimum. We suppose

Figure 6.3: Annihilation of forces - Between both peaks and after the second

peak at time 0 the norm of the measured forces fall far below the

baseline which the model cannot represent.

that this is an e�ect of measuring accelerating forces from the throw process

together with gravity. With our sensor, we measure the sum of accelerating

forces over time according to three axes. One accelerating in�uence is gravity

that is globally pointing downwards and static with a �xed strength. In phases

in which the sensor rests or in phases of weak accelerations and slow velocity -

e.g., when the participants slowly lift the arm to the initial throwing position -

it is the strongest force that superimposes others. For this reason, we observe

in those phases a baseline that is lifted to the static value for the strength

of gravity. Yet, in phases of strong acceleration and deceleration, the related

forces superimpose the e�ect of gravity. For this reason, we observe our two

78

6.2 Our Method

peaks. Yet, both forces are often not additive. For example, at the end of

the phase of deceleration, some participants are moving their throwing hand

almost vertically. They decelerate by carrying out a decelerating force that is

directed upward. The sensor measures the sum of these force vectors. While

gravity is pointing downwards, decelerating forces with a similar strength are

pointing up. The summed vector that a�ects the sensor is shorter - or in other

words, has a smaller strength - than gravity. In other words, those forces an-

nihilate each other. For the norm of the accelerating forces at that moment it

means that the norm has a value beneath the baseline which is what we can

see in Figure 6.3. This annihilation depends on the directions of the occurring

forces. Thus, it is speci�c to the throw pattern of participants. So it may

occur for some participants and may not occur for others. The consequences

are that the shape of the time-series contains unexpected features which our

model cannot handle. For this reason, certain form features can deviate from

their intended position and our model is less robust to be �tted to the data,

which may, later on, in�uence the distribution properties by displacing median

and increasing IQR. These form features are the basis for the throw features

and in�uence therefore further analyses.

Another case in which the �tting of our model is less robust occurs when both

peaks for the acceleration phases are badly separated. In Figure 6.4 and Figure

6.5 we have the data and the �tted models for two subsequent throws of the

same participant. The course of curves for the data in both �gures is similar

with a small peak for the acceleration phase at time -100 ms, a big peak for

the deceleration phase at time 0 ms, and an unclear transition between both.

While both time-series are similar, the �tted models vary in this interval. In

the following throw which we can see in Figure 6.6, the �rst peak could not be

identi�ed. Though our measures - median and IQR - are more robust against

outliers, if the frequency of mis�ts is too high then this will a�ect median and

IQR on these features presumably shifting median and increasing IQR.

Lastly, in our analyses, we manually identi�ed the di�erences between good

and bad groups according to the diagrams that we created. Yet, this is also a

subjective and sequential process in which we may change our subjective crite-

ria for identifying a di�erence. On time we may have accepted a visual pattern

as a di�erence. Another time, we may have ignored a similar visual pattern.

We see that there are multiple in�uences that may reduce the reliabliity of the

results of our method. Yet, their extent is unknown so far.

79

6 Discussion

Figure 6.4: 40th throw - The model (red) could not be �tted to capture the

valley between both peaks in the data (black).

Figure 6.5: 41st throw - The model (red) could be �tted to capture the valley

between both peaks in the data (black).

80

6.3 Future Work

Figure 6.6: 42nd throw - The model (red) could not be �tted to capture the

�rst peak for the acceleration phase and the valley towards the

peak for the deceleration phase in the data (black).

Nevertheless, we managed to base our analyses on relevant data from a dart

experiment in which we recorded data regarding performance and physiology

of throws. Based on this data, we managed to make multichannel time-series

of throws comparable by capturing kinematic features of throws. As we in-

tended, we found and applied a way to quantify performance and physiology

regarding darts-game, as well as the development in those measures. We also

managed to apply multiple performance measures in a combined way in our

analyses. Lastly, by assessing the relation between performance-related and

physiological measures, we were able to identify the di�erences between high

and low performing respectively strongly and weakly improving participants

which we could compare against our expectations.

6.3 Future Work

In this last section, we like to connect this thesis to future research by pro-

viding ideas for future work for improving and extending our method, and to

answer related questions. First, how may we improve our method to answer

81

6 Discussion

the questions apart from the examination and handling of the problems that

we mentioned in Section 6.2?

To seperate the compound variances for the di�erent types of variability in

order to explore the quality variability, [Boenke2018] propose to focus solely

on throws that ful�ll the goal in order to avoid trivial variability and noise

in order to keep the e�ect of the movement stable. We see another option

to explore those correlations by exploring the development over time. We

assume that trivial variability as well as noise variability decrease over time

while quality variability increases over time. As participants practice, they �nd

more options to ful�ll the goal. Applying more di�erent movements that reach

the goal increase the variance in the movement that we observe. Regarding

noise variability, participants may get better in avoiding noise. They may

start to time their movements with breathing and the heartbeat. This will

decrease the variance of noise variability over time towards a certain minimum.

Regarding trivial variability, participants may get better over time to focus

on the given goal. Also here, the variance in the movement will decrease

over time to a certain minimum. Following the idea that trivial variability

and noise variability decrease over time and only quality variability increases

over time, we should see for participants that decrease in trivial and noise

variability initially faster than they increase in quality variability a certain

v-shape of the development. As the decrease in trivial and noise variability

superimpose in the sum, the sum initially decreases as well. Over time the

changes in both variabilities get smaller until the increase in variance for quality

variability superimposes and the sum of variances start to increase. This is

just an example development. In Figure 6.7 we depicted for every participant

the development of the IQR of the feature d_t1_t2 over the 5 blocks of our

experiment. The red lines mark participants who show in the �fth block a

higher IQR than in the �rst Block. We see for several participants a decrease

towards the second or third block. We also see for some participants an increase

in towards the 4th and �fth block. This is a tempting approach which leaves the

question of how these developments relate to the development in performance.

We decided in this thesis to exclude the data that we recorded in the second

phase of the execution of our experiment from further analyses for several

reasons like mismatching scales and artifacts from recording. We could try to

solve these problems to extend the dataset. We may then use the additional

data to validate the di�erences that we observed here.

82

6.3 Future Work

B1 B2 B3 B4 B5

20406080

100120140160180

d_t1

_t2_

iqr

Figure 6.7: Development of IQR for d_t1_t2 over time

In order to reduce the errors that we introduce by abstracting the throw time-

series data, we may search for a better model and improve the �tting process

itself. Our model based on a piecewise linear function. Eventually, we could

model the acceleration phases as bell curves or similar curves.

We could try to answer the research questions in di�erent ways.

Following the idea of [Hulikanthe Math2017], we could explore the tendencies

following a front from non-dominated sorting and compare the tendencies be-

tween good and bad fronts. This could provide us with hints regarding the

in�uence of single performance criteria.

From the physiological data that we collected in our experiment, we focused

solely on the strength of the accelerating forces. A di�erent approach could be

to explore in a similar way other measures like the rotational velocity. Next,

we could also focus on multi-dimensional data. We could focus on the multi-

dimensional kinematic data eventually de�ning a multidimensional model or

reconstructing and analyzing the trajectories of throwing movements. Another

multidimensional measure that we collected are the electrical activities from

83

6 Discussion

EEG. With them, we could explore the relation between brain activity and

performance or between brain activity and the variance in physiological throw

measures.

In this thesis, we decided to apply di�erences to quantify development in per-

formance measures and physiological measures. Though there is also the option

of ratios as quanti�cation for changes, we are curious about a multidimensional

option. In this thesis, we already noticed the e�ect that is described by the

law of practice. Depending on the progress of practice which is related to

the performance the acquired improvements get smaller and smaller. To ex-

tract whether a participant improved much or few in comparison to others

and in relation to the initial performance, we could determine the fronts from

non-dominated sorting based on the performance and the development in per-

formance.

In order to identify di�erences, we opposed good and bad groups which we se-

lected based on the performance measures and the fronts from non-dominated

sorting. We can think of other selection strategies. For development in per-

formance measures, we have the concept of improvement and degradation.

According to this, we could also oppose elements that improve with elements

that degrade. So far, we followed the idea of selecting according to performance

and observe the di�erences in the physiological measures. Another option is

to select groups according to physiological measures and observe their di�er-

ences according to performance measures. Summing up, future work can be

done regarding di�erent ways of quantifying physiology and performance, and

other strategies to select groups for comparison to answer the same research

questions.

From this work, we also met related questions.

Assuming that participants do not improve anymore, e.g., because they already

reached some kind of personal maximum performance, we wonder if there is a

tradeo� between our two performance measures. This is related to our next

question.

In Section 5.1.1, we observed the triangle-like relation between our performance

measures. We were able to show that our method does not evoke this pattern

in general. So, what creates this pattern? Some of our considerations suspect

some kind of feedback mechanism. With low precision, the hits of a participant

are scattered all over the dartboard. This makes it harder to access a bias like

84

6.3 Future Work

our accuracy-error compared to the case of densely concentrated hits for a

participant with high precision. In other words, it may be easier to perceive

the extent of the accuracy-error and so to counter it when the precision-error

is smaller. This may relate to the batch size that we use to aggregate our

measures. We �xed our batch size to 100 which represents a block. Assuming

that we would aggregate just 10 throws instead of 100, we suppose that we

observe performance beneath the diagonal of the triangle that we observed.

So, how does the batch size in�uence our results?

Since we considered the relation between performance and physiological mea-

sures, as well as the relation between the development of performance and

physiological measures, a slightly di�erent question would be the relation be-

tween initial status and development. So, how do good or bad performing

participants change over time?

Lastly, We manually identi�ed the di�erences between good and bad groups.

We spend quite some time to do this. In addition, the results are a�ected by

our subjective judgment. To improve this we wonder how we could automatize

this process which may include the interfront and intrafront analyses.

85

Bibliography

[Ashby1958] W. R. Ashby. Requisite variety and its implications for the con-

trol of complex systems. Cybernetica, (1:2):83�99, 1958.

[Bernstein1967] N. Bernstein. The co-ordination and regulation of move-

ments. Pergamon Press, Oxford, 1. engl. ed. edition, 1967.

[Boenke2018] Lars T. Boenke. Reframing variability: From nuisance to an aid

to understand complex systems dynamics, Talk presented in the McIntosh

Lab, Rotman Research Institute - Baycrest Centre, Toronto, Canada, 2018.

[Cheng et al.2015] Ming-Yang Cheng, Chiao-Ling Hung, Chung-Ju Huang,

Yu-Kai Chang, Li-Chuan Lo, Cheng Shen, and Tsung-Min Hung. Expert-

novice di�erences in smr activity during dart throwing. Biological psychol-

ogy, 110:212�218, 2015.

[Deb2011] Kalyanmoy Deb. Multi-objective optimisation using evolutionary

algorithms: An introduction. In Multi-objective evolutionary optimisation

for product design and manufacturing, pages 3�34. Springer, London [u.a.],

2011.

[Edwards2011] William H. Edwards. An introduction to motor learning and

motor control. Wadsworth Cengage Learning, Belmont CA, 1. ed., interna-

tional ed. edition, 2011.

[Günnemann et al.2010] Stephan Günnemann, Ines Färber, Hardy Kremer,

and Thomas Seidl. Coda. Proceedings of the VLDB Endowment, 3(1-

2):1633�1636, 2010.

[Heylighen and Joslyn2001] Francis Heylighen and Cli� Joslyn. Cybernetics

and second-order cybernetics. Encyclopedia of physical science & technology,

4:155�170, 2001.

[Hulikanthe Math2017] Nithish Hulikanthe Math. Multi-Objective Success

Factor Analysis for IT based Startups. Master thesis, Otto-von-Guericke-

University, Magdeburg, 2017.

87

Bibliography

[Latash2008] Mark L. Latash. Synergy. Oxford University Press, Oxford and

New York, 2008.

[Lee et al.2014] S. Lee, Y. S. Kim, C. S. Park, K. G. Kim, Y. H. Lee, H. S.

Gong, H. J. Lee, and G. H. Baek. Ct-based three-dimensional kinematic

comparison of dart-throwing motion between wrists with malunited distal

radius and contralateral normal wrists. Clinical radiology, 69(5):462�467,

2014.

[Levin2014] Andreas Levin. Analyzing startup performance: How methods of

machine learning and multi-criteria optimization may help to uncover suc-

cessful startups. Master thesis, Karlsruhe Institute of Technology, Karlsruhe,

2014.

[Preim et al.2009] Bernhard Preim, Ste�en Oeltze, Matej Mlejnek, Eduard

Gröeller, Anja Hennemuth, and Sarah Behrens. Survey of the visual explo-

ration and analysis of perfusion data. IEEE transactions on visualization

and computer graphics, 15(2):205�220, 2009.

[Ranganathan2004] Ananth Ranganathan. The levenberg-marquardt algo-

rithm. Tutorial on LM algorithm, 11(1):101�110, 2004.

[Seidler et al.2017] Rachael D. Seidler, Brittany S. Gluskin, and Brian Gree-

ley. Right prefrontal cortex transcranial direct current stimulation enhances

multi-day savings in sensorimotor adaptation. Journal of neurophysiology,

117(1):429�435, 2017.

[van Beers et al.2013] Robert J. van Beers, Yor van der Meer, and Richard M.

Veerman. What autocorrelation tells us about motor variability: insights

from dart throwing. PLoS ONE, 8(5):e64332, 2013.

[Vrugt et al.2007] Jasper A. Vrugt, Jelmer van Belle, and Willem Bouten.

Pareto front analysis of �ight time and energy use in long-distance bird

migration. Journal of Avian Biology, 38(4):432�442, 2007.

[Weber and Doppelmayr2016] E. Weber and M. Doppelmayr. Kinesthetic

motor imagery training modulates frontal midline theta during imagination

of a dart throw. International journal of psychophysiology : o�cial journal

of the International Organization of Psychophysiology, 110:137�145, 2016.

88

Declaration of Authorship

I hereby declare that this thesis was created by me and me alone using only

the stated sources and tools.

Thomas Hennig Magdeburg, November 20, 2018

Variability Analysis based on Multi-Objective Performance ...und+Bachelor_Arbeiten/MasterThe… ·...

Documents

Transcript of Variability Analysis based on Multi-Objective Performance ...und+Bachelor_Arbeiten/MasterThe… ·...