Journal of Theoretical and Applied Computer Science
Transcript of Journal of Theoretical and Applied Computer Science
Journal of Theoretical and Applied
Computer Science
Vol. 6, No. 3, 2012
QCA & CQCA: QUAD COUNTRIES ALGORITHM AND CHAOTIC QUAD COUNTRIES ALGORITHM
M. A. Soltani-Sarvestani, Shahriar Lotfi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
EFFECTIVENESS OF MINI-MODELS METHOD WHEN DATA MODELLING WITHIN A 2D-SPACE IN AN
INFORMATION DEFICIENCY SITUATION
Marcin Pietrzykowski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
SMARTMONITOR: RECENT PROGRESS IN THE DEVELOPMENT OF AN INNOVATIVE VISUAL
SURVEILLANCE SYSTEM
Dariusz Frejlichowski, Katarzyna Gościewska, Paweł Forczmański, Adam Nowosielski,
Radosław Hofman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
NONLINEARITY OF HUMAN MULTI-CRITERIA IN DECISION-MAKING
Andrzej Piegat, Wojciech Sałabun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
METHOD OF NON-FUNCTIONAL REQUIREMENTS BALANCING DURING SERVICE DEVELOPMENT
Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill . . . . . . . . . . . . . . . . . . . . . . . 50
DONOR LIMITED HOT DECK IMPUTATION: EFFECTS ON PARAMETER ESTIMATION
Dieter William Joenssen, Udo Bankhofer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Journal of Theoretical and Applied Computer Science Scientific quarterly of the Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission
Scientific advisory board:
Chairman:
Prof. Henryk Krawczyk, Corresponding Member of Polish Academy of Sciences,
Gdansk University of Technology, Poland
Members:
Prof. Michał Białko, Member of Polish Academy of Sciences, Koszalin University of Technology, Poland
Prof. Aurélio Campilho, University of Porto, Portugal
Prof. Ran Canetti, School of Computer Science, Tel Aviv University, Israel
Prof. Gisella Facchinetti, Università del Salento, Italy
Prof. André Gagalowicz, The National Institute for Research in Computer Science and Control (INRIA), France
Prof. Constantin Gaindric, Corresponding Member of Academy of Sciences of Moldova, Institute of Mathematics and Computer
Science, Republic of Moldova
Prof. Georg Gottlob, University of Oxford, United Kingdom
Prof. Edwin R. Hancock, University of York, United Kingdom
Prof. Jan Helmke, Hochschule Wismar, University of Applied Sciences, Technology, Business and Design, Wismar, Germany
Prof. Janusz Kacprzyk, Member of Polish Academy of Sciences, Systems Research Institute, Polish Academy of Sciences, Poland
Prof. Mohamed Kamel, University of Waterloo, Canada
Prof. Marc van Kreveld, Utrecht University, The Netherlands
Prof. Richard J. Lipton, Georgia Institute of Technology, USA
Prof. Jan Madey, University of Warsaw, Poland
Prof. Kirk Pruhs, University of Pittsburgh, USA
Prof. Elisabeth Rakus-Andersson, Blekinge Institute of Technology, Karlskrona, Sweden
Prof. Leszek Rutkowski, Corresponding Member of Polish Academy of Sciences, Czestochowa University of Technology, Poland
Prof. Ali Selamat, Universiti Teknologi Malaysia (UTM), Malaysia
Prof. Stergios Stergiopoulos, University of Toronto, Canada
Prof. Colin Stirling, University of Edinburgh, United Kingdom
Prof. Maciej M. Sysło, University of Wrocław, Poland
Prof. Jan Węglarz, Member of Polish Academy of Sciences, Poznan University of Technology, Poland
Prof. Antoni Wiliński, West Pomeranian University of Technology, Szczecin, Poland
Prof. Michal Zábovský, University of Zilina, Slovakia
Prof. Quan Min Zhu, University of the West of England (UWE), Bristol, United Kingdom
Editorial board:
Editor-in-chief:
Dariusz Frejlichowski, West Pomeranian University of Technology, Szczecin, Poland
Managing editor:
Piotr Czapiewski, West Pomeranian University of Technology, Szczecin, Poland
Section editors:
Michaela Chocholata, University of Economics in Bratislava, Slovakia
Piotr Dziurzański, West Pomeranian University of Technology, Szczecin, Poland
Paweł Forczmański, West Pomeranian University of Technology, Szczecin, Poland
Przemysław Klęsk, West Pomeranian University of Technology, Szczecin, Poland
Radosław Mantiuk, West Pomeranian University of Technology, Szczecin, Poland
Jerzy Pejaś, West Pomeranian University of Technology, Szczecin, Poland
Izabela Rejer, West Pomeranian University of Technology, Szczecin, Poland
ISSN 2299-2634
The on-line edition of JTACS can be found at: http://www.jtacs.org. The printed edition is to be considered the primary one.
Publisher:
Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission
Address: Waryńskiego 17, 71-310 Szczecin, Poland
http://www.jtacs.org, email: [email protected]
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 3-20
ISSN 2299-2634 http://www.jtacs.org
QCA & CQCA: Quad Countries Algorithm and Chaotic
Quad Countries Algorithm
M. A. Soltani-Sarvestani1, Shahriar Lotfi
2
1 Computer Engineering Department, University College of Nabi Akram, Tabriz, Iran
2 Computer Science Department, University of Tabriz, Tabriz, Iran
[email protected], [email protected]
Abstract: This paper introduces an improved evolutionary algorithm based on the Imperialist Com-
petitive Algorithm (ICA), called Quad Countries Algorithm (QCA) and with a little change
called Chaotic Quad Countries Algorithm (CQCA). The Imperialist Competitive Algorithm
is inspired by socio-political process of imperialistic competition in the real world and has
shown its reliable performance in optimization problems. This algorithm converges quickly,
but is easily stuck into a local optimum while solving high-dimensional optimization prob-
lems. In the ICA, the countries are classified into two groups: Imperialists and Colonies
which Imperialists absorb Colonies, while in the proposed algorithm two other kinds of
countries, namely Independent and Seeking Independence countries, are added to the coun-
tries collection which helps to more exploration. In the suggested algorithm, Seeking Inde-
pendence countries move in a contrary direction to the Imperialists and Independent
countries move arbitrarily that in this paper two different movements are considered for
this group; random movement (QCA) and Chaotic movement (CQCA). On the other hand,
in the ICA the Imperialists’ positions are fixed, while in the proposed algorithm, Imperial-
ists will move if they can reach a better position compared to the previous position. The
proposed algorithm was tested by famous benchmarks and the compared results of the QCA
and CQCA with results of ICA, Genetic Algorithm (GA), Particle Swarm Optimization
(PSO), Particle Swarm inspired Evolutionary Algorithm (PS-EA) and Artificial Bee Colony
(ABC) show that the QCA has better performance than all mentioned algorithms. Between
all cases, the QCA, ABC and PSO have better performance respectively about 50%, 41.66%
and 8.33% of cases.
Keywords: Optimization, Imperialist Competitive Algorithm (ICA), Independent country, Seeking Inde-
pendent country, Quad Countries Algorithm (QCA) and Chaotic Quad Countries Algorithm
(CQCA).
1. Introduction
Evolutionary algorithms (EA) [1, 2] are algorithms that are inspired by nature and have
many applications to solving NP problems in various fields of science. Some of the famous
Evolutionary Algorithms proposed for optimization problems are: the Genetic Algorithm
(GA) [2, 3, 4], at first proposed by Holland, in 1962 [3], Particle Swarm Optimization algo-
rithm (PSO) [5] first proposed by Kennedy and Eberhart [5], in 1995. In 2007, Atashpaz and
Lucas proposed an algorithm known as Imperialist Competitive Algorithm (ICA) [6,7], that
was inspired by a socio-human phenomenon. Since 2007 attempts were performed to in-
4 M. A. Soltani-Sarvestani, Shahriar Lotfi
crease the efficiency of the ICA. Zhang, Wang and Peng proposed the approach based on
the concept of small probability perturbation to enhance the movement of colonies to impe-
rialist, in 2009 [8]. Faez, Bahrami and Abdechiri, in 2010, proposed a new method using the
chaos theory to adjust the angle of colonies movement toward the Imperialist’s positions
(CICA: Imperialist Competitive Algorithm using Chaos Theory for Optimization) [9], and in
another paper in the same year, they proposed another algorithm that applies the probability
density function to adapt the angle of colonies movement towards imperialist’s position
dynamically, during iterations (AICA: Adaptive Imperialist Competitive Algorithm) [10].
In the Imperialist Competitive Algorithm (ICA), there are only two different types of
countries, Imperialists and Colonies that Imperialists absorb. While, in the real world, there
are some Independent Countries which are neither Imperialists nor Colonies. Some of the
Independent Countries are at peace with Imperialists and the others have challenge with
Imperialists to stable their independence. In the ICA, only the Colonies’ movements toward
Imperialists are considered while in the real world each Imperialist moves in order to pro-
mote its political and cultural position. In the Quad Countries Algorithm (QCA) and Chaotic
Quad Countries Algorithm (CQCA), countries are divided into four categories: Imperialist,
Colony, Seeking Independent and Independent as each category has its special movement
compared to the others. In the QCA and CQCA, as in the real world, an Imperialist will
move if it brings advancement to a better position than its current position.
The rest of this paper is arranged as follows. Section two explains about related works.
Section three presents a brief description of Imperialist Competitive Algorithm. Section four
will explain the proposed algorithm. In section five, the result will be analyzed and the per-
formance of algorithms will be evaluated. In the section six, a conclusion will be presented.
2. Related Works
In 2009 Zhang, Wang and Peng [8] mentioned that the original approach in the Imperial-
ist Competitive Algorithm has difficulty in practical implementation with the increase of the
dimension of the search spaces, as the ambiguous definition of the “random angle” in the
process of optimization. Compared to the original algorithm, their approach based on the
concept of small probability perturbation has more simplicity to be implemented, especially
in solving high-dimensional optimization problems. Furthermore, their algorithm has been
extended to constrained optimization problem, using a classical penalty technique to handle
constraints.
In 2010, Faez, Bahrami and Abdechiri [9] introduced a new Imperialist Competitive Al-
gorithm using chaotic maps (CICA). In their algorithm, the chaotic maps were used to adapt
the angle of colonies movement towards imperialist’s position to enhance the escaping ca-
pability from a local optima trap.
In the same year Faez, Bahrami and Abdechiri [10] introduced an algorithm that the Ab-
sorption Policy changed dynamically to adapt the angle of colonies movement towards im-
perialist’s position. They mentioned that The ICA is easily stuck into a local optimum when
solving high-dimensional multi-model numerical optimization problems. To overcome this
shortcoming, they used probabilistic model that utilize the information of colonies positions
to balance the exploration and exploitation abilities of the imperialistic competitive algo-
rithm. Using this mechanism, ICA exploration capability enhanced.
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 5
3. The Imperialist Competitive Algorithm (ICA)
Imperialist Competitive Algorithm (ICA) was proposed for the first time by Atashpaz
and Lucas in 2007 [6]. ICA is a new evolutionary algorithm in the Evolutionary Computa-
tion (EC) field based on the human socio-political evolution. The algorithm starts with an
initial random population called countries, then some of the best countries in the population
are selected to be the imperialists and the rest of them form the colonies of these imperial-
ists. The colonies are divided between them according to imperial power. In an Nvar-
dimensional optimization problem, a country is a 1×Nvar array. This array defined as below:
var1 2[ , ,..., ]
Ncountry p p p= .
(1)
The cost of a country is found by evaluating the cost function f at the variables
var1 2( , ,..., )
Np p p . Then
var1 2( ) ( , ,..., )i i N
f fcountry p p pc = = .
(2)
The algorithm starts with Npop initial countries and the Nimp of the most powerful coun-
tries is chosen as imperialists. The remaining countries are colonies belong into imperialists
in convenience with their powers. To distribute the colonies among imperialist proportional-
ly, the normalized cost of an imperialist is defined as follow
{ }maxn i i nC c c= − , (3)
where, cn is the cost of nth
imperialist and Cn is its normalized cost. Each imperialist with
more cost value will have less normalized cost value. Having the normalized cost, the nor-
malized power of each imperialist is calculated as below and based on this, the colonies are
distributed among the imperialist countries:
C nP n
N impC i
i i
=
∑=
, (4)
where Pn is the normalized power of an imperialist. On the other hand, the normalized pow-
er of an imperialist is assessed by its colonies. Then, the initial number of colonies of an
empire will be
( ){ }.n coln
rand pNC N= , (5)
where NCn is initial number of colonies of nth
empire and Ncol is the number of all colonies.
To distribute the colonies among imperialist, NCn of the colonies is selected randomly and
assigned to their imperialist. The imperialist countries absorb the colonies towards them-
selves using the absorption policy. The absorption policy makes the main core of this algo-
rithm and causes the countries move towards their minimum optima; this policy is shown in
Fig.1. In the absorption policy, the colony moves towards the imperialist by x unit. The di-
rection of movement is the vector from colony to imperialist, as shown in Fig.1. In this fig-
ure, the distance between the imperialist and colony is shown by d and x is a random
variable with uniform distribution:
( )dUx ×≈ β,0 , (6)
6 M. A. Soltani-Sarvestani, Shahriar Lotfi
where β is greater than 1 and is near to 2. So, in [6] is mentioned that a proper choice can be
β=2. In ICA algorithm, to search different points around the imperialist, a random amount of
deviation is added to the direction of colony movement towards the imperialist. In Fig.1,
this deflection angle is shown as Ө, which is chosen randomly and with a uniform distribu-
tion:
( ),Uθ γ γ≈ − . (7)
While moving toward the imperialist countries, a colony may reach a better position, so
the colony position changes according to the position of the imperialist.
x Ө
d
Figure 1. Moving colonies toward their imperialist [6]
The imperialists absorb these colonies towards themselves with respect to their power
that is described in (8). The total power of each imperialist is determined by the power of its
both parts, the empire power plus the percent of its average colonies’ power:
( ) ( ){ }n n ncost mean cost colonies ofimperialist empireTC ξ= + ⋅ , (8)
where TCn is the total cost of the nth
empire and ξ is a positive number which is considered
to be less than one. In the ICA, the imperialistic competition has an important role. During
the imperialistic competition, the weak empire will lose their power and their colonies. To
model this competition, first the probability of possessing all the colonies is calculated for
each empire, considering the total cost of such an empire:
{ } TCTCNTC niin −= max , (9)
where TCn is the total cost of nth
empire and NTCn is the normalized total cost of nth
empire.
Having the normalized total cost, the possession probability of each empire is calculated as
below:
1
NTC np p Nn impNTC i
i
=
∑=
. (10)
After a while all the empires except the most powerful one will collapse and all the colo-
nies will be under the control of this unique empire.
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 7
4. Quad Countries Algorithm (QCA)
In this paper, a new Imperialist Competitive Algorithm is proposed which is called Quad
Countries Algorithm where two new categories of countries are added to the collection of
countries; Independent and Seeking Independence countries. In addition, in the new algo-
rithm Imperialists can also move like the other countries. In the main ICA, there are only
two categories of countries, Imperialist and Colony, and the only movement that exists there
is the Colonies’ movement towards Imperialists, while in the proposed algorithm, there are
four categories of countries with different movements. Therefore, the primary ICA may fall
into local minimum trap during the search process and it is possible to get far from the glob-
al optimum. With changes that were performed in ICA a new algorithm called QCA was
made whose power of exploration in the search space will substantially increase and prevent
it from sticking in the local traps.
4.1. Independent Country
In the real world, permanently there are countries which have been neither Colonies, nor
Imperialist. These Countries may perform any movements in order to take their advantage
and try to improve their current situation. In the proposed algorithm, some countries are
defined as Independent countries which explore search space randomly. As an illustration in
Fig. 2, if during the search process an Independent country reaches a better position com-
pared to an Imperialist, they definitely exchange their positions. The Independent country
changes to a new Imperialist and will be the owner of old Imperialist’s Colonies, and the
Imperialist changes to an Independent Country and will start to explore the search space like
these kinds of countries.
As mentioned, the Independent countries can perform any movements in the algorithm
and their movements are arbitrary. In this paper, two different kinds of movements are con-
sidered for the Independent countries. One is a completely random movement. With this
kind of movement, the Independent countries move completely randomly in different direc-
tions, and also independently from each other, which is named QCA. In the second kind of
movement, these countries move based on Chaos Theory which is named CQCA which is
explained in the next part.
4.1.1. Definition of Chaotic movement for Independent Countries (CQCA)
In this approach, the Independent countries move according to Chaos Theory. In this
kind of movement, the angle of movement is changed in a Chaotic way during the search
process.
8 M. A. Soltani-Sarvestani, Shahriar Lotfi
Figure 2. Replacing an Empire with an Independent
This Chaotic action in the Independent countries’ movements in the CQCA makes the
proper condition for the algorithm to more exploration and escape from local peaks and we
introduce this approach as Chaotic Quad countries algorithm (CQCA). Chaos variables are
usually generated by the some well-known Chaotic maps [11, 12]. Table 1 shows some of
the Chaotic maps for adjusting Ө parameter (Angle of Independent countries movement).
Table 1. Chaotic maps
Chaotic maps
1 (1 )n n nθ αθ θ+ = − CM1
2
1 sin( )n n n
θ αθ πθ+ = CM2
1 ( )sin(2 ) mod(1)2n n nb αθ θ πθπ+ = + − CM3
In Table 1, α is a control parameter and Ө is a chaotic variable in kth
iteration which be-
longs to interval (0, 1). During the search process, no value of Ө is repeated.
Independent
Imperialists
Colonies
One step of movement
Replacing an Empire
with an Independent
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 9
4.2. Seeking Independence Countries
Seeking Independence Countries are countries which have challenges with the Imperial-
ists and try to be away from them. In the main ICA, the only movement is the Colonies’
movements toward Imperialists and in fact, there is only Absorption policy. While by defin-
ing the Seeking Independence Countries in proposed algorithm, there is also Repulsion poli-
cy versus Absorption policy. Fig.3 illustrates the Repulsion Policy.
Empire1
Empire2
Empire3
Colony1
Colony2
Colony3
Independent
a) Absorption policy
b) Absorption and Repulsion policy
Figure 3. Different movement policy
As can be seen in Fig.3.a, there is only Absorption policy that matches with the ICA. As
it shows, the only use of applying Absorption policy causes that countries’ positions to get
closer to each other and their surrounded space to decrease gradually, and the global optima
might be lost. In Fig.3.a the algorithm is converging to a local optimum. Fig.3.b illustrates
the process of the proposed algorithm. The black squares represent the Seeking Independ-
ence Countries, and as can be seen, these countries can steer the search process to a direc-
tion which the other countries don’t cover. It shows that using Absorption and Repulsion
policies together leads to a better coverage of search space.
To apply the Repulsion policy in the QCA, first the sum of differences between the
Seeking Independent Countries and the Imperialists positions is calculated as a vector like
(11) named Center, that is a 1×N vector.
1( ) , 1, 2,...,
Nimp
i i jijCenter a p i N
== − =∑ , (11)
where Centeri is sum of ith
component of all Imperialists, pji is ith
component of jth
Imperial-
ist, ai is ith
component of Seeking Independence Country and N indicates the problem di-
mensions. Then the Seeking Independence Countries will move in the direction of obtained
vector as (12).
, (0,1)D Centerδ δ= × ∈ , (12)
Global Optimum
Global Optimum
10 M. A. Soltani-Sarvestani, Shahriar Lotfi
where δ is relocation factor and D is relocation vector that its components sum peer to peer
with the Seeking Independence Country’s components and obtain new position of the Seek-
ing Independence Country.
4.3. Imperialists Movement
In the real world, all countries including Imperialists perform ongoing efforts to improve
their current situation. While in the main ICA, Imperialists never move and this fixed situa-
tion sometimes leads to the loss of global optima or prevents to reach up better solutions.
Fig.4 illustrates this problem clearly. Fig.4 could be a final state of running the ICA, when
only one Imperialist has remained. Since in the ICA Imperialists have no motion, solution 1
is the answer that the ICA returns. In the proposed approach, a random movement is as-
sumed for Imperialists in each iteration and the cost of this hypothetical position will be
calculated. If the cost of the new position is less than the cost of the previous one, the Impe-
rialist will move to the new position, otherwise the Imperialist will not move. As can be
seen in Fig.4, using this method leads to solution 2 which is a better solution than solution 1.
Figure 4. A final state of ICA and QCA
To applying this policy in the QCA, first of all, equals to the number of problem dimen-
sions, the random values are generated like (13).
iRand Iα = × , (13)
where I is an arbitrary value that is dependent on the problem size. Then the new position of
Imperialist is obtained like (14).
var var var
var var var
var var
1 1 1
1 1 1
1 1
( ,..., ) ( ,..., )
( ,..., ) ( ,..., )
( ,..., ) ( ,..., )
N N N
N N N
N N
P P P P
if f P P f P P
P P P P Otherwise
α α
α α
= + +
+ + <
=
, (14)
where the αi are numbers which were obtained in Equation (13) and Pi shows the value of ith
dimension of a country. In fact, equation (14) states that if the new position of Imperialist is
better than its current position, the Imperialists will be transferred to the new position, oth-
erwise, they remain in their current position.
According to the explained part about countries Seeking Independence and Independent
countries, now their actions in the algorithm are specific. By adding these policies and ac-
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 11
tions, a new algorithm is generated, called Quad Countries Algorithm (QCA), and through
defining Chaotic movement for Independent Countries another algorithm is generated,
which is named Chaotic Quad Countries Algorithm (CQCA), both of which have better per-
formance compared to ICA.
5. Evaluation and Experimental Results
In this paper, two new algorithms based on the Imperialist Competitive Algorithm
(ICA), called Quad Countries Algorithm (QCA) and Chaotic Quad Countries Algorithm
(CQCA) are introduced and were applied to some well-known benchmarks in order to veri-
fy their performance and compare to ICA. These benchmark functions are presented in Ta-
ble 2.
The simulation was made to evaluate the rate of convergence and the quality of the pro-
posed algorithm optima results, in comparison to ICA with all the benchmarks tested for
minimization. Both algorithms are applied in identical conditions in 2, 10, 30, 50 dimen-
sions. The number of countries in both algorithms were 125, including 10 Imperialists and
115 Colonies in the ICA, and 10 Imperialists, 80 Colonies, 18 countries Seeking Independ-
ence and 17 Independent countries in the QCA and CQCA. Both algorithms are run 100
times and 1000 generations in each iteration and average of these iterations are recorded in
Table 3.
Table 2. Benchmarks for simulation
Benchmark Mathematical Representation Range
Ackley e
D
i ix
n
D
ix i
n
xf −+∑=
−∑=
−−=
20
12cos
1
exp1
21
2.0exp20)( π
[-32.768,
32.768]
Griewank 1
1
cos2
4000
1)(
1+
=
−∑×= ∏
=
D
ii
xix i
xfD
i [-600,600]
Rastrigin ( )( )2( )
110 2 10cos
i
Df x
ix Dxi
π∑==
− × + [-15,15]
Sphere ( )∑=
=D
ixf ix
1)(
2 [-600,600]
Rosenbrock ( ) ( )( )∑−
== −+−× −
1
1)(
222
1 1100D
ixf iii xxx [-15,15]
Symmetric
Griewank 1
1
cos2
4000
1)(
1+
=
−∑×−= ∏
=
D
ii
xix i
xfD
i [-600,600]
Symmetric
Rastrigin ( )( )∑
=−= +×−
D
ixf
ix xi1
)( 10210 cos2 π [-600,600]
Symmetric
Sphere ( )∑
=−=
D
ixf ix
1)(
2 [-600,600]
Experiments started with Griewank Inverse function. Griewank Inverse is a hill-like
function and its global optima are located in the corner of search space. Both algorithms
were applied 100 times in identical conditions and the entrances are selected randomly.
Fig.5 averagely, illustrates the graph of the results of 100 iterations in different dimensions
with 1000 generations in each iteration of Griewank Inverse.
12 M. A. Soltani-Sarvestani, Shahriar Lotfi
In Figures 5.a, 6.a, 7.a and 8.a the horizontal axis indicates the number of iterations.
These graphs show the obtained results in each iteration for each algorithm. And in Figures
5.b, 6.b, 7.b and 8.b the horizontal axis indicates the number of generation. These graphs
illustrate the convergence of algorithms. As mentioned, two different kinds of motions are
defined for Independent countries: Chaotic and random motions which are named CQCA
and QCA respectively. So there are three curves in all graphs in these Figures, ICA, QCA
and CQCA.
Figure 5.a illustrates the results of 100 iterations of applying algorithms on Griewank
Inverse with two dimensions. In 100 iterations, 79 times the QCA and CQCA achieve better
results than ICA. As can be seen in Figures 6.a, 7.a and 8.a by increasing the function’s
dimensions respectively to 10, 30 and 50, the QCA and CQCA achieve better results com-
pared to the ICA in every 100 iterations. Figures 5.b, 6.b, 7.b and 8.b illustrate the average
of the convergence of both algorithms and as can be seen, in addition to the quality of the
results the convergences of the QCA and CQCA are also faster than the ICA. By increasing
the problem’s dimensions, the performance of ICA will decrease, while the QCA and
CQCA still maintain their performances. It is worth consideration that the results of apply-
ing two kinds of Independent countries’ movement are so close to each other that their
curves are the same.
The observed results of applying the algorithms on the rest of the benchmarks in Table 2
were approximately similar to Griewank Inverse and the results are shown in Table 3. The
Table 3 includes 14 columns; from left to right: the 1st column indicates the benchmark’s
name, the 2nd
one is the range of the function’s parameters, the 3rd
indicates the function’s
dimensions and the 4th
column indicates the optimum of benchmark. The 5th
column indi-
cates the best results obtained by the QCA and the 8th
and 11th
columns are respectively the
best results of the CQCA and the ICA. The 6th
,9th
and 12th
columns respectively indicate
the average of the results in 100 iterations of the QCA, CQCA and the ICA. And 7th
, 10th
and 13th
columns indicate standard deviation (SD) of the QCA, CQCA and the ICA. And
the 14th
column indicates the rate of improvement of QCA in comparison to the ICA.
As can be seen, the QCA and CQCA results are better than the ICA in all cases except
the Schwefel, where all algorithms achieve the same results. The recorded results in Table 3
show that, as the problem dimensions increase, the performance of the QCA and CQCA
increases versus the ICA.
The results of the QCA and CQCA are closely the same considerably. Each function in
Table 3 performs 100 times and up to 1000 generation in each iteration by the same en-
trances with 2, 10, 30 and 50 dimensions.
In the other comparison, the results are compared to Genetic Algorithm (GA), Particle
Swarm Optimization (PSO), PS-EA and Artificial Bee Colony (ABC) in Table 4. As can be
seen, the results of the proposed algorithm are better than GA and PS-EA in 100 percent of
cases. But in the comparison with the QCA, the ABC and PSO the conditions are different.
Also, in 50 percent of cases the QCA has better performance in compared to ABC and PSO
and the best results are highlight in Table 4. The ABC and PSO have better performance
respectively 41.66 and 8.33 percent of cases. But there is a doubt about the ABC. As can be
observed in all results, by increasing the problem dimensions, the performance of the algo-
rithm will decrease. Naturally, the obtained results for a function with higher dimensions
should be equal or bigger than the function with lower dimensions. By considering the
Greiwank in Table 4, observed that the ABC acted inversely in this case and the result of
applying the algorithm on function with 30 dimensions is smaller than 10 dimensions one
and it seems that it is a mistake. So if this paradox is considered as a mistake, performance
of QCA, PSO and ABC will change to 58.33, 16.66 and 25 percent.
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 13
cost
Run number
5.a. Stability of ICA, QCA and CQCA
cost
Generation
5.b. Convergence of ICA, QCA and CQCA
Figure 5. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 2 Dimensions
14 M. A. Soltani-Sarvestani, Shahriar Lotfi
cost
Run number
6.a. Stability of ICA, QCA and CQCA
cost
Generation
6.b. Convergence of ICA, QCA and CQCA
Figure 6. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 10 Dimensions
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 15
cost
Run number
7.a. Stability of ICA, QCA and CQCA
cost
Generation
7.b. Convergence of ICA, QCA and CQCA
Figure 7. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 30 Dimensions
16 M. A. Soltani-Sarvestani, Shahriar Lotfi
cost
Run number
8.a. Stability of ICA, QCA and CQCA
cost
Generation
8.b. Convergence of ICA, QCA and CQCA
Figure 8. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 50 Dimensions
Table 3. The results of applying benchmarks on the QCA, CQCA and the ICA with 2, 10, 30 and 50 dimensions
Benchmark Range Dim Optimum QCA CQCA ICA Imp.
Best Result Mean SD Best Result Mean SD Best Result Mean SD
Sphere [-600,
600]
2 0 1.6384E-26 7.4682E-20 2.7799E-19 1.1889E-26 2.6167E-19 2.0530E-18 2.0568E-20 1.371E-10 1.1761E-9 ≈100%
10 0 4.6801E-15 1.8719E-11 3.9881E-11 1.7152E-14 3.1369E-11 6.4424E-11 2.5493E-12 3.0484E-8 6.445E-8 99.94%
30 0 2.559E-9 7.1833E-7 2.2583E-6 3.7622E-9 5.2950E-7 1.2766E-6 1.0972E-6 3.2491E-5 3.6956E-5 98.37%
50 0 7.3234E-7 3.9662E-5 1.0098E-4 6.6159E-7 2.8669E-5 5.3403E-5 2.6172E-4 0.0031 0.003 99.07%
Sphere Inv. [-600,
600]
2 -7.2 E +5 -7.2 E +5 -7.2 E +5 0.2526 -7.2 E +5 -7.2 E +5 0.2536 -7.2 E +5 -7.1998E+5 14.8687 0.003%
10 -3.6 E +6 -3.5995E+6 -3.5983E+6 783.7695 -3.5994E+6 -3.5981E+6 847.4918 -3.5821E+6 -3.5689E+6 6.2142E+3 0.82%
30 -1.08E+7 -1.0761E+7 -1.0734E+7 1.4222E+4 -1.0759E+7 -1.0731E+7 1.4091E+4 -1.0506E+7 -1.0358E+7 5.4485E+4 3.63%
50 -1.8E+7 -1.7866E+7 -1.7755E+7 4.0419E+4 -1.7859E+7 -1.7756E+7 3.8404E+4 -1.6950E+7 -1.6706E+7 9.4520E+4 6.29%
Rastrigin [-15,15]
2 0 0 0 0 0 0 0 0 1.1358E-13 6.652E-13 100%
10 0 0 1.3269E-14 4.0851E-14 0 1.6129E-14 3.8613E-14 4.464E-12 5.5944E-9 1.5154E-8 99.99%
30 0 3.6981E-10 1.4274E-8 2.4467E-8 2.6805E-10 1.5004E-7 1.3607E-6 1.6195E-4 0.3899 0.5083 ≈100%
50 0 7.5566E-7 0.0599 0.2362 1.054E-6 0.0203 0.1393 1.0452 5.3211 1.7154 99.62%
Rastrigin
Inv.
[-600,
600]
2 -7.2 E +5 -7.2E+5 -7.2E+5 0.3104 -7.2E+5 -7.2E+5 0.6883 -7.2E+5 -7.1999E+5 13.0936 0.002%
10 -3.6 E +6 -3.5995E+5 -35983E+6 906.4164 -3.5956E+6 -3.5983E+6 833.7654 -3.5861E+6 -3.5692E+6 6.4272E+3 0.82%
30 -1.08E+7 -1.0767E+7 -1.0732E+7 1.3192E+4 -1.0756e+7 -1.0732E+7 1.3110E+4 -1.0486E+7 -1.0348E+7 4.7617E+4 3.71%
50 -1.8E+7 -1.7830E+7 -1.7757E+7 3.4901E+4 -1.7844E+7 -1.7753E+7 4.0589E+4 -1.7019E+7 -1.6707E+7 1.0234E+5 6.28%
Griewank [-600,
600]
2 0 0 0 0 0 0 0 0 3.7356E-13 2.7586 E-12 100%
10 0 8.9106E-13 9.3103E-9 2.2949E-8 1.3518E-11 1.2774 E -8 6.6488 E -8 01.4433E-9 6.8886E -6 1.9415 E -5 99.86%
30 0 4.8241E-4 0.0144 0.0220 4.3573E-4 0.0155 0.0224 0.0040 0.0721 0.0522 80.03%
50 0 0.0747 0.3832 37.9352 0.1251 0.35 34.6526 0.1402 0.4227 41.8484 17.2%
Griewank
Inv.
[-600,
600]
2 -180.0121 -179.0827 -178.9388 0.0877 -179.0931 -178.9193 0.1022 -179.0774 -178.8674 0.0832 0.04%
10 -901 -898.5777 -897.6309 0.5023 -898.515 -897.6836 0.5302 -893.9284 -890.7051 1.6002 0.79%
30 -2.701E+3 -2.6883E-3 -2.6812E+3 3.6252 -2.6887E+3 -2.6817E-3 3.3025 -2.6205E+3 -2.5886E+3 10.4053 3.6%
50 -4501 -4.4599E+3 -4.439E+3 9.2071 -4.4597E+3 -4.4401E+3 9.6358 -4.2549E+3 -4.1779E+3 28.4462 6.28%
Ackley [-32.768,
32.768]
2 0 8.8818E-16 8.4754E-13 2.0295E-12 4.4409E-15 7.6213E-13 1.7966E-12 3.4195E-13 5.2040E-8 4.5546E-7 99.99%
10 0 1.2632E-9 2.5552E-8 4.8522E-8 2.3995E-10 2.4881E-8 4.2524E-8 1.4476E-7 2.4681E-6 5.4074E-6 98.99%
30 0 1.0459E-6 4.3273E-6 2.5904E-6 1.0613E-6 4.2235E-6 2.0663E-6 3.9508E-5 1.7145E-4 1.0189E-4 97.54%
50 0 3.7308E-5 9.9126E-5 4.1411E-5 3.3848E-5 9.0414E-5 3.2841E-5 4.5648E-4 0.0014 7.0191E-4 93.54%
Schewefel [-500,
500]
2 0 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0 0
30 0 0 0 0 0 0 0 0 0 0 0
50 0 0 0 0 0 0 0 0 0 0 0
Table 4. The result of GA, PSO, PS-EA, ABC, ICA, QCA
Benchmark D
GA [14] PSO [14] PS-EA [14] ABC [13] ICA QCA CQCA
Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD
Griewank
10 0.05023 0.02952 0.07939 0.033451 0.222366 0.0781 0.00087 0.002535 6.889E-6 1.941E-5 9.31E-9 2.295E-8 1.277E-8 6.649E-8
20 1.0139 0.02697 0.03056 0.025419 0.59036 0.2030 2.01E-08 6.76E-08 0.0052 0.0079 1.206E-4 1.989E-4 1.753E-4 4.092E-4
30 1.2342 0.11045 0.01115 0.014209 0.8211 0.1394 2.87E-09 8.45E-10 0.0721 0.0522 0.0144 0.0220 0.0155 0.0224
Rastrigin
10 1.3928 0.76319 2.6559 1.3896 0.43404 0.2551 0 0 5.594E-9 1.515E-8 1.33E-14 4.09E-14 1.61E-14 3.86E-14
20 6.0309 1.4537 12.059 3.3216 1.8135 0.2551 1.45E-08 5.06E-08 2.154E-4 0.0016 3.31E-11 6.26E-11 6.11E-11 1.61E-10
30 10.4388 2.6386 32.476 6.9521 3.0527 0.9985 0.033874 0.181557 0.3899 0.5083 1.427E-8 2.447E-8 1.5E-7 1.361E-6
Ackley
10 0.59267 0.22482 9.85E-13 9.62E-13 0.19209 0.1951 7.8E-11 1.16E-09 2.468E-6 5.407E-6 2.555E-8 4.852E-8 2.488E-8 4.252E-8
20 0.92413 0.22599 1.178E-6 1.5842E-6 0.32321 0.097353 1.6E-11 1.9E-11 3.033E-5 1.916E-5 4.719E-7 3.782E-7 4.311E-7 3.544E-7
30 1.0989 0.24956 1.492E-6 1.8612E-6 0.3771 0.098762 3E-12 5E-12 1.715E-4 1.019E-4 4.327E-6 2.59E-6 4.224E-6 2.066E-6
Schwefel
10 1.9519 1.3044 161.87 144.16 0.32037 1.6185 1.27E-09 4E-12 0 0 0 0 0 0
20 7.285 2.9971 543.07 360.22 1.4984 0.84612 19.83971 45.12342 0 0 0 0 0 0
30 13.5346 4.9534 990.77 581.14 3.272 1.6185 146.8568 82.3144 0 0 0 0 0 0
QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 19
6. Conclusions
In this paper, two improved imperialist algorithms are introduced which are called re-
spectively the Quad Countries Algorithm (QCA) and the Chaotic Quad Countries Algorithm
(CQCA). In the QCA and CQCA, we define four categories of countries including Imperial-
ist, Colony, Independent, and Seeking Independent country so that each group of countries
has special motion and moves differently compared to the others. The difference between
QCA and CQCA is related to the Independent countries’ movement. In the QCA Independ-
ent countries move completely randomly, but in the CQCA they move with chaotic maps. In
the primary ICA there are only two categories, Colony and Imperialist, and the only motion
is the Colonies’ movement toward Imperialists which is applied through Absorption policy.
Whereas by adding Independent countries in the QCA, a new policy which is called Repul-
sion policy is also added. The empirical results were found by applying the proposed algo-
rithm to some famous benchmarks, indicating that the quality of global optima solutions and
the convergence speeds towards the optima have remarkably increased in the proposed algo-
rithms, in comparison to the primary ICA. In experiments it can be clearly seen that, when
the ICA sticks into a local optimum trap the QCA and CQCA find global optima. In cases
when the ICA found a solution near to the global optima, the QCA and CQCA discovered
an equal or better solution than the ICA’s solution. Through the increase of the problem
dimensions, the performance of the QCA and CQCA increase considerably when compared
to the ICA. In comparison with the QCA, CQCA, GA, PSO, PS-EA and ABC, it was ob-
served that in 100 percent of cases the proposed algorithms has better performance than GA
and PS-EA, but in comparison with ABC and PSO, in 50 percent of cases the QCA has bet-
ter performance than ABC and PSO. ABC and PSO have better performance about 41.66
and 8.33percent of cases. Overall, the performed experiments showed that the QCA and
CQCA have considerably better performance in comparison with the primary ICA and also
the other evolutionary algorithms such as GA, PSO, PS-EA and ABC.
The Quad Countries Algorithm (QCA) has a proper performance to solve optimization
problems, but by changing the countries’ movements and defining new movement policies
its performance will increase. In fact, by defining new movement policies both the ability of
exploration and algorithm performance will increase.
References
[1] Sarimveis H., Nikolakopoulos A.: A Life Up Evolutionary Algorithm for Solving Non-
linear Constrained Optimization Problems. Computer & Operation Research,
32(6):pp.1499-1514 (2005)
[2] Mühlenbein H., Schomisch M., Born J.: The Parallel Genetic Algorithm as Function
Optimizer. Proceedings of The Forth International Conference on Genetic Algorithms,
University of California, San Diego, pp. 270-278 (1991)
[3] Holland J. H.: ECHO: Explorations of Evolution in a Miniature World. In: Farmer
J. D., Doyne J., editors, Proceedings of the Second Conference on Artificial Life (1990)
[4] Melanie M.: An Introduction to Genetic Algorithms. Massachusett's: MIT Press (1999)
[5] Kennedy J., Eberhart R.C.: Particle Swarm Optimization. In: Proceedings of IEEE, pp.
1942-1948 (1995)
[6] Atashpaz-Gargari E., Lucas C.: Imperialist Competitive Algorithm: An Algorithm for
Optimization Inspired by Imperialistic Competition. IEEE Congress on Evolutionary
Computation (CEC 2007), pp. 4661-4667 (2007)
20 M. A. Soltani-Sarvestani, Shahriar Lotfi
[7] Atashpaz-Gargari E., Hashemzadeh F., Rajabioun R., Lucas C.: Colonial Competitive
Algorithm: A novel approach for PID controller design in MIMO distillation column
process. International Journal of Intelligent Computing and cybernetics (IJICC), Vol. 1
No. 3, pp. 337-355 (2008)
[8] Zhang Y., Wang Y., Peng C.: Improved Imperialist Competitive Algorithm for Con-
strained Optimization. International Forum on Computer Science-Technology and Ap-
plications (2009)
[9] Bahrami H., Feaz K., Abdechiri M.: Imperialist Competitive Algorithm using Chaos
Theory for Optimization (CICA). Proceedings of the 12th
International Conference on
Computer Modelling and Simulation (2010)
[10] Bahrami H., Feaz K., Abdechiri M.: Adaptive Imperialist Competitive Algorithm (AI-
CA). Proceedings of The 9th
IEEE international Conference on Cognitive Informatics
(ICCI'10) (2010)
[11] Karaboga D., Basturk B.: A powerful and efficient algorithm for numerical function
optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization,
vol. 39, Issue.3, pp. 459-471 (2007)
[12] Srinivasan D., Seow T.H.: Evolutionary Computation. CEC ’03, 8--12 Dec. 2003, 4,
Canberra, Australia, pp. 2292-2297 (2003)
[13] Schuster H.G.: Deterministic Chaos: An Introduction. 2nd
reviseded, Weinheim, Feder-
al Republic of Germany: Physick-Verlag GmnH (1988)
[14] Zheng W.M.: Kneading plane of the circle map. Chaos, Solitons & Fractals, 4:1221
(1994)
[15] Soltani-Sarvestani M.A., Lotfi S., Ramezani F.: Quad Countries Algorithm (QCA). In:
Proc. of the 4th Asian Conference on Intelligent Information and Database Systems
(ACIIDS 2012), Part III, LNAI, pp. 119-129 (2012)
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 21-27
ISSN 2299-2634 http://www.jtacs.org
Effectiveness of mini-models method when data
modelling within a 2D-space in an information deficiency
situation
Marcin Pietrzykowski
Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,
Szczecin , Poland
Abstract: This paper examines mini-models method and its effectiveness when data modelling in an
information deficiency situation. It also compares the effectiveness of mini-models with var-
ious methods of modelling such as neural networks, the KNN-method and polynomials. The
algorithm concentrates only on local query data and does not construct a global model dur-
ing the learning process when it is not necessary. It is characterized by a high efficacy and
a short calculation time. The article briefly describes the method by means of four variants:
linear heuristic, nonlinear heuristic, mini-models based on linear regression, and mini-
models based on polynomial approximation. The paper presents the results of experiments
that compare the effectiveness of mini-models with selected methods of modelling in an in-
formation deficiency situation.
Keywords: mini-models, modelling, parameter of minimum number of samples, leave one out error,
information gap
1. Introduction
The concept of mini-models method was developed by Piegat [1], [2]. In contrast to
most well-known methods of modelling such as neural networks, neuro-fuzzy networks and
polynomial approximation, the method does not create a global model when it is not neces-
sary [3]. Mini-models method, similarly to the method of k-nearest neighbours, operates
only on data from the local neighbourhood of a query [4], [5]. This is a consequence of the
fact that in the modelling process we are generally only interested in an answer to a specific
query, such as: “What does the compressive strength of 28-day concrete amount to when
that of cement amounts to 163 kg/m3, water to 180 kg/m
3, coarse aggregate to 843 kg/m
3
and fine aggregate to 746 kg/m3?” The answer to the first question requires only the data,
“cement amounts to about 163 kg/m3, water to 180 kg/m
3, etc. This approach frees us from
the time consuming process of creating a global model. Moreover, when a new sample is
acquired the global model becomes outdated and re-learning is required. Mini-model meth-
ods calculate the answer to the query point “ad-hocly”, which allows them to work in situa-
tions where new data points are continuously being received. It is also possible to build a
global model in order to learn the value of a modelled variable across an entire domain. This
can be done very simply by adding together mini-models for subsequent query points.
The main aim of this paper is to compare the effectiveness of mini-models with selected
methods of modelling in information deficiency situations. The article only briefly describes
22 Marcin Pietrzykowski
the methods. Results of experiments on datasets that don't contain information gaps and the
details of mini-models method have been described more comprehensively in previous
works by the author [6], [7].
Mini-models in 2D-space form the basis for mini-models operating in spaces with a
greater number of dimensions. In 2D-space mini-models can take the form of a line segment
for linear models, or either a polynomial curve or an arc of a circle for nonlinear mini-
models. In 3D-space mini-models take the form of a polygon. In 4D-space mini-models take
the form of a polyhedron [1]. However, regardless of the dimensionality of the space in
which mini-models operate it is necessary to define the query point and the mini-model's
local neighbourhood. The query point is a set consisting of some independent variables with
known values and a dependent variable with an unknown value. For example, for the simple
query in a 2D-space: “What does the compressive strength of 28-day concrete amount to
when that of cement amounts to 163 kg/m3”, the dependent variable � is the compressive
strength and the independent variable � is the quantity of cement. Query points will there-
fore take the following form: �� = 163, �� =?; or simply �� = 163.
For proper operation of a mini-models method the query point and its local neighbour-
hood must first be defined. The local neighbourhood may take various forms that depend
on: the shape of the modelled data, the type of mini-model it is, and the location of the que-
ry point. The main parameter in defining the local neighbourhood is that of the minimum
number of samples (�). This parameter is closely related to the mini-model's limit points.
In 2D-space they form graphical representations of the end points of a line or curve seg-
ment. It is assumed that the parameter � is defined by the formula:
� ≥ + 1, (1)
where is the number of dimensions of the modelled space. In 2D-space, � ≥ 3. The pa-
rameter � can be defined globally for the entire domain or locally for either a selected range
or a selected group of learning points. Unfortunately, there is no simple rule for choosing
the optimal value of the parameter � for a particular problem. Finding the optimal value
instead requires an extensive search through all possible values. Thus, the solution may lo-
cally adapt to the modelled data. A test method based on leave-one-out cross-validation is
used in the process of testing the effectiveness of mini-models for selected values of the
parameter �.
2. Details of the method
A mini-models method works on a training dataset which consists of some points �� and
is sorted in ascending order with respect to the variable, �:
�� = ���, ���, (2)
� = ���, ��, ��, … �, �ℎ����� ≤ �� ≤ �� ≤…� ≥ + 1. (3)
The local neighbourhood of the query point �� is defined by boundaries or limit points: i.e.
the lower limit �� and the upper limit � :
��, � ∈ ", (4)
�� ∈ ⟨��; � ⟩. (5)
We call the set of points on which the mini-models operate &:
& = ��� ∈ � ∶ �� ∈ ⟨��; � ⟩�, (6)
Effectiveness of mini-models method when data modelling… 23
()� �&� ≥ �. (7)
There are two basic variants of the method: linear and nonlinear. As the name suggests line-
ar mini-models form the shape of a line segment in response to query point data, and non-
linear mini-models take the shape of a curve segment after the learning process has
completed.
2.1. Linear mini-models
The simplest linear mini-models are based on linear regression. The learning algorithm
for these is as follows:
1. choose a set of points &* that satisfy properties: (4), (5), (6) and (7),
2. calculate the function +* of the local mini-model using linear regression and a set &*,
3. calculate the error �* committed by the model +*. The error �* is calculated using fol-
lowing formula:
�* =∑ -./012�3/�-4567�82�
/9:
;�<=�>2�, (8)
4. repeat steps 1 – 3 until all combinations have been checked,
5. select the unique model +whichthat caused the minimal value of the error �.
In order to gain a valid solution, an extensive search through all possible combinations is
first required to define the local neighbourhood, while satisfying the properties above: (4),
(5), (6) and (7).
Note that the error � (8) is also the estimated value of the error that can be committed by
the model during the process of calculating the answer to a query point. For example, for the
error � = 0.09 and the answer �� = 0.43 it is assumed that �� = 0.43 ± 0.09. Estima-
tion of the value of the error also applies to other versions of mini-models presented later in
this article.
The second type of linear mini-model is trained heuristically. Unlike linear regression
mini-models, there is no problem in defining the local neighbourhood of a query point. The
neighbourhood is instead created “ad-hoc” during the training of the mini-model. Heuristic
learning is done by cyclic movement of the limits �� and � along the x- and y-axis. When a
change in the location of one limit point does not show any improvement in results, we
change the location of the second point. We then repeat the whole operation again with the
first limit point and so on. This whole operation is repeated until the stop condition has been
reached. Searching along the y-axis is done by “moving” the limit point by a value of ∆ in
the desired direction. Searching along the x-axis is done in a similar way, but the limit point
must take the value of �� of the nearest point �� in the desired direction. The variable does
not have to take any intermediate values, since this would not affect the number of points
included in a mini-model and thus the error committed by that mini-model. We should re-
member that limit points after each operation along the x-axis have to satisfy the above
properties: (4), (5), (6) and (7). After each shift of limit points we calculate the equation of a
mini-model based on equations of a straight line passing through two points on a plane:
� = .D0.53D035
�� −��� + ��, (9)
24 Marcin Pietrzykowski
we then calculate the error � committed by a mini-model (8). The mini-model having the
smallest error value will become the output value for the next cycle of operations. In the
end, we select the best model with the smallest value of the error �.
2.2. Nonlinear mini-models
The first variant of nonlinear mini-models is based on polynomial approximation. This
type of mini-model works in a similar way to the linear equivalent. The only difference is
that a polynomial approximation of the second order is used instead of linear regression.
There is no need to use polynomial approximation of the higher order, and this would only
increase the complexity of the algorithm. Mini-models are able to model the complex shape
of a function of a few mini-models so long as they have relatively simple shape.
The second variant of nonlinear mini-models is the heuristic mini-models. The initial
stage of these mini-models' learning process is the same as the learning process of heuristic
linear mini-models. After finding the best solution, the model takes the form of a circular
arc when represented graphically. This can curve either “up” or “down” depending on the
type of the modelled data. The results of numerical experiments have shown that those mini-
models which were curved in the process of determining the locations of the limit points
achieve worse results than the mini-models presented above. Training of mini-models with a
higher number of degrees of freedom is more difficult and such models often reach local
minima.
3. Experiments and results
In order to test the effectiveness of mini-models in an information deficiency situation
and to compare them with other commonly used methods of modelling, experiments were
performed on the following specially prepared data sets:
• a dataset containing an “information hole” with a width of 10% of the interval,
• a dataset containing an “information hole” with a width of 20% of the interval,
• a dataset with 30% random sample removal.
These experiments were performed with optimal values for all parameters, for all tested
methods. Two types of tests were made. Firstly, the algorithms were tested using a test
method based on leave-one-out cross validation using datasets with information loss. Sec-
ondly, the tested methods were trained with datasets with information loss and their effec-
tiveness was checked against data sets consisting of “lost data”. For example, methods were
tested with data from “information holes” that were not involved in the learning process.
The experiments were conducted on 11 different data sets: Compressive Strength of 28-day
Concrete, Concrete Slump Test, Unemployment Rate in Poland, Housing Value Concerns in
the Suburbs of Boston, Computer Hardware Performance, Concentration of NO2 Measured
at Alnabru in Oslo, Sold Production of Industry with Inflation in Poland, Sleep in Mammals,
Air Pollution to Mortality, Fuel Consumption in Miles per Gallon, and Determinants of
Wages. It should be noted that the learning datasets are multi-dimensional and presented as
mini-models operating within 2D-space. It was possible to perform 37 different numerical
experiments for each type of modification of the learning datasets. A summary comparison
of these tested methods is shown in Table 1.
Effectiveness of mini-models method when data modelling… 25
Table 1. The total number of experiments in which the tested methods achieved the best results (i.e.
achieved a better result than the other tested method)
method
10% gap 20% gap 30% random loss
count of
exper.
% of exper-
iments
count of
exper.
% of exper-
iments
count of
exper.
% of exper-
iments
LOO1 TT
2 LOO TT LOO TT
LO
O TT LOO TT LOO TT
mini-model with global parameter of minimal numbers of points
heuristic linear mini-
model 0 4 0,0 10,5 0 2 0,0 5,3 0 3 0,0 7,9
mini-model based on
linear regression 0 3 0,0 7,9 0 3 0,0 7,9 0 6 0,0 15,8
heuristic nonlinear mini-
model 0 2 0,00 5,3 0 6 0,0 15,8 0 3 0,0 7,9
mini-model based on
polynomial approxima-
tion
1 1 2,6 2,6 0 1 0,0 2,6 0 2 0,0 5,3
mini-model with local parameter of minimal numbers of points
heuristic linear mini-
model 18 3 47,4 7,9 18 2 47,4 5,3 18 3 46,1 7,9
mini-model based on
linear regression 0 3 0,00 7,9 1 3 2,6 7,9 1 5 2,6 13,2
heuristic nonlinear mini-
model 12 5 31,6 13,2 12 1 31,6 2,6 11 3 28,2 7,9
mini-model based on
polynomial approxima-
tion
6 4 15,8 10,5 6 1 15,8 2,6 8 1 20,5 2,6
other methods
k-nearest neighbours 1 3 2,6 7,9 1 8 2,6 21,0 0 2 0,0 5,2
polynomial approxima-
tion of degree n 0 4 0,0 10,5 0 5 0,0 13,2 0 3 0,0 7,9
feed forward neural net-
work 0 5 0,0 13,2 0 6 0,0 15,8 1 4 2,6 10,5
General Regression Neu-
ral Network [8] 0 1 0,0 2,6 0 0 0,0 0,0 0 3 0,0 7,9
summary comparison
mini-model with global
parameter � 1 10 2,6 26,3 0 12 0,00 31,6 0 14 0,00 36,8
mini-model with local
parameter � 36 15 94,7 39,5 37 7 97,4 18,4 38 12 97,4 31,6
other methods 1 13 2,6 34,5 1 19 2,6 50,0 1 12 2,6 31,6
all mini-models 37 25 97,4 65,5 37 19 97,4 50,0 38 26 97,4 68,4
1 Test method based on leave-one-out cross validation
2 Testing using “lost data” for example form information “hole”
26 Marcin Pietrzykowski
4. Discussion of results
It should be noted that the information gap does not significantly affect the results of ex-
periments using test methods based on leave-one-out cross validation. Mini-models method
was the most effective and the effectiveness of different types of mini-models only varied
slightly. Mini-models were less efficient in the tests with datasets than those with infor-
mation gaps, but their advantage is still significant. In the tests with datasets containing a
“hole” with a width of 10%, mini-models were the most efficient and achieved best results
in 65% of the tests. For datasets containing an information “hole” with a width of 20%,
mini-models achieved best results in 50% of the tests. The KNN method also achieved good
results with these datasets. The KNN method is considered as the main competitor to the
mini-models method. It should be remembered that the KNN method in an “information
hole” situation is effective only for datasets where samples are not evenly distributed (Fig-
ure 1c). The method does not work very well with datasets that have a clearly visible trend
line. The graph shows clearly the “steps behaviour” of the method shown in Figure 2c. For
the datasets with 30% of the random loss of samples, the mini-models achieved the best
results in 68% of tests. It should be noted that other methods (except KNN mentioned
above) gained no more than several percent across all tests. Mini-models with a global value
of the parameter � performed as well across the entire range as mini-models with a local
value of this parameter.
a) b) c)
Figure 1. a) Original data of compressive strength of 28-day concrete depending on fine aggregate.
b) Global model build with heuristic linear mini-models with global value of the parameter � (best mini-
model MAE=0,1599) for a dataset with a “hole” in the interval [0.5; 0.7]. c) Global model build with the
k-nearest neighbours method (best result MAE=0,1598) for a dataset with a “hole” in the interval
[0.5; 0.7]
a) b) c)
Figure 2. a) Original data of unemployment rate in Poland depending on the money supply. b) Global
model build with heuristic linear mini-models with a global value of the parameter � (best result
MAE=0,0374) for a dataset with a “hole” in the interval [0.3; 0.5]. c) Global model build with the
k-nearest neighbours method (worst result MAE=0,1611) for a dataset with a “hole” in the interval
[0.3; 0.5]
Effectiveness of mini-models method when data modelling… 27
5. Conclusions
The results of the experiments have shown advantages of mini-models over other meth-
ods of modelling information deficiency situations. Their advantage is not as great as in the
situation of testing with leave-one-out cross validation method for original data, but still
remains significant. The irregularity of the global models created by mini-models method
and their high efficiency raises the question of the validity of the theory of regularization.
Authors in future research should move towards the use of mini-models in spaces with a
higher number of dimensions and their relevance to the theory of regularization.
References
[1] Piegat A., Wąsikowska B., Korzeń M.: Differences between the method of mini-models
and of the k-nearest neighbors on example of modeling of unemployment rate in Poland
in Information Systems in Management IX. Bussines Inteligence and Knowledge
Management, Warsaw, 2011, pp. 34-43.
[2] Piegat A., Wąsikowska B., Korzeń M.: Zastosowanie samouczącego się
trzypunktowego minimodelu do modelowania stopy bezrobocia w Polsce, Studia
Informatica, no. 27, pp. 45-58, 2011.
[3] Rutkowski L.: Metody i techniki sztucznej inteligencji. Warszawa: PWN, 2009.
[4] Fix E., Hodges J. L.: Discriminatory analysis, nonparametric discrimination:
Consistency properties, Randolph Field, Texas, 1951.
[5] Kordos M., Blachnik M., Strzempa D.: Do We Need Whatever More than k-NN?, in
Proceedings of 10-th International Conference on Artificial Inteligence and Soft
Computing, Zakopane, 2010.
[6] Pietrzykowski M.: Comparison of effectiveness of linear mini-models with some
methods of modelling, in Młodzi naukowcy dla Polskiej Nauki, Kraków, 2011.
[7] Pietrzykowski M.: The use of linear and nonlinear mini-models in process of data
modelling in a 2D-space, in Nowe trendy w naukach inżynieryjnych., 2011.
[8] Specht D. F.: A General Regression Neural Network, IEEE Transactions on Neural
Networks, pp. 568-576, 1991.
[9] Witten I. A., Frank E.: Data mining. San Francisco: Morgan Kaufmann Publishers,
2005.
[10] Pluciński M.: Nonlinear ellipsoidal mini-models – application for the function approx-
imation task, paper accepted for ACS Conference, 2012
[11] Pluciński M.: Application of the information-gap theory for evaluation of nearest
neighbours method robustness to data uncertainty, paper accepted for ACS Confer-
ence, 2012
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 28–35ISSN 2299-2634 http://www.jtacs.org
SmartMonitor: recent progress in the development of aninnovative visual surveillance system
Dariusz Frejlichowski1, Katarzyna Gosciewska1,2, Paweł Forczmanski1, Adam Nowosielski1,Radosław Hofman2
1 Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,Szczecin, Poland2 Smart Monitor sp. z o.o., Szczecin, Poland
{dfrejlichowski,pforczmanski,anowosielski}@wi.zut.edu.pl,{katarzyna.gosciewska,radekh}@smartmonitor.pl
Abstract: This paper describes recent improvements in developing SmartMonitor — an innovative securitysystem based on existing traditional surveillance systems and video content analysis algorithms.The system is being developed to ensure the safety of people and assets within small areas. Itis intended to work without the need for user supervision and to be widely customizable to meetan individual’s requirements. In this paper, the fundamental characteristics of the system arepresented including a simplified representation of its modules. Methods and algorithms that havebeen investigated so far alongside those that could be employed in the future are described. Inorder to show the effectiveness of the methods and algorithms described, some experimental resultsare provided together with a concise explanation.
Keywords: SmartMonitor, visual surveillance system, video content analysis
1. IntroductionExisting monitoring systems usually require supervision by responsible person whose role
it is to observe multiple monitors and report any suspicious behaviour. The existing intelligentsurveillance systems that have been built to perform additional video content analysis tend tobe very specific, narrowly targeted and expensive. For example, the Bosch IVA 4.0 [1], anadvanced surveillance system with VCA functionality, is designed to help operators of CCTVmonitoring and is applied primarily for the monitoring of public buildings or larger areas,hence making it unaffordable for personal use. In turn, SmartMonitor is being designed forindividual customers and home use, and user interaction will only be necessary during systemcalibration. SmartMonitor’s aim is to satisfy the needs of a large number of people who wantto ensure the safety of both themselves and their possessions. It will allow for the monitoringof buildings (e.g. houses, apartments, small enterprises, etc.) and their surroundings (e.g.yards, gardens, etc.), where only a small number of objects need to be tracked. Moreover, itwill utilize only commonly available and inexpensive hardware such as a personal computerand digital cameras. Another intelligent monitoring system, described in [2], analyses humanlocation, motion trajectory and velocity in an attempt to classify the type of behaviour. Itrequires both the participation of a qualified employee and the preparation of a large databaseduring the learning process. These steps are unnecessary with the SmartMonitor system dueto a simple calibration mechanism and feature-based methods. Moreover, a precise calibra-
SmartMonitor: recent progress. . . 29
tion can improve a system’s effectiveness and allow the system’s sensitivity to be adjustedto situations that do not require any system reaction. The customization ability offered bySmartMonitor is very advantageous. In [3], the problem of automatic monitoring systemswith object classification was described. It was assumed that the background model used forforeground subtraction does not change with time. This is a crucial limitation caused by thebackground variability of real videos. Therefore, and due to planned system scenarios, themodel that best adapts to changes in the scene will be utilized.
SmartMonitor will be able to operate in four independent modes (scenarios) that will pro-vide home/surroundings protection against unauthorized intrusion, allow for supervision ofpeople who are ill, detect suspicious behaviours and sudden changes in object trajectory andshape, and detect smoke or fire. Each scenario is characterized by a group of performedactions and conditions, such as movement detection, object tracking, object classification,region limitation, object size limitation, object feature change, weather conditions and worktime (with artificial lighting required at night). A more detailed explanation of system scenar-ios and parameters is provided in [4].
The rest of the paper is organised as follows: Section 2 contains the description of the mainsystem modules; algorithms and methods that are utilised in each module are briefly describedin Section 3; Section 4 contains selected experimental results; and Section 5 concludes thepaper.
2. System ModulesSmartMonitor will be composed of six main modules: background modelling, object
tracking, artefacts removal, object classification, event detection and system response. Someof these are common to the intelligent surveillance systems that were reviewed in [5]. Asimplified representation of these system modules is displayed in Fig. 1.
Figure 1. Simplified representation of system modules
Background modelling detects movement through use of background subtraction methods.Foreground objects that are larger than a specified size and coherent are extracted as objectsof interest (OOI). The second module, object tracking, tracks object locations across consec-utive video frames. When multiple objects are tracked, each object is labelled accordingly.Every object moves along a specified path called a trajectory. Trajectories can be comparedand analysed in order to detect suspicious behaviours. The third module, artefacts removal,is an important step preceding classification and should be performed correctly. In this, all
30 Dariusz Frejlichowski, et al.
artefacts, such as shadows, reflections or false detection results, enlarge the foreground regionand usually move with the actual OOI. The fourth module, object classification, will allow forsimple classification using object parameters and object templates. The template base will becustomizable so that new objects can be added. A more detailed classification will also bepossible using more sophisticated methods. The key issue of the fifth, i.e. the event detectionmodule, is to detect changes in object features. The system will react to both sudden changes(mainly in shape) and a lack of movement. The final module defines how the system re-sponds to detected events. By eliminating the human factor it is important to determine whichsituations should set off alarms or cause information to be sent to the appropriate services.
3. Employed Methods and AlgorithmFor each module we investigated the existing approaches, and modified them to apply the
best solution for the system. Below we present a brief description and explanation of this.Background modelling includes models that utilize static background images [3], back-
ground images averaged in time [6] and background images built adaptively, e.g. usingGaussian Mixture Models (GMM) [7, 8]. Since the backgrounds of real videos tend to beextremely variable in time, we decided to use a model based on GMM. This builds per-pixelbackground image that is updated with every frame, and is also sensitive to sudden changesin lighting which can cause false detections, mainly by shadows. It was stated in [9] thatshadows only affects the image brightness and not the hue. By comparing foreground imagesconstructed using both the Y component of the YIQ colour scheme and the H component ofthe HSV colour scheme, it is possible to exclude false detections that are caused by shadows.Following this, morphological operations are applied to the resulting binary mask. Erosionallows for the elimination of small objects composed of one or few pixels (such as noise) andthe reduction of the region. Later the dilation process fills in the gaps.
For the object tracking stage we investigated three possible implementations, namely theKalman filter [10], Mean Shift and Camshift [11, 12] algorithms. The Mean Shift algorithmis simple and appearance-based. It requires one or more feature, such as colour or edge datato be selected for tracking purposes. This can cause several problems with object localizationwhen particular features change. The Camshift algorithm is simply a version of the MeanShift algorithm that continuously adapts to the variable size of tracked objects. Unfortunately,the described solution is not optimal since it increases the number of computations. More-over, both methods are effective only when certain assumptions are met, such as that trackedobjects will differ from the background (e.g. through variations in colour). The Kalmanfilter algorithm was therefore selected to overcome these drawbacks. This constitutes a set ofmathematical equations that define a predictor-corrector type estimator. The main task was toestimate future values in two steps: prediction based on known values, and correction basedon new measurements. It is assumed that objects can move uniformly and in any direction butwill not change direction suddenly and unpredictably.
After tracking the objects are classified (labelled) as either human or not human. A boostedcascade of Haar-like features [13] connected using the AdaBoost algorithm [14] can be uti-lized. However, at this stage, we replaced the AdaBoost classification with a simpler one.Objects can now be classified using their binary masks and the threshold values of two oftheir properties: area size and minimum bounding rectangle aspect ratio.
A specific and detailed classification can be performed using a Histogram of OrientedGradients (HOG) [15]. A HOG descriptor localises and extracts objects from static scenes
SmartMonitor: recent progress. . . 31
through use of specified patterns. Despite its high computational complexity, the HOG algo-rithm can be applied to a system under several conditions such as those with limited regionsor time intervals.
4. Experimental Conditions and ResultsIn this section we present some experimental results from employing the algorithms for
object localization, extraction and tracking that have given the best results so far. In order toensure the experiments were performed under realistic conditions, a set of test video sequencescorresponding to certain system scenarios was prepared. These include scenes recorded bothinside and outside the buildings, with different types of moving objects. A database alsohad to be created due to the lack of free, universal video databases that matched the plannedscenarios.
The results of employing both the GMM algorithm and the methods for removing falseobjects are presented in Fig. 2. The first row contains the sample frame and backgroundimages for the Y and H components. The second row shows the respective foreground imagesfor the Y and H components alongside the foreground object’s binary mask after false objectsremoval. It is noticeable that the foregrounds constructed using the different colour compo-nents strongly differ and that, by subtracting one image from another, we can eliminate falsedetections.
Figure 2. Results of employing the GMM algorithm and false objects removal methods
Specific objects can be localised and extracted using the HOG descriptor. This detectsobjects using a predefined patterns and extracted feature vectors. Below we present the resultsof the experiments utilizing HOG descriptor. The first experiment was performed using a fixedtemplate size and two sample frames, the second one utilized various template sizes and onesample frame.
The results of the first experiment are pictured in Fig. 3. The figure contains: a sampleframe with a chosen template (left column) and two frames (middle column) from the samevideo sequence which were scanned horizontally in an attempt to identify the matching re-gions. The depth maps (right column) show the results of the HOG algorithm — the darkerthe colour the more similar the region is. Black regions indicate a Euclidean distance betweentwo feature vectors of zero.
32 Dariusz Frejlichowski, et al.
Figure 3. Results of the experiment utilizing the HOG descriptor with a fixed template size
In the next experiment, devoted to an investigation of the HOG descriptor, various templatesizes were tested. The left column of Fig. 4 presents a frame with a chosen template marked bya white rectangle, the central column contains a frame that was scanned horizontally using twodifferent template sizes (dark rectangles in the top left corners define the size of the rescaledtemplate) and the right column provides the respective results of the HOG algorithm. Clearly,the closer the template size is to object size, the more accurate the depth map is.
Figure 4. Results of the experiment utilizing the HOG descriptor with a variable template size
As mentioned in the previous section, we investigated three tracking methods. The firstone, the Mean Shift algorithm, uses part of an image to create a fixed template model. In thiscase we converted images to the HSV colour scheme. Fig. 5 presents three sample framesfrom the tracking process (first row) and their corresponding binary masks (second row). Thewhite masked regions indicate those regions that are similar to the template, the dark rectangledetermines the template and the light points within the rectangle create the object’s trajectory.
Camshift was the second tracking method investigated. This uses the HSV colour schemeand a variable template model. The first row in Fig. 6 presents sample frames from the track-ing process: the starting frame with the chosen template, the central frame with an enlargedtemplate and the finishing frame where the moving object leaves the scene. The second rowin Fig. 6 shows corresponding binary masks for each frame. Both tracking methods, thanks
SmartMonitor: recent progress. . . 33
Figure 5. Results of the experiment utilizing the Mean Shift algorithm
to their local application, were effective despite of the presence of many similar regions to thetemplate.
Figure 6. Results of the experiment utilizing the Camshift algorithm
Fig. 7 shows a result of employing the third algorithm, the Kalman filter, to track a personwalking in a garden. Light asterisks are obtained for object positions that were estimated usinga moving object detection algorithm and dark circles are positions predicted by the Kalmanfilter.
5. Summary and ConclusionsIn this paper, recently achieved results from the SmartMonitor system during the develop-
ment process were described. We provided basic information about system characteristics andproperties, and system modules. Investigated methods and algorithms were briefly described.Selected experimental results on utilizing various solutions were presented.
SmartMonitor will be an innovative surveillance system based on video content analysisand targeted at individual customers. It will operate in four independent modes which are fullycustomizable (and will also be combinable to make custom modes). This allows for individualsafety rules to be set based on different system sensitivity degrees. Moreover, SmartMonitorwill utilize only commonly available hardware. It will almost eliminate human involvement,
34 Dariusz Frejlichowski, et al.
Figure 7. Results of the experiment utilizing the Kalman filter
being only required for the calibration process. Our system will analyse a small number ofmoving objects over limited region which could additionally improve its effectiveness.
Currently, there are no similar systems on the market. Modern surveillance systems areusually expensive, specific and need to be operated by a qualified employee. SmartMonitorwill eliminate these factors by offering less expensive software, making it more affordable forpersonal use and requiring less effort to use.
AcknowledgementsThe project Innovative security system based on image analysis — SmartMonitor pro-
totype construction (original title: Budowa prototypu innowacyjnego systemu bezpieczenstwaopartego o analize obrazu — SmartMonitor) is the project co-founded by the European Union(project number PL: UDA-POIG.01.04.00-32-008/10-01, Value: 9.996.604 PLN, EU con-tribution: 5.848.800 PLN, realization period: 07.2011-04.2013). European Funds — forthe development of innovative economy (Fundusze Europejskie — dla rozwoju innowacyjnejgospodarki).
References[1] Bosch IVA 4.0 Commercial Brochure, http:// resource.boschsecurity.com/documents/
Commercial Brochure enUS 1558886539.pdf[2] Robertson N., Reid I.: A general method for human activity recognition in video. Computer
Vision and Image Understanding 104, 232–248 (2006)[3] Gurwicz Y., Yehezkel R., Lachover B.: Multiclass object classification for real-time video
surveillance systems. Pattern Recognition Letters 32, 805–815 (2011)[4] Frejlichowski D., Forczmanski P., Nowosielski A., Gosciewska K., Hofman R.: SmartMonitor:
An Approach to Simple, Intelligent and Affordable Visual Surveillance System. In: Bolc, L. et al.(eds.) ICCVG 2012. LNCS, vol. 7594, pp. 726–734. Springer, Heidelberg (2012)
SmartMonitor: recent progress. . . 35
[5] Forczmanski P., Frejlichowski D., Nowosielski A., Hofman R.: Current trends in the develope-ment of intelligent visual monitoring systems (in Polish). Methods of Applied Computer Science4/2011(29), 19–32 (2011)
[6] Frejlichowski D.: Automatic Localisation of Moving Vehicles in Image Sequences Using Mor-phological Operations. 1st IEEE International Conference on Information Technology, 439-442(2008)
[7] Stauffer C., Grimson W. E. L.: Adaptive background mixture models for real-time tracking. IEEEComputer Society Conference on Computer Vision and Pattern Recognition, 2–252 (1999)
[8] Zivkovic Z.: Improved adaptive Gaussian mixture model for background subtraction. Proceed-ings of the 17th International Conference on Pattern Recognition 2, 28–31 (2004)
[9] Forczmanski P., Seweryn M.: Surveillance Video Stream Analysis Using Adaptive BackgroundModel and Object Recognition. In: Bolc, L. et al. (eds.) ICCVG 2010, Part I. LNCS, vol. 6374,pp. 114–121. Springer, Heidelberg (2010)
[10] Welch G., Bishop G.: An Introduction to the Kalman Filter. UNC-Chapel Hill, TR 95-041 (24July 2006)
[11] Cheng Y.: Mean Shift, Mode Seeking, and Clustering. IEEE Transactions on Pattern Analysisand Machine Intelligence 17(8), 790–799 (1995)
[12] Comaniciu D., Meer P.: Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEETransactions on Pattern Analysis and Machine Intelligence 24(5), 603–619 (2002)
[13] Viola P., Jones M.: Rapid Object Detection Using a Boosted Cascade of Simple Features. IEEEComputer Society Conference on Computer Vision and Pattern Recognition 1, 511–518 (2001)
[14] Avidan S.: Ensemble Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence29(2), 261–271 (2007)
[15] Dalal N., Triggs B.: Histograms of oriented gradients for human detection. IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition 1, 886–893 (2005)
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 36-49
ISSN 2299-2634 http://www.jtacs.org
Nonlinearity of human multi-criteria in decision-making
Andrzej Piegat, Wojciech Sałabun
Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,
Szczecin, Poland
{apiegat, wsalabun}@wi.zut.edu.pl
Abstract: In most cases, known methods of multi-criteria decision-making are used in order to make linear aggregation of human preferences. Authors of these methods seem not to take into account the fact that linear functional dependences rather rarely occur in real systems. Lin-ear functions rather imply a global character of multi-criteria. This paper shows several examples of human nonlinear multi-criteria that are purely local. In these examples, the nonlinear approach is used based on fuzzy logic. It allows for better understanding of how important is the non-linear aggregation of human multi-criteria. The paper contains also proposal of an indicator of nonlinearity degree of the criteria. The presented results are based on investigations and experiments realized by authors.
Keywords: Multi-criteria analysis, multi-criteria decision-analysis, non-linear multi-criteria, fuzzy multi-criteria, indicator of nonlinearity.
1. Introduction
On a daily basis and in professional life we frequently have to make decisions. Then we
use some criteria that depend on our individual preferences, or in case of group-decisions on
preferences of the group. Further on, criteria represent preferences of a single person will be
called individual criteria and criteria represent a group will be called group-criteria. Group-
criteria can be achieved by aggregation of individual ones. Therefore, further on, the nonlin-
earity problem of criteria will be analyzed on examples of individual criteria, because prop-
erties of individual criteria are transferred on the group-ones. Individual human multi-
criteria are “programmed” in our brains and special methods for their elicitation and math-
ematical formulation of them are necessary. Multi-criteria (for short M-Cr) of different per-
sons are more or less different and therefore it would be not reasonable to assume one and
the same type of a mathematical formula for certain criterion representing thousands of dif-
ferent people, e.g. for the individual criterion of car attractiveness. However, in case of M-
Crs most frequently used criterion type is the linear M-Cr form (1).
� = �� + ���� + ���� + ⋯+ �� , (1)
where: wi – weight coefficients of particular component criteria, ∑ �� = 1� � , Ki - the com-
ponent criteria aggregated by the M-Cr (� = 1: ������). They are mostly used, also in this paper,
in the form which is normalized to interval [0,1]. The linear criterion-function in the space
2D is represented by a straight line, in the space 3D by a plane, Fig.1, and in the space nD
by a hyper-plane.
Nonlinearity of human, decisional multi-criteria 37
Figure 1. A linear criterion function in space 2D (Fig.1a) and in space 3D (Fig.1b)
Let us notice that in the linear-criterion function K particular component-criteria Ki in-
fluence the superior criterion in the mutually independent and uncorrelated way. Apart of
this, the influence strength of particular component criteria Ki is of the global, constant and
unchanging character in the full criterion-domain. Both above features are great disad-
vantages of the linear M-Cr, because human M-Cr are in most cases nonlinear and signifi-
cance of component criteria Ki is not constant, is not independent from other criteria and
varies in particular, local sub-domains of the global MCr. Unfortunately, linear multi-
criteria are used in many world-known methods of the multi-criteria decision-analysis. Fol-
lowing examples illustrating the above statement can be given: the method SAW (Simple
Additive Weighting) [4,15], the very known and widely used AHP-method of Saaty (the
Analytic Hierarchy Process) [11,15,18], the ANP-method (Analytic Network Process),
[12,13]. Other known MCr-methods as TOPSIS [15,16], ELECTRE [2], PROMETHEE
[1,2] are not strictly linear ones. However, they assume global weight-coefficients wi, con-
stant for the full MCr-domain and in certain steps of their algorithms they also use the line-
ar, weighted aggregation of alternatives. The next part will present the simplest examples of
nonlinear criterion-functions in 2D-space.
2. Nonlinear human criterion-functions in 2D-space
An example of a very simple human nonlinear criterion-function can be the dependence
between the coffee taste (CT), CT ∈ [0,1], and the sugar quantity S, S ∈ [0,5] expressed in
number of sugar spoons, Fig.2. Coffee taste represents inner human preference.
The criterion function of the coffee taste can be identified by interviewing a given per-
son or more exactly, experimentally, by giving the person coffees with different amount of
sugar and asking he/she to evaluate the coffee taste or to compare tastes of pairs of coffees
with different amount of sugar. The achieved taste evaluations can be processed with vari-
ous MCr-methods previously cited or with the method of characteristic objects proposed by
one of the paper authors. However, even without scientific investigations it is easy to under-
stand that the criterion-function shown in Fig.2 is qualitatively correct. This function repre-
sents preferences of the author-AP. He does not like coffee with too great amount of sugar
(more than 3 coffee-spoons) and evaluates its taste as CT≈0. The taste of coffee without
sugar (S=0) he also evaluates as a poor one. The best taste he feels when cup of coffee con-
38 Andrzej Piegat, Wojciech Sałabun
tains 2 spoons of sugar (Sopt=2). For other persons the optimal sugar amount will be differ-
ent. Thus, this criterion-function is not an “objective” (what does it mean?) function of all
people in the world but an individual criterion-function of the AP-author of the paper. It is
very important to differentiate between individual criteria and group-criteria, which repre-
sent small or greater group of people. Similar in character as the function in Fig.2 is also
other one-component human criterion function: e.g. dependence of the text-reading easiness
from the light intensity.
Figure 2. Criterion function representing dependence of the coffee taste CT from number of sugar
spoons S (felt by an individual person, the paper author-AP)
3. Nonlinear, human, multi-criterion function in 3D-space and a method
of its identification
Already in 60-ties and 70-ties of the 20th
century American scientists D. Kahneman and
A. Tversky, Nobel prize winners from 2002 have drawn the attention of the scientific com-
munity on the nonlinearity of human multi-criteria [5] by their investigation results on hu-
man decisions based on a MCr. In their experiment were aggregated some component
criteria: value of a possible profit, probability of the possible profit value, value of a possi-
ble loss, probability of the possible loss-value. Further on, there will be presented a similar
but a simplified problem of evaluation of the individual play acceptability-degree K in de-
pendence of a possible winnings-value K1[$] and of a possible loss-value K2 [$]. Both val-
ues are not great. The interviewed person has to make decisions in the problem described
below.
Among 25 plays shown in Table 1, with different winnings K1[$] and losses K2[$] (if you
don’t win you will have to pay a sum equal to the loss K2) at first find all plays (K1,K2) which certainly are not accepted by you (K=0), and next all plays which are certainly ac-cepted by you (K=1). For rest of the plays determine a rank with the method of pair-tournament (pair comparisons). Probability of winnings and losses are the same and equal to 0.5.
Nonlinearity of human, decisional multi-criteria 39
Table 1 gives values of possible winnings and losses (K1,K2) in particular plays. It also
informs for which plays the AP-author declares the full acceptation (full readiness to take up
the game) that means K=1, and informs for which plays he does not accept at all (zero read-
iness to take up the game) that means K=0. The acceptability degree plays a role of the mul-
ti-criterion in the shown decision-problem.
The acceptability degree of plays marked with question mark will be determined with
the tournament-rank method. The investigated person chooses from each play-pair the more
acceptable play (inserting the value 1 in the table for this play), which means the win. If the
person is not able to decide which of two plays is better, then she/he inserts the value 0.5 for
both plays of the pair, which means the draw.
Summarized scores from Table 2 are shown in Table 3 for particular plays (K1,K2).
Table 1. Winnings K1[$] and losses K2[$] in particular 25 plays and first decisions of the interviewed
person : determining the unacceptable plays (acceptation degree K=0) and the fully acceptable plays
(K=1) which certainly would be played by the person. Plays with question marks are plays of a par-
tial (fractional) acceptation that is to be determined.
The value of losses
��[$] The value of winning ��[$]
0.0 2.5 5.0 7.5 10.0
0.0 0 1 1 1 1
2.5 0 0 ? ? ?
5.0 0 0 0 ? ?
7.5 0 0 0 0 ?
10.0 0 0 0 0 0
Table 2. Tournament results of particular play-pairs. The value 1 means the win of a play, the value
0.5 means the draw. A single play is marked by (K1,K2).
Points ���, ���[$]���, ��) [$] Points Points ���, ���[$]���, ��) [$] Points
0 (5.0, 2.5) (7.5, 2.5) 1 1 (7.5, 2.5) (10.0, 7.5) 0
0 (5.0, 2.5) (10.0, 2.5) 1 1 (10.0, 2.5) (7.5, 5.0) 0
0.5 (5.0, 2.5) (7.5, 5.0) 0.5 1 (10.0, 2.5) (10.0, 5.0) 0
0 (5.0, 2.5) (10.0, 5.0) 1 1 (10.0, 2.5) (10.0, 7.5) 0
0.5 (5.0, 2.5) (10.0, 7.5) 0.5 0 (7.5, 2.5) (10.0, 5.0) 1
0 (7.5, 2.5) (10.0, 2.5) 1 0.5 (7.5, 2.5) (10.0, 7.5) 0.5
1 (7.5, 2.5) (7.5, 5.0) 0 1 (10.0, 5.0) (10.0, 7.5) 0
0.5 (7.5, 2.5) (10.0, 5.0) 0.5
Table 3. Scores of particular plays (K1,K2) and rank places assigned to particular plays with
fractional acceptation degree K (multi-criterion) of the investigated person
Play (K1,K2) (10.0,
2.5)
(10.0,
5.0)
(7.5, 2.5) (10.0,
7.5)
(5.0, 2.5) (7.5, 5.0)
��� !"#���, ��� 5 3.5 3.5 1 1 1
Rank(K1,K2) I II II III III III
Analysis of Table 3 shows that in the end we have 3 play types with differentiated val-
ues of the multi-criterion K. Apart from 6 plays with fractional acceptation given in Table 3
we also have 15 plays with the zero-acceptability K=0 and 4 plays with the full acceptability
40 Andrzej Piegat, Wojciech Sałabun
K=1, see Table 1. Applying the indifference principle of Laplace [2], we can assume that the
full difference of acceptation value relating to plays from Table 3, Kmax - Kmin= 1 - 0 = 1
should be partitioned in 4 equal differences ∆K = ¼. The plays (5, 2.5), (7.5, 5), (10,7.5)
achieve the M-Cr value K=1/4 (the third place in the rank). The plays (7.5, 2.5) and (10, 5)
achieve K=2/4 (the second place in the rank). The play (10,2.5) achieves K=3/4 (the first
place in the rank of fractional-acceptability of plays). Resulting values of the M-Cr K de-
termined for particular plays with the tournament-rank method are given in Table 4.
Table 4. Resulting values of the multi-criterion K= f(K1,K2), which represents the acceptability de-
gree of particular plays (K1,K2) for the investigated person.
The value of losses
��[$] The value of winning ��[$]
0.0 2.5 5.0 7.5 10.0
0.0 0 1 1 1 1
2.5 0 0 1 4⁄ 2 4⁄ 3 4⁄
5.0 0 0 0 1 4⁄ 2 4⁄
7.5 0 0 0 0 1 4⁄
10.0 0 0 0 0 0
On the basis of Table 4 a visualization of the investigated multi-criterion K of the play
acceptability-degree can be realized, Fig. 3 and 4.
Figure 3. Visualization of the 25 analyzed plays (K1,K2) as 25 characteristic objects regularly placed
in the decisional domain K1 K2 of the problem
Each of the 25 characteristic plays (decisional objects) can be interpreted as a crisp rule,
e.g.:
IF (K1 = 7.5) AND (K2 = 5) THEN (K = ¼) (2)
However, if K1 is not exactly equal to 7.5 and K2 is not exactly equal to 5.0 then rule (2)
can be transformed in a fuzzy rule (3) based on tautology Modus Ponens [8, 9].
IF (K1 close to 7.5) AND (K2 close 5.0) THEN (K close ¼) (3)
Nonlinearity of human, decisional multi-criteria 41
This way 25 fuzzy rules of type (4) were achieved on the basis of each characteristic ob-
ject (play) given in Table 3. The rules enable calculating values of the nonlinear multi-
criterion K for any values of the component criteria K1i and K2j, i,j =1:5.
IF (K1 close to K1i) AND (K2 close to K2j) THEN (K close to Kij) (4)
The complete rule base is given in Table 3. To enable calculation of the fuzzy M-Cr-
function K it is necessary to define membership functions µK1i ( close to K1i ), µK2j (close to
K2j) and µKij (close to Kij). These functions are shown in Fig.4.
Figure 4. Membership functions µK1i (close to K1i), µK2j (close to K2j) of the component criteria and
µKij (close to Kij) of the aggregating multi-criterion K
On the basis of the rule base (Table 3) and of membership functions from Fig.4 it is easy
to visualize the function-surface K = f(K1,K2) of individual multi-criterion of the play ac-
ceptation. As visualization tool one also can use toolbox of fuzzy logic from MATLAB or
own knowledge about fuzzy modeling [8, 9]. The functional surface is shown in Fig.5.
As Fig.5 shows, the functional surface of the human multi-criterion K=f(K1,K2) is
strongly nonlinear. This surface represents the M-Cr of one person. However, in case of
other persons surfaces of this multi-criterion are qualitatively very similar (an investigation
was realized on approximately 100 students of Faculty of Computer Science of West Pom-
eranian University of Technology in Szczecin and of Faculty of Management and Economy
of University of Szczecin). Quantitative differences of the multi-criterion K between partic-
ular investigated persons were mostly not considerable. All identified surfaces were strongly
nonlinear.
The second co-author WS of the paper used the method of characteristic objects in in-
vestigation of the attractiveness degree of color. In the experiment two attributes occur:
• the degree of brightness green (in short G),
• the degree of brightness blue (in short B).
42 Andrzej Piegat, Wojciech Sałabun
Figure 5. Functional surface of the individual multi-criterion K=f(K1,K2) of the play acceptability
with possible winnings K1[$] and losses K2[$], probability of winnings and losses are identical and
equal to 0.5. This particular surface represents the AP-author of the paper.
The degree of red was fixed at constant brightness level 50%. The brightness level of
each components was normalized to the range [0,1]. The first step was to define linguistic
values for the G and B components, presented in Fig. 6. and 7.
Figure 6. Definitions of linguistic values for the component G
Figure 7. Definitions of linguistic values for the component B
Nonlinearity of human, decisional multi-criteria 43
Membership functions presented in Fig 6. are described by formula (5):
() = �.+,-�.+ (.) = -,�
�.+ (./ = �,-�.+ (0 = -,�.+
�.+ , (5)
where: L – low, ML – medium left, MR – medium right, H – height, G – the level of bright-
ness green.
Membership functions presented in Fig. 7. are described by formula (6):
() = �.+,1�.+ (.) = 1,�
�.+ (./ = �,1�.+ (0 = 1,�.+
�.+ , (6)
where: L – low, ML – medium left, MR – medium right, H – height, B – the level of bright-
ness blue.
Linguistic values of attributes generate 9 characteristic objects. Their distribution in the
problem space is presented by Fig.8.
Figure 8. Characteristic objects Ri in the space of the problem
Attribute values of the characteristic Ri objects, their names and colors are given in Ta-
ble 5.
Table 5. Complex color and their rules
Rule [R, G, B] Color
R1 [0.5, 0.0, 0.0]
R2 [0.5, 0.0, 0.5]
R3 [0.5, 0.0, 1.0]
R4 [0.5, 0.5, 0.0]
R5 [0.5, 0.5, 0.5]
R6 [0.5, 0.5, 1.0]
R7 [0.5, 1.0, 0.0]
R8 [0.5, 1.0, 0.5]
R9 [0.5, 1.0, 1.0]
The interviewed person has to make decisions described below.
In the survey, please indicate, which color of the pair of colors is more attractive (please mark this color by X). If both colors have similar or identical level of attractiveness, please mark a draw. Attractiveness of color is telling you which color you prefer more from the pair of colors.
44 Andrzej Piegat, Wojciech Sałabun
Evaluation of characteristic objects is determined with the tournament-rank method. If
one color of a pair is preferred, then this color receives 1 point and second color receives 0
points. If the interviewed person marks a draw, both colors receive 0.5 point. Next, all the
points assigned to each object are added. On the basis of the sums the ranking of objects is
established. Applying the indifference principle of Laplace we can assume that the full dif-
ference value �234 5 �2� = 1 5 0 = 1 should be partitioned in 7 5 1 equal differences 89:;,89<=
2,� . ( m – number of places in the ranking). Experimental identification of surfaces
of the multi-criterion showed, that for all interviewed people, this surfaces were strongly
nonlinear. Fig. 9. shows the multi-criterion surface for a randomly chosen person.
For comparison, Fig. 10 shows the multi-criterion surface for co-author WS of the arti-
cle.
Figure 9. Functional surface of the individual multi-criterion of the resulting color-attractiveness
achieved by mixing 2 component colors with different proportion-rates.
Figure 10. Functional surface of the individual multi-criterion of attractiveness of the resulting color
achieved by mixing 2 component colors with different proportion-rates (WS)
Nonlinearity of human, decisional multi-criteria 45
The realized investigation also showed that functional surfaces of the multi-criterion of
all persons were strongly nonlinear. Fig. 9 presents the functional, M-Cr-surface of one of
the persons taking part in the investigation. For other interviewed people, these M-Cr-
surfaces were also highly nonlinear. (Identification of M-Cr-surfaces has been performed
for a group of 307 selected people).
4. Nonlinearity indicator of the functional surface of a multi-criterion
In case of the 2-component multi-criterion K = f(K1,K2) there exists visualization possi-
bility of the functional surface of the M-Cr and possibility of an approximate, visual evalua-
tion of its nonlinearity degree or, at least, of evaluation whether the surface is linear or
nonlinear one. However, in case of higher-dimensional multi-criteria K = f(K1,K2, … ,Kn)
visualization and visual evaluation of nonlinearity becomes more and more difficult, though
it can be realized e.g. with method of lower-dimension cuts [7]. Therefore it would be very
useful to construct a quantitative indicator of nonlinearity N-IndK of a model of the multi-
criterion K. First, let us analyze, for better understanding of the problem, the most simple
criterion-model K = f(K1),the criterion of the lowest dimension identified with the method of
characteristic objects (Ch-Ob-method). Let us assume that after realized investigations we
have at disposal m objects, each of them is described by the pair (K1,K) of coordinate values
and can be interpreted as a measurement sample that can be used for identification of a
functional dependence. Let us assume that the characteristic objects are distributed in the
coordinate-system space as shown in Fig.11a.
Figure 11. An example placement of characteristic objects (���, ��), � = 1:7������, in the space �� > �,
Fig.11a, and a nonlinear, fuzzy model approximating the characteristic objects, Fig.11b
Nonlinearity of the fuzzy model approximating the criterion-function K=f(K1) will be the
smaller, the smaller is the difference sum (Ki – KLi ) of corresponding points lying on the
fuzzy and on the linear approximation of the criterion function. Information about this sum
delivers the proposed indicator N-IndK of nonlinearity, formula (7).
? 5 @�A8 = ∑ |8<,8C<|9<DE�.+2�89:;,89<=� = ∑ |8<,�FGHFE8E<|9<DE
�.+2�89:;,89<=� (7)
46 Andrzej Piegat, Wojciech Sałabun
The denominator 0.5m∙(Kmax-Kmin) in formula (5) realizes normalization of the indicator
to interval [0,1]. Fig. 12a presents distribution of characteristic objects for which the nonlin-
earity indicator equals zero. Fig.12b presents the inverse situation, when the indicator as-
sumes value equal to 1.
Figure 12. Distribution of characteristic objects (K1i,Ki), I = 1-m, for which the nonlinearity indicator
N-IndK is equal to zero, Fig.12a, and distribution for which the indicator assumes the maximal
value 1, Fig.12b
If we use a multi-criterion K aggregating n component criteria Ki, then the linear approx-
imation KL of K has the form (8) and the nonlinearity indicator N-IndK is expressed by for-
mula (9).
�) = �� + ���� + ���� + ⋯+ �� (8)
? 5 @�A8 = ∑ |8<,8C<|9<DE�.+2�89:;,89<=� (9)
The linear approximation KL of a M-Cr can be determined e.g. with the method of the
minimal sum of square errors for which many program-tools can be found, e.g. in
MATLAB and STATISTICA. As an example, the nonlinearity indicator was determined for
the multi-criterion K = f(K1,K2) aggregating winnings and losses of a play, see Fig.5 and
Table 3. The achieved value of the indicator was? 5 @�A8 = 0.35. The obtained (with the
least squares) linear model KL of multi-criterion K is presented in Fig. 13a, and in Fig. 13b,
for comparison, the fuzzy model of this criterion, obtained with the characteristic objects
method is shown.
Another example of determining the nonlinearity indicator N-IndK is given for the multi-
criterion of attractiveness of the resulting color achieved by mixing 2 component colors
with different proportion-rates, which was presented in part 3. The indicator N-IndK was
calculated for the nonlinear models from Fig. 9 and Fig. 10.
Nonlinearity of human, decisional multi-criteria 47
Figure 13. Comparison of the linear model KL = w0 + w1K1 +w2K2, Fig.13a, and of the nonlinear
model K = f(K1,K2) of the multi-criterion of acceptability of plays on the basis of their winnings
K1($) and losses K2($). In Fig.13b the nonlinear model obtained with the method of characteristic
objects with the nonlinearity indicator ? 5 @�A8 = 0.35.
The linear model in Fig. 14a was identified by method of least squares. For comparison
the nonlinear model presented in Fig. 14b was determined. For this model the nonlinearity
indicator is equal to 0.49. This means a higher nonlinearity degree than in case of the play-
problem presented in Fig. 13 where this value was equal to 0.35.
Figure 14. Comparison of the linear model KL = w0 + w1G +w2B, Fig.14a, and of the nonlinear model
K = f(G, B) of the multi-criterion of attractiveness of the resulting color achieved by mixing 2 com-
ponent colors with different proportion-rates, Fig.14b. The nonlinear model obtained with the meth-
od of characteristic objects is characterized by the nonlinearity indicator ? 5 @�A8 = 0.49.
Fig. 15a presents the linear model of this same multi-criterion for co-author WS of the
article. This model was identified with method of least squares. After comparing the linear
model with the fuzzy model presented in Fig. 15b the nonlinearity indicator 0.54 was
achieved. This means the highest degree of the multi-criterion nonlinearity in all presented
cases.
48 Andrzej Piegat, Wojciech Sałabun
Figure 15. Comparison of the linear model KL = w0 + w1G +w2B, Fig.15a, and of the nonlinear model
K = f(G, B) of the multi-criterion of attractiveness of the resulting color achieved by mixing 2 com-
ponent colors with different proportion-rates, Fig.15b. The nonlinear multi-criterion was identified
with the method of characteristic objects. Its nonlinearity indicator equals ? 5 @�A8 = 0.54.
5. Conclusions
Human multi-criteria representing human preferences usually are not of linear but of
nonlinear character. Linearity is an idealized feature and it occurs rather seldom in the reali-
ty. The paper presented few examples of nonlinear, human multi-criteria – a considerably
greater number easily could be presented. Scientists, in modeling human multi-criteria
should go over from linear to nonlinear models (approximations) of these criteria. The paper
presented the method of characteristic objects, which enables identification of more precise,
nonlinear models of human multi-criteria. Because it is difficult to visualize high-
dimensional multi-criteria a nonlinearity indicator was proposed. This indicator allows for
error-evaluation of linear, simplified models of human multi-criteria. The method of charac-
teristic objects and the nonlinearity indicator was conceived by Andrzej Piegat.
References
[1] Brans J.P., Vincke P.: A preference ranking organization method: the PROMETHEE method for MCDM. Management Science, 1985.
[2] Burdzy K.: The search for certainty. World Scientific, New Jersey, London, 2009.
[3] Figueira J. et al.: Multiple criteria decision analysis: state of the arts surveys. Springer
Science + Business Media Inc, New York, 2005.
[4] French S. at al.: Decision behavior, analysis and support. Cambridge, New York, 2009.
[5] Hwang Cl., Yoon K.: Multiple attribute decision making: methods and applications. Springer-Verlag, Berlin, 1981.
[6] Kahneman D., Tversky A.: Choices, values and frames. Cambridge University Press,
Cambridge, New York, 2000.
[7] Lu Jie at al.: Multi-objective group decision-making. Imperial College Press, London,
Singapore, 2007.
[8] Piegat A.: Stationary to the lecture Methods of Artificial Intelligence. Faculty of Com-
puter Science, West Pomeranian University of Technology, Szczecin, Poland, not pub-
lished.
[9] Piegat A.: Fuzzy modeling and control. Springer-Verlag, Heidelberg, New York, 2001.
Nonlinearity of human, decisional multi-criteria 49
[10] Rao C.R.: Linear Models: Least Squares and Alternatives., Rao C.R.(eds), Springer
Series in Statistics, 1999.
[11] Rutkowski L.: Metody i techniki sztucznej inteligencji (Methods and techniques of arti-ficial intelligence)
[12] Saaty T.L.: How to make a decision: the analytic hierarchy process. European Journal
of Operational Research, vol.48, no1, pp.9-26, 1990.
[13] Saaty T.L.: Decision making with dependence and feedback: the analytic network pro-cess. RWS Publications, Pittsburg, Pennsylvania, 1996.
[14] Saaty T.L., Brady C.: The encyclicon, volume 2: a dictionary of complex decisions us-ing the analytic network process. RWS Publications, Pittsburgh, Pennsylvania, 2009.
[15] Stadnicki J.: Teoria I praktyka rozwiązywania zadań optymalizacji (Theory and practi-ce of solving optimization problems). Wydawnictwo Naukowo Techniczne, Warszawa,
2006.
[16] Zarghami M., Szidarovszky F.: Multicriteria analysis. Springer, Heidelberg, New
York, 2011.
[17] Zeleny M.: Compromise programming. In Cochrane J.L., Zeleny M.,(eds). Multiple
criteria decision-making. University of South Carolina Press, Columbia, pp. 263-301,
1973.
[18] Zimmermann H.J.: Fuzzy set theory and its applications. Kluwer Academic Publishers,
Boston/Dordrecht/London, 1991.
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 50-57
ISSN 2299-2634 http://www.jtacs.org
Method of non-functional requirements balancing during
service development
Larisa Globa1, Tatiana Kot
1, Andrei Reverchuk
2, Alexander Schill
3
1 National Technical University of Ukraine «Kyiv Polytechnic Institute», Ukraine
2 SITRONICS Telecom Solutions, Czech Republic a.s.
3 Technische Universitat Dresden, Fakultat Informatik, Deutschland
{lgloba, tkot}@its.kpi.ua, [email protected], [email protected]
Abstract: Today, the list of telecom services, their functionality and requirements for Service Execu-
tion Environment (SEE) are changing extremely fast. Especially when it concerns require-
ments for charging as they have a high influence on business. This results in the need for
constant adaptation and reconfiguration of Online Charging System (OCS) used in mobile
operator networks. Moreover any new functionality requested from a service can have an
impact on system behavior (performance, response time, delays) which are in general non-
functional requirements. Currently, this influence and reconfiguration strategies are poorly
formalized and validated. Current state-of-the-art approaches are considered methodolo-
gies that can model non-functional or functional requirements but these approaches don’t
take into account interaction between functional and nonfunctional requirements and col-
laboration between services. All these result in time and money consuming service devel-
opment and testing, and cause delays during service deployment. The balancing method
proposed in this paper fills this gap. It employs a well-defined workflow with predefined
stages for development and deployment process for OCS. The applicability of this novel ap-
proach is described in a separate section which contains an example of GPRS service
charging. A tool, based on this method will be developed, providing automation of service
functionality influence on non-functional requirements and allowing to provide a target de-
ployment model for a particular customer. The reduction of development time and thus nec-
essary financial input has been proved based on real-world experiments.
Keywords: OCS, service deployment, non-functional requirements, requirements balancing.
1. Introduction
During service design and deployment, provided by telecom operator, using OCS [1],
one important aspect should be considered. It concerns NFR1 to service provision.
There is the established fact that any system and services run on the system shall be de-
veloped not only based on functional requirements, defining software functions (inputs, be-
havior, outputs), but non-functional ones as well. It is very important to meet non-functional
requirements in the telecom industry, especially for real time systems. Generally non-
functional parameters could be classified as follows: Performance (Response Time,
Throughput, Utilization, Static Volumetric); Scalability; Capacity; Availability; Reliability;
1 Non-functional requirements
Method of non-functional requirements balancing during service development 51
Recoverability; Maintainability; Serviceability; Security; Regulatory; Manageability; Envi-
ronmental; Data Integrity; Usability; Interoperability.
Non-functional requirements specify a system’s “quality characteristics” or “quality at-
tributes”. If non-functional requirements are not considered at the designer level, then the
provided service may actually be useless in practice.
Currently, NFR are not considered within the perspective of the services list, provided
by Telecom Operator. The main problem is that legacy methods can design service accord-
ing to NFR, but cannot model an influence of concurrency services on particular NFR be-
cause of collaboration between services.
This means that Operator has no tool that allows flexible balancing between services,
run on OCS. Balancing can allow to model system behavior for a determined (requested) list
of services to analyze how this configuration meets the NFR.
This paper describes a novel NFR balancing method, focusing on collaboration between
functional and non-functional requirements, allowing to automate service planning stages
and to reduce the time and costs for OCS adaptation in general.
The paper is structured as follows: Section 2 contains state of the art analysis of methods
and approaches to considering NFR. Furthermore, NFR analysis methods are described.
Section 3 introduces NFR balancing method, focusing on functional and non-functional
requirements collaboration. The evaluation has been applied using a real-world scenario
within a telecommunication company and it is represented in Section 4. Section 5 concludes
the work with a summary and outlook on future work.
2. State of the art and non-functional testing
Errors due to omission of NFR or not properly dealing with them are among the most
expensive type and most difficult to correct. Recent works [2] points out that early-phase
requirements engineering should address organizational and non-functional requirements,
while later-phase engineering focuses on completeness, consistency and automated verifica-
tion of requirements.
There are reports [3, 4] showing that not properly dealing with NFR has led to consider-
able delays in the project and consequently to a significant increase of the final cost.
There are many reasons for delays and significant increasing of costs, but one of the
most important reasons relies on the fact that performance was neglected during software
development, leading to several changes in both hardware and software architecture, as well
as in software design and code [5, 6, 7].
There could be a situation in which the system can be deactivated just after its deploy-
ment because, among other reasons, many non-functional requirements were neglected dur-
ing the system development such as: reliability (vehicles location), cost (emphasis on the
best price), usability (poor control of information on the screen), and performance (the sys-
tem did what it was supposed to do but performance was unacceptable). As it was men-
tioned above, OCS shall provide all functionality to charge telecom services (GPRS, voice,
sms, mms, VAS2) using Event Charging with Unit Reservation, Session Charging with Res-
ervation Unit, Immediate Event Charging mechanisms. Each service consumes a strictly
predefined volume of system resource (memory, process time, etc.) and has influence on
non-functional requirements to be supported.
2 Value added services
52 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill
2.1. NFR framework
NFR are considered at the design level and there are several approaches that can help to
model NFR within the scope of the developed service. NFR framework [7] is a methodolo-
gy that guides the system to accommodate change with replaceable components. NFR
framework is a goal-oriented and process-oriented quality approach guiding the NFR mod-
eling. Non-functional requirements such as security, accuracy, performance and cost are
used to drive the overall design process and choose design alternatives. It helps developers
express NFR explicitly, deal with them systematically and use them to drive development
process rationally [8]. In the NFR Framework, each NFR is called an NFR softgoal (depict-
ed by a cloud), while each development technique to achieve the NFR is called an opera-
tionalizing softgoal or design softgoal (depicted by a dark cloud). Design rationale is
represented by a claim softgoal (depicted by a dash cloud). The goal refinement can take
place along the Type or the Topic. These three kinds of softgoals are connected by links to
form the SIG3 that records the design consideration and shows the interdependencies among
softgoals.
2.2. KAOS
Another methodology for considering NFR is KAOS [9, 10]. KAOS is a methodology
for requirements engineering enabling analysts to build requirements models and to derive
requirements documents from KAOS models. KAOS has been designed:
− to fit problem descriptions by allowing you to define and manipulate concepts rele-
vant to problem description;
− to improve the problem analysis process by providing a systematic approach for dis-
covering and structuring requirements;
− to clarify the responsibilities of all the project stakeholders;
− to let the stakeholders communicate easily and efficiently about the requirements.
KAOS is independent of the development model type: waterfall, iterative, incremental,
but it also doesn’t take into account collaboration between FR4 and NFR.
The legacy software tools, for instance NFR-Assistant CASE [11], ARIS [12], don’t
provide requested functionality to model nonfunctional requirements and compare their in-
fluence on functionality.
2.3. Non-functional testing
Testing of non-functional requirements is another issue. Non-functional testing [13] is
concerned with the non-functional requirements and is designed to evaluate the readiness of
a system according to several criteria not covered by functional testing. Non-functional test-
ing covers:
− Load and Performance Testing;
− Ergonomics Testing;
− Stress & Volume Testing;
− Compatibility & Migration Testing;
− Data Conversion Testing;
− Security / Penetration Testing;
− Operational Readiness Testing;
3 Softgoal interdependency graph 4 Functional requirements
Method of non-functional requirements balancing during service development 53
− Installation Testing;
− Security Testing (Application Security, Network Security, System Security).
It enables the measurement and comparison of the testing of non-functional attributes of
software systems. The cost of catching and correcting errors related to non- functional re-
quirements is very high and could cause full redesign of developed service (system). Testing
does not have to occur once the 'code' has been delivered. It can start early with analyzing
the requirements and creating test criteria of 'What' it is needed to test. The process for do-
ing this is called the “V” model [9] (Fig. 1.).
It decomposes requirements and testing. It allows testing and coding as a parallel activi-
ty which enables the changes to occur more dynamic. NFR has a high influence on the test-
ing process and any service that doesn’t meet NFR can cause rollback of the development
process to initial phases.
Figure 1. V- Model
3. NFR balancing method
The proposed NFR balancing method is based on creating FR and NFR collaboration
model. Implementation of functional requirements is presented by listed FB5. Each of FB is
responsible for a particular logical function. The proposed method includes the following
main stages:
− NFR Catalogue development;
− FR decomposition;
− NFR mapping;
− FB distribution;
− Balancing;
− Target deployment model.
NFR balancing method uses NFR Catalogue, Functional Requirements to be implement-
ed, create collaboration model between them. The main stages of the concept are represent-
ed below.
5 Functional Block
54 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill
3.1. Catalogue of NFR
NFR are usually complex, global, conflicting and numerous. Aside from that, both soft-
ware engineers and stakeholders are not used to recognizing NFR. Because of that, a
knowledge base will be used to present NFR in the form of catalogues, to guide the re-
quirements engineering through possibly needed NFR and the possible operationalizations
for each NFR can be found. Thus we can operate with catalogues for performance and ser-
viceability. These catalogues will be updated with further operationalizations to keep cata-
logues on NFR up to date. Such approach will facilitate future reuse of acquired knowledge
on NFR elicitation.
3.2. FR decomposition
The next stage is creating the FR decomposition model. FR decomposition shall de-
scribe all services with their features’ influence on NFR. This means that each service shall
be split into functional blocks. A functional block is a logical unit, responsible for providing
some strictly defined functionality (for instance sending of notification, bonus system regis-
tration, etc.). What is more, services and features, they provide, will be depicted for each
functional block (functional requirements).
Total distribution of functional blocks between all services, run on OCS, is represented
in Table 1.
Table 1. FR decomposition
Service Functional Block Functional Requirement
Service1 FB1.1 or FB1.2 FR1, FR2
Service1 FB2.1 and FB2.2 FR3, FR1
Service2 FB1.1 FR5, FR6
Service2 FB3 FR1, FR7
3.3. NFR mapping
Each call of FB requests a defined amount of each system resource (memory, processor
time, network, etc.) and has a list of characteristics: response time, availability, etc. All of
these characteristics shall be mapped to NFR from catalogue with values that specify how
exact FB meets the particular NFR (it could be graded from 0 to 100 – Table 2).
Table 2. NFR mapping
Functional block/ NFR Availability Performance Security
FB1.1 90 80 10
FB1.2 80 70 20
FB2.1 50 10 10
FB2.2 5 20 30
FB with the same first number (FB1.1, FB1.2) provides the same functionality but in
different way. This means that from a functional point of view there is no difference be-
tween these two blocks. The difference is only how each FB meets the NFR.
To understand and reason about different alternatives involved in these tradeoffs be-
tween functional blocks it is required to clarify some NFR operationalizations and to nego-
tiate which NFR should be denied or partially denied prejudicing another NFR.
To build the NFR model, it is necessary to go through every service and connect it to all
needed functional blocks to cover the requested functionality.
Method of non-functional requirements balancing during service development 55
3.4. Functional blocks distribution
Using NFR catalogues and FR decomposition, Functional blocks distribution can be re-
alized as it is represented on Fig. 2.
Fig. 2 represents use of functional blocks by services. Influence of each connection be-
tween Service and FB on NFR is determined in Table 2. According to this, input could lead
to different deployment configurations. Fig. 2 describes that FR 1 and FR 2 from table 1 can
be implemented either by FB1.1 or FB1.2. The implementation way depends on the NFR
specification for a particular case.
Figure 2. Functional blocks distribution
3.5. Balancing and target model
The target model would be obtained by using balancing between NFR and approaches to
implementation of a particular functionality with FB. This tradeoff can be continued until
target deployment configuration is received based on requested NFR. If requested NFR can-
not be gained with legacy list of service, then some service should be excluded from de-
ployment scheme. For instance, there is the Customer’s demand that service shall support
the highest availability and there is no specified requirement for security and performance.
Such case can be realized by the model, represented on Fig. 3. It is a simple situation and
there are usually combinations of NFR in practice. Thus, a priority should be assigned to
any requirement that will be considered during target model development.
Figure 3. Target deployment model
56 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill
4. Charging of GPRS service
Evaluation of the proposed method is demonstrated using a real-world scenario within a
telecommunication company. Charging of GPRS service at the design level, requested by
Telecom Operator from OCS, is described as an example. Its FR decomposition is depicted
in the Table 3.
Table 3. FR decomposition of GPRS service
Service Functional Block Functional Requirement
GPRS LBS1.1 or LBS1.2 Location Base Charging
GPRS RF2.1 and RF2.2 Step Charging
GPRS NB3.1 or NB 3.2 or NB3.3. User notification
Assuming that Customer takes into account availability of GPRS service and delay
caused by the service as main NFR, and according to statistical data and knowledge base, all
FB characteristics are estimated in the Table 4.
Table 4. NFR mapping of GPRS service
Functional block/ NFR Availability Delay
LBS1.1- location based module implemented
as internal cache in OCS
90 80
LBS1.2 – using external Home Zone Billing -
HZB platform
50 10
RF2.1 – internal Rating 50 20
RF2.2 – external Rating 5 15
NB3.1 – notification via SMS 40 50
NB3.2 –online notification via USSD 50 40
NB3.3 – offline notification via email 50 10
Finally, the target model for GPRS service using balancing method to get optimal de-
ployment configuration could be created (Fig. 4). The model supposes that configuration
will be applied to provide service at the highest availability with minimal delay.
Figure 4. Target model for GPRS service
5. Summary and outlook
The proposed method can be applied at both the service design and deployment stages.
The method could be realized within a software tool, used for service provision software
design and realization. It is also necessary to foresee the possibility of its usage during ser-
vice monitoring to obtain specific statistical data. This data shall be used to evaluate how
Method of non-functional requirements balancing during service development 57
each functional block meets a particular NFR. The method increases efficiency of develop-
ment process on testing and deployment phases and allows quick system reconfiguration on
customer demand. In the future, the method will be extended to consider possibly changing
the NFR list and their priorities during different time periods (e.g. periods with high load,
service upgrading) and also take into account changing priority between services.
References
[1] 3GPP TS 32.296 Online Charging System (OCS): Application and interfaces, 88 p.
[2] Abdukalykov R., Hussain I., Kassab M., Ormandjieva O.: Quantifying the Impact of
Different Non-functional Requirements and Problem Domains on Software Effort Esti-
mation. 9th International Conference on Software Engineering Research, Management
and Applications (SERA), 2011
[3] National Institute of Standards and Technology: Software Errors Cost U.S. Economy
$59.5 Billion Annually (NIST 2002-10).
http://www.nist.gov/public_affairs/releases/n02-10.htm (2002).
[4] Lindstrom D.R.: Five Ways to Destroy a Development Project. IEEE Software, Sep-
tember 1993, pp. 55-58.
[5] Boehm B., In H.: Identifying Quality-Requirement Conflicts. IEEE Software, March
1996, pp. 25-35.
[6] Breitman K. K., Leite J.C.S.P., Finkelstein A.: The World's Stage: A Survey on Re-
quirements Engineering Using a Real-Life Case Study. Journal of the Brazilian Com-
puter Society No. 1, Vol. 6, Jul. 1999, pp. 13-37.
[7] Chung L.: Representing and Using Non-Functional Requirements: A Process Oriented
Approach. Ph.D. Thesis, Dept. of Comp. Science. University of Toronto, June 1993.
Also tech. Rep. DKBS-TR-91-1.
[8] Chung L., Nixon B. A., Yu E., Mylopoulos J.: Non-Functional Requirements in Soft-
ware Engineering, Kluwer Academic Publishers, Boston, 2000.
[9] http://www.info.ucl.ac.be/research/projects/AVL/ReqEng.html
[10] http://www.objectiver.com/
[11] Quan Tran: NFR-Assistant: tool support for achieving quality, Application-Specific
Systems and Software Engineering and Technology, 1999. ASSET '99. Proceedings.
1999 IEEE Symposium.
[12] http://www.softwareag.com/corporate/products/aris_platform/default.asp
[13] Page A., Johnston K., Rollison B.: How We Test Software at Microsoft, Microsoft Press
– December 10, 2008, 448 p.
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 58–70ISSN 2299-2634 http://www.jtacs.org
Donor limited hot deck imputation: effects on parameterestimation
Dieter William Joenssen, Udo BankhoferTechnische Universitat Ilmenau, Germany
{Dieter-William.Joenssen, Udo.Bankhofer}@TU-Ilmenau.de
Abstract: Methods for dealing with missing data in the context of large surveys or data mining projects arelimited by the computational complexity that they may exhibit. Hot deck imputation methods arecomputationally simple, yet effective for creating complete data sets from which correct inferencesmay be drawn. All hot deck methods draw values for the imputation of missing values from thedata matrix that will later be analyzed. The object, from which these available values are taken forimputation within another, is called the donor. This duplication of values may lead to the problemthat using any donor “too often” will induce incorrect estimates. To mitigate this dilemma somehot deck methods limit the amount of times any one donor may be selected. This study answerswhich conditions influence whether or not any such limitation is sensible for six different hot deckmethods. In addition, five factors that influence the strength of any such advantage are identifiedand possibilities for further research are discussed.
Keywords: hot deck imputation, missing data, non-response, imputation, simulation
1. IntroductionDealing with missing observations when estimating parameters or extracting information
from empirical data remains a challenge for scientists and practitioners alike. Failures in eithermanual or automated data collection or editing, such as aggregating information from differentsources [18] or outlier removal [22], cause missing observations. Some missing data may beresolved through manual or automatic logical inference when values may be inferred directlyfrom existing data (e.g. a missing passport number when the respondent has no passport, miss-ing age when the date of birth is known). If missing data cannot be resolved in this way (e.g.cost restraints, lack of domain knowledge), it must be compensated in light of the missingnessmechanism.
Rubin [25] first treated missing data indicators as random variables. Based on the indica-tors distribution, he defined three basic mechanisms, MCAR, MAR, and NMAR, that governwhich missing data methods are appropriate. With MCAR (missing completely at random),missingness is independent of any data values, missing or observed. Thus under MCAR, ob-served values represent a subsample of the intended sample. Under MAR (missing at random),whether or not data is missing depends on some observed data’s values. A MAR mechanismwould be present if response rates for an item differ between two groups of respondents, e.g.survey respondents with a higher education level are less likely to answer a question on in-come than respondents exhibiting a lower education level. Finally under NMAR (not missing
Donor limited hot deck imputation. . . 59
at random), the presence of missing data is dependent on the variable’s values, which is itselfsubject to missingness. NMAR missingness is present when, for example, data is less likely tobe transmitted by a temperature sensor if the temperature rises above a certain threshold.
With missingness present, conventional methods cannot be simply applied to the data with-out proxy. Explicit provisions must be made before or within the analysis. The provisions,to deal with the missing data, must be chosen based on the identified missingness mechanism.Principally, two strategies to deal with missing data in the data mining or large survey contextare appropriate: elimination and imputation. Elimination procedures will eliminate objects orattributes with missingness from the analysis. These only lead to a data set, from which accurateinferences may be made, if the missingness mechanism is MCAR, and correctly identified assuch. But even if the mechanism is MCAR, eliminating records with missing values denotesan inferior strategy, especially when many records need to be eliminated due to unfavorablemissingness patterns or data collection schemes (e.g. asynchronous sampling). Imputationmethods replace missing values with estimates ([17], [1]), and can be suitable under the lessstringent assumptions of MAR. Some techniques can even lead to correct inferences under thenon-ignorable NMAR mechanism ([3], [19]). Replacing missing values with reasonable onesnot only assures that all information gathered can be used, but also broadens the spectrum ofavailable analyses. Imputation methods differ in how they define these reasonable values. Thesimplest imputation techniques, and so far the state of the art for data mining [18], replacemissing values with eligible location parameters. Beyond that, multivariate methods, such asregression or classification methods, may be used to identify imputation values. The interestedreader may find a more complete description of missingness mechanisms and methods for deal-ing with missing data in [3], [19], or [14].
A category of imputation techniques appropriate for imputation in the context of mininglarge amounts of data and large surveys, due to its computational simplicity (c.p. [22], [14],[20]), is hot deck imputation. Ford [11] defines a hot deck procedure as one where missing itemsare replaced by using values from one or more similar records within the same classificationgroup. Partitioning records into disjoint, homogeneous groups is done so selected, good recordsthat supply the imputation values (the donors) follow the same distribution as the bad records(the recipients). Due to this, and the replication property, all hot deck imputed data sets containonly plausible values, which cannot be guaranteed by most other methods. Traditionally, adonor is chosen at random, but other methods such as ordering by covariate, when sequentiallyimputing records, or nearest neighbor techniques, utilizing distance metrics, are possible, whichimprove estimates at the expense of computational simplicity (c.p. [11], [19]).
The replication of values leads to the central problem in question here. Any donor may,fundamentally, be chosen to accommodate multiple recipients. This poses the inherent risk that“too many” or even all recipients are imputed with the same value or values from a single donor.Due to this, some variants of hot deck procedures limit the amount of times any one donor maybe selected for donating its values. This inevitably leads to question under which conditions alimitation is sensible and whether or not some appropriate limit value exists. This study aims toanswer these questions. An overview of the basic mechanics of hot deck methods is presented inchapter 2. Chapter 3 discusses current empirical and theoretical research on this topic. Chapter4 highlights the simulation study design while results are reported and discussed in chapter 5.A conclusion possibilities for further research are presented in chapter 6.
60 Dieter William Joenssen, Udo Bankhofer
2. Overview of Hot Deck MethodsFord [11] describes hot deck methods as processes in which a reported value is duplicated to
represent a value missing from the sample. Sande [26] extends this to define hot deck imputa-tion procedures as methods for completing incomplete responses using values from one or morerecords in the same file. Thus, from a procedural standpoint, clearly hot deck methods matchdonors and recipients within the same data matrix, whereby observations are duplicated to re-solve either all the recipient’s missingness simultaneously or on an attribute sequential basis.Simultaneous resolution of all the recipient’s missing data may better preserve the associationsbetween the variables, while the sequential resolution ensures a larger donor pool. Since, the-oretically, any procedure may be iteratively applied to all attributes exhibiting missing values,hot deck methods are better classified by the method of how donors and recipients are matched.The two primary possibilities for donor matching are:— Randomly. A donor is selected at random to accommodate any recipient. This method is,
seen computationally, the simplest. It preserves the overall distribution of the data and leadsto correct mean and variance estimation [2] under the MCAR mechanism. When data is notmissing MCAR, this method can be modified in various ways. Most often imputation classesare formed by stratifying auxiliary variables or by applying common clustering proceduresto the data, in an effort to achieve MCAR missingness within the classes. The randommatching of donor and recipient is then performed within these classes.Another variant of the random hot deck applies weights to the selection probabilities [27].This guarantees that donors more similar to the recipient have a higher chance of beingselected.The last and most widely used (random) method is the so called sequential hot deck. Thesequential hot deck is a procedure developed by the U.S. Census Bureau [7]. Based onpartitioning the data into imputation classes, each record in the data set is considered inturn. If a record is missing a value, this value is replaced by one saved in a register. If therecord is complete, the register’s value is updated. Initial values for this register are takeneither from a previous survey, class or randomly from the variables’ domain. The sequentialhot deck yields equivalent results to the random hot deck if the data set’s ordering is random.An advantage may be attained when ordering is nonrandom, such as when the data set issorted by covariates. This, however, is seldom done purposefully as sorting not only requirescomputationally intensive sorting but also identification of strong covariates. Usually, in anysequential hot deck application, any order in the data set is due to data entry procedures andthus is unlikely to ensure substantially better results.
— Deterministically. This class of hot decks matches recipients to their respective donors.These, usually of the nearest neighbor type, procedures are state of the art for many sta-tistical institutes and bureaus around the world. For example, nearest neighbor hot decksare used by the US Bureau of the Census in the CPS1, SIPP2, and ACS Surveys3, the UKOffice for National Statistics used them for the 2001/2011 Censuses4, and Statistics Canadautilizes nearest neighbor hot decks in 45% of all active surveys exhibiting missing data suchas the SLID and LFS5.
1 http://www.census.gov/cps/methodology/2 http://www.census.gov/sipp/editing.html3 http://www.census.gov/acs/www/methodology/ item allocation rates definitions/4 http://www.ons.gov.uk/ons/guide-method/ index.html5 http://www23.statcan.gc.ca/ imdb-bmdi/pub/ index-eng.htm
Donor limited hot deck imputation. . . 61
The nearest neighbor is usually defined by minimizing simple distance functions such as theManhattan or Chebyshev distances. These hot decks guarantee that the same donor is alwayschosen, given a static data set, ensuring consistency when multiple independent analyses areperformed on the data after a public release. While distance matrix computation tends tobecome prohibitively expensive for large amounts of data, this limit is reached later for thenearest neighbor hot deck methods, as neither the simultaneous nor the sequential versionrequire a full distance matrix. Rather only distances between all donors and all recipientsneed to be calculated.All hot deck methods guarantee, by virtue of the duplication property, that the imputed data
set contains only naturally occurring values, without the need to round or transform categoricalvalues. Hot decks also conserve unique distribution features, such as discontinuities or spikes.Their low cost of implementation and execution is, however, offset by the fact that little isknown about their theoretical properties.
Literature further detailing the mechanics hot deck imputation methods includes [11], [19],[14], [15], and [6].
3. Review of LiteratureThe theoretical effects of a donor limit were first investigated by Kalton and Kish [16].
Based on combinatorics, they come to the conclusion that selecting a donor from the donorpool without replacement leads to a reduction in the imputation variance, the precision withwhich any parameter is estimated from the post-imputation data matrix. A possible effect on animputation introduced bias was not discussed. Two more arguments in favor of a donor limit aremade. First, the risk of exclusively using one donor for all imputations is removed [26]. Second,the probability of using one donor with an extreme value or values “too often” is reduced ([3],[28]). Based on these arguments and sources, recommendations are made in [15], [21], [28],and [10].
In contrast, Andridge and Little [2] reason that imposing a donor limit inherently reducesthe ability to choose the most similar, and therefore most appropriate, donor for imputation. Notlimiting the times a donor can be chosen may thus increase data quality. Generally speaking, adonor limit makes results dependent on the order of object imputation. Usually, the imputationorder will correspond to the sequence of the objects in the data set. This property is undesirable,especially in deterministic hot decks. Thus, from a theoretical point of view, it is not clearwhether or not a donor limit has a positive or negative impact on the post-imputation data’squality.
Literature on this subject provides only studies that compare hot deck imputation methodswith other imputation methods. These studies include either only drawing the donor from thedonor pool with replacement ([4], [24], [29]) or without replacement ([13]).
It becomes apparent, based on this review of literature, that the consequences of imposing adonor limit have not been sufficiently examined.
4. Study DesignConsidering possible theoretical advantage that a donor limit has, and possible effects that
have not been investigated to date, the following questions will be answered by this study:1. Are true parameters of any hot deck imputed data matrix estimated with higher precision
when a donor limit is used?
62 Dieter William Joenssen, Udo Bankhofer
2. Does a donor limit lead to less biased post-imputation parameter estimation?3. What factors influence if a hot deck with a donor limit creates better results?
A series of factors are identified, by considering papers where authors chose similar ap-proaches ([23], [24], [28]) and further deliberations, that might influence whether or not a donorlimit affects parameter estimates. The factors varied are the following:
— Imputation class count: Imputation classes are assumed to be given prior to imputationand data is generated as determined by the class structure. Factor levels are two and sevenimputation classes.
— Objects per imputation class: The amount of objects characterizing each imputation classis varied. Factor levels 50 and 250 objects per class are considered.
— Class structure: To differentiate between well- and ill-chosen imputation classes, dataare generated with a relatively strong and relatively weak class structure. A strong classstructure is achieved by having classes overlap by 5% and an inner-class correlation of.5. A weak class structure is achieved by an intra-class overlap of 30% and no inner-classcorrelation.
— Data matrices: Data matrices of nine multivariate normal variables are generated depen-dent on the given class structure. Three of these variables are then transformed to a discreteuniform distribution with either five or seven possible values, simulating an ordinal scale.The next three variables are converted to a nominal scale so that 60% of all objects areexpected to take the value one, with the remaining values being set to zero. General detailson this NORTA type transformation are described by Cario and Nelson [8].
— Portion of missing data: Factor levels include 5, 10, and 20% missing data points andevery object is assured to have at least one data point available (no subject non-response).
— Missingness mechanism: Missingness mechanisms considered are either MCAR, MAR,or NMAR. These are generated as follows: under MCAR a set amount of values are chosenwithout replacement to be missing. Under MAR missing data is generated as under MCARbut using two different rates based on the value of one binary variable, which is not subjectto missingness. The different rates of missingness are either 10% higher or lower than therates under MCAR. NMAR modifies the MAR mechanism to also allow missingness of thebinary variable. To forgo possible problems with the simultaneous imputation methods andthe donor limitation of once, it was guaranteed that at least 50% of all objects within oneclass were complete in all attributes.
— Hot deck methods: The six hot deck methods considered are named “SeqR,” “SeqDW,”“SeqDM,” “SimR,” “SimDW,” and “SimDM” according to the three properties that theyexhibit. The prefixes denote whether attributes are considered sequentially (Seq) or simul-taneously (Sim) for imputation. The postfixes indicate a random (R) or a distance baseddeterministic (D) hot deck and the type of adjustment made to compensate for missingnesswhen computing the distances. “W” indicates a reweighting type of compensation, whichassumes that the missing components supply an average deviation to the distance. “M”denotes that an imputation of relevant location estimates is performed before distance calcu-lation, which assumes that the missing component is close to the average for this attribute.To account for variability and importance, prior to aggregating the Manhattan distances,variables are weighted with the inverse of their range.
Next to the previously mentioned factors, two static and two dynamic donor limits are evalu-
Donor limited hot deck imputation. . . 63
ated. The two static donor limits allow either a donor to be chosen once or an unlimited numberof times. For the dynamic cases, the limit is set to either 25% or 50% of the recipient count.
To evaluate imputation quality, a set of location, variability, and contingency measures isconsidered (c.p. [21]). For the quantitative variables mean, variance, and correlation for theordinal variables median, quartile distance, and rank-correlation, and for the binary variables therelative frequency of the value one and the normalized coefficient of contingency are computed.
100 data matrices are simulated for every factor level combination of “imputation classcount”, “object count per imputation class”, “class structure”, and “ordinal variables scale”.For every complete data matrix, the set of true parameters is computed. Each of these 1600 datamatrices is then subjugated to each missingness mechanism, generating three different amountsof missing data. All of the matrices with missing data are then imputed by all six hot deckmethods using all four donor limits. Repeating this process ten times creates 3.456 millionimputed data matrices, for which each parameter set is calculated again.
Considering every parameter in the set, the relative deviation ∆p between the true parametervalue pT and the estimated parameter value pI , based on the imputed data matrix, is calculatedas follows:
∆p =pI − pT
pT(1)
To analyze the impact of different donor limits on the quality of imputation, the differences inthe absolute values of ∆p, that can be attributed to the change in donor limitation, are consid-ered. Due to the large data amounts that are generated in this simulation, statistical significancetests on these absolute relative deviations are not considered appropriate. As an alternative tothis Cohen’s d measure of effect ([9], [5]) is chosen as a qualitative criterion. The calculationof Cohen’s d for this case is as follows:
d =|∆p1| − |∆p2|√
s21+s222
(2)
∆p1 and ∆p2 are the means of all relative deviations calculated via (1) for two different donorlimits. s21 and s22 are the corresponding variances in the relative deviations. Using absolutevalues for ∆p1 and ∆p2 allows interpreting the sign of d. A positive sign means that the sec-ond case of donor limitation performed better than the first, while a negative sign means theconverse. As with any qualitative interpretation of results, thresholds are quite arbitrary anddependent on the investigators frame of reference. Recommendations ([9], [12]) are to considerdeviations larger than 10% of a standard deviation as meaningful and thus thresholds to considereffects nontrivial are set to |d| >= .1.
5. ResultsBased on the simulation’s results, the research questions formulated in section 4 are now an-
swered. Section 5.1 deals with whether or not minimum imputation variance is always achieved,independent of the data and chosen hot deck procedure, when the most stringent donor limit isapplied. Section 5.2 deals with whether or not a donor limit will introduce a bias. Influencingfactors will be analyzed for each hot deck method separately in section 5.3
64 Dieter William Joenssen, Udo Bankhofer
Table 1. Frequency distribution of minimum imputation variance
Donor limitEvaluated parameter once 25% 50% unlim.
Quantitativevariables
Mean 68.52% 15.47% 7.95% 8.06%Var. 67.25% 15.74% 8.56% 8.45%Corr. 48.84% 19.98% 15.24% 15.93%
Ordinalvariables
Med. 74.54% 11.38% 7.62% 6.46%Q. dist. 85.88% 5.71% 4.96% 3.45%
Rank corr. 62.27% 14.47% 11.75% 11.52%Binary
variablesRel. freq. 78.36% 8.41% 6.96% 6.27%
Cont. coef. 61.64% 14.29% 11.71% 12.37%
Table 2. Frequency distribution of minimum imputation bias
Donor limitEvaluated parameter once 25% 50% unlim.
Quantitativevariables
Mean 42.71% 2.22% 18.48% 18.60%Var. 54.05% 17.79% 13.04% 15.12%Corr. 38.08% 19.60% 18.96% 23.36%
Ordinalvariables
Med. 46.41% 21.53% 14.47% 17.59%Q. dist. 56.83% 16.24% 12.94% 13.99%
Rank corr. 40.63% 2.99% 2.24% 18.15%Binary
variablesRel. freq. 49.42% 18.94% 15.07% 16.57%
Cont. coef. 63.10% 15.83% 9.81% 11.27%
5.1. Donor Limitation Impact on PrecisionThe theoretical reduction in imputation variance through donor selection without replace-
ment, as put forth by Kalton and Kish [16], is investigated empirically at this point. Table 1shows how often, in any simulated situation, a certain donor limit leads to the least amount ofimputation variance in the parameter estimate.
Clearly, a donor limit of one leads to minimal imputation variance in most cases and thuscan be expected to have highest precision in parameter estimation. Estimation precision for pa-rameters also tends to increase with the stringency of the donor limit. Variables with lower scaletype, binary and ordinal, favor donor selection without replacement even more strongly than thequantitative variables. Nonetheless, this recommendation does not hold for all situations. Somesituations demand using donors more often, while others require an over usage protection.
5.2. Donor Limitation Impact on Imputation BiasTo answer the, for the practitioner, more pressing question of whether or not implementing
a donor limit also leads to a reduction in imputation bias, the recorded data was evaluatedsimilarly to the previous fashion. Table 2 shows the percentage of situations in which a certaindonor limit has the least bias, as measured by the mean relative deviations.
The values indicate, just as with the imputation variance previously discussed, that in mostcases donor selection without replacement leads to best expected parameter estimation. Min-imal imputation bias is mostly achieved under limiting donor usage to just one time, but evenmore so than for the imputation variance, there are situations where other donor limits improvehot deck performance. Measures of variability are more strongly affected than those of location,
Donor limited hot deck imputation. . . 65
Table 3. Effect sizes for each factor
Quantitative Ordinal Binaryvariables variables variables
Mean Var. Corr. Med.Q. Rank Rel. Cont.
dist. corr. freq. coef.Imputation 2 .000 -.068 -.003 -.001 -.029 -.014 -.072 -.065class count 7 .000 -.147 -.052 -.003 -.115 -.054 -.090 -.118Objects per 50 .000 -.112 -.005 -.001 -.073 -.019 -.028 -.116
imputation class 250 .000 -.090 -.162 -.005 -.041 -.145 -.141 -.146Class Strong .000 -.092 -.004 -.001 -.072 -.013 -.072 -.088
structure Weak .000 -.094 -.008 -.001 -.045 -.019 -.080 -.102
Portion ofmissing data
5% .000 -.025 -.002 .000 -.013 -.003 -.011 -.02010% .000 -.071 -.002 .000 -.037 -.010 -.051 -.06120% .000 -.148 -.008 .000 -.100 -.027 -.129 -.156
Missingnessmechanism
MCAR .001 -.088 -.004 -.001 -.053 -.015 -.065 -.087MAR .000 -.100 -.006 .000 -.066 -.017 -.086 -.101
NMAR .001 -.091 -.003 .000 -.058 -.013 -.077 -.089SimDW -.001 .153 -.008 -.002 .025 .024 .075 -.147SimDM -.004 -.339 -.018 .005 -.214 -.058 -.338 -.222
Hot Deck SeqDW .001 -.007 .000 -.003 .000 .002 -.005 -.057method SeqDM .000 -.088 -.004 .010 -.133 -.006 -.041 -.078
SimR .000 -.001 .000 -.001 -.004 .001 .000 -.007SeqR .000 -.001 .000 .000 -.001 .002 -.003 -.002
which means that in some cases donor limitation will lead to less accurate confidence intervals.Contingency measures are affected less than both location and variability measures, signifyingthat a choice of donor limit is even more important if the association between variables is ofinterest.
5.3. Analysis of Donor Limit Influencing FactorsCohen’s d is used to analyze which of the factors have an influence on whether or not a donor
limitation is beneficial. Tables in the following sections first highlight main effects followedby between factor effects on any donor limit advantages. Effect sizes are calculated betweenthe two extreme cases, donor selection without and with replacement. Effects exceeding thethreshold value of .1 are in bold, with negative values indicating an advantage for the moststringent donor limit.
5.3.1. Analysis of Main Effects
Table 3 (below) shows the cross classification between all factors and factor levels with allparameters analyzed.
The first conclusion that can be reached upon investigation of the results is that, indepen-dent of any chosen factors, there are no meaningful differences between using a donor limitand using no donor limit in mean and median estimation. This result is congruent with theresults of the previous section. In contrast to this, parameters measuring variability are moreheavily influenced through the variation of the chosen factors. Especially data matrices witha high proportion missing data, as well as those imputed with SimDM will profit significantlyfrom a donor limitation. Correlation measures are influenced mainly by the amount of objects
66 Dieter William Joenssen, Udo Bankhofer
per imputation class. All effects related to the binary variables are negative, indicating thatespecially these types of variables profit from donor selection without replacement. Also a highamount of imputation classes tends to speak for a limit on donor usage.
The class structure, any of the random hot deck procedures or SeqDW have no influenceon whether a donor limit is advantageous. Fairly conspicuous is the fact that SimDW leadsto partially positive effect sizes meaning that leaving donor usage unlimited is favorable. Thisleads to interesting higher order effects, detailed in the following section.
5.3.2. Analysis of Interactions
Based on the findings in the previous section, effects are investigated stratified by the hotdeck methods SimDW, SimDM and SeqDM. Results for the parameters mean and median,for the quantitative and ordinal variables respectively, are omitted because no circumstanceconsidered yielded meaningful differences. The values for the remaining parameters are shownin table 4.
As in the analysis of main effects, this table clearly shows that using SimDW with no donorlimit is advantageous in most cases. If solely the estimation of association between binaryvariables is of interest, limiting donor usage to once is always appropriate. Furthermore, theother two methods, SimDM and SeqDM, show only negative values. Thus, the advantage ofusing a hot deck with a donor limit is strongly dependent upon the imputation method used.
For all three portrayed methods, a high amount of imputation classes and a high percentageof missing data show meaningful effects, indicating an increased tendency for any advanta-geous strategy of choosing a donor limit. The amount of objects per imputation class show nohomogeneous effect on the parameters, rather it seems to strengthen the advantage the donorlimitation or non-limitation has, with the parameters variance and quartile distance reactinginversely to the other four.
The other factors seemingly do not influence the effects as their variation does not lead togreat differences in the effect sizes, making their absolute level only dependent on the variable’sscale or imputation method.
Besides the results shown in table 4, further cross classifications between factors may becalculated. These effect sizes further highlight the additive nature of the factors systematicallyvaried in this study. Some strikingly large effects arise when considering large amounts ofmissingness and imputation classes. For example, the factor level combination: 20% missingdata, high amounts of imputation classes, and a low amount of objects per imputation class leadto effects up to -1.7 in variance, up to -1.9 in quartile distance, -3.6 in correlation, -2.9 in rankcorrelation, and -2.5 in coefficient of contingency when imputing with the SimDM algorithm.Maximum effects when imputing with the SimDW method are reached with 20% missing data,seven imputation classes,
Effect sizes up to -3 are calculated for the relative frequency in the binary variable whenthe amount of imputation classes is large, has many objects in each class and many values aremissing. This signifies some large advantage for donor selection without replacement whenusing SimDM. On the other hand, when using SimDW the largest effects are calculated whenthe amount of classes is high, but the amount of objects is low while having a high rate ofmissingness. Even though this only leads to effects of up to .6 and .34 for variance and quartiledifference respectively, the effect is noticeable and relevant for donor selection with replace-ment. Conspicuous nonetheless is the fact that especially the combination of hot deck variant,
Donor limited hot deck imputation. . . 67
Table 4. Interactions between imputation method and other factors
Var.Q. Rel.
Corr.Rank Cont.
dist. freq. Corr. coef.
SimDW
Imputation 2 .097 .025 .081 -.005 .020 -.120class count 7 .287 .033 .075 .084 .106 -.176Objects per 50 .182 .082 .034 -.009 .029 -.177
imputation class 250 .143 .056 .140 .337 .314 .012Class Strong .144 .006 .071 -.007 .018 -.145
structure Weak .153 .048 .078 .001 .033 -.154
Portion ofmissing data
5% .065 -.012 .031 -.006 .008 -.05410% .148 .006 .077 -.004 .023 -.13720% .203 .061 .101 -.006 .034 -.216
Missingnessmechanism
MAR .151 .025 .079 -.011 .023 -.150MCAR .153 .023 .067 -.005 .026 -.143NMAR .154 .029 .077 -.004 .022 -.148
SimDM
Imputation 2 -.247 -.101 -.300 -.015 -.050 -.156class count 7 -.521 -.382 -.424 -.185 -.213 -.278Objects per 50 -.426 -.284 -.132 -.021 -.074 -.280
imputation class 250 -.319 -.131 -.684 -.505 -.473 -.445Class Strong -.338 -.269 -.313 -.014 -.049 -.217
structure Weak -.339 -.156 -.362 -.033 -.073 -.233
Portion ofmissing data
5% -.084 -.057 -.045 -.007 -.010 -.04810% -.262 -.162 -.213 -.011 -.034 -.15920% -.558 -.345 -.600 -.028 -.108 -.369
Missingnessmechanism
MAR -.355 -.226 -.372 -.021 -.064 -.235MCAR -.326 -.204 -.296 -.017 -.055 -.212NMAR -.334 -.213 -.344 -.015 -.054 -.220
SeqDM
Imputation 2 -.066 -.082 -.049 -.002 -.008 -.065class count 7 -.130 -.217 -.031 -.051 -.003 -.090Objects per 50 -.111 -.196 -.004 -.004 -.007 -.104
imputation class 250 -.088 -.047 -.098 -.130 -.086 -.089Class Strong -.085 -.132 -.040 -.003 -.006 -.067
structure Weak -.091 -.135 -.042 -.008 -.006 -.096
Portion ofmissing data
5% -.013 -.028 -.004 .000 .001 -.01110% -.039 -.073 -.010 -.002 .001 -.03320% -.168 -.233 -.085 -.007 -.015 -.147
Missingnessmechanism
MAR -.107 -.152 -.058 -.004 -.010 -.101MCAR -.075 -.119 -.025 -.003 -.003 -.063NMAR -.081 -.125 -.038 -.004 -.004 -.068
68 Dieter William Joenssen, Udo Bankhofer
amount of imputation classes, objects per imputation class, and portion of missing data lead tostrong effects indicating strong advantages for and against donor limitation.
6. ConclusionsThe simulation conducted shows distinct differences between different levels of donor lim-
its. Unlike Kalton and Kish [16] suggested, smallest imputation variance is not always achievedwhen donors are selected without replacement from the pool. Their suggestion is thus limited toa subset of many possible combinations of situations and hot deck types. When imputation biasis taken into account, it becomes apparent, that there are many more situations where overlylimiting donor usage is ill advised. For most parameters, the chances are less than 50/50 thatthe most extreme donor limit is advisable.
Further, there are some subsets of situations in which both imputation variance and biasare minimal when one of the two dynamic donor limits is chosen. This indicates, that neitherstrategy of donor selection with or without replacement is always superior, but that there isindeed a payoff between protection from donor over usage and the ability to choose the mostsimilar donor. Thus the truth lies between the arguments presented in section 3.
These findings show that the most influential factor in deciding whether or not to imputeusing donor selection without replacement is the hot deck method used. When using randomhot deck methods, the question of choosing a donor limit is frivolous. Implementing a donorlimit into an existing system would not be worth the effort. When considering nearest neigh-bor hot decks, not only the method of compensating the missing data prior to calculating thedissimilarity measure is influential, but also whether variables are processed simultaneously orsequentially. With distance calculation assisted by mean imputation donor selection withoutreplacement always denotes a superior strategy. If a reweighting scheme is chosen, parameterestimation (excluding the contingency coefficient for binary variables) is never worse whendonors may be chosen an unlimited amount of times. Sequential processing of variables leadsto trivial differences, but simultaneous processing of variables leads to noticeable advantageswhen allowing infinite donor usage. Beyond that the overall magnitude of advantage for anydonor usage tactic is determined by the factors objects per imputation class, the amount ofimputation classes, and the proportion of missing data. These results, in conjunction with theintended, post-imputation analyses, dictate which donor limit, with or without replacement, ismost suitable. For example, if a decision tree should be constructed with a CHAID algorithm,donor selection without replacement should be used for imputation, because the best estimationof the coefficient of contingency is best estimated when donor selection is performed withoutreplacement.
In conclusion, some interesting questions can be answered with this research, while othersremain open. Results from sections 5.1 and 5.2 indicate, that there may be a situation dependentoptimal donor limit which may be dynamically determined from the data on hand. Hence, a hotdeck method with a data driven donor limit selection method may have desirable properties.Finally the large amount of situations under which donor selection without replacement is thesuperior strategy raises questions. Since generally imputing without donor replacement makesthe results dependent on the sequence of the recipients, results of hot deck imputation couldfurther be improved if donor selection was not performed to minimize the distance at each step,but to minimize the sum of distances between all donors and recipients. Thus, further researchpertaining to hot deck imputation and donor selection schemes remains worthwhile.
Donor limited hot deck imputation. . . 69
References
[1] Allison P.D.: Missing Data, Sage University Papers Series on Quantitative Applications in theSocial Sciences. Thousand Oaks, 2001.
[2] Andridge R.R., Little R.J.A.: A Review of Hot Deck Imputation for Survey Non-Response. Interna-tional Statistical Review, 78, 1, pp. 40–64, 2010.
[3] Bankhofer U.: Unvollstandige Daten- und Distanzmatrizen in der Multivariaten Datenanalyse.Eul, Bergisch Gladbach, 1995.
[4] Barzi F., Woodward M.: Imputations of Missing Values in Practice: Results from Imputations ofSerum Cholesterol in 28 Cohort Studies. American Journal of Epidemiology, 160, pp. 34–45, 2004.
[5] Bortz J., Doring N.: Forschungsmethoden und Evaluation fur Human- und Sozialwissenschaftler.Springer, Berlin, 2009.
[6] Brick J.M., Kalton G.: Handling Missing Data in Survey Research. Statistical Methods in MedicalResearch, 5, pp. 215–238, 1996.
[7] Brooks C.A., Bailar B.A.: An Error Profile: Employment as Measured by the Current PopulationSurvey. Statistical Policy Working Paper 3. U.S. Government Printing Office, Washington, D.C.,1978.
[8] Cario M.C., Nelson B.L.: Modeling and Generating Random Vectors with Arbitrary MarginalDistributions and Correlation Matrix. Northwestern University, IEMS Technical Report, 50, pp.100–150, 1997.
[9] Cohen J.: A Power Primer. Quantitative Methods in Psychology, 112, pp. 155–159, 1992.[10] Durrant G.B.: Imputation Methods for Handling Item-Nonresponse in Practice: Methodologi-
cal Issues and Recent Debates. International Journal of Social Research Methodology, 12, pp.293–304, 2009.
[11] Ford B.: An Overview of Hot-Deck Procedures. In: W. Madow, H. Nisselson, I. Olkin (Eds.):Incomplete Data in Sample Surveys, 2, Theory and Bibliographies. Academic Press, pp. 185–207,1983.
[12] Frohlich M., Pieter A.: Cohen’s Effektstarken als Mass der Bewertung von praktischer Relevanz –Implikationen fur die Praxis. Schweizerische Zeitschrift fur Sportmedizin und Sporttraumatologie,57, 4, pp. 139–142, 2009.
[13] Kaiser J.: The Effectiveness of Hot-Deck Procedures in Small Samples. Proceedings of the Sectionon Survey Research Methods, American Statistical Association, pp. 523–528, 1983.
[14] Kalton G., Kasprzyk D.: Imputing for Missing Survey Responses. Proceedings of the Section onSurvey Research Methods, American Statistical Association, pp. 22–31, 1982.
[15] Kalton G., Kasprzyk D.: The Treatment of Missing Survey Data. Survey Methodology, 12, pp.1–16, 1986.
[16] Kalton G., Kish L.: Two Efficient Random Imputation Procedures. Proceedings of the SurveyResearch Methods Section 1981, pp. 146–151, 1981.
[17] Kim J.O., Curry J.: The Treatment of Missing Data in Multivariate Analysis. Sociological Methodsand Research, 6, pp. 215–240, 1977.
[18] Kim W., Choi B., Hong E., Kim S., Lee D.: A Taxonomy of Dirty Data. Data Mining and Knowl-edge Discovery, 7, 1, pp. 81–99, 2003.
[19] Little R.J., Rubin D.B.: Statistical Analysis with Missing Data. New York, Wiley, 1987.[20] Marker D.A., Judkins D.R., Winglee M.: Large-Scale Imputation for Complex Surveys. In: Groves
R.M., Dillman D.A., Eltinge J.L., Little R.J.A. (Eds.): Survey Nonresponse. John Wiley and Sons,pp. 329–341, 2001.
[21] Nordholt E.S.: Imputation: methods, simulation experiments and practical examples. InternationalStatistical Review, 66, pp. 157–180, 1998.
70 Dieter William Joenssen, Udo Bankhofer
[22] Pearson R.: Mining Imperfect Data. Philadelphia, Society for Industrial and Applied Mathematics,2005.
[23] Roth P.L.: Missing Data in Multiple Item Scales: A Monte Carlo Analysis of Missing Data Tech-niques. Organizational Research Methods, 2, pp. 211–232, 1999.
[24] Roth P.L., Switzer III F.S.: A Monte Carlo Analysis of Missing Data Techniques in a HRM Setting.Journal of Management, 21, pp. 1003–1023, 1995.
[25] Rubin D.B.: Inference and Missing Data (with discussion). Biometrika 63, pp. 581–592, 1976.[26] Sande I.: Hot-Deck Imputation Procedures. In: W. Madow, H. Nisselson, I. Olkin (Eds.): Incom-
plete Data in Sample Surveys, 3, Theory and Bibliographies. Academic Press, pp. 339–349, 1983.[27] Siddique J., Belin T.R.: Multiple Imputation Using an Iterative Hot-Deck with Distance-Based
Donor Selection. Statistics in medicine, 27, 1, pp. 83–102, 2008.[28] Strike K., Emam K.E., Madhavji N.: Software Cost Estimation with Incomplete Data. IEEE Trans-
actions on Software Engineering, 27, pp. 890–908, 2001.[29] Yenduri S., Iyengar S.S.: Performance Evaluation of Imputation Methods for Incomplete Datasets.
International Journal of Software Engineering and Knowledge Engineering, 17, pp. 127–152, 2007.