Journal of Theoretical and Applied Computer Science

70
Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012 QCA & CQCA: QUAD COUNTRIES ALGORITHM AND CHAOTIC QUAD COUNTRIES ALGORITHM M. A. Soltani-Sarvestani, Shahriar Lotfi .......................................... 3 EFFECTIVENESS OF MINI-MODELS METHOD WHEN DATA MODELLING WITHIN A 2D-SPACE IN AN INFORMATION DEFICIENCY SITUATION Marcin Pietrzykowski ....................................................... 21 SMARTMONITOR: RECENT PROGRESS IN THE DEVELOPMENT OF AN INNOVATIVE VISUAL SURVEILLANCE SYSTEM Dariusz Frejlichowski, Katarzyna Gościewska, Pawel Forczmański, Adam Nowosielski, Radoslaw Hofman.......................................................... 28 NONLINEARITY OF HUMAN MULTI-CRITERIA IN DECISION-MAKING Andrzej Piegat, Wojciech Salabun.............................................. 36 METHOD OF NON-FUNCTIONAL REQUIREMENTS BALANCING DURING SERVICE DEVELOPMENT Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill ....................... 50 DONOR LIMITED HOT DECK IMPUTATION: EFFECTS ON PARAMETER ESTIMATION Dieter William Joenssen, Udo Bankhofer ........................................ 58

Transcript of Journal of Theoretical and Applied Computer Science

Page 1: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied

Computer Science

Vol. 6, No. 3, 2012

QCA & CQCA: QUAD COUNTRIES ALGORITHM AND CHAOTIC QUAD COUNTRIES ALGORITHM

M. A. Soltani-Sarvestani, Shahriar Lotfi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

EFFECTIVENESS OF MINI-MODELS METHOD WHEN DATA MODELLING WITHIN A 2D-SPACE IN AN

INFORMATION DEFICIENCY SITUATION

Marcin Pietrzykowski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

SMARTMONITOR: RECENT PROGRESS IN THE DEVELOPMENT OF AN INNOVATIVE VISUAL

SURVEILLANCE SYSTEM

Dariusz Frejlichowski, Katarzyna Gościewska, Paweł Forczmański, Adam Nowosielski,

Radosław Hofman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

NONLINEARITY OF HUMAN MULTI-CRITERIA IN DECISION-MAKING

Andrzej Piegat, Wojciech Sałabun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

METHOD OF NON-FUNCTIONAL REQUIREMENTS BALANCING DURING SERVICE DEVELOPMENT

Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill . . . . . . . . . . . . . . . . . . . . . . . 50

DONOR LIMITED HOT DECK IMPUTATION: EFFECTS ON PARAMETER ESTIMATION

Dieter William Joenssen, Udo Bankhofer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Page 2: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Scientific quarterly of the Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission

Scientific advisory board:

Chairman:

Prof. Henryk Krawczyk, Corresponding Member of Polish Academy of Sciences,

Gdansk University of Technology, Poland

Members:

Prof. Michał Białko, Member of Polish Academy of Sciences, Koszalin University of Technology, Poland

Prof. Aurélio Campilho, University of Porto, Portugal

Prof. Ran Canetti, School of Computer Science, Tel Aviv University, Israel

Prof. Gisella Facchinetti, Università del Salento, Italy

Prof. André Gagalowicz, The National Institute for Research in Computer Science and Control (INRIA), France

Prof. Constantin Gaindric, Corresponding Member of Academy of Sciences of Moldova, Institute of Mathematics and Computer

Science, Republic of Moldova

Prof. Georg Gottlob, University of Oxford, United Kingdom

Prof. Edwin R. Hancock, University of York, United Kingdom

Prof. Jan Helmke, Hochschule Wismar, University of Applied Sciences, Technology, Business and Design, Wismar, Germany

Prof. Janusz Kacprzyk, Member of Polish Academy of Sciences, Systems Research Institute, Polish Academy of Sciences, Poland

Prof. Mohamed Kamel, University of Waterloo, Canada

Prof. Marc van Kreveld, Utrecht University, The Netherlands

Prof. Richard J. Lipton, Georgia Institute of Technology, USA

Prof. Jan Madey, University of Warsaw, Poland

Prof. Kirk Pruhs, University of Pittsburgh, USA

Prof. Elisabeth Rakus-Andersson, Blekinge Institute of Technology, Karlskrona, Sweden

Prof. Leszek Rutkowski, Corresponding Member of Polish Academy of Sciences, Czestochowa University of Technology, Poland

Prof. Ali Selamat, Universiti Teknologi Malaysia (UTM), Malaysia

Prof. Stergios Stergiopoulos, University of Toronto, Canada

Prof. Colin Stirling, University of Edinburgh, United Kingdom

Prof. Maciej M. Sysło, University of Wrocław, Poland

Prof. Jan Węglarz, Member of Polish Academy of Sciences, Poznan University of Technology, Poland

Prof. Antoni Wiliński, West Pomeranian University of Technology, Szczecin, Poland

Prof. Michal Zábovský, University of Zilina, Slovakia

Prof. Quan Min Zhu, University of the West of England (UWE), Bristol, United Kingdom

Editorial board:

Editor-in-chief:

Dariusz Frejlichowski, West Pomeranian University of Technology, Szczecin, Poland

Managing editor:

Piotr Czapiewski, West Pomeranian University of Technology, Szczecin, Poland

Section editors:

Michaela Chocholata, University of Economics in Bratislava, Slovakia

Piotr Dziurzański, West Pomeranian University of Technology, Szczecin, Poland

Paweł Forczmański, West Pomeranian University of Technology, Szczecin, Poland

Przemysław Klęsk, West Pomeranian University of Technology, Szczecin, Poland

Radosław Mantiuk, West Pomeranian University of Technology, Szczecin, Poland

Jerzy Pejaś, West Pomeranian University of Technology, Szczecin, Poland

Izabela Rejer, West Pomeranian University of Technology, Szczecin, Poland

ISSN 2299-2634

The on-line edition of JTACS can be found at: http://www.jtacs.org. The printed edition is to be considered the primary one.

Publisher:

Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission

Address: Waryńskiego 17, 71-310 Szczecin, Poland

http://www.jtacs.org, email: [email protected]

Page 3: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 3-20

ISSN 2299-2634 http://www.jtacs.org

QCA & CQCA: Quad Countries Algorithm and Chaotic

Quad Countries Algorithm

M. A. Soltani-Sarvestani1, Shahriar Lotfi

2

1 Computer Engineering Department, University College of Nabi Akram, Tabriz, Iran

2 Computer Science Department, University of Tabriz, Tabriz, Iran

[email protected], [email protected]

Abstract: This paper introduces an improved evolutionary algorithm based on the Imperialist Com-

petitive Algorithm (ICA), called Quad Countries Algorithm (QCA) and with a little change

called Chaotic Quad Countries Algorithm (CQCA). The Imperialist Competitive Algorithm

is inspired by socio-political process of imperialistic competition in the real world and has

shown its reliable performance in optimization problems. This algorithm converges quickly,

but is easily stuck into a local optimum while solving high-dimensional optimization prob-

lems. In the ICA, the countries are classified into two groups: Imperialists and Colonies

which Imperialists absorb Colonies, while in the proposed algorithm two other kinds of

countries, namely Independent and Seeking Independence countries, are added to the coun-

tries collection which helps to more exploration. In the suggested algorithm, Seeking Inde-

pendence countries move in a contrary direction to the Imperialists and Independent

countries move arbitrarily that in this paper two different movements are considered for

this group; random movement (QCA) and Chaotic movement (CQCA). On the other hand,

in the ICA the Imperialists’ positions are fixed, while in the proposed algorithm, Imperial-

ists will move if they can reach a better position compared to the previous position. The

proposed algorithm was tested by famous benchmarks and the compared results of the QCA

and CQCA with results of ICA, Genetic Algorithm (GA), Particle Swarm Optimization

(PSO), Particle Swarm inspired Evolutionary Algorithm (PS-EA) and Artificial Bee Colony

(ABC) show that the QCA has better performance than all mentioned algorithms. Between

all cases, the QCA, ABC and PSO have better performance respectively about 50%, 41.66%

and 8.33% of cases.

Keywords: Optimization, Imperialist Competitive Algorithm (ICA), Independent country, Seeking Inde-

pendent country, Quad Countries Algorithm (QCA) and Chaotic Quad Countries Algorithm

(CQCA).

1. Introduction

Evolutionary algorithms (EA) [1, 2] are algorithms that are inspired by nature and have

many applications to solving NP problems in various fields of science. Some of the famous

Evolutionary Algorithms proposed for optimization problems are: the Genetic Algorithm

(GA) [2, 3, 4], at first proposed by Holland, in 1962 [3], Particle Swarm Optimization algo-

rithm (PSO) [5] first proposed by Kennedy and Eberhart [5], in 1995. In 2007, Atashpaz and

Lucas proposed an algorithm known as Imperialist Competitive Algorithm (ICA) [6,7], that

was inspired by a socio-human phenomenon. Since 2007 attempts were performed to in-

Page 4: Journal of Theoretical and Applied Computer Science

4 M. A. Soltani-Sarvestani, Shahriar Lotfi

crease the efficiency of the ICA. Zhang, Wang and Peng proposed the approach based on

the concept of small probability perturbation to enhance the movement of colonies to impe-

rialist, in 2009 [8]. Faez, Bahrami and Abdechiri, in 2010, proposed a new method using the

chaos theory to adjust the angle of colonies movement toward the Imperialist’s positions

(CICA: Imperialist Competitive Algorithm using Chaos Theory for Optimization) [9], and in

another paper in the same year, they proposed another algorithm that applies the probability

density function to adapt the angle of colonies movement towards imperialist’s position

dynamically, during iterations (AICA: Adaptive Imperialist Competitive Algorithm) [10].

In the Imperialist Competitive Algorithm (ICA), there are only two different types of

countries, Imperialists and Colonies that Imperialists absorb. While, in the real world, there

are some Independent Countries which are neither Imperialists nor Colonies. Some of the

Independent Countries are at peace with Imperialists and the others have challenge with

Imperialists to stable their independence. In the ICA, only the Colonies’ movements toward

Imperialists are considered while in the real world each Imperialist moves in order to pro-

mote its political and cultural position. In the Quad Countries Algorithm (QCA) and Chaotic

Quad Countries Algorithm (CQCA), countries are divided into four categories: Imperialist,

Colony, Seeking Independent and Independent as each category has its special movement

compared to the others. In the QCA and CQCA, as in the real world, an Imperialist will

move if it brings advancement to a better position than its current position.

The rest of this paper is arranged as follows. Section two explains about related works.

Section three presents a brief description of Imperialist Competitive Algorithm. Section four

will explain the proposed algorithm. In section five, the result will be analyzed and the per-

formance of algorithms will be evaluated. In the section six, a conclusion will be presented.

2. Related Works

In 2009 Zhang, Wang and Peng [8] mentioned that the original approach in the Imperial-

ist Competitive Algorithm has difficulty in practical implementation with the increase of the

dimension of the search spaces, as the ambiguous definition of the “random angle” in the

process of optimization. Compared to the original algorithm, their approach based on the

concept of small probability perturbation has more simplicity to be implemented, especially

in solving high-dimensional optimization problems. Furthermore, their algorithm has been

extended to constrained optimization problem, using a classical penalty technique to handle

constraints.

In 2010, Faez, Bahrami and Abdechiri [9] introduced a new Imperialist Competitive Al-

gorithm using chaotic maps (CICA). In their algorithm, the chaotic maps were used to adapt

the angle of colonies movement towards imperialist’s position to enhance the escaping ca-

pability from a local optima trap.

In the same year Faez, Bahrami and Abdechiri [10] introduced an algorithm that the Ab-

sorption Policy changed dynamically to adapt the angle of colonies movement towards im-

perialist’s position. They mentioned that The ICA is easily stuck into a local optimum when

solving high-dimensional multi-model numerical optimization problems. To overcome this

shortcoming, they used probabilistic model that utilize the information of colonies positions

to balance the exploration and exploitation abilities of the imperialistic competitive algo-

rithm. Using this mechanism, ICA exploration capability enhanced.

Page 5: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 5

3. The Imperialist Competitive Algorithm (ICA)

Imperialist Competitive Algorithm (ICA) was proposed for the first time by Atashpaz

and Lucas in 2007 [6]. ICA is a new evolutionary algorithm in the Evolutionary Computa-

tion (EC) field based on the human socio-political evolution. The algorithm starts with an

initial random population called countries, then some of the best countries in the population

are selected to be the imperialists and the rest of them form the colonies of these imperial-

ists. The colonies are divided between them according to imperial power. In an Nvar-

dimensional optimization problem, a country is a 1×Nvar array. This array defined as below:

var1 2[ , ,..., ]

Ncountry p p p= .

(1)

The cost of a country is found by evaluating the cost function f at the variables

var1 2( , ,..., )

Np p p . Then

var1 2( ) ( , ,..., )i i N

f fcountry p p pc = = .

(2)

The algorithm starts with Npop initial countries and the Nimp of the most powerful coun-

tries is chosen as imperialists. The remaining countries are colonies belong into imperialists

in convenience with their powers. To distribute the colonies among imperialist proportional-

ly, the normalized cost of an imperialist is defined as follow

{ }maxn i i nC c c= − , (3)

where, cn is the cost of nth

imperialist and Cn is its normalized cost. Each imperialist with

more cost value will have less normalized cost value. Having the normalized cost, the nor-

malized power of each imperialist is calculated as below and based on this, the colonies are

distributed among the imperialist countries:

C nP n

N impC i

i i

=

∑=

, (4)

where Pn is the normalized power of an imperialist. On the other hand, the normalized pow-

er of an imperialist is assessed by its colonies. Then, the initial number of colonies of an

empire will be

( ){ }.n coln

rand pNC N= , (5)

where NCn is initial number of colonies of nth

empire and Ncol is the number of all colonies.

To distribute the colonies among imperialist, NCn of the colonies is selected randomly and

assigned to their imperialist. The imperialist countries absorb the colonies towards them-

selves using the absorption policy. The absorption policy makes the main core of this algo-

rithm and causes the countries move towards their minimum optima; this policy is shown in

Fig.1. In the absorption policy, the colony moves towards the imperialist by x unit. The di-

rection of movement is the vector from colony to imperialist, as shown in Fig.1. In this fig-

ure, the distance between the imperialist and colony is shown by d and x is a random

variable with uniform distribution:

( )dUx ×≈ β,0 , (6)

Page 6: Journal of Theoretical and Applied Computer Science

6 M. A. Soltani-Sarvestani, Shahriar Lotfi

where β is greater than 1 and is near to 2. So, in [6] is mentioned that a proper choice can be

β=2. In ICA algorithm, to search different points around the imperialist, a random amount of

deviation is added to the direction of colony movement towards the imperialist. In Fig.1,

this deflection angle is shown as Ө, which is chosen randomly and with a uniform distribu-

tion:

( ),Uθ γ γ≈ − . (7)

While moving toward the imperialist countries, a colony may reach a better position, so

the colony position changes according to the position of the imperialist.

x Ө

d

Figure 1. Moving colonies toward their imperialist [6]

The imperialists absorb these colonies towards themselves with respect to their power

that is described in (8). The total power of each imperialist is determined by the power of its

both parts, the empire power plus the percent of its average colonies’ power:

( ) ( ){ }n n ncost mean cost colonies ofimperialist empireTC ξ= + ⋅ , (8)

where TCn is the total cost of the nth

empire and ξ is a positive number which is considered

to be less than one. In the ICA, the imperialistic competition has an important role. During

the imperialistic competition, the weak empire will lose their power and their colonies. To

model this competition, first the probability of possessing all the colonies is calculated for

each empire, considering the total cost of such an empire:

{ } TCTCNTC niin −= max , (9)

where TCn is the total cost of nth

empire and NTCn is the normalized total cost of nth

empire.

Having the normalized total cost, the possession probability of each empire is calculated as

below:

1

NTC np p Nn impNTC i

i

=

∑=

. (10)

After a while all the empires except the most powerful one will collapse and all the colo-

nies will be under the control of this unique empire.

Page 7: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 7

4. Quad Countries Algorithm (QCA)

In this paper, a new Imperialist Competitive Algorithm is proposed which is called Quad

Countries Algorithm where two new categories of countries are added to the collection of

countries; Independent and Seeking Independence countries. In addition, in the new algo-

rithm Imperialists can also move like the other countries. In the main ICA, there are only

two categories of countries, Imperialist and Colony, and the only movement that exists there

is the Colonies’ movement towards Imperialists, while in the proposed algorithm, there are

four categories of countries with different movements. Therefore, the primary ICA may fall

into local minimum trap during the search process and it is possible to get far from the glob-

al optimum. With changes that were performed in ICA a new algorithm called QCA was

made whose power of exploration in the search space will substantially increase and prevent

it from sticking in the local traps.

4.1. Independent Country

In the real world, permanently there are countries which have been neither Colonies, nor

Imperialist. These Countries may perform any movements in order to take their advantage

and try to improve their current situation. In the proposed algorithm, some countries are

defined as Independent countries which explore search space randomly. As an illustration in

Fig. 2, if during the search process an Independent country reaches a better position com-

pared to an Imperialist, they definitely exchange their positions. The Independent country

changes to a new Imperialist and will be the owner of old Imperialist’s Colonies, and the

Imperialist changes to an Independent Country and will start to explore the search space like

these kinds of countries.

As mentioned, the Independent countries can perform any movements in the algorithm

and their movements are arbitrary. In this paper, two different kinds of movements are con-

sidered for the Independent countries. One is a completely random movement. With this

kind of movement, the Independent countries move completely randomly in different direc-

tions, and also independently from each other, which is named QCA. In the second kind of

movement, these countries move based on Chaos Theory which is named CQCA which is

explained in the next part.

4.1.1. Definition of Chaotic movement for Independent Countries (CQCA)

In this approach, the Independent countries move according to Chaos Theory. In this

kind of movement, the angle of movement is changed in a Chaotic way during the search

process.

Page 8: Journal of Theoretical and Applied Computer Science

8 M. A. Soltani-Sarvestani, Shahriar Lotfi

Figure 2. Replacing an Empire with an Independent

This Chaotic action in the Independent countries’ movements in the CQCA makes the

proper condition for the algorithm to more exploration and escape from local peaks and we

introduce this approach as Chaotic Quad countries algorithm (CQCA). Chaos variables are

usually generated by the some well-known Chaotic maps [11, 12]. Table 1 shows some of

the Chaotic maps for adjusting Ө parameter (Angle of Independent countries movement).

Table 1. Chaotic maps

Chaotic maps

1 (1 )n n nθ αθ θ+ = − CM1

2

1 sin( )n n n

θ αθ πθ+ = CM2

1 ( )sin(2 ) mod(1)2n n nb αθ θ πθπ+ = + − CM3

In Table 1, α is a control parameter and Ө is a chaotic variable in kth

iteration which be-

longs to interval (0, 1). During the search process, no value of Ө is repeated.

Independent

Imperialists

Colonies

One step of movement

Replacing an Empire

with an Independent

Page 9: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 9

4.2. Seeking Independence Countries

Seeking Independence Countries are countries which have challenges with the Imperial-

ists and try to be away from them. In the main ICA, the only movement is the Colonies’

movements toward Imperialists and in fact, there is only Absorption policy. While by defin-

ing the Seeking Independence Countries in proposed algorithm, there is also Repulsion poli-

cy versus Absorption policy. Fig.3 illustrates the Repulsion Policy.

Empire1

Empire2

Empire3

Colony1

Colony2

Colony3

Independent

a) Absorption policy

b) Absorption and Repulsion policy

Figure 3. Different movement policy

As can be seen in Fig.3.a, there is only Absorption policy that matches with the ICA. As

it shows, the only use of applying Absorption policy causes that countries’ positions to get

closer to each other and their surrounded space to decrease gradually, and the global optima

might be lost. In Fig.3.a the algorithm is converging to a local optimum. Fig.3.b illustrates

the process of the proposed algorithm. The black squares represent the Seeking Independ-

ence Countries, and as can be seen, these countries can steer the search process to a direc-

tion which the other countries don’t cover. It shows that using Absorption and Repulsion

policies together leads to a better coverage of search space.

To apply the Repulsion policy in the QCA, first the sum of differences between the

Seeking Independent Countries and the Imperialists positions is calculated as a vector like

(11) named Center, that is a 1×N vector.

1( ) , 1, 2,...,

Nimp

i i jijCenter a p i N

== − =∑ , (11)

where Centeri is sum of ith

component of all Imperialists, pji is ith

component of jth

Imperial-

ist, ai is ith

component of Seeking Independence Country and N indicates the problem di-

mensions. Then the Seeking Independence Countries will move in the direction of obtained

vector as (12).

, (0,1)D Centerδ δ= × ∈ , (12)

Global Optimum

Global Optimum

Page 10: Journal of Theoretical and Applied Computer Science

10 M. A. Soltani-Sarvestani, Shahriar Lotfi

where δ is relocation factor and D is relocation vector that its components sum peer to peer

with the Seeking Independence Country’s components and obtain new position of the Seek-

ing Independence Country.

4.3. Imperialists Movement

In the real world, all countries including Imperialists perform ongoing efforts to improve

their current situation. While in the main ICA, Imperialists never move and this fixed situa-

tion sometimes leads to the loss of global optima or prevents to reach up better solutions.

Fig.4 illustrates this problem clearly. Fig.4 could be a final state of running the ICA, when

only one Imperialist has remained. Since in the ICA Imperialists have no motion, solution 1

is the answer that the ICA returns. In the proposed approach, a random movement is as-

sumed for Imperialists in each iteration and the cost of this hypothetical position will be

calculated. If the cost of the new position is less than the cost of the previous one, the Impe-

rialist will move to the new position, otherwise the Imperialist will not move. As can be

seen in Fig.4, using this method leads to solution 2 which is a better solution than solution 1.

Figure 4. A final state of ICA and QCA

To applying this policy in the QCA, first of all, equals to the number of problem dimen-

sions, the random values are generated like (13).

iRand Iα = × , (13)

where I is an arbitrary value that is dependent on the problem size. Then the new position of

Imperialist is obtained like (14).

var var var

var var var

var var

1 1 1

1 1 1

1 1

( ,..., ) ( ,..., )

( ,..., ) ( ,..., )

( ,..., ) ( ,..., )

N N N

N N N

N N

P P P P

if f P P f P P

P P P P Otherwise

α α

α α

= + +

+ + <

=

, (14)

where the αi are numbers which were obtained in Equation (13) and Pi shows the value of ith

dimension of a country. In fact, equation (14) states that if the new position of Imperialist is

better than its current position, the Imperialists will be transferred to the new position, oth-

erwise, they remain in their current position.

According to the explained part about countries Seeking Independence and Independent

countries, now their actions in the algorithm are specific. By adding these policies and ac-

Page 11: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 11

tions, a new algorithm is generated, called Quad Countries Algorithm (QCA), and through

defining Chaotic movement for Independent Countries another algorithm is generated,

which is named Chaotic Quad Countries Algorithm (CQCA), both of which have better per-

formance compared to ICA.

5. Evaluation and Experimental Results

In this paper, two new algorithms based on the Imperialist Competitive Algorithm

(ICA), called Quad Countries Algorithm (QCA) and Chaotic Quad Countries Algorithm

(CQCA) are introduced and were applied to some well-known benchmarks in order to veri-

fy their performance and compare to ICA. These benchmark functions are presented in Ta-

ble 2.

The simulation was made to evaluate the rate of convergence and the quality of the pro-

posed algorithm optima results, in comparison to ICA with all the benchmarks tested for

minimization. Both algorithms are applied in identical conditions in 2, 10, 30, 50 dimen-

sions. The number of countries in both algorithms were 125, including 10 Imperialists and

115 Colonies in the ICA, and 10 Imperialists, 80 Colonies, 18 countries Seeking Independ-

ence and 17 Independent countries in the QCA and CQCA. Both algorithms are run 100

times and 1000 generations in each iteration and average of these iterations are recorded in

Table 3.

Table 2. Benchmarks for simulation

Benchmark Mathematical Representation Range

Ackley e

D

i ix

n

D

ix i

n

xf −+∑=

−∑=

−−=

20

12cos

1

exp1

21

2.0exp20)( π

[-32.768,

32.768]

Griewank 1

1

cos2

4000

1)(

1+

=

−∑×= ∏

=

D

ii

xix i

xfD

i [-600,600]

Rastrigin ( )( )2( )

110 2 10cos

i

Df x

ix Dxi

π∑==

− × + [-15,15]

Sphere ( )∑=

=D

ixf ix

1)(

2 [-600,600]

Rosenbrock ( ) ( )( )∑−

== −+−× −

1

1)(

222

1 1100D

ixf iii xxx [-15,15]

Symmetric

Griewank 1

1

cos2

4000

1)(

1+

=

−∑×−= ∏

=

D

ii

xix i

xfD

i [-600,600]

Symmetric

Rastrigin ( )( )∑

=−= +×−

D

ixf

ix xi1

)( 10210 cos2 π [-600,600]

Symmetric

Sphere ( )∑

=−=

D

ixf ix

1)(

2 [-600,600]

Experiments started with Griewank Inverse function. Griewank Inverse is a hill-like

function and its global optima are located in the corner of search space. Both algorithms

were applied 100 times in identical conditions and the entrances are selected randomly.

Fig.5 averagely, illustrates the graph of the results of 100 iterations in different dimensions

with 1000 generations in each iteration of Griewank Inverse.

Page 12: Journal of Theoretical and Applied Computer Science

12 M. A. Soltani-Sarvestani, Shahriar Lotfi

In Figures 5.a, 6.a, 7.a and 8.a the horizontal axis indicates the number of iterations.

These graphs show the obtained results in each iteration for each algorithm. And in Figures

5.b, 6.b, 7.b and 8.b the horizontal axis indicates the number of generation. These graphs

illustrate the convergence of algorithms. As mentioned, two different kinds of motions are

defined for Independent countries: Chaotic and random motions which are named CQCA

and QCA respectively. So there are three curves in all graphs in these Figures, ICA, QCA

and CQCA.

Figure 5.a illustrates the results of 100 iterations of applying algorithms on Griewank

Inverse with two dimensions. In 100 iterations, 79 times the QCA and CQCA achieve better

results than ICA. As can be seen in Figures 6.a, 7.a and 8.a by increasing the function’s

dimensions respectively to 10, 30 and 50, the QCA and CQCA achieve better results com-

pared to the ICA in every 100 iterations. Figures 5.b, 6.b, 7.b and 8.b illustrate the average

of the convergence of both algorithms and as can be seen, in addition to the quality of the

results the convergences of the QCA and CQCA are also faster than the ICA. By increasing

the problem’s dimensions, the performance of ICA will decrease, while the QCA and

CQCA still maintain their performances. It is worth consideration that the results of apply-

ing two kinds of Independent countries’ movement are so close to each other that their

curves are the same.

The observed results of applying the algorithms on the rest of the benchmarks in Table 2

were approximately similar to Griewank Inverse and the results are shown in Table 3. The

Table 3 includes 14 columns; from left to right: the 1st column indicates the benchmark’s

name, the 2nd

one is the range of the function’s parameters, the 3rd

indicates the function’s

dimensions and the 4th

column indicates the optimum of benchmark. The 5th

column indi-

cates the best results obtained by the QCA and the 8th

and 11th

columns are respectively the

best results of the CQCA and the ICA. The 6th

,9th

and 12th

columns respectively indicate

the average of the results in 100 iterations of the QCA, CQCA and the ICA. And 7th

, 10th

and 13th

columns indicate standard deviation (SD) of the QCA, CQCA and the ICA. And

the 14th

column indicates the rate of improvement of QCA in comparison to the ICA.

As can be seen, the QCA and CQCA results are better than the ICA in all cases except

the Schwefel, where all algorithms achieve the same results. The recorded results in Table 3

show that, as the problem dimensions increase, the performance of the QCA and CQCA

increases versus the ICA.

The results of the QCA and CQCA are closely the same considerably. Each function in

Table 3 performs 100 times and up to 1000 generation in each iteration by the same en-

trances with 2, 10, 30 and 50 dimensions.

In the other comparison, the results are compared to Genetic Algorithm (GA), Particle

Swarm Optimization (PSO), PS-EA and Artificial Bee Colony (ABC) in Table 4. As can be

seen, the results of the proposed algorithm are better than GA and PS-EA in 100 percent of

cases. But in the comparison with the QCA, the ABC and PSO the conditions are different.

Also, in 50 percent of cases the QCA has better performance in compared to ABC and PSO

and the best results are highlight in Table 4. The ABC and PSO have better performance

respectively 41.66 and 8.33 percent of cases. But there is a doubt about the ABC. As can be

observed in all results, by increasing the problem dimensions, the performance of the algo-

rithm will decrease. Naturally, the obtained results for a function with higher dimensions

should be equal or bigger than the function with lower dimensions. By considering the

Greiwank in Table 4, observed that the ABC acted inversely in this case and the result of

applying the algorithm on function with 30 dimensions is smaller than 10 dimensions one

and it seems that it is a mistake. So if this paradox is considered as a mistake, performance

of QCA, PSO and ABC will change to 58.33, 16.66 and 25 percent.

Page 13: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 13

cost

Run number

5.a. Stability of ICA, QCA and CQCA

cost

Generation

5.b. Convergence of ICA, QCA and CQCA

Figure 5. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 2 Dimensions

Page 14: Journal of Theoretical and Applied Computer Science

14 M. A. Soltani-Sarvestani, Shahriar Lotfi

cost

Run number

6.a. Stability of ICA, QCA and CQCA

cost

Generation

6.b. Convergence of ICA, QCA and CQCA

Figure 6. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 10 Dimensions

Page 15: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 15

cost

Run number

7.a. Stability of ICA, QCA and CQCA

cost

Generation

7.b. Convergence of ICA, QCA and CQCA

Figure 7. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 30 Dimensions

Page 16: Journal of Theoretical and Applied Computer Science

16 M. A. Soltani-Sarvestani, Shahriar Lotfi

cost

Run number

8.a. Stability of ICA, QCA and CQCA

cost

Generation

8.b. Convergence of ICA, QCA and CQCA

Figure 8. The result of applying the ICA, QCA and CQCA on Griewank Inverse with 50 Dimensions

Page 17: Journal of Theoretical and Applied Computer Science

Table 3. The results of applying benchmarks on the QCA, CQCA and the ICA with 2, 10, 30 and 50 dimensions

Benchmark Range Dim Optimum QCA CQCA ICA Imp.

Best Result Mean SD Best Result Mean SD Best Result Mean SD

Sphere [-600,

600]

2 0 1.6384E-26 7.4682E-20 2.7799E-19 1.1889E-26 2.6167E-19 2.0530E-18 2.0568E-20 1.371E-10 1.1761E-9 ≈100%

10 0 4.6801E-15 1.8719E-11 3.9881E-11 1.7152E-14 3.1369E-11 6.4424E-11 2.5493E-12 3.0484E-8 6.445E-8 99.94%

30 0 2.559E-9 7.1833E-7 2.2583E-6 3.7622E-9 5.2950E-7 1.2766E-6 1.0972E-6 3.2491E-5 3.6956E-5 98.37%

50 0 7.3234E-7 3.9662E-5 1.0098E-4 6.6159E-7 2.8669E-5 5.3403E-5 2.6172E-4 0.0031 0.003 99.07%

Sphere Inv. [-600,

600]

2 -7.2 E +5 -7.2 E +5 -7.2 E +5 0.2526 -7.2 E +5 -7.2 E +5 0.2536 -7.2 E +5 -7.1998E+5 14.8687 0.003%

10 -3.6 E +6 -3.5995E+6 -3.5983E+6 783.7695 -3.5994E+6 -3.5981E+6 847.4918 -3.5821E+6 -3.5689E+6 6.2142E+3 0.82%

30 -1.08E+7 -1.0761E+7 -1.0734E+7 1.4222E+4 -1.0759E+7 -1.0731E+7 1.4091E+4 -1.0506E+7 -1.0358E+7 5.4485E+4 3.63%

50 -1.8E+7 -1.7866E+7 -1.7755E+7 4.0419E+4 -1.7859E+7 -1.7756E+7 3.8404E+4 -1.6950E+7 -1.6706E+7 9.4520E+4 6.29%

Rastrigin [-15,15]

2 0 0 0 0 0 0 0 0 1.1358E-13 6.652E-13 100%

10 0 0 1.3269E-14 4.0851E-14 0 1.6129E-14 3.8613E-14 4.464E-12 5.5944E-9 1.5154E-8 99.99%

30 0 3.6981E-10 1.4274E-8 2.4467E-8 2.6805E-10 1.5004E-7 1.3607E-6 1.6195E-4 0.3899 0.5083 ≈100%

50 0 7.5566E-7 0.0599 0.2362 1.054E-6 0.0203 0.1393 1.0452 5.3211 1.7154 99.62%

Rastrigin

Inv.

[-600,

600]

2 -7.2 E +5 -7.2E+5 -7.2E+5 0.3104 -7.2E+5 -7.2E+5 0.6883 -7.2E+5 -7.1999E+5 13.0936 0.002%

10 -3.6 E +6 -3.5995E+5 -35983E+6 906.4164 -3.5956E+6 -3.5983E+6 833.7654 -3.5861E+6 -3.5692E+6 6.4272E+3 0.82%

30 -1.08E+7 -1.0767E+7 -1.0732E+7 1.3192E+4 -1.0756e+7 -1.0732E+7 1.3110E+4 -1.0486E+7 -1.0348E+7 4.7617E+4 3.71%

50 -1.8E+7 -1.7830E+7 -1.7757E+7 3.4901E+4 -1.7844E+7 -1.7753E+7 4.0589E+4 -1.7019E+7 -1.6707E+7 1.0234E+5 6.28%

Griewank [-600,

600]

2 0 0 0 0 0 0 0 0 3.7356E-13 2.7586 E-12 100%

10 0 8.9106E-13 9.3103E-9 2.2949E-8 1.3518E-11 1.2774 E -8 6.6488 E -8 01.4433E-9 6.8886E -6 1.9415 E -5 99.86%

30 0 4.8241E-4 0.0144 0.0220 4.3573E-4 0.0155 0.0224 0.0040 0.0721 0.0522 80.03%

50 0 0.0747 0.3832 37.9352 0.1251 0.35 34.6526 0.1402 0.4227 41.8484 17.2%

Griewank

Inv.

[-600,

600]

2 -180.0121 -179.0827 -178.9388 0.0877 -179.0931 -178.9193 0.1022 -179.0774 -178.8674 0.0832 0.04%

10 -901 -898.5777 -897.6309 0.5023 -898.515 -897.6836 0.5302 -893.9284 -890.7051 1.6002 0.79%

30 -2.701E+3 -2.6883E-3 -2.6812E+3 3.6252 -2.6887E+3 -2.6817E-3 3.3025 -2.6205E+3 -2.5886E+3 10.4053 3.6%

50 -4501 -4.4599E+3 -4.439E+3 9.2071 -4.4597E+3 -4.4401E+3 9.6358 -4.2549E+3 -4.1779E+3 28.4462 6.28%

Ackley [-32.768,

32.768]

2 0 8.8818E-16 8.4754E-13 2.0295E-12 4.4409E-15 7.6213E-13 1.7966E-12 3.4195E-13 5.2040E-8 4.5546E-7 99.99%

10 0 1.2632E-9 2.5552E-8 4.8522E-8 2.3995E-10 2.4881E-8 4.2524E-8 1.4476E-7 2.4681E-6 5.4074E-6 98.99%

30 0 1.0459E-6 4.3273E-6 2.5904E-6 1.0613E-6 4.2235E-6 2.0663E-6 3.9508E-5 1.7145E-4 1.0189E-4 97.54%

50 0 3.7308E-5 9.9126E-5 4.1411E-5 3.3848E-5 9.0414E-5 3.2841E-5 4.5648E-4 0.0014 7.0191E-4 93.54%

Schewefel [-500,

500]

2 0 0 0 0 0 0 0 0 0 0 0

10 0 0 0 0 0 0 0 0 0 0 0

30 0 0 0 0 0 0 0 0 0 0 0

50 0 0 0 0 0 0 0 0 0 0 0

Page 18: Journal of Theoretical and Applied Computer Science

Table 4. The result of GA, PSO, PS-EA, ABC, ICA, QCA

Benchmark D

GA [14] PSO [14] PS-EA [14] ABC [13] ICA QCA CQCA

Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD

Griewank

10 0.05023 0.02952 0.07939 0.033451 0.222366 0.0781 0.00087 0.002535 6.889E-6 1.941E-5 9.31E-9 2.295E-8 1.277E-8 6.649E-8

20 1.0139 0.02697 0.03056 0.025419 0.59036 0.2030 2.01E-08 6.76E-08 0.0052 0.0079 1.206E-4 1.989E-4 1.753E-4 4.092E-4

30 1.2342 0.11045 0.01115 0.014209 0.8211 0.1394 2.87E-09 8.45E-10 0.0721 0.0522 0.0144 0.0220 0.0155 0.0224

Rastrigin

10 1.3928 0.76319 2.6559 1.3896 0.43404 0.2551 0 0 5.594E-9 1.515E-8 1.33E-14 4.09E-14 1.61E-14 3.86E-14

20 6.0309 1.4537 12.059 3.3216 1.8135 0.2551 1.45E-08 5.06E-08 2.154E-4 0.0016 3.31E-11 6.26E-11 6.11E-11 1.61E-10

30 10.4388 2.6386 32.476 6.9521 3.0527 0.9985 0.033874 0.181557 0.3899 0.5083 1.427E-8 2.447E-8 1.5E-7 1.361E-6

Ackley

10 0.59267 0.22482 9.85E-13 9.62E-13 0.19209 0.1951 7.8E-11 1.16E-09 2.468E-6 5.407E-6 2.555E-8 4.852E-8 2.488E-8 4.252E-8

20 0.92413 0.22599 1.178E-6 1.5842E-6 0.32321 0.097353 1.6E-11 1.9E-11 3.033E-5 1.916E-5 4.719E-7 3.782E-7 4.311E-7 3.544E-7

30 1.0989 0.24956 1.492E-6 1.8612E-6 0.3771 0.098762 3E-12 5E-12 1.715E-4 1.019E-4 4.327E-6 2.59E-6 4.224E-6 2.066E-6

Schwefel

10 1.9519 1.3044 161.87 144.16 0.32037 1.6185 1.27E-09 4E-12 0 0 0 0 0 0

20 7.285 2.9971 543.07 360.22 1.4984 0.84612 19.83971 45.12342 0 0 0 0 0 0

30 13.5346 4.9534 990.77 581.14 3.272 1.6185 146.8568 82.3144 0 0 0 0 0 0

Page 19: Journal of Theoretical and Applied Computer Science

QCA & CQCA: Quad Countries Algorithm and Chaotic Quad Countries Algorithm 19

6. Conclusions

In this paper, two improved imperialist algorithms are introduced which are called re-

spectively the Quad Countries Algorithm (QCA) and the Chaotic Quad Countries Algorithm

(CQCA). In the QCA and CQCA, we define four categories of countries including Imperial-

ist, Colony, Independent, and Seeking Independent country so that each group of countries

has special motion and moves differently compared to the others. The difference between

QCA and CQCA is related to the Independent countries’ movement. In the QCA Independ-

ent countries move completely randomly, but in the CQCA they move with chaotic maps. In

the primary ICA there are only two categories, Colony and Imperialist, and the only motion

is the Colonies’ movement toward Imperialists which is applied through Absorption policy.

Whereas by adding Independent countries in the QCA, a new policy which is called Repul-

sion policy is also added. The empirical results were found by applying the proposed algo-

rithm to some famous benchmarks, indicating that the quality of global optima solutions and

the convergence speeds towards the optima have remarkably increased in the proposed algo-

rithms, in comparison to the primary ICA. In experiments it can be clearly seen that, when

the ICA sticks into a local optimum trap the QCA and CQCA find global optima. In cases

when the ICA found a solution near to the global optima, the QCA and CQCA discovered

an equal or better solution than the ICA’s solution. Through the increase of the problem

dimensions, the performance of the QCA and CQCA increase considerably when compared

to the ICA. In comparison with the QCA, CQCA, GA, PSO, PS-EA and ABC, it was ob-

served that in 100 percent of cases the proposed algorithms has better performance than GA

and PS-EA, but in comparison with ABC and PSO, in 50 percent of cases the QCA has bet-

ter performance than ABC and PSO. ABC and PSO have better performance about 41.66

and 8.33percent of cases. Overall, the performed experiments showed that the QCA and

CQCA have considerably better performance in comparison with the primary ICA and also

the other evolutionary algorithms such as GA, PSO, PS-EA and ABC.

The Quad Countries Algorithm (QCA) has a proper performance to solve optimization

problems, but by changing the countries’ movements and defining new movement policies

its performance will increase. In fact, by defining new movement policies both the ability of

exploration and algorithm performance will increase.

References

[1] Sarimveis H., Nikolakopoulos A.: A Life Up Evolutionary Algorithm for Solving Non-

linear Constrained Optimization Problems. Computer & Operation Research,

32(6):pp.1499-1514 (2005)

[2] Mühlenbein H., Schomisch M., Born J.: The Parallel Genetic Algorithm as Function

Optimizer. Proceedings of The Forth International Conference on Genetic Algorithms,

University of California, San Diego, pp. 270-278 (1991)

[3] Holland J. H.: ECHO: Explorations of Evolution in a Miniature World. In: Farmer

J. D., Doyne J., editors, Proceedings of the Second Conference on Artificial Life (1990)

[4] Melanie M.: An Introduction to Genetic Algorithms. Massachusett's: MIT Press (1999)

[5] Kennedy J., Eberhart R.C.: Particle Swarm Optimization. In: Proceedings of IEEE, pp.

1942-1948 (1995)

[6] Atashpaz-Gargari E., Lucas C.: Imperialist Competitive Algorithm: An Algorithm for

Optimization Inspired by Imperialistic Competition. IEEE Congress on Evolutionary

Computation (CEC 2007), pp. 4661-4667 (2007)

Page 20: Journal of Theoretical and Applied Computer Science

20 M. A. Soltani-Sarvestani, Shahriar Lotfi

[7] Atashpaz-Gargari E., Hashemzadeh F., Rajabioun R., Lucas C.: Colonial Competitive

Algorithm: A novel approach for PID controller design in MIMO distillation column

process. International Journal of Intelligent Computing and cybernetics (IJICC), Vol. 1

No. 3, pp. 337-355 (2008)

[8] Zhang Y., Wang Y., Peng C.: Improved Imperialist Competitive Algorithm for Con-

strained Optimization. International Forum on Computer Science-Technology and Ap-

plications (2009)

[9] Bahrami H., Feaz K., Abdechiri M.: Imperialist Competitive Algorithm using Chaos

Theory for Optimization (CICA). Proceedings of the 12th

International Conference on

Computer Modelling and Simulation (2010)

[10] Bahrami H., Feaz K., Abdechiri M.: Adaptive Imperialist Competitive Algorithm (AI-

CA). Proceedings of The 9th

IEEE international Conference on Cognitive Informatics

(ICCI'10) (2010)

[11] Karaboga D., Basturk B.: A powerful and efficient algorithm for numerical function

optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization,

vol. 39, Issue.3, pp. 459-471 (2007)

[12] Srinivasan D., Seow T.H.: Evolutionary Computation. CEC ’03, 8--12 Dec. 2003, 4,

Canberra, Australia, pp. 2292-2297 (2003)

[13] Schuster H.G.: Deterministic Chaos: An Introduction. 2nd

reviseded, Weinheim, Feder-

al Republic of Germany: Physick-Verlag GmnH (1988)

[14] Zheng W.M.: Kneading plane of the circle map. Chaos, Solitons & Fractals, 4:1221

(1994)

[15] Soltani-Sarvestani M.A., Lotfi S., Ramezani F.: Quad Countries Algorithm (QCA). In:

Proc. of the 4th Asian Conference on Intelligent Information and Database Systems

(ACIIDS 2012), Part III, LNAI, pp. 119-129 (2012)

Page 21: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 21-27

ISSN 2299-2634 http://www.jtacs.org

Effectiveness of mini-models method when data

modelling within a 2D-space in an information deficiency

situation

Marcin Pietrzykowski

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,

Szczecin , Poland

[email protected]

Abstract: This paper examines mini-models method and its effectiveness when data modelling in an

information deficiency situation. It also compares the effectiveness of mini-models with var-

ious methods of modelling such as neural networks, the KNN-method and polynomials. The

algorithm concentrates only on local query data and does not construct a global model dur-

ing the learning process when it is not necessary. It is characterized by a high efficacy and

a short calculation time. The article briefly describes the method by means of four variants:

linear heuristic, nonlinear heuristic, mini-models based on linear regression, and mini-

models based on polynomial approximation. The paper presents the results of experiments

that compare the effectiveness of mini-models with selected methods of modelling in an in-

formation deficiency situation.

Keywords: mini-models, modelling, parameter of minimum number of samples, leave one out error,

information gap

1. Introduction

The concept of mini-models method was developed by Piegat [1], [2]. In contrast to

most well-known methods of modelling such as neural networks, neuro-fuzzy networks and

polynomial approximation, the method does not create a global model when it is not neces-

sary [3]. Mini-models method, similarly to the method of k-nearest neighbours, operates

only on data from the local neighbourhood of a query [4], [5]. This is a consequence of the

fact that in the modelling process we are generally only interested in an answer to a specific

query, such as: “What does the compressive strength of 28-day concrete amount to when

that of cement amounts to 163 kg/m3, water to 180 kg/m

3, coarse aggregate to 843 kg/m

3

and fine aggregate to 746 kg/m3?” The answer to the first question requires only the data,

“cement amounts to about 163 kg/m3, water to 180 kg/m

3, etc. This approach frees us from

the time consuming process of creating a global model. Moreover, when a new sample is

acquired the global model becomes outdated and re-learning is required. Mini-model meth-

ods calculate the answer to the query point “ad-hocly”, which allows them to work in situa-

tions where new data points are continuously being received. It is also possible to build a

global model in order to learn the value of a modelled variable across an entire domain. This

can be done very simply by adding together mini-models for subsequent query points.

The main aim of this paper is to compare the effectiveness of mini-models with selected

methods of modelling in information deficiency situations. The article only briefly describes

Page 22: Journal of Theoretical and Applied Computer Science

22 Marcin Pietrzykowski

the methods. Results of experiments on datasets that don't contain information gaps and the

details of mini-models method have been described more comprehensively in previous

works by the author [6], [7].

Mini-models in 2D-space form the basis for mini-models operating in spaces with a

greater number of dimensions. In 2D-space mini-models can take the form of a line segment

for linear models, or either a polynomial curve or an arc of a circle for nonlinear mini-

models. In 3D-space mini-models take the form of a polygon. In 4D-space mini-models take

the form of a polyhedron [1]. However, regardless of the dimensionality of the space in

which mini-models operate it is necessary to define the query point and the mini-model's

local neighbourhood. The query point is a set consisting of some independent variables with

known values and a dependent variable with an unknown value. For example, for the simple

query in a 2D-space: “What does the compressive strength of 28-day concrete amount to

when that of cement amounts to 163 kg/m3”, the dependent variable � is the compressive

strength and the independent variable � is the quantity of cement. Query points will there-

fore take the following form: �� = 163, �� =?; or simply �� = 163.

For proper operation of a mini-models method the query point and its local neighbour-

hood must first be defined. The local neighbourhood may take various forms that depend

on: the shape of the modelled data, the type of mini-model it is, and the location of the que-

ry point. The main parameter in defining the local neighbourhood is that of the minimum

number of samples (�). This parameter is closely related to the mini-model's limit points.

In 2D-space they form graphical representations of the end points of a line or curve seg-

ment. It is assumed that the parameter � is defined by the formula:

� ≥ + 1, (1)

where is the number of dimensions of the modelled space. In 2D-space, � ≥ 3. The pa-

rameter � can be defined globally for the entire domain or locally for either a selected range

or a selected group of learning points. Unfortunately, there is no simple rule for choosing

the optimal value of the parameter � for a particular problem. Finding the optimal value

instead requires an extensive search through all possible values. Thus, the solution may lo-

cally adapt to the modelled data. A test method based on leave-one-out cross-validation is

used in the process of testing the effectiveness of mini-models for selected values of the

parameter �.

2. Details of the method

A mini-models method works on a training dataset which consists of some points �� and

is sorted in ascending order with respect to the variable, �:

�� = ���, ���, (2)

� = ���, ��, ��, … �, �ℎ����� ≤ �� ≤ �� ≤…� ≥ + 1. (3)

The local neighbourhood of the query point �� is defined by boundaries or limit points: i.e.

the lower limit �� and the upper limit � :

��, � ∈ ", (4)

�� ∈ ⟨��; � ⟩. (5)

We call the set of points on which the mini-models operate &:

& = ��� ∈ � ∶ �� ∈ ⟨��; � ⟩�, (6)

Page 23: Journal of Theoretical and Applied Computer Science

Effectiveness of mini-models method when data modelling… 23

()� �&� ≥ �. (7)

There are two basic variants of the method: linear and nonlinear. As the name suggests line-

ar mini-models form the shape of a line segment in response to query point data, and non-

linear mini-models take the shape of a curve segment after the learning process has

completed.

2.1. Linear mini-models

The simplest linear mini-models are based on linear regression. The learning algorithm

for these is as follows:

1. choose a set of points &* that satisfy properties: (4), (5), (6) and (7),

2. calculate the function +* of the local mini-model using linear regression and a set &*,

3. calculate the error �* committed by the model +*. The error �* is calculated using fol-

lowing formula:

�* =∑ -./012�3/�-4567�82�

/9:

;�<=�>2�, (8)

4. repeat steps 1 – 3 until all combinations have been checked,

5. select the unique model +whichthat caused the minimal value of the error �.

In order to gain a valid solution, an extensive search through all possible combinations is

first required to define the local neighbourhood, while satisfying the properties above: (4),

(5), (6) and (7).

Note that the error � (8) is also the estimated value of the error that can be committed by

the model during the process of calculating the answer to a query point. For example, for the

error � = 0.09 and the answer �� = 0.43 it is assumed that �� = 0.43 ± 0.09. Estima-

tion of the value of the error also applies to other versions of mini-models presented later in

this article.

The second type of linear mini-model is trained heuristically. Unlike linear regression

mini-models, there is no problem in defining the local neighbourhood of a query point. The

neighbourhood is instead created “ad-hoc” during the training of the mini-model. Heuristic

learning is done by cyclic movement of the limits �� and � along the x- and y-axis. When a

change in the location of one limit point does not show any improvement in results, we

change the location of the second point. We then repeat the whole operation again with the

first limit point and so on. This whole operation is repeated until the stop condition has been

reached. Searching along the y-axis is done by “moving” the limit point by a value of ∆ in

the desired direction. Searching along the x-axis is done in a similar way, but the limit point

must take the value of �� of the nearest point �� in the desired direction. The variable does

not have to take any intermediate values, since this would not affect the number of points

included in a mini-model and thus the error committed by that mini-model. We should re-

member that limit points after each operation along the x-axis have to satisfy the above

properties: (4), (5), (6) and (7). After each shift of limit points we calculate the equation of a

mini-model based on equations of a straight line passing through two points on a plane:

� = .D0.53D035

�� −��� + ��, (9)

Page 24: Journal of Theoretical and Applied Computer Science

24 Marcin Pietrzykowski

we then calculate the error � committed by a mini-model (8). The mini-model having the

smallest error value will become the output value for the next cycle of operations. In the

end, we select the best model with the smallest value of the error �.

2.2. Nonlinear mini-models

The first variant of nonlinear mini-models is based on polynomial approximation. This

type of mini-model works in a similar way to the linear equivalent. The only difference is

that a polynomial approximation of the second order is used instead of linear regression.

There is no need to use polynomial approximation of the higher order, and this would only

increase the complexity of the algorithm. Mini-models are able to model the complex shape

of a function of a few mini-models so long as they have relatively simple shape.

The second variant of nonlinear mini-models is the heuristic mini-models. The initial

stage of these mini-models' learning process is the same as the learning process of heuristic

linear mini-models. After finding the best solution, the model takes the form of a circular

arc when represented graphically. This can curve either “up” or “down” depending on the

type of the modelled data. The results of numerical experiments have shown that those mini-

models which were curved in the process of determining the locations of the limit points

achieve worse results than the mini-models presented above. Training of mini-models with a

higher number of degrees of freedom is more difficult and such models often reach local

minima.

3. Experiments and results

In order to test the effectiveness of mini-models in an information deficiency situation

and to compare them with other commonly used methods of modelling, experiments were

performed on the following specially prepared data sets:

• a dataset containing an “information hole” with a width of 10% of the interval,

• a dataset containing an “information hole” with a width of 20% of the interval,

• a dataset with 30% random sample removal.

These experiments were performed with optimal values for all parameters, for all tested

methods. Two types of tests were made. Firstly, the algorithms were tested using a test

method based on leave-one-out cross validation using datasets with information loss. Sec-

ondly, the tested methods were trained with datasets with information loss and their effec-

tiveness was checked against data sets consisting of “lost data”. For example, methods were

tested with data from “information holes” that were not involved in the learning process.

The experiments were conducted on 11 different data sets: Compressive Strength of 28-day

Concrete, Concrete Slump Test, Unemployment Rate in Poland, Housing Value Concerns in

the Suburbs of Boston, Computer Hardware Performance, Concentration of NO2 Measured

at Alnabru in Oslo, Sold Production of Industry with Inflation in Poland, Sleep in Mammals,

Air Pollution to Mortality, Fuel Consumption in Miles per Gallon, and Determinants of

Wages. It should be noted that the learning datasets are multi-dimensional and presented as

mini-models operating within 2D-space. It was possible to perform 37 different numerical

experiments for each type of modification of the learning datasets. A summary comparison

of these tested methods is shown in Table 1.

Page 25: Journal of Theoretical and Applied Computer Science

Effectiveness of mini-models method when data modelling… 25

Table 1. The total number of experiments in which the tested methods achieved the best results (i.e.

achieved a better result than the other tested method)

method

10% gap 20% gap 30% random loss

count of

exper.

% of exper-

iments

count of

exper.

% of exper-

iments

count of

exper.

% of exper-

iments

LOO1 TT

2 LOO TT LOO TT

LO

O TT LOO TT LOO TT

mini-model with global parameter of minimal numbers of points

heuristic linear mini-

model 0 4 0,0 10,5 0 2 0,0 5,3 0 3 0,0 7,9

mini-model based on

linear regression 0 3 0,0 7,9 0 3 0,0 7,9 0 6 0,0 15,8

heuristic nonlinear mini-

model 0 2 0,00 5,3 0 6 0,0 15,8 0 3 0,0 7,9

mini-model based on

polynomial approxima-

tion

1 1 2,6 2,6 0 1 0,0 2,6 0 2 0,0 5,3

mini-model with local parameter of minimal numbers of points

heuristic linear mini-

model 18 3 47,4 7,9 18 2 47,4 5,3 18 3 46,1 7,9

mini-model based on

linear regression 0 3 0,00 7,9 1 3 2,6 7,9 1 5 2,6 13,2

heuristic nonlinear mini-

model 12 5 31,6 13,2 12 1 31,6 2,6 11 3 28,2 7,9

mini-model based on

polynomial approxima-

tion

6 4 15,8 10,5 6 1 15,8 2,6 8 1 20,5 2,6

other methods

k-nearest neighbours 1 3 2,6 7,9 1 8 2,6 21,0 0 2 0,0 5,2

polynomial approxima-

tion of degree n 0 4 0,0 10,5 0 5 0,0 13,2 0 3 0,0 7,9

feed forward neural net-

work 0 5 0,0 13,2 0 6 0,0 15,8 1 4 2,6 10,5

General Regression Neu-

ral Network [8] 0 1 0,0 2,6 0 0 0,0 0,0 0 3 0,0 7,9

summary comparison

mini-model with global

parameter � 1 10 2,6 26,3 0 12 0,00 31,6 0 14 0,00 36,8

mini-model with local

parameter � 36 15 94,7 39,5 37 7 97,4 18,4 38 12 97,4 31,6

other methods 1 13 2,6 34,5 1 19 2,6 50,0 1 12 2,6 31,6

all mini-models 37 25 97,4 65,5 37 19 97,4 50,0 38 26 97,4 68,4

1 Test method based on leave-one-out cross validation

2 Testing using “lost data” for example form information “hole”

Page 26: Journal of Theoretical and Applied Computer Science

26 Marcin Pietrzykowski

4. Discussion of results

It should be noted that the information gap does not significantly affect the results of ex-

periments using test methods based on leave-one-out cross validation. Mini-models method

was the most effective and the effectiveness of different types of mini-models only varied

slightly. Mini-models were less efficient in the tests with datasets than those with infor-

mation gaps, but their advantage is still significant. In the tests with datasets containing a

“hole” with a width of 10%, mini-models were the most efficient and achieved best results

in 65% of the tests. For datasets containing an information “hole” with a width of 20%,

mini-models achieved best results in 50% of the tests. The KNN method also achieved good

results with these datasets. The KNN method is considered as the main competitor to the

mini-models method. It should be remembered that the KNN method in an “information

hole” situation is effective only for datasets where samples are not evenly distributed (Fig-

ure 1c). The method does not work very well with datasets that have a clearly visible trend

line. The graph shows clearly the “steps behaviour” of the method shown in Figure 2c. For

the datasets with 30% of the random loss of samples, the mini-models achieved the best

results in 68% of tests. It should be noted that other methods (except KNN mentioned

above) gained no more than several percent across all tests. Mini-models with a global value

of the parameter � performed as well across the entire range as mini-models with a local

value of this parameter.

a) b) c)

Figure 1. a) Original data of compressive strength of 28-day concrete depending on fine aggregate.

b) Global model build with heuristic linear mini-models with global value of the parameter � (best mini-

model MAE=0,1599) for a dataset with a “hole” in the interval [0.5; 0.7]. c) Global model build with the

k-nearest neighbours method (best result MAE=0,1598) for a dataset with a “hole” in the interval

[0.5; 0.7]

a) b) c)

Figure 2. a) Original data of unemployment rate in Poland depending on the money supply. b) Global

model build with heuristic linear mini-models with a global value of the parameter � (best result

MAE=0,0374) for a dataset with a “hole” in the interval [0.3; 0.5]. c) Global model build with the

k-nearest neighbours method (worst result MAE=0,1611) for a dataset with a “hole” in the interval

[0.3; 0.5]

Page 27: Journal of Theoretical and Applied Computer Science

Effectiveness of mini-models method when data modelling… 27

5. Conclusions

The results of the experiments have shown advantages of mini-models over other meth-

ods of modelling information deficiency situations. Their advantage is not as great as in the

situation of testing with leave-one-out cross validation method for original data, but still

remains significant. The irregularity of the global models created by mini-models method

and their high efficiency raises the question of the validity of the theory of regularization.

Authors in future research should move towards the use of mini-models in spaces with a

higher number of dimensions and their relevance to the theory of regularization.

References

[1] Piegat A., Wąsikowska B., Korzeń M.: Differences between the method of mini-models

and of the k-nearest neighbors on example of modeling of unemployment rate in Poland

in Information Systems in Management IX. Bussines Inteligence and Knowledge

Management, Warsaw, 2011, pp. 34-43.

[2] Piegat A., Wąsikowska B., Korzeń M.: Zastosowanie samouczącego się

trzypunktowego minimodelu do modelowania stopy bezrobocia w Polsce, Studia

Informatica, no. 27, pp. 45-58, 2011.

[3] Rutkowski L.: Metody i techniki sztucznej inteligencji. Warszawa: PWN, 2009.

[4] Fix E., Hodges J. L.: Discriminatory analysis, nonparametric discrimination:

Consistency properties, Randolph Field, Texas, 1951.

[5] Kordos M., Blachnik M., Strzempa D.: Do We Need Whatever More than k-NN?, in

Proceedings of 10-th International Conference on Artificial Inteligence and Soft

Computing, Zakopane, 2010.

[6] Pietrzykowski M.: Comparison of effectiveness of linear mini-models with some

methods of modelling, in Młodzi naukowcy dla Polskiej Nauki, Kraków, 2011.

[7] Pietrzykowski M.: The use of linear and nonlinear mini-models in process of data

modelling in a 2D-space, in Nowe trendy w naukach inżynieryjnych., 2011.

[8] Specht D. F.: A General Regression Neural Network, IEEE Transactions on Neural

Networks, pp. 568-576, 1991.

[9] Witten I. A., Frank E.: Data mining. San Francisco: Morgan Kaufmann Publishers,

2005.

[10] Pluciński M.: Nonlinear ellipsoidal mini-models – application for the function approx-

imation task, paper accepted for ACS Conference, 2012

[11] Pluciński M.: Application of the information-gap theory for evaluation of nearest

neighbours method robustness to data uncertainty, paper accepted for ACS Confer-

ence, 2012

Page 28: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 28–35ISSN 2299-2634 http://www.jtacs.org

SmartMonitor: recent progress in the development of aninnovative visual surveillance system

Dariusz Frejlichowski1, Katarzyna Gosciewska1,2, Paweł Forczmanski1, Adam Nowosielski1,Radosław Hofman2

1 Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,Szczecin, Poland2 Smart Monitor sp. z o.o., Szczecin, Poland

{dfrejlichowski,pforczmanski,anowosielski}@wi.zut.edu.pl,{katarzyna.gosciewska,radekh}@smartmonitor.pl

Abstract: This paper describes recent improvements in developing SmartMonitor — an innovative securitysystem based on existing traditional surveillance systems and video content analysis algorithms.The system is being developed to ensure the safety of people and assets within small areas. Itis intended to work without the need for user supervision and to be widely customizable to meetan individual’s requirements. In this paper, the fundamental characteristics of the system arepresented including a simplified representation of its modules. Methods and algorithms that havebeen investigated so far alongside those that could be employed in the future are described. Inorder to show the effectiveness of the methods and algorithms described, some experimental resultsare provided together with a concise explanation.

Keywords: SmartMonitor, visual surveillance system, video content analysis

1. IntroductionExisting monitoring systems usually require supervision by responsible person whose role

it is to observe multiple monitors and report any suspicious behaviour. The existing intelligentsurveillance systems that have been built to perform additional video content analysis tend tobe very specific, narrowly targeted and expensive. For example, the Bosch IVA 4.0 [1], anadvanced surveillance system with VCA functionality, is designed to help operators of CCTVmonitoring and is applied primarily for the monitoring of public buildings or larger areas,hence making it unaffordable for personal use. In turn, SmartMonitor is being designed forindividual customers and home use, and user interaction will only be necessary during systemcalibration. SmartMonitor’s aim is to satisfy the needs of a large number of people who wantto ensure the safety of both themselves and their possessions. It will allow for the monitoringof buildings (e.g. houses, apartments, small enterprises, etc.) and their surroundings (e.g.yards, gardens, etc.), where only a small number of objects need to be tracked. Moreover, itwill utilize only commonly available and inexpensive hardware such as a personal computerand digital cameras. Another intelligent monitoring system, described in [2], analyses humanlocation, motion trajectory and velocity in an attempt to classify the type of behaviour. Itrequires both the participation of a qualified employee and the preparation of a large databaseduring the learning process. These steps are unnecessary with the SmartMonitor system dueto a simple calibration mechanism and feature-based methods. Moreover, a precise calibra-

Page 29: Journal of Theoretical and Applied Computer Science

SmartMonitor: recent progress. . . 29

tion can improve a system’s effectiveness and allow the system’s sensitivity to be adjustedto situations that do not require any system reaction. The customization ability offered bySmartMonitor is very advantageous. In [3], the problem of automatic monitoring systemswith object classification was described. It was assumed that the background model used forforeground subtraction does not change with time. This is a crucial limitation caused by thebackground variability of real videos. Therefore, and due to planned system scenarios, themodel that best adapts to changes in the scene will be utilized.

SmartMonitor will be able to operate in four independent modes (scenarios) that will pro-vide home/surroundings protection against unauthorized intrusion, allow for supervision ofpeople who are ill, detect suspicious behaviours and sudden changes in object trajectory andshape, and detect smoke or fire. Each scenario is characterized by a group of performedactions and conditions, such as movement detection, object tracking, object classification,region limitation, object size limitation, object feature change, weather conditions and worktime (with artificial lighting required at night). A more detailed explanation of system scenar-ios and parameters is provided in [4].

The rest of the paper is organised as follows: Section 2 contains the description of the mainsystem modules; algorithms and methods that are utilised in each module are briefly describedin Section 3; Section 4 contains selected experimental results; and Section 5 concludes thepaper.

2. System ModulesSmartMonitor will be composed of six main modules: background modelling, object

tracking, artefacts removal, object classification, event detection and system response. Someof these are common to the intelligent surveillance systems that were reviewed in [5]. Asimplified representation of these system modules is displayed in Fig. 1.

Figure 1. Simplified representation of system modules

Background modelling detects movement through use of background subtraction methods.Foreground objects that are larger than a specified size and coherent are extracted as objectsof interest (OOI). The second module, object tracking, tracks object locations across consec-utive video frames. When multiple objects are tracked, each object is labelled accordingly.Every object moves along a specified path called a trajectory. Trajectories can be comparedand analysed in order to detect suspicious behaviours. The third module, artefacts removal,is an important step preceding classification and should be performed correctly. In this, all

Page 30: Journal of Theoretical and Applied Computer Science

30 Dariusz Frejlichowski, et al.

artefacts, such as shadows, reflections or false detection results, enlarge the foreground regionand usually move with the actual OOI. The fourth module, object classification, will allow forsimple classification using object parameters and object templates. The template base will becustomizable so that new objects can be added. A more detailed classification will also bepossible using more sophisticated methods. The key issue of the fifth, i.e. the event detectionmodule, is to detect changes in object features. The system will react to both sudden changes(mainly in shape) and a lack of movement. The final module defines how the system re-sponds to detected events. By eliminating the human factor it is important to determine whichsituations should set off alarms or cause information to be sent to the appropriate services.

3. Employed Methods and AlgorithmFor each module we investigated the existing approaches, and modified them to apply the

best solution for the system. Below we present a brief description and explanation of this.Background modelling includes models that utilize static background images [3], back-

ground images averaged in time [6] and background images built adaptively, e.g. usingGaussian Mixture Models (GMM) [7, 8]. Since the backgrounds of real videos tend to beextremely variable in time, we decided to use a model based on GMM. This builds per-pixelbackground image that is updated with every frame, and is also sensitive to sudden changesin lighting which can cause false detections, mainly by shadows. It was stated in [9] thatshadows only affects the image brightness and not the hue. By comparing foreground imagesconstructed using both the Y component of the YIQ colour scheme and the H component ofthe HSV colour scheme, it is possible to exclude false detections that are caused by shadows.Following this, morphological operations are applied to the resulting binary mask. Erosionallows for the elimination of small objects composed of one or few pixels (such as noise) andthe reduction of the region. Later the dilation process fills in the gaps.

For the object tracking stage we investigated three possible implementations, namely theKalman filter [10], Mean Shift and Camshift [11, 12] algorithms. The Mean Shift algorithmis simple and appearance-based. It requires one or more feature, such as colour or edge datato be selected for tracking purposes. This can cause several problems with object localizationwhen particular features change. The Camshift algorithm is simply a version of the MeanShift algorithm that continuously adapts to the variable size of tracked objects. Unfortunately,the described solution is not optimal since it increases the number of computations. More-over, both methods are effective only when certain assumptions are met, such as that trackedobjects will differ from the background (e.g. through variations in colour). The Kalmanfilter algorithm was therefore selected to overcome these drawbacks. This constitutes a set ofmathematical equations that define a predictor-corrector type estimator. The main task was toestimate future values in two steps: prediction based on known values, and correction basedon new measurements. It is assumed that objects can move uniformly and in any direction butwill not change direction suddenly and unpredictably.

After tracking the objects are classified (labelled) as either human or not human. A boostedcascade of Haar-like features [13] connected using the AdaBoost algorithm [14] can be uti-lized. However, at this stage, we replaced the AdaBoost classification with a simpler one.Objects can now be classified using their binary masks and the threshold values of two oftheir properties: area size and minimum bounding rectangle aspect ratio.

A specific and detailed classification can be performed using a Histogram of OrientedGradients (HOG) [15]. A HOG descriptor localises and extracts objects from static scenes

Page 31: Journal of Theoretical and Applied Computer Science

SmartMonitor: recent progress. . . 31

through use of specified patterns. Despite its high computational complexity, the HOG algo-rithm can be applied to a system under several conditions such as those with limited regionsor time intervals.

4. Experimental Conditions and ResultsIn this section we present some experimental results from employing the algorithms for

object localization, extraction and tracking that have given the best results so far. In order toensure the experiments were performed under realistic conditions, a set of test video sequencescorresponding to certain system scenarios was prepared. These include scenes recorded bothinside and outside the buildings, with different types of moving objects. A database alsohad to be created due to the lack of free, universal video databases that matched the plannedscenarios.

The results of employing both the GMM algorithm and the methods for removing falseobjects are presented in Fig. 2. The first row contains the sample frame and backgroundimages for the Y and H components. The second row shows the respective foreground imagesfor the Y and H components alongside the foreground object’s binary mask after false objectsremoval. It is noticeable that the foregrounds constructed using the different colour compo-nents strongly differ and that, by subtracting one image from another, we can eliminate falsedetections.

Figure 2. Results of employing the GMM algorithm and false objects removal methods

Specific objects can be localised and extracted using the HOG descriptor. This detectsobjects using a predefined patterns and extracted feature vectors. Below we present the resultsof the experiments utilizing HOG descriptor. The first experiment was performed using a fixedtemplate size and two sample frames, the second one utilized various template sizes and onesample frame.

The results of the first experiment are pictured in Fig. 3. The figure contains: a sampleframe with a chosen template (left column) and two frames (middle column) from the samevideo sequence which were scanned horizontally in an attempt to identify the matching re-gions. The depth maps (right column) show the results of the HOG algorithm — the darkerthe colour the more similar the region is. Black regions indicate a Euclidean distance betweentwo feature vectors of zero.

Page 32: Journal of Theoretical and Applied Computer Science

32 Dariusz Frejlichowski, et al.

Figure 3. Results of the experiment utilizing the HOG descriptor with a fixed template size

In the next experiment, devoted to an investigation of the HOG descriptor, various templatesizes were tested. The left column of Fig. 4 presents a frame with a chosen template marked bya white rectangle, the central column contains a frame that was scanned horizontally using twodifferent template sizes (dark rectangles in the top left corners define the size of the rescaledtemplate) and the right column provides the respective results of the HOG algorithm. Clearly,the closer the template size is to object size, the more accurate the depth map is.

Figure 4. Results of the experiment utilizing the HOG descriptor with a variable template size

As mentioned in the previous section, we investigated three tracking methods. The firstone, the Mean Shift algorithm, uses part of an image to create a fixed template model. In thiscase we converted images to the HSV colour scheme. Fig. 5 presents three sample framesfrom the tracking process (first row) and their corresponding binary masks (second row). Thewhite masked regions indicate those regions that are similar to the template, the dark rectangledetermines the template and the light points within the rectangle create the object’s trajectory.

Camshift was the second tracking method investigated. This uses the HSV colour schemeand a variable template model. The first row in Fig. 6 presents sample frames from the track-ing process: the starting frame with the chosen template, the central frame with an enlargedtemplate and the finishing frame where the moving object leaves the scene. The second rowin Fig. 6 shows corresponding binary masks for each frame. Both tracking methods, thanks

Page 33: Journal of Theoretical and Applied Computer Science

SmartMonitor: recent progress. . . 33

Figure 5. Results of the experiment utilizing the Mean Shift algorithm

to their local application, were effective despite of the presence of many similar regions to thetemplate.

Figure 6. Results of the experiment utilizing the Camshift algorithm

Fig. 7 shows a result of employing the third algorithm, the Kalman filter, to track a personwalking in a garden. Light asterisks are obtained for object positions that were estimated usinga moving object detection algorithm and dark circles are positions predicted by the Kalmanfilter.

5. Summary and ConclusionsIn this paper, recently achieved results from the SmartMonitor system during the develop-

ment process were described. We provided basic information about system characteristics andproperties, and system modules. Investigated methods and algorithms were briefly described.Selected experimental results on utilizing various solutions were presented.

SmartMonitor will be an innovative surveillance system based on video content analysisand targeted at individual customers. It will operate in four independent modes which are fullycustomizable (and will also be combinable to make custom modes). This allows for individualsafety rules to be set based on different system sensitivity degrees. Moreover, SmartMonitorwill utilize only commonly available hardware. It will almost eliminate human involvement,

Page 34: Journal of Theoretical and Applied Computer Science

34 Dariusz Frejlichowski, et al.

Figure 7. Results of the experiment utilizing the Kalman filter

being only required for the calibration process. Our system will analyse a small number ofmoving objects over limited region which could additionally improve its effectiveness.

Currently, there are no similar systems on the market. Modern surveillance systems areusually expensive, specific and need to be operated by a qualified employee. SmartMonitorwill eliminate these factors by offering less expensive software, making it more affordable forpersonal use and requiring less effort to use.

AcknowledgementsThe project Innovative security system based on image analysis — SmartMonitor pro-

totype construction (original title: Budowa prototypu innowacyjnego systemu bezpieczenstwaopartego o analize obrazu — SmartMonitor) is the project co-founded by the European Union(project number PL: UDA-POIG.01.04.00-32-008/10-01, Value: 9.996.604 PLN, EU con-tribution: 5.848.800 PLN, realization period: 07.2011-04.2013). European Funds — forthe development of innovative economy (Fundusze Europejskie — dla rozwoju innowacyjnejgospodarki).

References[1] Bosch IVA 4.0 Commercial Brochure, http:// resource.boschsecurity.com/documents/

Commercial Brochure enUS 1558886539.pdf[2] Robertson N., Reid I.: A general method for human activity recognition in video. Computer

Vision and Image Understanding 104, 232–248 (2006)[3] Gurwicz Y., Yehezkel R., Lachover B.: Multiclass object classification for real-time video

surveillance systems. Pattern Recognition Letters 32, 805–815 (2011)[4] Frejlichowski D., Forczmanski P., Nowosielski A., Gosciewska K., Hofman R.: SmartMonitor:

An Approach to Simple, Intelligent and Affordable Visual Surveillance System. In: Bolc, L. et al.(eds.) ICCVG 2012. LNCS, vol. 7594, pp. 726–734. Springer, Heidelberg (2012)

Page 35: Journal of Theoretical and Applied Computer Science

SmartMonitor: recent progress. . . 35

[5] Forczmanski P., Frejlichowski D., Nowosielski A., Hofman R.: Current trends in the develope-ment of intelligent visual monitoring systems (in Polish). Methods of Applied Computer Science4/2011(29), 19–32 (2011)

[6] Frejlichowski D.: Automatic Localisation of Moving Vehicles in Image Sequences Using Mor-phological Operations. 1st IEEE International Conference on Information Technology, 439-442(2008)

[7] Stauffer C., Grimson W. E. L.: Adaptive background mixture models for real-time tracking. IEEEComputer Society Conference on Computer Vision and Pattern Recognition, 2–252 (1999)

[8] Zivkovic Z.: Improved adaptive Gaussian mixture model for background subtraction. Proceed-ings of the 17th International Conference on Pattern Recognition 2, 28–31 (2004)

[9] Forczmanski P., Seweryn M.: Surveillance Video Stream Analysis Using Adaptive BackgroundModel and Object Recognition. In: Bolc, L. et al. (eds.) ICCVG 2010, Part I. LNCS, vol. 6374,pp. 114–121. Springer, Heidelberg (2010)

[10] Welch G., Bishop G.: An Introduction to the Kalman Filter. UNC-Chapel Hill, TR 95-041 (24July 2006)

[11] Cheng Y.: Mean Shift, Mode Seeking, and Clustering. IEEE Transactions on Pattern Analysisand Machine Intelligence 17(8), 790–799 (1995)

[12] Comaniciu D., Meer P.: Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEETransactions on Pattern Analysis and Machine Intelligence 24(5), 603–619 (2002)

[13] Viola P., Jones M.: Rapid Object Detection Using a Boosted Cascade of Simple Features. IEEEComputer Society Conference on Computer Vision and Pattern Recognition 1, 511–518 (2001)

[14] Avidan S.: Ensemble Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence29(2), 261–271 (2007)

[15] Dalal N., Triggs B.: Histograms of oriented gradients for human detection. IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition 1, 886–893 (2005)

Page 36: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 36-49

ISSN 2299-2634 http://www.jtacs.org

Nonlinearity of human multi-criteria in decision-making

Andrzej Piegat, Wojciech Sałabun

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology,

Szczecin, Poland

{apiegat, wsalabun}@wi.zut.edu.pl

Abstract: In most cases, known methods of multi-criteria decision-making are used in order to make linear aggregation of human preferences. Authors of these methods seem not to take into account the fact that linear functional dependences rather rarely occur in real systems. Lin-ear functions rather imply a global character of multi-criteria. This paper shows several examples of human nonlinear multi-criteria that are purely local. In these examples, the nonlinear approach is used based on fuzzy logic. It allows for better understanding of how important is the non-linear aggregation of human multi-criteria. The paper contains also proposal of an indicator of nonlinearity degree of the criteria. The presented results are based on investigations and experiments realized by authors.

Keywords: Multi-criteria analysis, multi-criteria decision-analysis, non-linear multi-criteria, fuzzy multi-criteria, indicator of nonlinearity.

1. Introduction

On a daily basis and in professional life we frequently have to make decisions. Then we

use some criteria that depend on our individual preferences, or in case of group-decisions on

preferences of the group. Further on, criteria represent preferences of a single person will be

called individual criteria and criteria represent a group will be called group-criteria. Group-

criteria can be achieved by aggregation of individual ones. Therefore, further on, the nonlin-

earity problem of criteria will be analyzed on examples of individual criteria, because prop-

erties of individual criteria are transferred on the group-ones. Individual human multi-

criteria are “programmed” in our brains and special methods for their elicitation and math-

ematical formulation of them are necessary. Multi-criteria (for short M-Cr) of different per-

sons are more or less different and therefore it would be not reasonable to assume one and

the same type of a mathematical formula for certain criterion representing thousands of dif-

ferent people, e.g. for the individual criterion of car attractiveness. However, in case of M-

Crs most frequently used criterion type is the linear M-Cr form (1).

� = �� + ���� + ���� + ⋯+ �� , (1)

where: wi – weight coefficients of particular component criteria, ∑ �� = 1� � , Ki - the com-

ponent criteria aggregated by the M-Cr (� = 1: ������). They are mostly used, also in this paper,

in the form which is normalized to interval [0,1]. The linear criterion-function in the space

2D is represented by a straight line, in the space 3D by a plane, Fig.1, and in the space nD

by a hyper-plane.

Page 37: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 37

Figure 1. A linear criterion function in space 2D (Fig.1a) and in space 3D (Fig.1b)

Let us notice that in the linear-criterion function K particular component-criteria Ki in-

fluence the superior criterion in the mutually independent and uncorrelated way. Apart of

this, the influence strength of particular component criteria Ki is of the global, constant and

unchanging character in the full criterion-domain. Both above features are great disad-

vantages of the linear M-Cr, because human M-Cr are in most cases nonlinear and signifi-

cance of component criteria Ki is not constant, is not independent from other criteria and

varies in particular, local sub-domains of the global MCr. Unfortunately, linear multi-

criteria are used in many world-known methods of the multi-criteria decision-analysis. Fol-

lowing examples illustrating the above statement can be given: the method SAW (Simple

Additive Weighting) [4,15], the very known and widely used AHP-method of Saaty (the

Analytic Hierarchy Process) [11,15,18], the ANP-method (Analytic Network Process),

[12,13]. Other known MCr-methods as TOPSIS [15,16], ELECTRE [2], PROMETHEE

[1,2] are not strictly linear ones. However, they assume global weight-coefficients wi, con-

stant for the full MCr-domain and in certain steps of their algorithms they also use the line-

ar, weighted aggregation of alternatives. The next part will present the simplest examples of

nonlinear criterion-functions in 2D-space.

2. Nonlinear human criterion-functions in 2D-space

An example of a very simple human nonlinear criterion-function can be the dependence

between the coffee taste (CT), CT ∈ [0,1], and the sugar quantity S, S ∈ [0,5] expressed in

number of sugar spoons, Fig.2. Coffee taste represents inner human preference.

The criterion function of the coffee taste can be identified by interviewing a given per-

son or more exactly, experimentally, by giving the person coffees with different amount of

sugar and asking he/she to evaluate the coffee taste or to compare tastes of pairs of coffees

with different amount of sugar. The achieved taste evaluations can be processed with vari-

ous MCr-methods previously cited or with the method of characteristic objects proposed by

one of the paper authors. However, even without scientific investigations it is easy to under-

stand that the criterion-function shown in Fig.2 is qualitatively correct. This function repre-

sents preferences of the author-AP. He does not like coffee with too great amount of sugar

(more than 3 coffee-spoons) and evaluates its taste as CT≈0. The taste of coffee without

sugar (S=0) he also evaluates as a poor one. The best taste he feels when cup of coffee con-

Page 38: Journal of Theoretical and Applied Computer Science

38 Andrzej Piegat, Wojciech Sałabun

tains 2 spoons of sugar (Sopt=2). For other persons the optimal sugar amount will be differ-

ent. Thus, this criterion-function is not an “objective” (what does it mean?) function of all

people in the world but an individual criterion-function of the AP-author of the paper. It is

very important to differentiate between individual criteria and group-criteria, which repre-

sent small or greater group of people. Similar in character as the function in Fig.2 is also

other one-component human criterion function: e.g. dependence of the text-reading easiness

from the light intensity.

Figure 2. Criterion function representing dependence of the coffee taste CT from number of sugar

spoons S (felt by an individual person, the paper author-AP)

3. Nonlinear, human, multi-criterion function in 3D-space and a method

of its identification

Already in 60-ties and 70-ties of the 20th

century American scientists D. Kahneman and

A. Tversky, Nobel prize winners from 2002 have drawn the attention of the scientific com-

munity on the nonlinearity of human multi-criteria [5] by their investigation results on hu-

man decisions based on a MCr. In their experiment were aggregated some component

criteria: value of a possible profit, probability of the possible profit value, value of a possi-

ble loss, probability of the possible loss-value. Further on, there will be presented a similar

but a simplified problem of evaluation of the individual play acceptability-degree K in de-

pendence of a possible winnings-value K1[$] and of a possible loss-value K2 [$]. Both val-

ues are not great. The interviewed person has to make decisions in the problem described

below.

Among 25 plays shown in Table 1, with different winnings K1[$] and losses K2[$] (if you

don’t win you will have to pay a sum equal to the loss K2) at first find all plays (K1,K2) which certainly are not accepted by you (K=0), and next all plays which are certainly ac-cepted by you (K=1). For rest of the plays determine a rank with the method of pair-tournament (pair comparisons). Probability of winnings and losses are the same and equal to 0.5.

Page 39: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 39

Table 1 gives values of possible winnings and losses (K1,K2) in particular plays. It also

informs for which plays the AP-author declares the full acceptation (full readiness to take up

the game) that means K=1, and informs for which plays he does not accept at all (zero read-

iness to take up the game) that means K=0. The acceptability degree plays a role of the mul-

ti-criterion in the shown decision-problem.

The acceptability degree of plays marked with question mark will be determined with

the tournament-rank method. The investigated person chooses from each play-pair the more

acceptable play (inserting the value 1 in the table for this play), which means the win. If the

person is not able to decide which of two plays is better, then she/he inserts the value 0.5 for

both plays of the pair, which means the draw.

Summarized scores from Table 2 are shown in Table 3 for particular plays (K1,K2).

Table 1. Winnings K1[$] and losses K2[$] in particular 25 plays and first decisions of the interviewed

person : determining the unacceptable plays (acceptation degree K=0) and the fully acceptable plays

(K=1) which certainly would be played by the person. Plays with question marks are plays of a par-

tial (fractional) acceptation that is to be determined.

The value of losses

��[$] The value of winning ��[$]

0.0 2.5 5.0 7.5 10.0

0.0 0 1 1 1 1

2.5 0 0 ? ? ?

5.0 0 0 0 ? ?

7.5 0 0 0 0 ?

10.0 0 0 0 0 0

Table 2. Tournament results of particular play-pairs. The value 1 means the win of a play, the value

0.5 means the draw. A single play is marked by (K1,K2).

Points ���, ���[$]���, ��) [$] Points Points ���, ���[$]���, ��) [$] Points

0 (5.0, 2.5) (7.5, 2.5) 1 1 (7.5, 2.5) (10.0, 7.5) 0

0 (5.0, 2.5) (10.0, 2.5) 1 1 (10.0, 2.5) (7.5, 5.0) 0

0.5 (5.0, 2.5) (7.5, 5.0) 0.5 1 (10.0, 2.5) (10.0, 5.0) 0

0 (5.0, 2.5) (10.0, 5.0) 1 1 (10.0, 2.5) (10.0, 7.5) 0

0.5 (5.0, 2.5) (10.0, 7.5) 0.5 0 (7.5, 2.5) (10.0, 5.0) 1

0 (7.5, 2.5) (10.0, 2.5) 1 0.5 (7.5, 2.5) (10.0, 7.5) 0.5

1 (7.5, 2.5) (7.5, 5.0) 0 1 (10.0, 5.0) (10.0, 7.5) 0

0.5 (7.5, 2.5) (10.0, 5.0) 0.5

Table 3. Scores of particular plays (K1,K2) and rank places assigned to particular plays with

fractional acceptation degree K (multi-criterion) of the investigated person

Play (K1,K2) (10.0,

2.5)

(10.0,

5.0)

(7.5, 2.5) (10.0,

7.5)

(5.0, 2.5) (7.5, 5.0)

��� !"#���, ��� 5 3.5 3.5 1 1 1

Rank(K1,K2) I II II III III III

Analysis of Table 3 shows that in the end we have 3 play types with differentiated val-

ues of the multi-criterion K. Apart from 6 plays with fractional acceptation given in Table 3

we also have 15 plays with the zero-acceptability K=0 and 4 plays with the full acceptability

Page 40: Journal of Theoretical and Applied Computer Science

40 Andrzej Piegat, Wojciech Sałabun

K=1, see Table 1. Applying the indifference principle of Laplace [2], we can assume that the

full difference of acceptation value relating to plays from Table 3, Kmax - Kmin= 1 - 0 = 1

should be partitioned in 4 equal differences ∆K = ¼. The plays (5, 2.5), (7.5, 5), (10,7.5)

achieve the M-Cr value K=1/4 (the third place in the rank). The plays (7.5, 2.5) and (10, 5)

achieve K=2/4 (the second place in the rank). The play (10,2.5) achieves K=3/4 (the first

place in the rank of fractional-acceptability of plays). Resulting values of the M-Cr K de-

termined for particular plays with the tournament-rank method are given in Table 4.

Table 4. Resulting values of the multi-criterion K= f(K1,K2), which represents the acceptability de-

gree of particular plays (K1,K2) for the investigated person.

The value of losses

��[$] The value of winning ��[$]

0.0 2.5 5.0 7.5 10.0

0.0 0 1 1 1 1

2.5 0 0 1 4⁄ 2 4⁄ 3 4⁄

5.0 0 0 0 1 4⁄ 2 4⁄

7.5 0 0 0 0 1 4⁄

10.0 0 0 0 0 0

On the basis of Table 4 a visualization of the investigated multi-criterion K of the play

acceptability-degree can be realized, Fig. 3 and 4.

Figure 3. Visualization of the 25 analyzed plays (K1,K2) as 25 characteristic objects regularly placed

in the decisional domain K1 K2 of the problem

Each of the 25 characteristic plays (decisional objects) can be interpreted as a crisp rule,

e.g.:

IF (K1 = 7.5) AND (K2 = 5) THEN (K = ¼) (2)

However, if K1 is not exactly equal to 7.5 and K2 is not exactly equal to 5.0 then rule (2)

can be transformed in a fuzzy rule (3) based on tautology Modus Ponens [8, 9].

IF (K1 close to 7.5) AND (K2 close 5.0) THEN (K close ¼) (3)

Page 41: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 41

This way 25 fuzzy rules of type (4) were achieved on the basis of each characteristic ob-

ject (play) given in Table 3. The rules enable calculating values of the nonlinear multi-

criterion K for any values of the component criteria K1i and K2j, i,j =1:5.

IF (K1 close to K1i) AND (K2 close to K2j) THEN (K close to Kij) (4)

The complete rule base is given in Table 3. To enable calculation of the fuzzy M-Cr-

function K it is necessary to define membership functions µK1i ( close to K1i ), µK2j (close to

K2j) and µKij (close to Kij). These functions are shown in Fig.4.

Figure 4. Membership functions µK1i (close to K1i), µK2j (close to K2j) of the component criteria and

µKij (close to Kij) of the aggregating multi-criterion K

On the basis of the rule base (Table 3) and of membership functions from Fig.4 it is easy

to visualize the function-surface K = f(K1,K2) of individual multi-criterion of the play ac-

ceptation. As visualization tool one also can use toolbox of fuzzy logic from MATLAB or

own knowledge about fuzzy modeling [8, 9]. The functional surface is shown in Fig.5.

As Fig.5 shows, the functional surface of the human multi-criterion K=f(K1,K2) is

strongly nonlinear. This surface represents the M-Cr of one person. However, in case of

other persons surfaces of this multi-criterion are qualitatively very similar (an investigation

was realized on approximately 100 students of Faculty of Computer Science of West Pom-

eranian University of Technology in Szczecin and of Faculty of Management and Economy

of University of Szczecin). Quantitative differences of the multi-criterion K between partic-

ular investigated persons were mostly not considerable. All identified surfaces were strongly

nonlinear.

The second co-author WS of the paper used the method of characteristic objects in in-

vestigation of the attractiveness degree of color. In the experiment two attributes occur:

• the degree of brightness green (in short G),

• the degree of brightness blue (in short B).

Page 42: Journal of Theoretical and Applied Computer Science

42 Andrzej Piegat, Wojciech Sałabun

Figure 5. Functional surface of the individual multi-criterion K=f(K1,K2) of the play acceptability

with possible winnings K1[$] and losses K2[$], probability of winnings and losses are identical and

equal to 0.5. This particular surface represents the AP-author of the paper.

The degree of red was fixed at constant brightness level 50%. The brightness level of

each components was normalized to the range [0,1]. The first step was to define linguistic

values for the G and B components, presented in Fig. 6. and 7.

Figure 6. Definitions of linguistic values for the component G

Figure 7. Definitions of linguistic values for the component B

Page 43: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 43

Membership functions presented in Fig 6. are described by formula (5):

() = �.+,-�.+ (.) = -,�

�.+ (./ = �,-�.+ (0 = -,�.+

�.+ , (5)

where: L – low, ML – medium left, MR – medium right, H – height, G – the level of bright-

ness green.

Membership functions presented in Fig. 7. are described by formula (6):

() = �.+,1�.+ (.) = 1,�

�.+ (./ = �,1�.+ (0 = 1,�.+

�.+ , (6)

where: L – low, ML – medium left, MR – medium right, H – height, B – the level of bright-

ness blue.

Linguistic values of attributes generate 9 characteristic objects. Their distribution in the

problem space is presented by Fig.8.

Figure 8. Characteristic objects Ri in the space of the problem

Attribute values of the characteristic Ri objects, their names and colors are given in Ta-

ble 5.

Table 5. Complex color and their rules

Rule [R, G, B] Color

R1 [0.5, 0.0, 0.0]

R2 [0.5, 0.0, 0.5]

R3 [0.5, 0.0, 1.0]

R4 [0.5, 0.5, 0.0]

R5 [0.5, 0.5, 0.5]

R6 [0.5, 0.5, 1.0]

R7 [0.5, 1.0, 0.0]

R8 [0.5, 1.0, 0.5]

R9 [0.5, 1.0, 1.0]

The interviewed person has to make decisions described below.

In the survey, please indicate, which color of the pair of colors is more attractive (please mark this color by X). If both colors have similar or identical level of attractiveness, please mark a draw. Attractiveness of color is telling you which color you prefer more from the pair of colors.

Page 44: Journal of Theoretical and Applied Computer Science

44 Andrzej Piegat, Wojciech Sałabun

Evaluation of characteristic objects is determined with the tournament-rank method. If

one color of a pair is preferred, then this color receives 1 point and second color receives 0

points. If the interviewed person marks a draw, both colors receive 0.5 point. Next, all the

points assigned to each object are added. On the basis of the sums the ranking of objects is

established. Applying the indifference principle of Laplace we can assume that the full dif-

ference value �234 5 �2� = 1 5 0 = 1 should be partitioned in 7 5 1 equal differences 89:;,89<=

2,� . ( m – number of places in the ranking). Experimental identification of surfaces

of the multi-criterion showed, that for all interviewed people, this surfaces were strongly

nonlinear. Fig. 9. shows the multi-criterion surface for a randomly chosen person.

For comparison, Fig. 10 shows the multi-criterion surface for co-author WS of the arti-

cle.

Figure 9. Functional surface of the individual multi-criterion of the resulting color-attractiveness

achieved by mixing 2 component colors with different proportion-rates.

Figure 10. Functional surface of the individual multi-criterion of attractiveness of the resulting color

achieved by mixing 2 component colors with different proportion-rates (WS)

Page 45: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 45

The realized investigation also showed that functional surfaces of the multi-criterion of

all persons were strongly nonlinear. Fig. 9 presents the functional, M-Cr-surface of one of

the persons taking part in the investigation. For other interviewed people, these M-Cr-

surfaces were also highly nonlinear. (Identification of M-Cr-surfaces has been performed

for a group of 307 selected people).

4. Nonlinearity indicator of the functional surface of a multi-criterion

In case of the 2-component multi-criterion K = f(K1,K2) there exists visualization possi-

bility of the functional surface of the M-Cr and possibility of an approximate, visual evalua-

tion of its nonlinearity degree or, at least, of evaluation whether the surface is linear or

nonlinear one. However, in case of higher-dimensional multi-criteria K = f(K1,K2, … ,Kn)

visualization and visual evaluation of nonlinearity becomes more and more difficult, though

it can be realized e.g. with method of lower-dimension cuts [7]. Therefore it would be very

useful to construct a quantitative indicator of nonlinearity N-IndK of a model of the multi-

criterion K. First, let us analyze, for better understanding of the problem, the most simple

criterion-model K = f(K1),the criterion of the lowest dimension identified with the method of

characteristic objects (Ch-Ob-method). Let us assume that after realized investigations we

have at disposal m objects, each of them is described by the pair (K1,K) of coordinate values

and can be interpreted as a measurement sample that can be used for identification of a

functional dependence. Let us assume that the characteristic objects are distributed in the

coordinate-system space as shown in Fig.11a.

Figure 11. An example placement of characteristic objects (���, ��), � = 1:7������, in the space �� > �,

Fig.11a, and a nonlinear, fuzzy model approximating the characteristic objects, Fig.11b

Nonlinearity of the fuzzy model approximating the criterion-function K=f(K1) will be the

smaller, the smaller is the difference sum (Ki – KLi ) of corresponding points lying on the

fuzzy and on the linear approximation of the criterion function. Information about this sum

delivers the proposed indicator N-IndK of nonlinearity, formula (7).

? 5 @�A8 = ∑ |8<,8C<|9<DE�.+2�89:;,89<=� = ∑ |8<,�FGHFE8E<|9<DE

�.+2�89:;,89<=� (7)

Page 46: Journal of Theoretical and Applied Computer Science

46 Andrzej Piegat, Wojciech Sałabun

The denominator 0.5m∙(Kmax-Kmin) in formula (5) realizes normalization of the indicator

to interval [0,1]. Fig. 12a presents distribution of characteristic objects for which the nonlin-

earity indicator equals zero. Fig.12b presents the inverse situation, when the indicator as-

sumes value equal to 1.

Figure 12. Distribution of characteristic objects (K1i,Ki), I = 1-m, for which the nonlinearity indicator

N-IndK is equal to zero, Fig.12a, and distribution for which the indicator assumes the maximal

value 1, Fig.12b

If we use a multi-criterion K aggregating n component criteria Ki, then the linear approx-

imation KL of K has the form (8) and the nonlinearity indicator N-IndK is expressed by for-

mula (9).

�) = �� + ���� + ���� + ⋯+ �� (8)

? 5 @�A8 = ∑ |8<,8C<|9<DE�.+2�89:;,89<=� (9)

The linear approximation KL of a M-Cr can be determined e.g. with the method of the

minimal sum of square errors for which many program-tools can be found, e.g. in

MATLAB and STATISTICA. As an example, the nonlinearity indicator was determined for

the multi-criterion K = f(K1,K2) aggregating winnings and losses of a play, see Fig.5 and

Table 3. The achieved value of the indicator was? 5 @�A8 = 0.35. The obtained (with the

least squares) linear model KL of multi-criterion K is presented in Fig. 13a, and in Fig. 13b,

for comparison, the fuzzy model of this criterion, obtained with the characteristic objects

method is shown.

Another example of determining the nonlinearity indicator N-IndK is given for the multi-

criterion of attractiveness of the resulting color achieved by mixing 2 component colors

with different proportion-rates, which was presented in part 3. The indicator N-IndK was

calculated for the nonlinear models from Fig. 9 and Fig. 10.

Page 47: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 47

Figure 13. Comparison of the linear model KL = w0 + w1K1 +w2K2, Fig.13a, and of the nonlinear

model K = f(K1,K2) of the multi-criterion of acceptability of plays on the basis of their winnings

K1($) and losses K2($). In Fig.13b the nonlinear model obtained with the method of characteristic

objects with the nonlinearity indicator ? 5 @�A8 = 0.35.

The linear model in Fig. 14a was identified by method of least squares. For comparison

the nonlinear model presented in Fig. 14b was determined. For this model the nonlinearity

indicator is equal to 0.49. This means a higher nonlinearity degree than in case of the play-

problem presented in Fig. 13 where this value was equal to 0.35.

Figure 14. Comparison of the linear model KL = w0 + w1G +w2B, Fig.14a, and of the nonlinear model

K = f(G, B) of the multi-criterion of attractiveness of the resulting color achieved by mixing 2 com-

ponent colors with different proportion-rates, Fig.14b. The nonlinear model obtained with the meth-

od of characteristic objects is characterized by the nonlinearity indicator ? 5 @�A8 = 0.49.

Fig. 15a presents the linear model of this same multi-criterion for co-author WS of the

article. This model was identified with method of least squares. After comparing the linear

model with the fuzzy model presented in Fig. 15b the nonlinearity indicator 0.54 was

achieved. This means the highest degree of the multi-criterion nonlinearity in all presented

cases.

Page 48: Journal of Theoretical and Applied Computer Science

48 Andrzej Piegat, Wojciech Sałabun

Figure 15. Comparison of the linear model KL = w0 + w1G +w2B, Fig.15a, and of the nonlinear model

K = f(G, B) of the multi-criterion of attractiveness of the resulting color achieved by mixing 2 com-

ponent colors with different proportion-rates, Fig.15b. The nonlinear multi-criterion was identified

with the method of characteristic objects. Its nonlinearity indicator equals ? 5 @�A8 = 0.54.

5. Conclusions

Human multi-criteria representing human preferences usually are not of linear but of

nonlinear character. Linearity is an idealized feature and it occurs rather seldom in the reali-

ty. The paper presented few examples of nonlinear, human multi-criteria – a considerably

greater number easily could be presented. Scientists, in modeling human multi-criteria

should go over from linear to nonlinear models (approximations) of these criteria. The paper

presented the method of characteristic objects, which enables identification of more precise,

nonlinear models of human multi-criteria. Because it is difficult to visualize high-

dimensional multi-criteria a nonlinearity indicator was proposed. This indicator allows for

error-evaluation of linear, simplified models of human multi-criteria. The method of charac-

teristic objects and the nonlinearity indicator was conceived by Andrzej Piegat.

References

[1] Brans J.P., Vincke P.: A preference ranking organization method: the PROMETHEE method for MCDM. Management Science, 1985.

[2] Burdzy K.: The search for certainty. World Scientific, New Jersey, London, 2009.

[3] Figueira J. et al.: Multiple criteria decision analysis: state of the arts surveys. Springer

Science + Business Media Inc, New York, 2005.

[4] French S. at al.: Decision behavior, analysis and support. Cambridge, New York, 2009.

[5] Hwang Cl., Yoon K.: Multiple attribute decision making: methods and applications. Springer-Verlag, Berlin, 1981.

[6] Kahneman D., Tversky A.: Choices, values and frames. Cambridge University Press,

Cambridge, New York, 2000.

[7] Lu Jie at al.: Multi-objective group decision-making. Imperial College Press, London,

Singapore, 2007.

[8] Piegat A.: Stationary to the lecture Methods of Artificial Intelligence. Faculty of Com-

puter Science, West Pomeranian University of Technology, Szczecin, Poland, not pub-

lished.

[9] Piegat A.: Fuzzy modeling and control. Springer-Verlag, Heidelberg, New York, 2001.

Page 49: Journal of Theoretical and Applied Computer Science

Nonlinearity of human, decisional multi-criteria 49

[10] Rao C.R.: Linear Models: Least Squares and Alternatives., Rao C.R.(eds), Springer

Series in Statistics, 1999.

[11] Rutkowski L.: Metody i techniki sztucznej inteligencji (Methods and techniques of arti-ficial intelligence)

[12] Saaty T.L.: How to make a decision: the analytic hierarchy process. European Journal

of Operational Research, vol.48, no1, pp.9-26, 1990.

[13] Saaty T.L.: Decision making with dependence and feedback: the analytic network pro-cess. RWS Publications, Pittsburg, Pennsylvania, 1996.

[14] Saaty T.L., Brady C.: The encyclicon, volume 2: a dictionary of complex decisions us-ing the analytic network process. RWS Publications, Pittsburgh, Pennsylvania, 2009.

[15] Stadnicki J.: Teoria I praktyka rozwiązywania zadań optymalizacji (Theory and practi-ce of solving optimization problems). Wydawnictwo Naukowo Techniczne, Warszawa,

2006.

[16] Zarghami M., Szidarovszky F.: Multicriteria analysis. Springer, Heidelberg, New

York, 2011.

[17] Zeleny M.: Compromise programming. In Cochrane J.L., Zeleny M.,(eds). Multiple

criteria decision-making. University of South Carolina Press, Columbia, pp. 263-301,

1973.

[18] Zimmermann H.J.: Fuzzy set theory and its applications. Kluwer Academic Publishers,

Boston/Dordrecht/London, 1991.

Page 50: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 50-57

ISSN 2299-2634 http://www.jtacs.org

Method of non-functional requirements balancing during

service development

Larisa Globa1, Tatiana Kot

1, Andrei Reverchuk

2, Alexander Schill

3

1 National Technical University of Ukraine «Kyiv Polytechnic Institute», Ukraine

2 SITRONICS Telecom Solutions, Czech Republic a.s.

3 Technische Universitat Dresden, Fakultat Informatik, Deutschland

{lgloba, tkot}@its.kpi.ua, [email protected], [email protected]

Abstract: Today, the list of telecom services, their functionality and requirements for Service Execu-

tion Environment (SEE) are changing extremely fast. Especially when it concerns require-

ments for charging as they have a high influence on business. This results in the need for

constant adaptation and reconfiguration of Online Charging System (OCS) used in mobile

operator networks. Moreover any new functionality requested from a service can have an

impact on system behavior (performance, response time, delays) which are in general non-

functional requirements. Currently, this influence and reconfiguration strategies are poorly

formalized and validated. Current state-of-the-art approaches are considered methodolo-

gies that can model non-functional or functional requirements but these approaches don’t

take into account interaction between functional and nonfunctional requirements and col-

laboration between services. All these result in time and money consuming service devel-

opment and testing, and cause delays during service deployment. The balancing method

proposed in this paper fills this gap. It employs a well-defined workflow with predefined

stages for development and deployment process for OCS. The applicability of this novel ap-

proach is described in a separate section which contains an example of GPRS service

charging. A tool, based on this method will be developed, providing automation of service

functionality influence on non-functional requirements and allowing to provide a target de-

ployment model for a particular customer. The reduction of development time and thus nec-

essary financial input has been proved based on real-world experiments.

Keywords: OCS, service deployment, non-functional requirements, requirements balancing.

1. Introduction

During service design and deployment, provided by telecom operator, using OCS [1],

one important aspect should be considered. It concerns NFR1 to service provision.

There is the established fact that any system and services run on the system shall be de-

veloped not only based on functional requirements, defining software functions (inputs, be-

havior, outputs), but non-functional ones as well. It is very important to meet non-functional

requirements in the telecom industry, especially for real time systems. Generally non-

functional parameters could be classified as follows: Performance (Response Time,

Throughput, Utilization, Static Volumetric); Scalability; Capacity; Availability; Reliability;

1 Non-functional requirements

Page 51: Journal of Theoretical and Applied Computer Science

Method of non-functional requirements balancing during service development 51

Recoverability; Maintainability; Serviceability; Security; Regulatory; Manageability; Envi-

ronmental; Data Integrity; Usability; Interoperability.

Non-functional requirements specify a system’s “quality characteristics” or “quality at-

tributes”. If non-functional requirements are not considered at the designer level, then the

provided service may actually be useless in practice.

Currently, NFR are not considered within the perspective of the services list, provided

by Telecom Operator. The main problem is that legacy methods can design service accord-

ing to NFR, but cannot model an influence of concurrency services on particular NFR be-

cause of collaboration between services.

This means that Operator has no tool that allows flexible balancing between services,

run on OCS. Balancing can allow to model system behavior for a determined (requested) list

of services to analyze how this configuration meets the NFR.

This paper describes a novel NFR balancing method, focusing on collaboration between

functional and non-functional requirements, allowing to automate service planning stages

and to reduce the time and costs for OCS adaptation in general.

The paper is structured as follows: Section 2 contains state of the art analysis of methods

and approaches to considering NFR. Furthermore, NFR analysis methods are described.

Section 3 introduces NFR balancing method, focusing on functional and non-functional

requirements collaboration. The evaluation has been applied using a real-world scenario

within a telecommunication company and it is represented in Section 4. Section 5 concludes

the work with a summary and outlook on future work.

2. State of the art and non-functional testing

Errors due to omission of NFR or not properly dealing with them are among the most

expensive type and most difficult to correct. Recent works [2] points out that early-phase

requirements engineering should address organizational and non-functional requirements,

while later-phase engineering focuses on completeness, consistency and automated verifica-

tion of requirements.

There are reports [3, 4] showing that not properly dealing with NFR has led to consider-

able delays in the project and consequently to a significant increase of the final cost.

There are many reasons for delays and significant increasing of costs, but one of the

most important reasons relies on the fact that performance was neglected during software

development, leading to several changes in both hardware and software architecture, as well

as in software design and code [5, 6, 7].

There could be a situation in which the system can be deactivated just after its deploy-

ment because, among other reasons, many non-functional requirements were neglected dur-

ing the system development such as: reliability (vehicles location), cost (emphasis on the

best price), usability (poor control of information on the screen), and performance (the sys-

tem did what it was supposed to do but performance was unacceptable). As it was men-

tioned above, OCS shall provide all functionality to charge telecom services (GPRS, voice,

sms, mms, VAS2) using Event Charging with Unit Reservation, Session Charging with Res-

ervation Unit, Immediate Event Charging mechanisms. Each service consumes a strictly

predefined volume of system resource (memory, process time, etc.) and has influence on

non-functional requirements to be supported.

2 Value added services

Page 52: Journal of Theoretical and Applied Computer Science

52 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill

2.1. NFR framework

NFR are considered at the design level and there are several approaches that can help to

model NFR within the scope of the developed service. NFR framework [7] is a methodolo-

gy that guides the system to accommodate change with replaceable components. NFR

framework is a goal-oriented and process-oriented quality approach guiding the NFR mod-

eling. Non-functional requirements such as security, accuracy, performance and cost are

used to drive the overall design process and choose design alternatives. It helps developers

express NFR explicitly, deal with them systematically and use them to drive development

process rationally [8]. In the NFR Framework, each NFR is called an NFR softgoal (depict-

ed by a cloud), while each development technique to achieve the NFR is called an opera-

tionalizing softgoal or design softgoal (depicted by a dark cloud). Design rationale is

represented by a claim softgoal (depicted by a dash cloud). The goal refinement can take

place along the Type or the Topic. These three kinds of softgoals are connected by links to

form the SIG3 that records the design consideration and shows the interdependencies among

softgoals.

2.2. KAOS

Another methodology for considering NFR is KAOS [9, 10]. KAOS is a methodology

for requirements engineering enabling analysts to build requirements models and to derive

requirements documents from KAOS models. KAOS has been designed:

− to fit problem descriptions by allowing you to define and manipulate concepts rele-

vant to problem description;

− to improve the problem analysis process by providing a systematic approach for dis-

covering and structuring requirements;

− to clarify the responsibilities of all the project stakeholders;

− to let the stakeholders communicate easily and efficiently about the requirements.

KAOS is independent of the development model type: waterfall, iterative, incremental,

but it also doesn’t take into account collaboration between FR4 and NFR.

The legacy software tools, for instance NFR-Assistant CASE [11], ARIS [12], don’t

provide requested functionality to model nonfunctional requirements and compare their in-

fluence on functionality.

2.3. Non-functional testing

Testing of non-functional requirements is another issue. Non-functional testing [13] is

concerned with the non-functional requirements and is designed to evaluate the readiness of

a system according to several criteria not covered by functional testing. Non-functional test-

ing covers:

− Load and Performance Testing;

− Ergonomics Testing;

− Stress & Volume Testing;

− Compatibility & Migration Testing;

− Data Conversion Testing;

− Security / Penetration Testing;

− Operational Readiness Testing;

3 Softgoal interdependency graph 4 Functional requirements

Page 53: Journal of Theoretical and Applied Computer Science

Method of non-functional requirements balancing during service development 53

− Installation Testing;

− Security Testing (Application Security, Network Security, System Security).

It enables the measurement and comparison of the testing of non-functional attributes of

software systems. The cost of catching and correcting errors related to non- functional re-

quirements is very high and could cause full redesign of developed service (system). Testing

does not have to occur once the 'code' has been delivered. It can start early with analyzing

the requirements and creating test criteria of 'What' it is needed to test. The process for do-

ing this is called the “V” model [9] (Fig. 1.).

It decomposes requirements and testing. It allows testing and coding as a parallel activi-

ty which enables the changes to occur more dynamic. NFR has a high influence on the test-

ing process and any service that doesn’t meet NFR can cause rollback of the development

process to initial phases.

Figure 1. V- Model

3. NFR balancing method

The proposed NFR balancing method is based on creating FR and NFR collaboration

model. Implementation of functional requirements is presented by listed FB5. Each of FB is

responsible for a particular logical function. The proposed method includes the following

main stages:

− NFR Catalogue development;

− FR decomposition;

− NFR mapping;

− FB distribution;

− Balancing;

− Target deployment model.

NFR balancing method uses NFR Catalogue, Functional Requirements to be implement-

ed, create collaboration model between them. The main stages of the concept are represent-

ed below.

5 Functional Block

Page 54: Journal of Theoretical and Applied Computer Science

54 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill

3.1. Catalogue of NFR

NFR are usually complex, global, conflicting and numerous. Aside from that, both soft-

ware engineers and stakeholders are not used to recognizing NFR. Because of that, a

knowledge base will be used to present NFR in the form of catalogues, to guide the re-

quirements engineering through possibly needed NFR and the possible operationalizations

for each NFR can be found. Thus we can operate with catalogues for performance and ser-

viceability. These catalogues will be updated with further operationalizations to keep cata-

logues on NFR up to date. Such approach will facilitate future reuse of acquired knowledge

on NFR elicitation.

3.2. FR decomposition

The next stage is creating the FR decomposition model. FR decomposition shall de-

scribe all services with their features’ influence on NFR. This means that each service shall

be split into functional blocks. A functional block is a logical unit, responsible for providing

some strictly defined functionality (for instance sending of notification, bonus system regis-

tration, etc.). What is more, services and features, they provide, will be depicted for each

functional block (functional requirements).

Total distribution of functional blocks between all services, run on OCS, is represented

in Table 1.

Table 1. FR decomposition

Service Functional Block Functional Requirement

Service1 FB1.1 or FB1.2 FR1, FR2

Service1 FB2.1 and FB2.2 FR3, FR1

Service2 FB1.1 FR5, FR6

Service2 FB3 FR1, FR7

3.3. NFR mapping

Each call of FB requests a defined amount of each system resource (memory, processor

time, network, etc.) and has a list of characteristics: response time, availability, etc. All of

these characteristics shall be mapped to NFR from catalogue with values that specify how

exact FB meets the particular NFR (it could be graded from 0 to 100 – Table 2).

Table 2. NFR mapping

Functional block/ NFR Availability Performance Security

FB1.1 90 80 10

FB1.2 80 70 20

FB2.1 50 10 10

FB2.2 5 20 30

FB with the same first number (FB1.1, FB1.2) provides the same functionality but in

different way. This means that from a functional point of view there is no difference be-

tween these two blocks. The difference is only how each FB meets the NFR.

To understand and reason about different alternatives involved in these tradeoffs be-

tween functional blocks it is required to clarify some NFR operationalizations and to nego-

tiate which NFR should be denied or partially denied prejudicing another NFR.

To build the NFR model, it is necessary to go through every service and connect it to all

needed functional blocks to cover the requested functionality.

Page 55: Journal of Theoretical and Applied Computer Science

Method of non-functional requirements balancing during service development 55

3.4. Functional blocks distribution

Using NFR catalogues and FR decomposition, Functional blocks distribution can be re-

alized as it is represented on Fig. 2.

Fig. 2 represents use of functional blocks by services. Influence of each connection be-

tween Service and FB on NFR is determined in Table 2. According to this, input could lead

to different deployment configurations. Fig. 2 describes that FR 1 and FR 2 from table 1 can

be implemented either by FB1.1 or FB1.2. The implementation way depends on the NFR

specification for a particular case.

Figure 2. Functional blocks distribution

3.5. Balancing and target model

The target model would be obtained by using balancing between NFR and approaches to

implementation of a particular functionality with FB. This tradeoff can be continued until

target deployment configuration is received based on requested NFR. If requested NFR can-

not be gained with legacy list of service, then some service should be excluded from de-

ployment scheme. For instance, there is the Customer’s demand that service shall support

the highest availability and there is no specified requirement for security and performance.

Such case can be realized by the model, represented on Fig. 3. It is a simple situation and

there are usually combinations of NFR in practice. Thus, a priority should be assigned to

any requirement that will be considered during target model development.

Figure 3. Target deployment model

Page 56: Journal of Theoretical and Applied Computer Science

56 Larisa Globa, Tatiana Kot, Andrei Reverchuk, Alexander Schill

4. Charging of GPRS service

Evaluation of the proposed method is demonstrated using a real-world scenario within a

telecommunication company. Charging of GPRS service at the design level, requested by

Telecom Operator from OCS, is described as an example. Its FR decomposition is depicted

in the Table 3.

Table 3. FR decomposition of GPRS service

Service Functional Block Functional Requirement

GPRS LBS1.1 or LBS1.2 Location Base Charging

GPRS RF2.1 and RF2.2 Step Charging

GPRS NB3.1 or NB 3.2 or NB3.3. User notification

Assuming that Customer takes into account availability of GPRS service and delay

caused by the service as main NFR, and according to statistical data and knowledge base, all

FB characteristics are estimated in the Table 4.

Table 4. NFR mapping of GPRS service

Functional block/ NFR Availability Delay

LBS1.1- location based module implemented

as internal cache in OCS

90 80

LBS1.2 – using external Home Zone Billing -

HZB platform

50 10

RF2.1 – internal Rating 50 20

RF2.2 – external Rating 5 15

NB3.1 – notification via SMS 40 50

NB3.2 –online notification via USSD 50 40

NB3.3 – offline notification via email 50 10

Finally, the target model for GPRS service using balancing method to get optimal de-

ployment configuration could be created (Fig. 4). The model supposes that configuration

will be applied to provide service at the highest availability with minimal delay.

Figure 4. Target model for GPRS service

5. Summary and outlook

The proposed method can be applied at both the service design and deployment stages.

The method could be realized within a software tool, used for service provision software

design and realization. It is also necessary to foresee the possibility of its usage during ser-

vice monitoring to obtain specific statistical data. This data shall be used to evaluate how

Page 57: Journal of Theoretical and Applied Computer Science

Method of non-functional requirements balancing during service development 57

each functional block meets a particular NFR. The method increases efficiency of develop-

ment process on testing and deployment phases and allows quick system reconfiguration on

customer demand. In the future, the method will be extended to consider possibly changing

the NFR list and their priorities during different time periods (e.g. periods with high load,

service upgrading) and also take into account changing priority between services.

References

[1] 3GPP TS 32.296 Online Charging System (OCS): Application and interfaces, 88 p.

[2] Abdukalykov R., Hussain I., Kassab M., Ormandjieva O.: Quantifying the Impact of

Different Non-functional Requirements and Problem Domains on Software Effort Esti-

mation. 9th International Conference on Software Engineering Research, Management

and Applications (SERA), 2011

[3] National Institute of Standards and Technology: Software Errors Cost U.S. Economy

$59.5 Billion Annually (NIST 2002-10).

http://www.nist.gov/public_affairs/releases/n02-10.htm (2002).

[4] Lindstrom D.R.: Five Ways to Destroy a Development Project. IEEE Software, Sep-

tember 1993, pp. 55-58.

[5] Boehm B., In H.: Identifying Quality-Requirement Conflicts. IEEE Software, March

1996, pp. 25-35.

[6] Breitman K. K., Leite J.C.S.P., Finkelstein A.: The World's Stage: A Survey on Re-

quirements Engineering Using a Real-Life Case Study. Journal of the Brazilian Com-

puter Society No. 1, Vol. 6, Jul. 1999, pp. 13-37.

[7] Chung L.: Representing and Using Non-Functional Requirements: A Process Oriented

Approach. Ph.D. Thesis, Dept. of Comp. Science. University of Toronto, June 1993.

Also tech. Rep. DKBS-TR-91-1.

[8] Chung L., Nixon B. A., Yu E., Mylopoulos J.: Non-Functional Requirements in Soft-

ware Engineering, Kluwer Academic Publishers, Boston, 2000.

[9] http://www.info.ucl.ac.be/research/projects/AVL/ReqEng.html

[10] http://www.objectiver.com/

[11] Quan Tran: NFR-Assistant: tool support for achieving quality, Application-Specific

Systems and Software Engineering and Technology, 1999. ASSET '99. Proceedings.

1999 IEEE Symposium.

[12] http://www.softwareag.com/corporate/products/aris_platform/default.asp

[13] Page A., Johnston K., Rollison B.: How We Test Software at Microsoft, Microsoft Press

– December 10, 2008, 448 p.

Page 58: Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 3, 2012, pp. 58–70ISSN 2299-2634 http://www.jtacs.org

Donor limited hot deck imputation: effects on parameterestimation

Dieter William Joenssen, Udo BankhoferTechnische Universitat Ilmenau, Germany

{Dieter-William.Joenssen, Udo.Bankhofer}@TU-Ilmenau.de

Abstract: Methods for dealing with missing data in the context of large surveys or data mining projects arelimited by the computational complexity that they may exhibit. Hot deck imputation methods arecomputationally simple, yet effective for creating complete data sets from which correct inferencesmay be drawn. All hot deck methods draw values for the imputation of missing values from thedata matrix that will later be analyzed. The object, from which these available values are taken forimputation within another, is called the donor. This duplication of values may lead to the problemthat using any donor “too often” will induce incorrect estimates. To mitigate this dilemma somehot deck methods limit the amount of times any one donor may be selected. This study answerswhich conditions influence whether or not any such limitation is sensible for six different hot deckmethods. In addition, five factors that influence the strength of any such advantage are identifiedand possibilities for further research are discussed.

Keywords: hot deck imputation, missing data, non-response, imputation, simulation

1. IntroductionDealing with missing observations when estimating parameters or extracting information

from empirical data remains a challenge for scientists and practitioners alike. Failures in eithermanual or automated data collection or editing, such as aggregating information from differentsources [18] or outlier removal [22], cause missing observations. Some missing data may beresolved through manual or automatic logical inference when values may be inferred directlyfrom existing data (e.g. a missing passport number when the respondent has no passport, miss-ing age when the date of birth is known). If missing data cannot be resolved in this way (e.g.cost restraints, lack of domain knowledge), it must be compensated in light of the missingnessmechanism.

Rubin [25] first treated missing data indicators as random variables. Based on the indica-tors distribution, he defined three basic mechanisms, MCAR, MAR, and NMAR, that governwhich missing data methods are appropriate. With MCAR (missing completely at random),missingness is independent of any data values, missing or observed. Thus under MCAR, ob-served values represent a subsample of the intended sample. Under MAR (missing at random),whether or not data is missing depends on some observed data’s values. A MAR mechanismwould be present if response rates for an item differ between two groups of respondents, e.g.survey respondents with a higher education level are less likely to answer a question on in-come than respondents exhibiting a lower education level. Finally under NMAR (not missing

Page 59: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 59

at random), the presence of missing data is dependent on the variable’s values, which is itselfsubject to missingness. NMAR missingness is present when, for example, data is less likely tobe transmitted by a temperature sensor if the temperature rises above a certain threshold.

With missingness present, conventional methods cannot be simply applied to the data with-out proxy. Explicit provisions must be made before or within the analysis. The provisions,to deal with the missing data, must be chosen based on the identified missingness mechanism.Principally, two strategies to deal with missing data in the data mining or large survey contextare appropriate: elimination and imputation. Elimination procedures will eliminate objects orattributes with missingness from the analysis. These only lead to a data set, from which accurateinferences may be made, if the missingness mechanism is MCAR, and correctly identified assuch. But even if the mechanism is MCAR, eliminating records with missing values denotesan inferior strategy, especially when many records need to be eliminated due to unfavorablemissingness patterns or data collection schemes (e.g. asynchronous sampling). Imputationmethods replace missing values with estimates ([17], [1]), and can be suitable under the lessstringent assumptions of MAR. Some techniques can even lead to correct inferences under thenon-ignorable NMAR mechanism ([3], [19]). Replacing missing values with reasonable onesnot only assures that all information gathered can be used, but also broadens the spectrum ofavailable analyses. Imputation methods differ in how they define these reasonable values. Thesimplest imputation techniques, and so far the state of the art for data mining [18], replacemissing values with eligible location parameters. Beyond that, multivariate methods, such asregression or classification methods, may be used to identify imputation values. The interestedreader may find a more complete description of missingness mechanisms and methods for deal-ing with missing data in [3], [19], or [14].

A category of imputation techniques appropriate for imputation in the context of mininglarge amounts of data and large surveys, due to its computational simplicity (c.p. [22], [14],[20]), is hot deck imputation. Ford [11] defines a hot deck procedure as one where missing itemsare replaced by using values from one or more similar records within the same classificationgroup. Partitioning records into disjoint, homogeneous groups is done so selected, good recordsthat supply the imputation values (the donors) follow the same distribution as the bad records(the recipients). Due to this, and the replication property, all hot deck imputed data sets containonly plausible values, which cannot be guaranteed by most other methods. Traditionally, adonor is chosen at random, but other methods such as ordering by covariate, when sequentiallyimputing records, or nearest neighbor techniques, utilizing distance metrics, are possible, whichimprove estimates at the expense of computational simplicity (c.p. [11], [19]).

The replication of values leads to the central problem in question here. Any donor may,fundamentally, be chosen to accommodate multiple recipients. This poses the inherent risk that“too many” or even all recipients are imputed with the same value or values from a single donor.Due to this, some variants of hot deck procedures limit the amount of times any one donor maybe selected for donating its values. This inevitably leads to question under which conditions alimitation is sensible and whether or not some appropriate limit value exists. This study aims toanswer these questions. An overview of the basic mechanics of hot deck methods is presented inchapter 2. Chapter 3 discusses current empirical and theoretical research on this topic. Chapter4 highlights the simulation study design while results are reported and discussed in chapter 5.A conclusion possibilities for further research are presented in chapter 6.

Page 60: Journal of Theoretical and Applied Computer Science

60 Dieter William Joenssen, Udo Bankhofer

2. Overview of Hot Deck MethodsFord [11] describes hot deck methods as processes in which a reported value is duplicated to

represent a value missing from the sample. Sande [26] extends this to define hot deck imputa-tion procedures as methods for completing incomplete responses using values from one or morerecords in the same file. Thus, from a procedural standpoint, clearly hot deck methods matchdonors and recipients within the same data matrix, whereby observations are duplicated to re-solve either all the recipient’s missingness simultaneously or on an attribute sequential basis.Simultaneous resolution of all the recipient’s missing data may better preserve the associationsbetween the variables, while the sequential resolution ensures a larger donor pool. Since, the-oretically, any procedure may be iteratively applied to all attributes exhibiting missing values,hot deck methods are better classified by the method of how donors and recipients are matched.The two primary possibilities for donor matching are:— Randomly. A donor is selected at random to accommodate any recipient. This method is,

seen computationally, the simplest. It preserves the overall distribution of the data and leadsto correct mean and variance estimation [2] under the MCAR mechanism. When data is notmissing MCAR, this method can be modified in various ways. Most often imputation classesare formed by stratifying auxiliary variables or by applying common clustering proceduresto the data, in an effort to achieve MCAR missingness within the classes. The randommatching of donor and recipient is then performed within these classes.Another variant of the random hot deck applies weights to the selection probabilities [27].This guarantees that donors more similar to the recipient have a higher chance of beingselected.The last and most widely used (random) method is the so called sequential hot deck. Thesequential hot deck is a procedure developed by the U.S. Census Bureau [7]. Based onpartitioning the data into imputation classes, each record in the data set is considered inturn. If a record is missing a value, this value is replaced by one saved in a register. If therecord is complete, the register’s value is updated. Initial values for this register are takeneither from a previous survey, class or randomly from the variables’ domain. The sequentialhot deck yields equivalent results to the random hot deck if the data set’s ordering is random.An advantage may be attained when ordering is nonrandom, such as when the data set issorted by covariates. This, however, is seldom done purposefully as sorting not only requirescomputationally intensive sorting but also identification of strong covariates. Usually, in anysequential hot deck application, any order in the data set is due to data entry procedures andthus is unlikely to ensure substantially better results.

— Deterministically. This class of hot decks matches recipients to their respective donors.These, usually of the nearest neighbor type, procedures are state of the art for many sta-tistical institutes and bureaus around the world. For example, nearest neighbor hot decksare used by the US Bureau of the Census in the CPS1, SIPP2, and ACS Surveys3, the UKOffice for National Statistics used them for the 2001/2011 Censuses4, and Statistics Canadautilizes nearest neighbor hot decks in 45% of all active surveys exhibiting missing data suchas the SLID and LFS5.

1 http://www.census.gov/cps/methodology/2 http://www.census.gov/sipp/editing.html3 http://www.census.gov/acs/www/methodology/ item allocation rates definitions/4 http://www.ons.gov.uk/ons/guide-method/ index.html5 http://www23.statcan.gc.ca/ imdb-bmdi/pub/ index-eng.htm

Page 61: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 61

The nearest neighbor is usually defined by minimizing simple distance functions such as theManhattan or Chebyshev distances. These hot decks guarantee that the same donor is alwayschosen, given a static data set, ensuring consistency when multiple independent analyses areperformed on the data after a public release. While distance matrix computation tends tobecome prohibitively expensive for large amounts of data, this limit is reached later for thenearest neighbor hot deck methods, as neither the simultaneous nor the sequential versionrequire a full distance matrix. Rather only distances between all donors and all recipientsneed to be calculated.All hot deck methods guarantee, by virtue of the duplication property, that the imputed data

set contains only naturally occurring values, without the need to round or transform categoricalvalues. Hot decks also conserve unique distribution features, such as discontinuities or spikes.Their low cost of implementation and execution is, however, offset by the fact that little isknown about their theoretical properties.

Literature further detailing the mechanics hot deck imputation methods includes [11], [19],[14], [15], and [6].

3. Review of LiteratureThe theoretical effects of a donor limit were first investigated by Kalton and Kish [16].

Based on combinatorics, they come to the conclusion that selecting a donor from the donorpool without replacement leads to a reduction in the imputation variance, the precision withwhich any parameter is estimated from the post-imputation data matrix. A possible effect on animputation introduced bias was not discussed. Two more arguments in favor of a donor limit aremade. First, the risk of exclusively using one donor for all imputations is removed [26]. Second,the probability of using one donor with an extreme value or values “too often” is reduced ([3],[28]). Based on these arguments and sources, recommendations are made in [15], [21], [28],and [10].

In contrast, Andridge and Little [2] reason that imposing a donor limit inherently reducesthe ability to choose the most similar, and therefore most appropriate, donor for imputation. Notlimiting the times a donor can be chosen may thus increase data quality. Generally speaking, adonor limit makes results dependent on the order of object imputation. Usually, the imputationorder will correspond to the sequence of the objects in the data set. This property is undesirable,especially in deterministic hot decks. Thus, from a theoretical point of view, it is not clearwhether or not a donor limit has a positive or negative impact on the post-imputation data’squality.

Literature on this subject provides only studies that compare hot deck imputation methodswith other imputation methods. These studies include either only drawing the donor from thedonor pool with replacement ([4], [24], [29]) or without replacement ([13]).

It becomes apparent, based on this review of literature, that the consequences of imposing adonor limit have not been sufficiently examined.

4. Study DesignConsidering possible theoretical advantage that a donor limit has, and possible effects that

have not been investigated to date, the following questions will be answered by this study:1. Are true parameters of any hot deck imputed data matrix estimated with higher precision

when a donor limit is used?

Page 62: Journal of Theoretical and Applied Computer Science

62 Dieter William Joenssen, Udo Bankhofer

2. Does a donor limit lead to less biased post-imputation parameter estimation?3. What factors influence if a hot deck with a donor limit creates better results?

A series of factors are identified, by considering papers where authors chose similar ap-proaches ([23], [24], [28]) and further deliberations, that might influence whether or not a donorlimit affects parameter estimates. The factors varied are the following:

— Imputation class count: Imputation classes are assumed to be given prior to imputationand data is generated as determined by the class structure. Factor levels are two and sevenimputation classes.

— Objects per imputation class: The amount of objects characterizing each imputation classis varied. Factor levels 50 and 250 objects per class are considered.

— Class structure: To differentiate between well- and ill-chosen imputation classes, dataare generated with a relatively strong and relatively weak class structure. A strong classstructure is achieved by having classes overlap by 5% and an inner-class correlation of.5. A weak class structure is achieved by an intra-class overlap of 30% and no inner-classcorrelation.

— Data matrices: Data matrices of nine multivariate normal variables are generated depen-dent on the given class structure. Three of these variables are then transformed to a discreteuniform distribution with either five or seven possible values, simulating an ordinal scale.The next three variables are converted to a nominal scale so that 60% of all objects areexpected to take the value one, with the remaining values being set to zero. General detailson this NORTA type transformation are described by Cario and Nelson [8].

— Portion of missing data: Factor levels include 5, 10, and 20% missing data points andevery object is assured to have at least one data point available (no subject non-response).

— Missingness mechanism: Missingness mechanisms considered are either MCAR, MAR,or NMAR. These are generated as follows: under MCAR a set amount of values are chosenwithout replacement to be missing. Under MAR missing data is generated as under MCARbut using two different rates based on the value of one binary variable, which is not subjectto missingness. The different rates of missingness are either 10% higher or lower than therates under MCAR. NMAR modifies the MAR mechanism to also allow missingness of thebinary variable. To forgo possible problems with the simultaneous imputation methods andthe donor limitation of once, it was guaranteed that at least 50% of all objects within oneclass were complete in all attributes.

— Hot deck methods: The six hot deck methods considered are named “SeqR,” “SeqDW,”“SeqDM,” “SimR,” “SimDW,” and “SimDM” according to the three properties that theyexhibit. The prefixes denote whether attributes are considered sequentially (Seq) or simul-taneously (Sim) for imputation. The postfixes indicate a random (R) or a distance baseddeterministic (D) hot deck and the type of adjustment made to compensate for missingnesswhen computing the distances. “W” indicates a reweighting type of compensation, whichassumes that the missing components supply an average deviation to the distance. “M”denotes that an imputation of relevant location estimates is performed before distance calcu-lation, which assumes that the missing component is close to the average for this attribute.To account for variability and importance, prior to aggregating the Manhattan distances,variables are weighted with the inverse of their range.

Next to the previously mentioned factors, two static and two dynamic donor limits are evalu-

Page 63: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 63

ated. The two static donor limits allow either a donor to be chosen once or an unlimited numberof times. For the dynamic cases, the limit is set to either 25% or 50% of the recipient count.

To evaluate imputation quality, a set of location, variability, and contingency measures isconsidered (c.p. [21]). For the quantitative variables mean, variance, and correlation for theordinal variables median, quartile distance, and rank-correlation, and for the binary variables therelative frequency of the value one and the normalized coefficient of contingency are computed.

100 data matrices are simulated for every factor level combination of “imputation classcount”, “object count per imputation class”, “class structure”, and “ordinal variables scale”.For every complete data matrix, the set of true parameters is computed. Each of these 1600 datamatrices is then subjugated to each missingness mechanism, generating three different amountsof missing data. All of the matrices with missing data are then imputed by all six hot deckmethods using all four donor limits. Repeating this process ten times creates 3.456 millionimputed data matrices, for which each parameter set is calculated again.

Considering every parameter in the set, the relative deviation ∆p between the true parametervalue pT and the estimated parameter value pI , based on the imputed data matrix, is calculatedas follows:

∆p =pI − pT

pT(1)

To analyze the impact of different donor limits on the quality of imputation, the differences inthe absolute values of ∆p, that can be attributed to the change in donor limitation, are consid-ered. Due to the large data amounts that are generated in this simulation, statistical significancetests on these absolute relative deviations are not considered appropriate. As an alternative tothis Cohen’s d measure of effect ([9], [5]) is chosen as a qualitative criterion. The calculationof Cohen’s d for this case is as follows:

d =|∆p1| − |∆p2|√

s21+s222

(2)

∆p1 and ∆p2 are the means of all relative deviations calculated via (1) for two different donorlimits. s21 and s22 are the corresponding variances in the relative deviations. Using absolutevalues for ∆p1 and ∆p2 allows interpreting the sign of d. A positive sign means that the sec-ond case of donor limitation performed better than the first, while a negative sign means theconverse. As with any qualitative interpretation of results, thresholds are quite arbitrary anddependent on the investigators frame of reference. Recommendations ([9], [12]) are to considerdeviations larger than 10% of a standard deviation as meaningful and thus thresholds to considereffects nontrivial are set to |d| >= .1.

5. ResultsBased on the simulation’s results, the research questions formulated in section 4 are now an-

swered. Section 5.1 deals with whether or not minimum imputation variance is always achieved,independent of the data and chosen hot deck procedure, when the most stringent donor limit isapplied. Section 5.2 deals with whether or not a donor limit will introduce a bias. Influencingfactors will be analyzed for each hot deck method separately in section 5.3

Page 64: Journal of Theoretical and Applied Computer Science

64 Dieter William Joenssen, Udo Bankhofer

Table 1. Frequency distribution of minimum imputation variance

Donor limitEvaluated parameter once 25% 50% unlim.

Quantitativevariables

Mean 68.52% 15.47% 7.95% 8.06%Var. 67.25% 15.74% 8.56% 8.45%Corr. 48.84% 19.98% 15.24% 15.93%

Ordinalvariables

Med. 74.54% 11.38% 7.62% 6.46%Q. dist. 85.88% 5.71% 4.96% 3.45%

Rank corr. 62.27% 14.47% 11.75% 11.52%Binary

variablesRel. freq. 78.36% 8.41% 6.96% 6.27%

Cont. coef. 61.64% 14.29% 11.71% 12.37%

Table 2. Frequency distribution of minimum imputation bias

Donor limitEvaluated parameter once 25% 50% unlim.

Quantitativevariables

Mean 42.71% 2.22% 18.48% 18.60%Var. 54.05% 17.79% 13.04% 15.12%Corr. 38.08% 19.60% 18.96% 23.36%

Ordinalvariables

Med. 46.41% 21.53% 14.47% 17.59%Q. dist. 56.83% 16.24% 12.94% 13.99%

Rank corr. 40.63% 2.99% 2.24% 18.15%Binary

variablesRel. freq. 49.42% 18.94% 15.07% 16.57%

Cont. coef. 63.10% 15.83% 9.81% 11.27%

5.1. Donor Limitation Impact on PrecisionThe theoretical reduction in imputation variance through donor selection without replace-

ment, as put forth by Kalton and Kish [16], is investigated empirically at this point. Table 1shows how often, in any simulated situation, a certain donor limit leads to the least amount ofimputation variance in the parameter estimate.

Clearly, a donor limit of one leads to minimal imputation variance in most cases and thuscan be expected to have highest precision in parameter estimation. Estimation precision for pa-rameters also tends to increase with the stringency of the donor limit. Variables with lower scaletype, binary and ordinal, favor donor selection without replacement even more strongly than thequantitative variables. Nonetheless, this recommendation does not hold for all situations. Somesituations demand using donors more often, while others require an over usage protection.

5.2. Donor Limitation Impact on Imputation BiasTo answer the, for the practitioner, more pressing question of whether or not implementing

a donor limit also leads to a reduction in imputation bias, the recorded data was evaluatedsimilarly to the previous fashion. Table 2 shows the percentage of situations in which a certaindonor limit has the least bias, as measured by the mean relative deviations.

The values indicate, just as with the imputation variance previously discussed, that in mostcases donor selection without replacement leads to best expected parameter estimation. Min-imal imputation bias is mostly achieved under limiting donor usage to just one time, but evenmore so than for the imputation variance, there are situations where other donor limits improvehot deck performance. Measures of variability are more strongly affected than those of location,

Page 65: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 65

Table 3. Effect sizes for each factor

Quantitative Ordinal Binaryvariables variables variables

Mean Var. Corr. Med.Q. Rank Rel. Cont.

dist. corr. freq. coef.Imputation 2 .000 -.068 -.003 -.001 -.029 -.014 -.072 -.065class count 7 .000 -.147 -.052 -.003 -.115 -.054 -.090 -.118Objects per 50 .000 -.112 -.005 -.001 -.073 -.019 -.028 -.116

imputation class 250 .000 -.090 -.162 -.005 -.041 -.145 -.141 -.146Class Strong .000 -.092 -.004 -.001 -.072 -.013 -.072 -.088

structure Weak .000 -.094 -.008 -.001 -.045 -.019 -.080 -.102

Portion ofmissing data

5% .000 -.025 -.002 .000 -.013 -.003 -.011 -.02010% .000 -.071 -.002 .000 -.037 -.010 -.051 -.06120% .000 -.148 -.008 .000 -.100 -.027 -.129 -.156

Missingnessmechanism

MCAR .001 -.088 -.004 -.001 -.053 -.015 -.065 -.087MAR .000 -.100 -.006 .000 -.066 -.017 -.086 -.101

NMAR .001 -.091 -.003 .000 -.058 -.013 -.077 -.089SimDW -.001 .153 -.008 -.002 .025 .024 .075 -.147SimDM -.004 -.339 -.018 .005 -.214 -.058 -.338 -.222

Hot Deck SeqDW .001 -.007 .000 -.003 .000 .002 -.005 -.057method SeqDM .000 -.088 -.004 .010 -.133 -.006 -.041 -.078

SimR .000 -.001 .000 -.001 -.004 .001 .000 -.007SeqR .000 -.001 .000 .000 -.001 .002 -.003 -.002

which means that in some cases donor limitation will lead to less accurate confidence intervals.Contingency measures are affected less than both location and variability measures, signifyingthat a choice of donor limit is even more important if the association between variables is ofinterest.

5.3. Analysis of Donor Limit Influencing FactorsCohen’s d is used to analyze which of the factors have an influence on whether or not a donor

limitation is beneficial. Tables in the following sections first highlight main effects followedby between factor effects on any donor limit advantages. Effect sizes are calculated betweenthe two extreme cases, donor selection without and with replacement. Effects exceeding thethreshold value of .1 are in bold, with negative values indicating an advantage for the moststringent donor limit.

5.3.1. Analysis of Main Effects

Table 3 (below) shows the cross classification between all factors and factor levels with allparameters analyzed.

The first conclusion that can be reached upon investigation of the results is that, indepen-dent of any chosen factors, there are no meaningful differences between using a donor limitand using no donor limit in mean and median estimation. This result is congruent with theresults of the previous section. In contrast to this, parameters measuring variability are moreheavily influenced through the variation of the chosen factors. Especially data matrices witha high proportion missing data, as well as those imputed with SimDM will profit significantlyfrom a donor limitation. Correlation measures are influenced mainly by the amount of objects

Page 66: Journal of Theoretical and Applied Computer Science

66 Dieter William Joenssen, Udo Bankhofer

per imputation class. All effects related to the binary variables are negative, indicating thatespecially these types of variables profit from donor selection without replacement. Also a highamount of imputation classes tends to speak for a limit on donor usage.

The class structure, any of the random hot deck procedures or SeqDW have no influenceon whether a donor limit is advantageous. Fairly conspicuous is the fact that SimDW leadsto partially positive effect sizes meaning that leaving donor usage unlimited is favorable. Thisleads to interesting higher order effects, detailed in the following section.

5.3.2. Analysis of Interactions

Based on the findings in the previous section, effects are investigated stratified by the hotdeck methods SimDW, SimDM and SeqDM. Results for the parameters mean and median,for the quantitative and ordinal variables respectively, are omitted because no circumstanceconsidered yielded meaningful differences. The values for the remaining parameters are shownin table 4.

As in the analysis of main effects, this table clearly shows that using SimDW with no donorlimit is advantageous in most cases. If solely the estimation of association between binaryvariables is of interest, limiting donor usage to once is always appropriate. Furthermore, theother two methods, SimDM and SeqDM, show only negative values. Thus, the advantage ofusing a hot deck with a donor limit is strongly dependent upon the imputation method used.

For all three portrayed methods, a high amount of imputation classes and a high percentageof missing data show meaningful effects, indicating an increased tendency for any advanta-geous strategy of choosing a donor limit. The amount of objects per imputation class show nohomogeneous effect on the parameters, rather it seems to strengthen the advantage the donorlimitation or non-limitation has, with the parameters variance and quartile distance reactinginversely to the other four.

The other factors seemingly do not influence the effects as their variation does not lead togreat differences in the effect sizes, making their absolute level only dependent on the variable’sscale or imputation method.

Besides the results shown in table 4, further cross classifications between factors may becalculated. These effect sizes further highlight the additive nature of the factors systematicallyvaried in this study. Some strikingly large effects arise when considering large amounts ofmissingness and imputation classes. For example, the factor level combination: 20% missingdata, high amounts of imputation classes, and a low amount of objects per imputation class leadto effects up to -1.7 in variance, up to -1.9 in quartile distance, -3.6 in correlation, -2.9 in rankcorrelation, and -2.5 in coefficient of contingency when imputing with the SimDM algorithm.Maximum effects when imputing with the SimDW method are reached with 20% missing data,seven imputation classes,

Effect sizes up to -3 are calculated for the relative frequency in the binary variable whenthe amount of imputation classes is large, has many objects in each class and many values aremissing. This signifies some large advantage for donor selection without replacement whenusing SimDM. On the other hand, when using SimDW the largest effects are calculated whenthe amount of classes is high, but the amount of objects is low while having a high rate ofmissingness. Even though this only leads to effects of up to .6 and .34 for variance and quartiledifference respectively, the effect is noticeable and relevant for donor selection with replace-ment. Conspicuous nonetheless is the fact that especially the combination of hot deck variant,

Page 67: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 67

Table 4. Interactions between imputation method and other factors

Var.Q. Rel.

Corr.Rank Cont.

dist. freq. Corr. coef.

SimDW

Imputation 2 .097 .025 .081 -.005 .020 -.120class count 7 .287 .033 .075 .084 .106 -.176Objects per 50 .182 .082 .034 -.009 .029 -.177

imputation class 250 .143 .056 .140 .337 .314 .012Class Strong .144 .006 .071 -.007 .018 -.145

structure Weak .153 .048 .078 .001 .033 -.154

Portion ofmissing data

5% .065 -.012 .031 -.006 .008 -.05410% .148 .006 .077 -.004 .023 -.13720% .203 .061 .101 -.006 .034 -.216

Missingnessmechanism

MAR .151 .025 .079 -.011 .023 -.150MCAR .153 .023 .067 -.005 .026 -.143NMAR .154 .029 .077 -.004 .022 -.148

SimDM

Imputation 2 -.247 -.101 -.300 -.015 -.050 -.156class count 7 -.521 -.382 -.424 -.185 -.213 -.278Objects per 50 -.426 -.284 -.132 -.021 -.074 -.280

imputation class 250 -.319 -.131 -.684 -.505 -.473 -.445Class Strong -.338 -.269 -.313 -.014 -.049 -.217

structure Weak -.339 -.156 -.362 -.033 -.073 -.233

Portion ofmissing data

5% -.084 -.057 -.045 -.007 -.010 -.04810% -.262 -.162 -.213 -.011 -.034 -.15920% -.558 -.345 -.600 -.028 -.108 -.369

Missingnessmechanism

MAR -.355 -.226 -.372 -.021 -.064 -.235MCAR -.326 -.204 -.296 -.017 -.055 -.212NMAR -.334 -.213 -.344 -.015 -.054 -.220

SeqDM

Imputation 2 -.066 -.082 -.049 -.002 -.008 -.065class count 7 -.130 -.217 -.031 -.051 -.003 -.090Objects per 50 -.111 -.196 -.004 -.004 -.007 -.104

imputation class 250 -.088 -.047 -.098 -.130 -.086 -.089Class Strong -.085 -.132 -.040 -.003 -.006 -.067

structure Weak -.091 -.135 -.042 -.008 -.006 -.096

Portion ofmissing data

5% -.013 -.028 -.004 .000 .001 -.01110% -.039 -.073 -.010 -.002 .001 -.03320% -.168 -.233 -.085 -.007 -.015 -.147

Missingnessmechanism

MAR -.107 -.152 -.058 -.004 -.010 -.101MCAR -.075 -.119 -.025 -.003 -.003 -.063NMAR -.081 -.125 -.038 -.004 -.004 -.068

Page 68: Journal of Theoretical and Applied Computer Science

68 Dieter William Joenssen, Udo Bankhofer

amount of imputation classes, objects per imputation class, and portion of missing data lead tostrong effects indicating strong advantages for and against donor limitation.

6. ConclusionsThe simulation conducted shows distinct differences between different levels of donor lim-

its. Unlike Kalton and Kish [16] suggested, smallest imputation variance is not always achievedwhen donors are selected without replacement from the pool. Their suggestion is thus limited toa subset of many possible combinations of situations and hot deck types. When imputation biasis taken into account, it becomes apparent, that there are many more situations where overlylimiting donor usage is ill advised. For most parameters, the chances are less than 50/50 thatthe most extreme donor limit is advisable.

Further, there are some subsets of situations in which both imputation variance and biasare minimal when one of the two dynamic donor limits is chosen. This indicates, that neitherstrategy of donor selection with or without replacement is always superior, but that there isindeed a payoff between protection from donor over usage and the ability to choose the mostsimilar donor. Thus the truth lies between the arguments presented in section 3.

These findings show that the most influential factor in deciding whether or not to imputeusing donor selection without replacement is the hot deck method used. When using randomhot deck methods, the question of choosing a donor limit is frivolous. Implementing a donorlimit into an existing system would not be worth the effort. When considering nearest neigh-bor hot decks, not only the method of compensating the missing data prior to calculating thedissimilarity measure is influential, but also whether variables are processed simultaneously orsequentially. With distance calculation assisted by mean imputation donor selection withoutreplacement always denotes a superior strategy. If a reweighting scheme is chosen, parameterestimation (excluding the contingency coefficient for binary variables) is never worse whendonors may be chosen an unlimited amount of times. Sequential processing of variables leadsto trivial differences, but simultaneous processing of variables leads to noticeable advantageswhen allowing infinite donor usage. Beyond that the overall magnitude of advantage for anydonor usage tactic is determined by the factors objects per imputation class, the amount ofimputation classes, and the proportion of missing data. These results, in conjunction with theintended, post-imputation analyses, dictate which donor limit, with or without replacement, ismost suitable. For example, if a decision tree should be constructed with a CHAID algorithm,donor selection without replacement should be used for imputation, because the best estimationof the coefficient of contingency is best estimated when donor selection is performed withoutreplacement.

In conclusion, some interesting questions can be answered with this research, while othersremain open. Results from sections 5.1 and 5.2 indicate, that there may be a situation dependentoptimal donor limit which may be dynamically determined from the data on hand. Hence, a hotdeck method with a data driven donor limit selection method may have desirable properties.Finally the large amount of situations under which donor selection without replacement is thesuperior strategy raises questions. Since generally imputing without donor replacement makesthe results dependent on the sequence of the recipients, results of hot deck imputation couldfurther be improved if donor selection was not performed to minimize the distance at each step,but to minimize the sum of distances between all donors and recipients. Thus, further researchpertaining to hot deck imputation and donor selection schemes remains worthwhile.

Page 69: Journal of Theoretical and Applied Computer Science

Donor limited hot deck imputation. . . 69

References

[1] Allison P.D.: Missing Data, Sage University Papers Series on Quantitative Applications in theSocial Sciences. Thousand Oaks, 2001.

[2] Andridge R.R., Little R.J.A.: A Review of Hot Deck Imputation for Survey Non-Response. Interna-tional Statistical Review, 78, 1, pp. 40–64, 2010.

[3] Bankhofer U.: Unvollstandige Daten- und Distanzmatrizen in der Multivariaten Datenanalyse.Eul, Bergisch Gladbach, 1995.

[4] Barzi F., Woodward M.: Imputations of Missing Values in Practice: Results from Imputations ofSerum Cholesterol in 28 Cohort Studies. American Journal of Epidemiology, 160, pp. 34–45, 2004.

[5] Bortz J., Doring N.: Forschungsmethoden und Evaluation fur Human- und Sozialwissenschaftler.Springer, Berlin, 2009.

[6] Brick J.M., Kalton G.: Handling Missing Data in Survey Research. Statistical Methods in MedicalResearch, 5, pp. 215–238, 1996.

[7] Brooks C.A., Bailar B.A.: An Error Profile: Employment as Measured by the Current PopulationSurvey. Statistical Policy Working Paper 3. U.S. Government Printing Office, Washington, D.C.,1978.

[8] Cario M.C., Nelson B.L.: Modeling and Generating Random Vectors with Arbitrary MarginalDistributions and Correlation Matrix. Northwestern University, IEMS Technical Report, 50, pp.100–150, 1997.

[9] Cohen J.: A Power Primer. Quantitative Methods in Psychology, 112, pp. 155–159, 1992.[10] Durrant G.B.: Imputation Methods for Handling Item-Nonresponse in Practice: Methodologi-

cal Issues and Recent Debates. International Journal of Social Research Methodology, 12, pp.293–304, 2009.

[11] Ford B.: An Overview of Hot-Deck Procedures. In: W. Madow, H. Nisselson, I. Olkin (Eds.):Incomplete Data in Sample Surveys, 2, Theory and Bibliographies. Academic Press, pp. 185–207,1983.

[12] Frohlich M., Pieter A.: Cohen’s Effektstarken als Mass der Bewertung von praktischer Relevanz –Implikationen fur die Praxis. Schweizerische Zeitschrift fur Sportmedizin und Sporttraumatologie,57, 4, pp. 139–142, 2009.

[13] Kaiser J.: The Effectiveness of Hot-Deck Procedures in Small Samples. Proceedings of the Sectionon Survey Research Methods, American Statistical Association, pp. 523–528, 1983.

[14] Kalton G., Kasprzyk D.: Imputing for Missing Survey Responses. Proceedings of the Section onSurvey Research Methods, American Statistical Association, pp. 22–31, 1982.

[15] Kalton G., Kasprzyk D.: The Treatment of Missing Survey Data. Survey Methodology, 12, pp.1–16, 1986.

[16] Kalton G., Kish L.: Two Efficient Random Imputation Procedures. Proceedings of the SurveyResearch Methods Section 1981, pp. 146–151, 1981.

[17] Kim J.O., Curry J.: The Treatment of Missing Data in Multivariate Analysis. Sociological Methodsand Research, 6, pp. 215–240, 1977.

[18] Kim W., Choi B., Hong E., Kim S., Lee D.: A Taxonomy of Dirty Data. Data Mining and Knowl-edge Discovery, 7, 1, pp. 81–99, 2003.

[19] Little R.J., Rubin D.B.: Statistical Analysis with Missing Data. New York, Wiley, 1987.[20] Marker D.A., Judkins D.R., Winglee M.: Large-Scale Imputation for Complex Surveys. In: Groves

R.M., Dillman D.A., Eltinge J.L., Little R.J.A. (Eds.): Survey Nonresponse. John Wiley and Sons,pp. 329–341, 2001.

[21] Nordholt E.S.: Imputation: methods, simulation experiments and practical examples. InternationalStatistical Review, 66, pp. 157–180, 1998.

Page 70: Journal of Theoretical and Applied Computer Science

70 Dieter William Joenssen, Udo Bankhofer

[22] Pearson R.: Mining Imperfect Data. Philadelphia, Society for Industrial and Applied Mathematics,2005.

[23] Roth P.L.: Missing Data in Multiple Item Scales: A Monte Carlo Analysis of Missing Data Tech-niques. Organizational Research Methods, 2, pp. 211–232, 1999.

[24] Roth P.L., Switzer III F.S.: A Monte Carlo Analysis of Missing Data Techniques in a HRM Setting.Journal of Management, 21, pp. 1003–1023, 1995.

[25] Rubin D.B.: Inference and Missing Data (with discussion). Biometrika 63, pp. 581–592, 1976.[26] Sande I.: Hot-Deck Imputation Procedures. In: W. Madow, H. Nisselson, I. Olkin (Eds.): Incom-

plete Data in Sample Surveys, 3, Theory and Bibliographies. Academic Press, pp. 339–349, 1983.[27] Siddique J., Belin T.R.: Multiple Imputation Using an Iterative Hot-Deck with Distance-Based

Donor Selection. Statistics in medicine, 27, 1, pp. 83–102, 2008.[28] Strike K., Emam K.E., Madhavji N.: Software Cost Estimation with Incomplete Data. IEEE Trans-

actions on Software Engineering, 27, pp. 890–908, 2001.[29] Yenduri S., Iyengar S.S.: Performance Evaluation of Imputation Methods for Incomplete Datasets.

International Journal of Software Engineering and Knowledge Engineering, 17, pp. 127–152, 2007.