JAMRIS 2015 Vol 9 No 4

pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)

VOLUME 9 N° 4 2015 www.jamris.org

Articles 1

JOURNAL OF AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS

Publisher:Industrial Research Institute for Automation and Measurements PIAP

Editor-in-ChiefJanusz Kacprzyk

(Polish Academy of Sciences, PIAP, Poland)

Advisory Board:Dimitar Filev (Research & Advenced Engineering, Ford Motor Company, USA)

Kaoru Hirota (Japan Society for the Promotion of Science, Beijing Office)

Jan Jabłkowski (PIAP, Poland)

Witold Pedrycz (ECERF, University of Alberta, Canada)

Co-Editors:Roman Szewczyk (PIAP, Warsaw University of Technology)

Oscar Castillo (Tijuana Institute of Technology, Mexico)

Marek Zaremba (University of Quebec, Canada)

(ECERF, University of Alberta, Canada)

Executive Editor: Anna Ładan [email protected]

Associate Editors:Jacek Salach (Warsaw University of Technology, Poland)Maciej Trojnacki (PIAP, Poland)

Statistical Editor: Małgorzata Kaliczynska (PIAP, Poland)

Language Editors: Grace Palmer (USA), Urszula Wiaczek

Typesetting: Ewa Markowska, PIAP

Webmaster: Piotr Ryszawa, PIAP

Editorial Office:Industrial Research Institute for Automation and Measurements PIAPAl. Jerozolimskie 202, 02-486 Warsaw, POLANDTel. +48-22-8740109, [email protected]

Copyright and reprint permissions Executive Editor

The reference version of the journal is e-version. Printed in 300 copies.

If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements

and descriptions of products.

All rights reserved ©

Editorial Board:Chairman - Janusz Kacprzyk (Polish Academy of Sciences, PIAP, Poland)Plamen Angelov (Lancaster University, UK)Adam Borkowski (Polish Academy of Sciences, Poland)Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany)Chin Chen Chang (Feng Chia University, Taiwan)Jorge Manuel Miranda Dias (University of Coimbra, Portugal)Andries Engelbrecht (University of Pretoria, Republic of South Africa)Pablo Estévez (University of Chile)Bogdan Gabrys (Bournemouth University, UK)Fernando Gomide (University of Campinas, São Paulo, Brazil)Aboul Ella Hassanien (Cairo University, Egypt)Joachim Hertzberg (Osnabrück University, Germany)Evangelos V. Hristoforou (National Technical University of Athens, Greece)Ryszard Jachowicz (Warsaw University of Technology, Poland)Tadeusz Kaczorek (Bialystok University of Technology, Poland)Nikola Kasabov (Auckland University of Technology, New Zealand)Marian P. Kazmierkowski (Warsaw University of Technology, Poland)Laszlo T. Kóczy (Szechenyi Istvan University, Gyor and Budapest University of Technology and Economics, Hungary)Józef Korbicz (University of Zielona Góra, Poland)Krzysztof Kozłowski (Poznan University of Technology, Poland)Eckart Kramer (Fachhochschule Eberswalde, Germany)Rudolf Kruse (Otto-von-Guericke-Universität, Magdeburg, Germany)Ching-Teng Lin (National Chiao-Tung University, Taiwan)Piotr Kulczycki (AGH University of Science and Technology, Cracow, Poland)Andrew Kusiak (University of Iowa, USA)

Mark Last (Ben-Gurion University, Israel)Anthony Maciejewski (Colorado State University, USA)Krzysztof Malinowski (Warsaw University of Technology, Poland)Andrzej Masłowski (Warsaw University of Technology, Poland)Patricia Melin (Tijuana Institute of Technology, Mexico)Fazel Naghdy (University of Wollongong, Australia)Zbigniew Nahorski (Polish Academy of Sciences, Poland)Nadia Nedjah (State University of Rio de Janeiro, Brazil)Duc Truong Pham (Cardiff University, UK)Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland)Alain Pruski (University of Metz, France)Rita Ribeiro (UNINOVA, Instituto de Desenvolvimento de Novas Tecnologias, Caparica, Portugal)Imre Rudas (Óbuda University, Hungary)Leszek Rutkowski (Czestochowa University of Technology, Poland)Alessandro Saffiotti (Örebro University, Sweden)Klaus Schilling (Julius-Maximilians-University Wuerzburg, Germany)Vassil Sgurev (Bulgarian Academy of Sciences, Department of Intelligent Systems, Bulgaria)Helena Szczerbicka (Leibniz Universität, Hannover, Germany)Ryszard Tadeusiewicz (AGH University of Science and Technology in Cracow, Poland)Stanisław Tarasiewicz (University of Laval, Canada)Piotr Tatjewski (Warsaw University of Technology, Poland)Rene Wamkeue (University of Quebec, Canada)Janusz Zalewski (Florida Gulf Coast University, USA)Teresa Zielinska (Warsaw University of Technology, Poland)

Articles2

Development of Vibratory Part Feeder for Material Handling in Manufacturing Automation: A SurveyultsUdhayakumar SadasivamDOI: 10.14313/JAMRIS_4-2015/27

Preliminary Study of Hydrodynamic Load on an Underwater Robotic ManipulatorWaldemar KolodziejczykDOI: 10.14313/JAMRIS_4-2015/28

Face Recognition Using Canonical Correlation, Discrimination Power, and Fractional Multiple Exemplar Discriminant Analyses Mohammadreza Hajiarbabi, Arvin AgahDOI: 10.14313/JAMRIS_4-2015/29

Improving Self-Localization Efficiency in a Small Mobile Robot by Using a Hybrid Field of View Vision SystemMarta Rostkowska, Piotr SkrzypczynskiDOI: 10.14313/JAMRIS_4-2015/30

Design and Movement Control of a 12-Legged Mobile Robot Jacek Rysinski, Bartlomiej Gola, Jerzy KopecDOI: 10.14313/JAMRIS_4-2015/31

ICS System Supporting the Water Networks Management by Means of Mathematical Modelling and Optimization Algorithms Jan Studzinski DOI: 10.14313/JAMRIS_4-2015/32

Development of Graphene Based Flow Sensor Adam Kowalski, Marcin Safinowski, Roman Szewczyk, Wojciech WiniarskiDOI: 10.14313/JAMRIS_4-2015/33

Multiaspect Text Categorization Problem Solving: A Nearest Neighbours Classifier Based Approaches and BeyondSlawomir Zadrozny, Janusz Kacprzyk, Marek GajewskiDOI: 10.14313/JAMRIS_4-2015/34

JOURNAL OF AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS

VOLUME 9, N° 4, 2015 DOI: 10.14313/JAMRIS_4-2015

3

18

39

28

48

55

58

11

CONTENTS

Journal of Automation, Mobile Robotics & Intelligent Systems VOLUME 9, N° 4 2015

3

part feeder consumes more time and is a trial and er-ror process. The designer has to take into account of some critical aspects such as part to be fed, number of parts, material of feeder etc. [2]. This paper deals with the survey of published works in the area of de-sign and development of part feeding systems. Natu-ral resting orientation of a part is the way in which the part could rest on a horizontal surface naturally [3]. Fore-knowledge of probability of feasible natural resting orientation of part is critical in developing an efficient part feeder [1], [4]. Hence, the literature sur-vey on methods available for determining the prob-ability of natural resting orientation of parts was the first step. Then, the literature on design and develop-ment of part feeding devices and flexible part feeding systems was done.

2. Determining the Probability of Natural Resting Orientations The parts are to be oriented in desired manner for

automated assembly operation [4]. If the most proba-ble natural resting orientation of the part is chosen as the preferred orientation, the need to re-orient parts would be minimized. Most probable natural resting is the orientation which has the highest probability of occurrence. Greater the number of parts in preferred orientation, higher is the efficiency of the part feeder [1]. Ngoi et al. [5] stated that, components have to be fed and aligned in a proper orientation at high speed in automated assembly. They also emphasized that for continuous feeding of parts through vibratory feeding; the parts were to be fed in the most prob-able natural resting orientation. They determined the probability of natural resting orientations of parts us-ing drop test.

Moll and Erdmann [6] focused on orienting parts with minimal sensing and manipulation. A new ap-proach to orient parts through manipulation of pose distributions was elaborated. The pose distribution of a part being dropped from an arbitrary height on arbitrary surface was determined through dynamic simulation. They analyzed the effect of drop height and shape of support surface on pose distributions. They also derived a condition on the pose and veloc-ity of a planar object in contact with a sloped surface, which enabled to determine the final resting orienta-tion of the part. They also validated the dynamic sim-ulation results with experimental results.

The experimental method to find the most proba-ble natural resting orientation is time consuming and hence the industries have the necessity of mathemati-

Development of Vibratory Part Feeder for Material Handling in Manufacturing Automation: a Survey

Udhayakumar Sadasivam

Submitted: 13th June 2015; accepted 28th August 2015

DOI: 10.14313/JAMRIS_4-2015/27

Abstract:In manufacturing automation, material handling plays a significant role. Material handling is the process of load-ing, placing or manipulating material. The handling of materials is to be performed efficiently, safely, accurately in a timely manner so that the right parts arrive in right quantities to the right locations, at low cost and without damage. The material handling devices are generally designed around standard production machinery and integrated with specially made feeders. In assembly and processing operations, there is a need for the parts to be presented in a preferred orientation, which is achieved using part feeders. A perfect example of feeders is vibra-tory feeders. Vibratory feeders are commonly used in in-dustries for sorting and orienting parts before assembly. In this paper, a survey of literature regarding design and development of part feeders is discussed. The survey in-cludes sensorless vibratory feeders to vision based flex-ible part feeders.

Keywords: vibratory part feeder, flexible part feeder, conveying velocity

1. Introduction Part feeders play a vital role in manufacturing

industries. A part feeder has three major functions: storing, aligning and feeding. Feeders are used to make the production faster, convenient and inexpen-sive. They are designed to supply a specific type of material, which is a part of the production process. They help in maintaining the flow of product needed for the next stage of the process. A part feeder intakes parts of arbitrary orientation and provides output in uniform orientation. Presenting the part in preferred orientation is very much useful in assembly and pro-cessing operations and this could be easily achieved through part feeders. Vibratory feeders are perfect examples of part feeders. Vibratory feeders are com-monly used in industries for sorting and orienting parts before assembly [1]. The ease of controlling the flow of bulk materials and their adaptation for processing requirements make vibratory feeders, dearer among the manufacturing industries. Vibra-tory feeders provide suitable alternate to manual la-bour, thereby saving manufacturer’s time and cost. Further, labour could be utilised for value adding ac-tivities rather than non-value adding activities such as segregating, stacking etc. Designing an industrial


Articles4

cal models to predict them from their geometries [2]. The commonly discussed theoretical methods in liter-ature, to determine the probability of natural resting orientation of parts are energy barrier method, cen-troid solid angle method, stability method and critical solid angle method. The methods are discussed as fol-lows:

2.1. Energy Barrier MethodThis method was proposed by Boothroyd et al. [3].

The probability of a part to come to rest in a particu-lar orientation is a function of the energy tending to prevent a change of part orientation and amount of energy possessed by the part when it fall into that resting orientation. For complex parts with more than two natural resting orientations, this method was dif-ficult to compute the energy barrier and hence only preferred for simple parts with constant cross section and two natural resting orientations [2].

2.2. Centroid Solid Angle MethodThis method was proposed by Lee et al. [7]. A solid

angle is defined as one steradian unit subtended by a part of the spherical surface whose area equals the square of the radius of the sphere. The centroid solid angle is the solid angle subtended from the centroid of a part. The centroid solid angle method is based on the assumption that the probability of a component resting on a specific orientation is directly propor-tional to the magnitude of the centroid solid angle and inversely proportional to the height of its centroid from that orientation.

The following steps were followed to determine the solid angle of the parts: 1. Assume the part is resting on a flat surface in

any orientation.2. Locate the centroid of the part (Figure 1).3. Construct a pyramid with cen-

troid as the apex and base of the part (Figure 1).

4. Construct a sphere of any arbitrary radius (R) with the centroid as its centre. The radius of sphere should not exceed the part height. (Figure 1).

5. The intersected volume of pyramid and sphere is called the enveloped volume, from which centroid solid angle can be found (Figure 2).The centroid solid angle of that orientation (Wi) is

computed by,

(1)

If a part has ‘n’ natural resting orientations, then the probability of natural resting orientation is ob-tained by Equation (2),

(2)

−pi is the probability of the part resting on orienta-tion ‘i’,

− n is the number of natural resting orientations,− Wi is the centroid solid angle subtended by orien-

tation ‘i’ from centroid, sr,−hi is the height of the centroid from orientation ‘i’,

mm,−Wi is the centroid solid angle subtended from cen-

troid by ‘j’, sr,−hj is the height of the centroid from orientation ‘j’,

mm.

Figure 1. Creation of pyramid with centroid as apex

Figure 2. Solid angle generation

The set of these probabilities is called Static Prob-ability Profile.

2.3. Stability MethodStability method is based on logical analysis and

was elaborated by Chua and Tay [8]. Larger the con-tact area with the base more is the stability. Similarly, if the center of gravity of the part is much lower and nearer to the base, stability is higher. The stability method is based on these two aspects. Stability ‘S’ is a function of the magnitude of the contact area (Ar) and the distance of the center of gravity(y) from the base. ‘S’ is proportional to ‘Ar’ and inversely propor-tional to ‘y’. The generalized equation is given by Equation (3).

(3)

−pi is the probability for the orientation ‘i’,−N is the number of surface identical to and inclu-

sive of the contacting surface,−Ar is the contact area, mm2,−y is the distance from base to center of gravity, mm.

2.4. Critical Solid Angle MethodThis method is based on the hypothesis that the

probability of a part to rest in a particular orientation is proportional to the difference between the centroid


Articles 5

solid angle subtended by that orientation and critical solid angle of that orientation for changing to rest in its neighboring orientation and is inversely propor-tional to the height of centre of gravity of that orien-tation [9]. The Critical solid angle is the solid angle subtended by the resting orientation of the part, with respect to the point that lies on the line normal to that orientation and passing through the centre of grav-ity, at the height of the length between the centre of gravity and the edge of that orientation and its neigh-boring orientation. The solid angle at a critical posi-tion of the part that is least required for the change in orientation of the part (with the part resting on its edge) is termed as critical solid angle. Whenever the part tries to shift its position to any one of the neigh-boring orientation, a new critical solid angle is avail-able. Hence, the probability that a part comes to rest is proportional to the difference between the centroid solid angle subtended by that resting orientation and average of the critical solid angles of that orientation when trying to shift to its neighboring orientations and inversely proportional to the height of center of gravity from that orientation. The probability of oc-currence of each orientation is given by Equation (4).

(4)

Udhayakumar et al. [10] determined the most probable natural resting orientation for a family of sector-shaped parts using drop test and theoreti-cal methods (Centroid solid angle method, stability method and critical solid angle method). The effect of initial orientation of the part during drop and height of drop, on the natural resting orientation, was also studied [11]. Pearson’s χ2 test for goodness of fit be-tween the drop test and theoretical method results re-vealed that the null hypothesis could not be accepted at 95% confidence level.

The next section provides the literature survey in the area of part feeders.

3. Vibratory Part FeedersThis section deals with the literature related to de-

sign and development of part feeders. About 50% of manufacturing cost and 40% of workforce is dedicat-ed to production assembly [12]. A part feeder intakes identical parts of arbitrary orientation and provides output in uniform orientation. In assembly process, the parts were to be shifted from one orientation to another and the most humanoid way of doing that was gripping the part and then shifting it to anoth-er orientation [13]. Vibrating feeders are commonly used in industries for orienting parts. Vibratory part feeders are more commonly used in industries such as food processing, plastic component manufactur-ing, automobiles etc. The most important factor to be considered when selecting a part feeder is the type of parts to be fed. Feeder sizes and types are determined through a variety of factors such as: part size and con-figuration, part abrasiveness, condition of the part

when handled and the required feed rate. The design of industrial parts feeders is trial and error process that can take several months [2].

Berkowitz and Canny [14] developed a tool to test the feeder designs. The behavior of the system was evaluated using Markov model. The probability that a part in any random orientation ends up in a desired/preferred orientation was computed using Markov analysis. The probability of each pre and post orienta-tion that a gate will convert was computed and based on that, the efficiency was calculated. They used the developed tool to simulate a feeder with edge riser. They concluded that future work was required to determine the accuracy of simulation results on ac-tual feeder.

Lim [15] performed the dynamic analysis of the vibratory feeder. A theoretical analysis of feeding on a track vibrating with simple harmonic motion was presented. Based on his analysis, the factors affecting the conveying velocity of part on a vibratory feeder are excitation frequency, amplitude of vibration, co-efficient of friction and track angle. A model was de-veloped to predict the conveying velocity based on the above said factors. The results of the developed model followed the same pattern as that of the experi-mental results.

The part motion on a planar part feeder which had a longitudinally vibrating flat plate was discussed by Reznik et al. [16].They stated that feed rate was due to plate motion in forward motion for a longer time than backward motion combined with non-linear nature of Coulomb friction. They developed analytical expres-sions for feed rate. Though the analytical results devi-ated from rigid body dynamic simulation results, they followed the same pattern.

Many methods of sensorless part feeding were discussed in literature that includes orienting and positioning using push forces, fences etc. Akella and Mason [17] discussed the use of pushing action for orienting and translating objects. In their paper, the following were briefed:1. The sequence of linear normal pushes for orient-

ing and positioning polygonal objects 2. Existence of sequence of linear normal pushes to

move any polygon from start pose to goal pose3. Polynomial-time pose planner that generate push

sequence to orient any polygon to the goal pose.Beretty et al. [18] demonstrated that a polygo-

nal part could be oriented using fences placed along a conveyor belt. At the end of conveyor, a part of any pose could be converted to unique final pose using fences. They developed an algorithm for developing a fence design of minimal length (i.e. less number of fences). The results proved good to fence designs for parts with acyclic left and right environments. But, they could not be generalized for any arbitrary parts.

Lynch [19] augmented a 1JOC (Joint Over Con-veyor) with a prismatic joint that allows the fence to move vertically and hence named them as 2JOC. They attempted feeding of 3D parts on the conveyor by combining toppling with the ability of the 1JOC to perform conveyor-plane feeding. He also proposed the idea of developing inexpensive part feeders using


Articles6

toppling and pushing actions. He also derived the me-chanical conditions for toppling.

Bohringer et al. [20] developed a programmable equipment that used vibrating surface for position-ing and orienting parts without sensor feedback or force closure. This was based on dynamic modes of vibrating surface. They explained the apparatus using planar objects. They also developed polynomial-time algorithms to generate sequences of force fields.

Manipulating a planar rigid part on a conveyor belt using a robot with just one joint was explored by Akella et al. [21]. Their approach provided a simple and flexible method for feeding parts in industries. The 1JOC approach used a fixed velocity conveyor along with a single servoed joint to obtain the diver-sity of motions that were required for planar manipu-lation. They also proved the 1JOC is capable of useful planar manipulation: any polygon is controllable from a broad range of initial configurations to any goal cho-sen from a broad range of goal configurations. They demonstrated that the sensorless 1JOC could position and orient polygons without sensing.

Berretty et al. [22] discussed a sequence of me-chanical devices such as wiper blades, grooves to filter polygonal parts on a track. They termed this as ‘traps’. They discussed several gates of trap such as balcony, gap, canyon, and slot. They have a series of mechanical barriers (also known as gates) which ei-ther reject the disoriented parts or reorient the parts to desired orientation. The former type is known as passive trap and the latter as active trap. Active traps are preferred than passive traps, since the efficiency of active trap is 100%. These traps are mounted at the exit of the vibratory feeder. Vibratory bowl feeders are suitable for smaller parts whereas liner vibratory feeders can be used for handling larger parts. Active devices convert any orientation of the part to the de-sired orientation whereas passive devices reject the disoriented parts. The sequence of placement of pas-sive and active devices depends on the orientation to be obtained as output.

Wiendahl and Rybarczyk [23] presented the pos-sibilities and the potentials of aerodynamic part feed-ing processes. They used the idea of aerodynamic part feeding, e.g. a permanent air field which forced apart into the desired orientation without the need for any control by sensors. They elaborated on three different aerodynamic part feeding methods. The orientation method was based on the behavior of workpieces in a field of air flow. Usable part characteristics for this method were the general air resistance and the cen-ter of gravity. The tipping method was applied for the orientation of the workpiece with the axis of rotation parallel to the direction of transport. The part char-acteristics were local air resistance, projected shape and the center of gravity. The rotating method was de-veloped for the orientation of the workpieces with the axis of rotation vertical to the direction of transport. The possible part characteristics were air resistance, center of gravity and projected shape.

Jiang et al. [24] developed a 3D simulation soft-ware for parts feeding and orienting in a vibratory bowl feeder. A mathematical model of part motion

and its behavior in orienting mechanism was deter-mined. Based on the model, a 3D simulation software was developed using Java. The computer simulation results had a good agreement with the experimen-tal results.

Force analysis and dynamic modeling of a vibrato-ry feeder was presented by Richard et al. [25]. The vi-bratory feeder was equated to a three-legged parallel mechanism and geometric property of the feeder was determined. The effect of the leaf-spring legs were converted to forces and moments acting on the base of the bowl. A dynamic model that integrates the an-gular displacement of the bowl with the displacement of the leaf-spring legs was developed. Newtonian and Lagrangian approaches were used to verify the model

Goemans et al. [26], [27] introduced a new class of geometric primitives, called blades, to feed sensorics class of 3D parts by reorienting and rejecting all but a desired orientation. The blade received identical polyhedral parts in arbitrary orientation as input and outputs parts in one single orientation. The blade is nothing but a horizontally mounted convex polygonal metal plate attached to the feeder wall. This plate was parallel to the track and had a triangular shaped seg-ment and a rectangular shaped segment. The three parameters to characterize a blade were blade angle, blade height and blade width.

Vose et al. [28] used force fields for sensorless part orientation. They developed large family of program-mable frictional force fields by vibrating the rigid plate. They also stated that the strength of field and line of squeeze line were easily controllable in six de-gree of freedom implementation.

Ramalingam and Samuel [29] investigated the be-havior of a linear vibratory feeder, used for convey-ing small parts. A rotating drum with radial fins was designed and developed for carrying out the experi-ments. A tumbling barrel hopper was developed for feeding the components onto the track. They con-sidered the parameters affecting the feed rate and conveying velocity of part such as barrel dimension, amplitude and angle of vibration, coefficient of fric-tion and the operating frequency and the influence of these parameters was determined experimentally.

Three different types of sensorless part feeding devices for handling asymmetric parts was discussed by Udhayakumar et al. [30]. They inferred that the ef-ficiency of feeder increases with number of passes. They determined the effect of excitation frequency and amplitude of vibration on velocity of part on a vi-bratory feeder. A model to determine the velocity of part was also presented.

A trap based vibratory part feeder for conveying brakeliners was developed by Udhayakumar et al. [31]. The trap had an efficiency of 100%. An expres-sion relating the conveying velocity of part as a func-tion of excitation frequency, vibration amplitude and trap inclination angle was obtained through regres-sion analysis. The developed set-up was able to re-duce the time taken for stacking 80 parts by 13.5%.

To investigate the dynamic behavior of the feeding part, a 2D numerical model based on discrete element method was developed by Ashrafizadeh and Ziaei-


Articles 7

Rad [32]. The feeding part was assumed as rectan-gular shape with three degrees of freedom. Through simulation, a good agreement between the calculated and experimental data was observed. They concluded that co-efficient of friction had a critical role in slid-ing regime but not in hopping regime. The proposed model was capable of demonstrating the periodic and chaotic behavior of the part.

A novel vibratory feeder called Decoupled vibra-tory feeder (DVF) was discussed by Linag Han et al. [33]. In DVF, excitation is provided in two mutual perpendicular directions. The governing parameters such as vibration angle, excitation frequency, and waveform of driving signals and phase angle between vertical and horizontal excitations were adjusted through software. They also developed a test system to evaluate electromagnets performance.

Prediction of appropriate parameters for convey-ing brake pads on a vibratory feeder was discussed by Suresh et al. [34]. They determined the optimal frequen-cy, trap and track angles using linear regression model.

4. Flexible Part FeedersThis section includes literature on non-vision

based and vision based flexible part feeding systems. Boehlke et al. [35] stated that 50% of failure in au-tomation systems attribute to custom built vibratory feeder. Janeja and Lee [36] stated that if the existing orienting elements were able to be adjusted rapidly, then this would convert a rigid design to a flexible one without any sacrifice in its efficiency. This would be able to handle a family of similar parts. Flexible feed-ers have the capability to accommodate most of the parts of one or more family, with minimum change-over time [37]. The flexible part feeders can be gener-ally classified in to the following two classes:1. Feeders that are non-vision based and rely on sim-

ple sensors or reconfigurable gates to handle the parts of one or more family.

2. Vision based feeders that depend on vision cam-eras to handle the parts of one or more family.

4.1. Non-vision Based Flexible Part FeedersThe concept of using a LED (Light Emitting Diode)

sensor to determine the part orientation was present-ed by Akella and Mason [38]. A LED sensor was used to measure the resting diameter or width of polygonal parts. An array of LEDs was arranged on the side of conveyor and a set of photo resistors on the opposite parallel side. By the LED-photo resistors blocked by the part, the resting diameter or width of polygonal parts was identified. Based on this partial information from the sensor, a robot was programmed to execute a sequence of push-align operations to orient the part.

Sim et al. [39] stated that programmable part feed-ers that can handle parts of one or more part families with short changeover times are highly in need. The capability of neural network based pattern recogni-tion algorithm for recognition of parts was developed. Three fiber optic sensors mounted on vibratory bowl feeder were used to scan the surface of each feeding part. The scanned signature was used as input to neu-ral network models to identify the part.

Tay et al. [37] developed a flexible and program-mable vibratory bowl feeding system for use in a flex-ible manufacturing system. The feeding system was capable of identifying the orientation of non-rotational parts and re-oriented them into desired orientation. It was equipped with programmable passive and ac-tive orienting devices which allowed them to handle variety of parts. Nine specially designed stations were present along the track of the feeder for feeding of non-rotational parts. These stations were controlled by both the computer sub-system and PLC (Program-mable Logic Controller) sub-system. The orientation of the part was identified using neural networks. Optical sensors were used to identify the internal features such as holes and pockets. Three types of neural network architectures were tried for pattern recognition and classification of feed orientation of parts in the feeder.

Chua [40] stated that the flexibility of assembly sys-tem is critical for survival in competitive manufactur-ing market. He also discussed the need for feeding sys-tem to handle asymmetric parts with high efficiency. He developed a part feeding system to handle cylindri-cal parts of different aspect ratios. His system included a singularity unit, V-belt orientator, transfer mecha-nism with aluminium plate and an unloading module with delivery chute and re-orientation.

Udhayakumar et al. [41] developed an adaptive part feeder for handling sector-shaped parts. This feeder was able to accommodate a family of sector shaped parts. Capacitive sensors were employed to determine the size of part. Based on the size of the part, the feeding system was modified accordingly, to convey the part. A regression model was developed to determine the conveying velocity based on excitation frequency and amplitude of vibration.

A part feeding system based on piezoelectric vibra-tory conveyors was developed by Urs Leberle and Jür-gen Fleischer [42]. Different variety of parts, including very delicate parts, could be fed in this conveyor. The design and commissioning of the conveyor set-up was discussed.

4.2. Vision Based Flexible Part FeedersCausey et al. [43] presented the design and devel-

opment of a flexible parts feeding system. He proposed three conveyors working together. The first inclined conveyor was used to lift parts from a bulk hopper. From the first conveyor, the parts were transferred to a horizontally mounted second conveyor. An under lit window presented a silhouette image of the parts to vision system. Based in this, the pose of part was deter-mined and a robotic arm was used to acquire it. Parts in inferable orientation/overlapping were returned to bulk conveyer by a third conveyor. The guidelines for improving a part feeding system’s performance were discussed.

Gudmundsson and Goldberg [44] analyzed the use of vision cameras in robots for part feeding. They found that the throughput of a part feeding system could be affected by starvation, where no part is visible to the camera and saturation (too many parts are visible to camera which acts as an obstacle for the robot to iden-tify the part orientation and grasp it).


Articles8

Chen et al. [45] developed a smart machine vision system for inspection of solder paste on printed circuit boards (PCB). Machine vision was considered since it has advantages over the traditional manual inspection by its higher efficiency and accuracy. The proposed system included two modules, LIF (Learning Inspec-tion Features) and OLI (On-Line Inspection). The LIF module learnt the inspection features from the CAD files of a PCB board. The OLI module inspected the PCB boards online. The accuracy of detection has exceeded 97%, when deployed in the manufacturing line.

Sumi and Kawai [46] proposed a new method for 3D object recognition in a cluttered environment, which used segment-based stereo vision. Based on the position and orientation of the object, a robot was signaled to pick and manipulate it. Different shaped objects (planar figures, polyhedra, free-form objects) were tried for demonstration of the concept.

Khan et al. [47] used a vision set-up to inspect for defects based on the size, shape, color and di-mensions of the part that arrived on a conveyor. The camera was mounted on the conveyor belt. Based on the output from the vision system, a lever attached to a stepper motor directed the part to the accepted or rejected trays. The accuracy of the system was found to be about 95%.

An overview of vision based system was discussed by Han et al. [48]. He stated that conventional part feeders were effective for specific type of parts, but had limitations where families of part (similar in shape but vary in size) were to be handled. He de-scribed the current design and retooling of feeder as a black art. A vision based vibratory feeder, where the major feeding parameters such as vibration an-gle, frequency, amplitude and phase difference could be adjusted online by software, was developed. The system was capable of handling wide range of parts without retooling. The best operating frequency was determined automatically though frequency re-sponse analysis. It was also capable of eliminating the parts jamming.

Mahalakshmi et al. [49] stated that ‘Template matching’ has created a revolution in the field of computer vision and has provided a new dimension into image processing. They have discussed the sig-nificance of various algorithms of template match-ing technique.

Flexible vibratory feeding system based on vision camera was proposed by Liang Hana and Huimin Lib [50]. The developed system was capable of identify-ing the part with preferred orientation. Otherwise, the part is again sent to the bowl. Auto vision software was used to identify the parts.

5. ConclusionsAutomation is growing at a rapid pace in today’s

world. Having understood the significance of automa-tion for success and growth of the industrial set-up, many companies are investing in bringing the latest technologies for the processes. Factory automation aims at minimization of manual and personnel related work over industrial, production and manufacturing

processes. Vibratory feeders are suitable for feeding parts for subsequent processes on special machines in mechanical, electrical, pharmaceutical, bearing, optical, fastener and many industries. This paper fo-cused on the various literature regarding the design and development of vibratory part feeders. The scope of survey included identifying the most probable nat-ural resting orientation to the development of flexible part feeders. From the literature survey, it could be understood that many more advancements could be made in vibratory part feeding technology so that the feeders are extremely flexible as well as cheap. Fur-ther research could be focused on flexible part feeders that can handle a variety of parts without retooling, at an optimum feeding rate. The conveying velocity of the parts on the feeder should be predictable in order to maintain continuous flow. More research work is required in the development of predictive models to determine the conveying velocity and hence part be-havior on part feeders are to be studied extensively.

AUTHORUdhayakumar Sadasivam – Department of Mechani-cal Engineering, PSG College of Technology, Coim-batore – 641004. Tamilnadu, INDIA. Phone: +91-422-4344271E-mail: [email protected]

REFERENCES

[1] Lee S.G., Ngoi B.K.A., Lye S.W., Lim L.E.N., “An analysis of the resting probabilities of an object with curved surfaces”, Int. J Adv. Manuf. Tech-nol., vol.12, no. 5, 1996, 366–369. DOI: 10.1007/BF01179812.

[2] Cordero A.S., “Analyzing the parts behavior in a vibratory bowl feeder to predict the Dynamic Probability Profile”, Thesis submitted for Master of Science Thesis, Mayaguez campus, University of Puerto Rico, 2004.

[3] Boothroyd G., Poli C.R., Murch L.E., Automatic As-sembly, Marcel Dekker, 1982.

[4] Ngoi K.A., Lye S.W., Chen J., “Analysing the natu-ral resting aspect of a prism on a hard surface for automated assembly”, Int. J. Adv. Manuf. Tech., vol. 11, no. 6, 1996, 406–412. DOI: 10.1007/BF01178966.

[5] Ngoi B.K.A, Lim L.E.N., Ee J.T., “Analysis of natu-ral resting aspects of parts in a vibratory bowl feeder- validation of drop test”, Int. J. Adv. Manuf. Technol., vol. 13, 1997, 300–310. DOI: 10.1007/BF01179612.

[6] Moll M., Erdmann M., “Manipulation of pose distributions”, International Journal of Robot-ics Research, vol. 21, no. 3, 2002, 277–292. DOI: 10.1177/027836402320556449.

[7] Lee S.S.G., Ngoi B.K.A., Lim L.E.N., Lye S.W., “De-termining the probabilities of the natural resting aspects of parts from their geometries”, Assembly Automation, vol. 17, no. 2, 1997, 137–142. DOI: 10.1108/01445159710171356.


Articles 9

[8] Chua P.S.K., Tay M.L., “Modeling the natural rest-ing aspect of small regular shaped parts”, Trans. ASME, vol. 120, 1998, 540–546.

[9] Ngoi K.A., Lye S.W., Chen J., “Analyzing the natu-ral resting aspect of a complex shaped part on a hard surface for automated parts feeding”, Proc Instn. Mech. Engrs., vol. 211 part B,1997, 435–442.

[10] Udhayakumar S., Mohanram P.V., Keerthi Anand P., Srinivasan R., “Determining the most proba-ble natural resting orientation of sector shaped parts”, Assembly Automation, vol. 33, no. 1, 2013, 29–37. DOI: 10.1108/01445151311294649.

[11] Udhayakumar S., Mohanram P.V., Krishnakumar M., Yeswanth S., “Effect of initial orientation and height of drop on natural resting orientation of sector shaped components”, Journal of Manufac-turing Engineering, vol. X, no. 2, 2011, 05–07.

[12] Boothroyd G., Assembly Automation and Product Design, CRC press Taylor and Francis, 2005.

[13] Rao A., Kriegman D., Goldberg K., “Complete algorithm for feeding polyhedral parts using pivot grasps”, IEEE Transactions on Robot Au-tomation, vol. 12, no. 2, 1996, 331–342. DOI: 10.1109/70.488952.

[14] Berkowitz D.R., Canny J., “Designing parts feed-ers using dynamic simulation”. In: Proceedings of IEEE International Conference on Robotics and Automation, 1996, 1127–1132. DOI: 10.1109/ROBOT.1996.506859.

[15] Lim G.H., “On the conveying velocity of a vibra-tory feeder”, Computers and Structures, vol. 62, no. 1, 1997, 197–203.

[16] Reznik D., Canny J., Goldberg K., “Analysis of part motion on logitudinally vibrating plate”. In: Pro-ceedings of the IEEE/RSJ International Confer-ence on Intelligent Robots and Systems (IROS), vol. 1, 1997, 421–427.

[17] Akella S., Mason M.T., “Posing polygonal objects in the plane by pushing”, International Journal of Robotics Research, vol. 17, no. 3,1998, 70–88.

[18] Berretty R.P., Goldberg K., Overmars M.H., van der Stappen A.F., “Computing fence designs for orienting parts”, Computational Geometry, vol. 10, no. 4, 1998, 249–262. DOI: 10.1016/S0925-7721(98)00010-8.

[19] Lynch K.M., “Inexpensive conveyor based parts feeding”, Assembly Automation, vol. 19, no. 3, 1999, 209–215. DOI: 10.1108/01445159910280074.

[20] Bohringer K.F., Bhatt V., Donald B.R., Goldberg K., “Algorithms for sensorless manipulation us-ing a vibrating surface”, Algorithmica, vol. 26, 2000, 389–429.

[21] Akella S., Huang W.H., Lynch K.M., Mason M.T., “Parts feeding on a conveyor with a one joint ro-bot”, Algorithmica, vol. 26, 2000, 313–344.

[22] Berretty R.P., Kenneth Y., Goldberg K., Over-mars M.H., van der Stappen A.F., “Trap design for vibratory bowl feeders”, The Int. J. Ro-bot. Res., vol. 20, no. 11, 2001, 891–908. DOI: 10.1177/02783640122068173.

[23] Wiendahl H.P., Rybarczyk A., “Using air streams for part feeding systems—innovative and reli-

able solutions for orientation and transport”, Journal of Materials Processing Technology, vol. 138, 2003, 189–195.

[24] Jiang M. H., Chua P.S.K., Tan F. L, “Simulation software for parts feeding in a vibratory bowl feeder”, International Journal of Production Re-search, vol. 41, no. 9, 2003, 2037–2055. DOI: 10.1080/0020754031000123895.

[25] Silversides R., Dai J.S., Seneviratne L., “Force analysis of a vibratory bowl feeder for automat-ic assembly”, J. Mech. Des., vol. 127, no. 4, 2004, 637–645. DOI: 10.1115/1.1897407.

[26] Goemans O.C., Goldberg K., van der Stappen A.F., “Blades for feeding 3D parts on vibratory tracks”, Assembly Automation, vol. 26, no. 3, 2006, 221–226.

[27] Goemans O.C., Goldberg K., van der Stappen A.F., “Blades: a new class of geometric primitives for feeding 3D parts on vibratory tracks”. In: Pro-ceedings of IEEE International Conference on Ro-botics and Automation, 2005,1730–1736. DOI: 10.1109/ROBOT.2006.1641956.

[28] Vose T.H., Umbanhowar P., Lynch K.M., “Vibra-tion-induced frictional force fields on a rigid plate”. In: IEEE International Conference on Ro-botics and Automation, 2007, 660–667. DOI: 10.1109/ROBOT.2007.363062.

[29] Ramalingam M., Samuel G.L., “Investigation on the conveying velocity of a linear vibratory feeder while handling bulk-sized small parts”, Int. J. Adv. Manuf. Technol., vol. 44, no. 3–4, 2009, 372–382. DOI: 10.1007/s00170-008-1838-1.

[30] Udhayakumar S., Mohanram P.V., Deepak S., Gobalakrishnan P., “Development of sensor-less part feeding system for handling asym-metric parts”, The International Journal for Manufacturing Science and Production, vol. 10, no. 3–4, 2009, 267–277. DOI: 10.1515/IJM-SP.2009.10.3-4.265.

[31] Udhayakumar S., Mohanram P.V., Keerthi Anand P., Srinivasan R., “Trap based part feeding sys-tem for stacking sector shaped parts”, Journal of the Brazilian Society of Mechanical Sciences and Engineering, vol. 36, no. 2, 2014, 421–431. DOI: 10.1007/s40430-013-0086-y.

[32] Ashrafizadeh H., Ziaei-Rad S., “A numerical 2D simulation of part motion in vibratory bowl feeders by discrete element method”, Journal of Sound and Vibration, vol. 332, no. 13, 2013, 3303–3314. DOI: 10.1016/j.jsv.2013.01.020.

[33] Han L. , Wu W.-Z., Bian Y.-H., “An Experimental Study on the Driving System of Vibratory Feed-ing”, TELKOMNIKA Indonesian Journal of Elec-trical Engineering, vol. 11, no. 10, 2013, 5851–5859. DOI: 10.11591/telkomnika.v11i10.3415.

[34] Suresh M., Jagadeesh K.A., Sakthivel J., “Predic-tion of Parameters using Linear regression for trap in a vibratory part feeder”, International journal of Research in Mechanical Engineering, vol. 2, no. 1, 2014, 43–47.

[35] Boehlke D., Teschler and Leland, “Smart design for flexible feeding”, Machine Design, vol. 66, no. 23, 1994, 132–134.


Articles10

[36] Joneja A., Lee N., “A modular, parametric vibra-tory feeder: A case study for flexible assem-bly tools for mass customization”, IIE Trans-actions, vol. 30, no. 10, 1998, 923–931. DOI: 10.1080/07408179808966546.

[37] Tay M.L., Chua P.S.K., Sim S.K., Gao Y., “Develop-ment of a flexible and programmable parts feed-ing system”, Int. J. Prod. Econ., vol. 98, no. 2, 2005, 227–237. DOI: 10.1016/j.ijpe.2004.05.019.

[38] Akella S., Mason M.T., “Using partial sensor infor-mation to orient parts”, International Journal of Robotics Research, vol. 18, no. 10, 1999, 963–997. DOI: 10.1177/02783649922067663.

[39] Sim S.K., Chua P.S.K., Tay M.L., Yun G., “Incorpo-rating pattern recognition capability in a flex-ible vibratory bowl feeder using a neural net-work”, International Journal of Production Research, vol. 41, no. 6, 2003,1217–1237. DOI: 10.1177/02783649922067663.

[40] Chua P.S.K., “Novel design and development of an active feeder”, Assembly Automation, vol. 27, no. 1, 2007, 31–37.

[41] Udhayakumar S., Mohanram P.V., Yeshwanth S., Ranjan B.W., Sabareeswaran A., “Development of an Adaptive Part Feeder for Handling Sector Shaped Parts”, Assembly Automation, vol. 34, no. 3, 2014, 227–236

[42] Leberle U., Fleischer J., “Automated Modular and Part-Flexible Feeding System for Micro Parts”, Int. J. of Automation Technology, vol. 8, no. 2, 2014, 282–290.

[43] Causey G.CQuinn., R.D., Barendt N.A., Sargent D.M., Newman W.S., “Design of a flexible parts feeding system”. In: Proceedings of IEEE Interna-tional Conference on Robotics and Automation, vol. 2, 1997, 1235–1240. DOI: 10.1109/RO-BOT.1997.614306.

[44] Gudmundsson D., Goldberg K., “Tuning robotic part feeder parameters to maximize throughput”, Assembly Automation, vol. 19, no. 3, 1999, 216–221.

[45] Chen J.X., Zhang T.Q., Zhou Y.N., Murphey Y.L., “A smart machine vision system for PCB inspec-tion”, Engineering of Intelligent Systems, Lecture notes in Computer Science, vol. 2070,2001, 513–518. DOI: 10.1007/3-540-45517-5_57.

[46] Sumi Y., Kawai Y., “3D object recognition in clut-tered environments by segment-based stereo vision”, International Journal of Computer Vision, vol. 46, no.1, 2002, 5–23.

[47] Khan U.S., Iqbal J., Khan M.A., “Automatic inspec-tion system using machine vision”. In: Proceed-ings of 34th Applied Imagery and Pattern Recogni-tion Workshop, 2005, 211–217.

[48] Han L., Wang L.Y., G.P.Hu, “A study on the vision-based flexible vibratory feeding system”, Advanced Materials Research, vol. 279, 2011, 434–439. DOI: 10.4028/www.scientific.net/AMR.279.434.

[49] Mahalakshmi T., Muthaiah R., Swaminathan P., “Overview of template matching technique in im-age processing”, Research Journal of Applied Sci-ences, Engineering and Technology, vol. 4, no. 29, 2012, 5469–5473.

[50] Liang H., Huimin L., “A study on Flexible Vibratory feeding system based on smart camera”, Interna-tional Symposium on Computers and Informatics, 2015,1316–1321.


11

tiple narrow slices, which can be considered as air-foils. Viscous effect of the fluid causes the drag and additional (beyond inviscid) lift force on the body, taken into consideration through simplified models including coefficients dependent on Reynolds num-ber, without taking into account, for example, the configuration of the arm. However, there are results showing that drag and lift coefficients are not configu-ration independent [5].

The modeling of underwater manipulators has been studied in many works [4, 6, 7, 8, 9]. Underwater arms were modeled mostly as consisting of cylindri-cal links in order to simplify added mass, drag and lift forces calculations. Underwater manipulator in action changes its geometry during work, and consequently it is important to include the hydrodynamic effects of all links of the kinematic chain on the dynamics of the whole manipulator and the ROV.

The lumped approach to the hydrodynamic load on the underwater manipulators, mentioned in this section, is of limited accuracy and there are some controversies as to how added mass effect can be in-cluded, for example, for the wakes [10]. Fluid struc-ture interactions (FSI) or computational fluid dynam-ics (CFD) methods enable more accurate results to be achieved. The fast development of computers, CFD methods and software make it possible to compute the results in more reasonable time than a short time ago, but naturally, not in real time, needed for control applications, for which, however, the obtained CFD re-sults can be harnessed as useful data.

The objective of this paper was to examine the 3D steady-state hydrodynamics of the flow around the three-link manipulator placed in the current of in-compressible water by using CFD methods. The pres-ent study concerned stationary three-link manipula-tor at different angles of the last link to the current. Seven robotic arm configurations were considered, subjected to the four different current speeds. It will enable us to compute the torques exerted on each joint of the manipulator at any configuration and at any velocity within the examined range as an interpo-lation function between values obtained, and conse-quently to make it possible to utilize results in control application for slow motion of the upper link or for slow current of water.

2. Modeling of the Flow Around the Robotic Arm. Case Study

The manipulator under consideration shown in Fig. 1 consists of three links with diameters of

Preliminary Study of Hydrodynamic Load on an Underwater Robotic Manipulator

Waldemar Kolodziejczyk

Submitted: 21st June 2015; accepted: 12th August 2015

DOI: 10.14313/JAMRIS_4-2015/28

Abstract:The objective of this study was to obtain the hydrodynam-ic load on an underwater three-link robotic arm subjected to the different current speeds at several arm configura-tions under steady-state conditions. CFD simulations were performed in order to assess torque requirements when hydrodynamic effects have to be compensated by motors in order to maintain the position of the arm.

Keywords: underwater manipulator, CFD, hydrodynamic load

1. Introduction Remotely operated manipulators are nowadays

standard equipment for several underwater ROV (Re-motely Operated Vehicles), as they offer underwater robots more flexibility and wider range of application, e.g. in picking up objects from the bed, joining parts, drilling… Industrial robots and manipulators oper-ate in atmosphere which is much lighter than a rigid body. In underwater applications the density of wa-ter is comparable with the density of the manipulator and additional effects of hydrodynamic forces appear-ing in the system have to be taken into consideration, especially for fast, high performance manipulators, for which large hydrodynamic forces and torques may develop inducing unwanted motions [1]. The hydrodynamic effects on the manipulator are signifi-cant and affect the ability to achieve precise control [2]. The control of underwater robots and manipula-tors is, moreover, extremely difficult due to additional complex hydrodynamic loads including currents and wakes caused by nearby structures.

In a context of automatic control the hydrody-namic contribution to the forces acting on a system cannot be obtained from the continuity equation and the Navier-Stokes equations of motion, because they are ill-suited for on-line calculations. Hydrodynamic forces are taken into account through so called “add-ed mass” contribution computed from the strip the-ory as a quotient of the hydrodynamic force divided by the acceleration of the body [3]. The added mass approach means that there is also an added Coriolis and added centripetal contribution.

Strip theory originates from potential flow back-ground for 2D inviscid flows, and was extended semi-empirically to three dimensions [4]. Under the strip theory approach, the solid body is divided into mul-


Articles12

8.4 cm. The lowest link is 0.43 m long, the middle one – 0.45 m, and the upper link has the cylindrical part of the length of 0.4 m. Two lower links of manipula-tor were kept unchanged. The modes of manipulator configurations are characterized by different arrange-ments of the third upper link inclined at seven angles q3 to the second (vertical) link: –135°, –90°, –45°, 0°, 45°, 90°, and 135°. Positive value of an angle q3 is measured in counter-clockwise direction with re-

spect to the z3 axis. The location of the arm with refer-ence to the free stream of water is presented in Figs. 1 and 2. In a way, the angle q3 becomes then an indicator of arm position, relative to the velocity of the current which is oppositely directed to the x axis of the exter-nal system of coordinates (Fig. 1).

The computational domain in the shape of a box has been bounded only by the flat base of 8m long and 3m wide, considered as a solid wall. The arm is attached to the base in the middle of the width of the base at a distance of 2.5 m from free current inlet, as it is shown in Fig. 2. The 1/7th power law was used to specify turbulent velocity profile at the inlet to the domain. The other sides of computational domain of the height of 2.5 m were in contact with surround-ing flowing water, i.e. the backflow to the domain may occur with the direction determined using the direction of the flow in the cell layer adjacent to the boundary.

Gulf Stream, Kuroshio, Agulhas, Brazil, and East Australian Currents flow at speeds up to 2.5 m/s. The strongest tidal current in the world, the Saltstrau-men, flows at speed reaching 41 km/h (11.4 m/s). It was decided to limit the range of velocities in the present considerations to 1.5 m/s. Calculations were performed for four free current speeds: 0.1 m/s, 0.5 m/s, 1.0 m/s and 1.5 m/s. Reynolds numbers computed with respect to the links diameters and

Fig. 1. Coordinate frame arrangement of the robotic arm (external and local reference frames)

a)

b)

Fig. 2. The location of the manipulator in the computational domain for intermediate configuration mode described by q3 = –22.5° and vortex structures shedding from the arm at V = 0.75 m/s: a) for the computational domain of size of 8 m x 3 m x 2.5; b) for the reference domain of size of 11 m x 5 m x 3 m


Articles 13

current speeds were equal to 8 400, 42 000, 84 000 and 126 000, respectively.

The steady-state, incompressible viscous flow around a manipulator is described by the continuity equation and the Navier-Stokes equations of motion. The direct numerical simulations of N–S equations, where all the scales of the turbulent motion are re-solved, exceed the capacity of currently existing com-puters, and then the governing equations have to be transformed to the Reynolds Averaged Navier-Stokes (RANS) equations:

(1)

(2)

where xi , xj are the Cartesian coordinates, ui , uj are mean velocity components in X, Y and Z, directions, u'i , u'j are the fluctuating velocity components, r is the density of fluid, p is the pressure, m – the viscosity.

The terms ( ), called the Reynolds stresses, must be modeled in order to close the problem. Usu-ally they are modeled utilizing the Boussinesq hy-pothesis:

, (3)

where mt is the turbulent viscosity, k – turbulence ki-netic energy, and dij is the Kronecker’s delta.

The ways in which turbulent viscosity mt and tur-bulent kinetic energy k are computed are called mod-els of turbulence. In the present study the standard k–e model of turbulence was applied, for its robust-ness, economy, and reasonable accuracy for fully tur-bulent flows. The standard k–e model is combined of two transport equations for the turbulence kinetic energy (k) and its dissipation rate (e):

, (4)

and

(5)

where C1ε = 1.44, C2ε = 1.92, sk = 1.0 and sε = 1.3 are the model constants. The term Gk represents the generation of turbulence kinetic energy due to the mean velocity gradients evaluated as:

, (6)

where is the modulus of the mean rate-of-strain tensor.

In this model the turbulent viscosity is computed as follows:

(7)

where Cµ = 0.09 is a constant.The ANSYS CFD (ANSYS Inc., Canonsburg, Penn-

sylvania, USA) software was used to perform simula-tions. For the computational domain with different manipulator configurations the set of eight meshes of approx. 9 500 00 ÷ 11 500 000 elements were gen-erated using cut-cell method. Figure 3 shows an ex-ample of the computational grid near the manipulator for configuration mode described by q3 = 135°.

Simulations were carried out in Parallel Fluent 16.0 (which implements the control volume method) with twelve parallel processes by utilizing the SIMPLE algorithm (Semi-Implicit Method for Pressure Linked Equations), a second order spatial pressure dis-cretization and second order upwind discretization schemes for momentum equations and for the model of turbulence.

This research has been focused on the calculations of torques exerted by the current of water about three z axes in local reference frames assigned to the arm links, as they are shown in Fig. 1. Going from top to bottom, the torque t3 was calculated taking into ac-count pressure and shear stress distributions along the surface of upper link about z3 axis. The torque t2 includes hydrodynamic effects (due to pressure and shear stresses) on the two upper links with respect to z2 axis and the torque t1 – describes the action of water on the whole manipulator about z1 axis. They can be considered as joint torques experienced by the manipulator placed into the current of water and which have to be compensated by motors in order to maintain the positions of the links.

Moments (torques) of pressure and viscous forces along a specified axis are determined as the dot prod-ucts of a unit vector in the direction of the axis and the net values of the moments computed by summing the cross products of the position vector of the pressure and viscous forces origins with respect to the moment center with the pressure and viscous force vectors for each boundary cell-face belonging to the discretized surface of the arm.

a) b)

Fig. 3. The examples of computational grid close to the manipulator for configuration mode described by q3 = 135⁰


Articles14

In order to investigate the effect of the size of the domain and the computational mesh resolution on the results of simulations the domain and grid inde-pendence study were conducted.

Domain dependence was checked quantitatively in a series of simulations carried out for particular arm configuration described by q3 = –22.5°, for current speed V = 0.75 m/s and for different sizes of the do-main: the length l, the width w and the height h shown in Tab. 1. Domain dependence factor was defined as:

(8)

where t1, t2, t3 are torques obtained for different siz-es of the domain, j is an indicator of the torque (1, 2 or 3), i – stands for a serial number of the domain (see Tab. 1), tj(r) is the “j” torque computed for the reference “r = 5” domain of maximum size 11 m×5 m×3.5 m. The domain selected for computation is indicated by No. 1 in Tab. 1 (size: 8 m×3 m ×2.5 m). The relative difference of torques for actual domain computed with reference to those obtained for the domain of maximum size was equal to 5.75% for t1, and was equal or less than 2.5% for t2 and t3. The most impor-tant geometrical feature of the domain was its length. It was selected as a compromise between the need to capture all the structures of the flow and the capac-ity of available computers. Vortex structures (Fig. 2) forming the wakes shedding from the manipulator for actual computational domain and for the reference

one are very similar in shape and the length of the wake is almost the same (vorticity con-tours were drawn at the same locations in both cases), so it can be stated that the actual computational domain was made long enough to capture all the features of the flow.

Grid independence study was performed for the posi-tion of the arm indicated by q3 = 45° and for current speed V = 1m/s, comparing result-ing torques t1, t2, t3, obtained for meshes of different resolu-tions, as it is shown in Tab. 2. Grid independence factor was defined in the same way as do-main dependence factor (8), except that i – stands for a seri-al number of the mesh (Tab. 2), tj(r) is the “j” torque computed for the reference “r = 4” grid of maximum number of cells. As can be seen in Tab. 2, grid independence factor constant-ly decreases with increasing number of cells and for two finest meshes of cell num-bers 6142455 and 9751800,

the relative differences of the torques were less than 1.6%. In order to better capture the flow structures, the finest mesh (No. 4) was selected and, consequent-ly, the number of cells for all computational cases was kept in the range of 950000 ÷ 11500000 cells.

(

3. Results and DiscussionThe results of calculations are summarized in

Tab. 3 for four velocities of the current and for seven configuration modes of the robotic arm. The obvious conclusion is that the largest torques appear for the greatest current speed (1.5 m/s), but the effect of con-figuration mode of the manipulator is not so evident. All the configurations of the manipulator induce nega-tive moments about the lower link (z1 axis). The high-est negative t1 is observed for q3 = –45°and –135°, that is when the upper arm is inclined upstream at an angle of 45° to the top or to the bottom of the free stream.

All the torques t2 computed from pressure and shear stress distributions along two upper links are positive. The highest t2 are located in the range of q3 between –45° and +45°. The lowest torque t2 appears for q3 = –135°, that is when the upper link is inclined upstream to the bottom. The torque t3 changes its di-rection determined by the position of the upper link and the current speed. It remaines positive in almost all cases for q3 between –45° and +90°, and negative in almost all cases for q3 = –90°, –135° and 135°.

The obtained magnitudes of joint torques can be used as interpolation points in procedures generating the interpolation functions for computing t1, t2 and t3 at intermediate values of current speeds and in inter-

Table 1. Domain dependence study

Domain size l×w×h

[m×m×m]Number of cells N

Torques[N m]

Domain dependence factor[%]

t1 t2 t3 δ1 δ2 δ3

1 8×3×2.511 450 290 –1.048 10.259 4.375 5.75 2.50 2.30

2 6×3×2.510 898 000 -0.931 10.881 4.666 6.05 3.41 4.19

3 11×3×2.512 227 519 -0.945 10.429 4.525 4.64 0.88 1.05

4 8×4×312632671 -0.941 10.490 4.562 5.05 0.30 1.88

5 11×5×3.515 297 509 -0.991 10.522 4.478

Table 2. Grid independence study

Sl. No.

i

Number of cells

N

Torques [N m] Grid independence factor [%]

t1 t2 t3 δ1 δ2 δ3

1 2992942 -1.455 11.949 3.836 3.00 1.79 2.70

2 5065890 -1.494 11.556 3.696 0.40 1.55 1.04

3 6142455 -1.506 11.561 3.714 0.40 1.51 0.56

4 9751800 -1.500 11.738 3.735


Articles 15

mediate positions of the upper link. The distributions of joint torques in the space created by an angle q3 and velocity V of the current are shown in Figs. 6÷8. In figure 8 the areas of positive and negative moments were separated by thicker zero-torque isolines in or-der to show the relationship between them better and to indicate, when the motor has to change the direc-tion of rotation.

The results of the present calculations allow as-sessing how much the hydrodynamic forces impact the torques required to be supplied by motors. In control applications the joint moments to be compen-sated due to hydrodynamic loads can be obtained by using simple interpolation procedures utilizing, for example, bicubic 2D splines [11].

The hydrodynamic torques is caused by pressure and shear stress distributions along the surface of the manipulator links. The effect of pressure is much high-er than that of shear stresses. Figures 4 and 5 present the pressure and wall shear contours on the robotic arm, obtained for speed current V = 1 m/s and for all considered configuration modes. The contours are seen from the direction different in each figure and most convenient in each case. The external system of coordinates placed near the arm indicates the posi-tion of the manipulator in relation to the current.

Generally, the pressure is at its highest on the sur-faces that are facing the current, and at its lowest on sides’ transversal to the current and on the sharp edg-es of the arm, that is in regions of the maximum veloc-ity gradients and separation. They are also the areas of the maximum shear stresses as it is clearly seen in

Figs. 4 and 5. The biggest pressure difference for cur-rent speed V = 1m/s was found to be approx. equal to 1200 Pa. Maximum values of positive gauge pressure were found to be about 350÷450Pa (depending on the configuration mode) on the upstream sides of the arm, and the greatest absolute value of the negative gauge pressure rose to about 1000Pa on the sharp edges of the third link, where the maximum velocities and shear stresses appeared (approx. 15 N/m2).

Wake formation in the flow around the manipu-lator strongly affects the hydrodynamic forces and torques. The strip theory used to compute the add-ed mass, drag and lift forces oversimplifies the flow patterns and interaction effects caused by changing geometry of the arm during its work. In the present simulations different wake patterns were observed depending on different configurations. One of them, for the intermediate configuration q3 = –22.5° and for current speed V = 0.75m/s, is presented in Fig. 2 as contours of vorticity shedding from the links.

4. ConclusionsCFD analysis has been performed to investigate the

flow around the three-link manipulator placed in the current of water. ANSYS Fluent software was used to predict the flow structure near the manipulator arm and to compute the hydrodynamic torques in several configurations of the underwater manipulator and for several velocities of the current flowing around it.

The hydrodynamic torques computed in this study may be applied as external loads to dynamic model of the manipulator in order to obtain more accurate and

Mode: a) q3 = 135° b) q3 = 90° c) q3 = 45° d) q3 = 0° e) q3 = -45° f) q3 = –90° e) q3 = -135°

Fig. 4. Gauge pressure distribution on the surface of the arm at speed of the current V = 1 m/s

Mode: a) q3 = 135° b) q3 = 90° c) q3 = 45° d) q3 = 0° e) q3 = –45° f) q3 = –90° e) q3 = –135°

Fig. 5. Shear stress distribution on the surface of the arm at speed of the current V = 1 m/s


Articles16

more realistic simulation of the manipulator motion. The results can be applied in robotic models to define control strategies that will take into account the hy-drodynamic forces computed for different modes of arm configurations and velocities of current with ap-plication of interpolation functions. In the table 4 the joint torques computed for an intermediate position of the last link (q3 = –22.5°), when the upper arm is slightly inclined upstream as one can see in Fig. 2, and for intermediate current speed V = 0.75 m/s by using the CFD approach and bicubic 2D splines [11] are pre-sented. Relative differences between CFD calculation and the values interpolated from the data presented in Tab. 3 are less than 10%.

There is also a possibility of obtaining the lift and drag forces and consequently added mass for each link of the manipulator more accurately than from the strip theory and utilizing them in modeling of the dy-namics of manipulator.

Underwater manipulators are usually sturdier than presented one, symmetrical in shape in most cases, and with, usually, not cylindrical links. The manipulator under investigation is based on UR 5 with cylindrical links, and it is non-symmetrical in shape. These features may give us many benefits in our investigations. Firstly, non-symmetrical shape of the arm allows us to investigate the effect of hydrodynamic load in more general way. Then cylindrical links enable us (in future works) to compare the hydrodynamic loads computed through numerical approach with results obtained via standard added mass calculations, which are more suited for links of robotic arm shaped cylindrically.

This paper presents just the first step in understanding of hydrodynamic loads on the

Table 3. Joint torques due to hydrodynamic effects

Configura-tion mode

τ1 [Nm] τ2 [Nm] τ3 [Nm]

V [m/s] V [m/s] V [m/s]

0.1 0.5 1.0 1.5 0.1 0.5 1.0 1.5 0.1 0.5 1.0 1.5

θ 3

135° -0.010 -0.226 -0.717 -0.847 0.036 0.982 4.046 8.649 -0.009 -0.284 -1.912 -4.154

90° -0.008 -0.161 -0.955 -1.682 0.028 1.023 3.938 9.371 -0.005 0.093 0.262 1.249

45° -0.006 -0.301 -1.500 -3.511 0.084 2.604 11.738 26.213 0.021 0.798 3.735 8.039

0° -0.009 -0.252 -1.186 -2.804 0.134 3.631 16.629 39.232 0.050 1.373 6.034 13.931

-45° -0.014 -0.474 -2.414 -4.937 0.128 4.099 16.708 39.146 0.067 1.875 7.372 17.622

-90° -0.005 -0.143 -0.613 -1.264 0.027 0.804 3.846 7.985 -0.002 -0.042 0.030 -0.362

-135° -0.009 -0.281 -1.463 -4.226 0.004 0.136 1.199 3.919 -0.052 -1.504 -5.873 -9.732

Fig. 6. Interpolation surface for joint torque t1



Table 4. Joint torques in intermediate position of the arm and at intermediate current speed V = 0.75 m/s

t1 [Nm] t2 [Nm] t3 [Nm]

CFD calculations -1.048 10.258 4.375

Interpolation - bicubic 2D splines -1.029 9.667 4.012

Relative difference 0.018 0.058 0.083


Articles 17

underwater robotic arm via numerical simulations, because it concerns only on the steady-state flow around different configurations of the last link of the arm. In the future, we will focus on the determination how the motion of the arm may affect the magnitude and direction of joint torques which in turn may give us information about the range of current speeds and velocities of the last link for which the flow might be considered as steady-state.

ACKNOWLEDGEMENTS

This work was supported by the Bialystok Univer-sity of Technology under the grant No. S/WM/1/2012.

AUTHORWaldemar Kołodziejczyk – Bialystok University of Technology, Faculty of Mechanical Engineering, De-partment of Automatic Control and Robotics, ul. Wiej-ska 45 c, 15-351, Bialystok, Poland. E-mail: [email protected].

REFERENCES

[1] Antonelli G., Underwater Robots, Springer Tracts in Advanced Robotics, Second edition, Springer, 2006.

[2] Farivarnejad H., Moosavian S.A., “Multiple Im-pedance Control for object manipulation by a dual arm underwater vehicle-manipulator sys-tem”, Ocean Engineering, vol. 89, 2014, 82–98. DOI: 10.1016/j.oceaneng.2014.06.032.

[3] Fossen T.I., Guidance and Control of Ocean Vehi-cles, John Wiley & Sons, Chichester, United King-dom, 1994.

[4] McLain T.W., Rock S.M., “Development and Ex-perimental Validation of an Underwater Manip-ulator Hydrodynamic Model”, The International Journal of Robotics Research, vol. 17, 1988, 748–759.

[5] Leabourne K.N., Rock S.M., “Model Development of an Underwater Manipulator for Coordinated Arm-Vehicle Control”. In: Proceedings of the OCEANS ’98 Conference, Nice, France, no. 2, 1998, 941–946.

[6] Richard M.J., Levesque B., “Stochastic dynamic modelling of an open-chain manipulator in a fluid environment”, Mech. Mach. Theory, vol. 31, no. 5, 1996, 561–572.

[7] Rivera C., Hinchey M., “Hydrodynamics loads on subsea robots”, Ocean Engineering, vol. 26, no. 8, 1999, 805–812. DOI: 10.1016/S0029-8018(98)00031-6.

[8] Vossoughi G.R., Meghdari A., Borhan H., “Dynam-ic modeling and robust control of an underwater ROV equipped with a robotic manipulator arm”. In: Proceedings of 2004 JUSFA, @004 Japan-USA Symposium on Flexible Automation, Denver, Col-orado, July 19–21, 2004.

[9] Pazmino R.S., Garcia C.E., Alvarez Arocha C., San-toja R.A., “Experiences and results from designing

and developing a 6DOF underwater parallel ro-bot”, Robotics and Autonomous System, 59, 2011, 101–112.

[10] Williamson C.H.K., Govardhan R., “A brief review of recent results in vortex-induced vibrations”, Journal of Wind Engineering and Industrial Aero-dynamics, vol. 96, no. 6–7, 2008, 713–735. DOI: 10.1016/j.jweia.2007.06.019.

[11] Press W.H., Teukolsky S.A., Vetterling W.T., Flan-nery B.P., Numerical recipes in C, The art of sci-entific computing, Second edition, Cambridge University Press, 1992.


18

A biometric system is a system which has an au-tomated measuring component that is robust and can distinguish physical characteristics that can be used to identify a person. By robust it is meant that the features should not change significantly with the passing of years. For example iris recognition is more robust than other biometric systems because it does not change a lot over time. Due to matters of security, the budget for implementing biometric systems has increased [25]. A face biometric system can use both visual images and infra-red images, which have their own properties [19]. Face biometric systems can be divided into three categories based on the utilized im-plementation:1. Appearance-based methods: These methods use

statistical approaches to extract the most impor-tant information from the image.

2. Model-based methods: These use a model and then the model is placed on the test images and by com-puting some parameters, the person can be recog-nized. Elastic bunch graph [34], Active Appearance Model (AAM) [6] and 3D morphable model are some examples of model-based methods [1, 18].

3. Template-based methods: these methods first find the location of each part of the face for example eyes, nose etc. and then by computing the correla-tion between parts of the training images and the test images the face can be recognized [4]. All the face biometric systems should also include

a face detection part in order to find the place of the face in the image. Viola used Adaboost algorithm to find faces in an image [33]. Rowley used neural net-works [24]. In Viola and Rowley method a window was moved over the image in order to find a face. New methods use color images. Hsu [17] first used color images and skin detection in order to find faces in the image. In [14] faces were detected by using correla-tion and skin segmentation [15].

2. Appearance-Based MethodsAppearance-based methods start with the concept

of image space. A two dimensional image can be shown as a point or vector in a high dimensional space which is called image space. In this image space, each dimension is compatible with a pixel of an image. In general, an image with m rows and n columns shows a point in a N dimensional space where = ×N m n . For example, an image with 20 rows and 20 columns de-scribes a point in a 400 dimensional space. One im-portant characteristic of image space is that changing the pixels of one image with each other does not

Face Recognition Using Canonical Correlation, Discrimination Power, and Fractional Multiple Exemplar Discriminant Analyses

Mohammadreza Hajiarbabi, Arvin Agah

Submitted: 14th August 2015; accepted: 17th September 2015

DOI: 10.14313/JAMRIS_4-2015/29

Abstract:Face recognition is a biometric identification method

which compared to other methods, such as finger print identification, speech, signature, hand written and iris recognition is shown to be more noteworthy both theo-retically and practically. Biometric identification meth-ods have various applications such as in film processing, control access networks, among many. The automatic recognition of a human face has become an important problem in pattern recognition, due to (1) the structural similarity of human faces, and (2) great impact of fac-tors such as illumination conditions, facial expression and face orientation. These have made face recognition one of the most challenging problems in pattern recog-nition. Appearance-based methods are one of the most common methods in face recognition, which can be cat-egorized into linear and nonlinear methods. In this paper face recognition using Canonical Correlation Analysis is introduced, along with the review of the linear and non-linear appearance-based methods. Canonical Correla-tion Analysis finds the linear combinations between two sets of variables which have maximum correlation with one another. Discriminant Power analysis and Fractional Multiple Discriminant Analysis has been used to extract features from the image. The results provided in this paper show the advantage of this method compared to other methods in this field.

Keywords: face recognition, Canonical Correlation Anal-ysis, Discrimination Power Analysis, Multiple Exemplar Discriminant Analysis, and Radial Basis Function neural networks

1. Introduction Recognizing the identity of humans is of great

importance. Humans recognize each other based on physical characteristic such as face, voice, gait and etc. In the past centuries, the first systematic methods for recognizing were invented and used in police stations for recognizing the villains. This method measured the different parts of the body. After discovering that the finger print is unique for each person, this method became the best method for recognizing humans. In the last decades and because of inventing high speed computers, a good opportunity has been provided for the researches to work on different methods and to find certain methods for recognizing humans based on unique patterns.


Articles 19

change the image space. Also image space can show the connection between a set of images [31]. The im-age space is a space with high dimensions. The ap-pearance-based methods extract the most important information from the image and lower the dimension of the image space. The produced subspace under this situation is called feature space or face space [31].

The origin of appearance-based methods dates back to 1991 when Turk and Pentland introduced the Eigen face algorithm which is based on a famous mathematical method, namely, Principal Component Analysis [32]. This was the start of appearance-based methods. In 2000, Scholkopf by introducing kernel principal component analysis (Kernel Eigenface) ex-panded the concept of appearance-based method into non-linear fields. Appearance-based methods are robust to noise, defocusing, and similar issues [10]. Appearance-based methods have been classified into two categories of linear and non-linear methods. In the following sections these methods are described.

2.1. Linear Discriminant AnalysisIn face space which is of ×m n dimension with m ,

n as the image dimensions, ( ) ×= ⊂ ℜ1 2, ,..., m nnX X X X

is a matrix containing the images in the training set. iX is an image that has been converted to a column

vector. LDA maximize the between class scatter ma-trix to the within class scatter matrix [8]. The between class scatter matrix is calculated as:

( )( )

=

= − −∑1

c Ti i iB

i

S n X X X X

Where ( ) == ∑ 1

1 njj

X Xn the mean of the images in the training is set and ( ) =

= ∑ 11i n i

i jj

iX Xn is the mean

of class i , and c is the number of the classes (total images that belong to one person). The within class scatter matrix is calculated as:

( )( )

= ∈

= − −∑ ∑1

c Ti i

W i ii X nii

S X X X X

The optimal subspace is calculated by:

[ ]−= = 1 2 1argmax , ,...,

TB

optimal E cTW

E S EE c c c

E S E

Where [ ]−1 2 1, ,..., cc c c is the set of Eigen vectors of BS and WS corresponding to − 1c greatest general-

ized Eigen value iλ and = −1,2,..., 1i c

= = −1,2,..., 1B i i W iS E S E i cλ

Thus, the most discriminant response for face im-ages X would be [8]:

In order to avoid the singularity problem first one has to reduce the dimension of the problem and then

apply LDA. Principal component analysis (PCA) is the most common method which is used for dimension reduction. In this paper we applied principal compo-nent analysis on the images prior to other methods have been discussed. In addition to PCA, there are other effective methods that can be used as dimen-sion reduction prior to LDA, such as Discrete Cosine Transform (DCT) [12].

Some researchers have observed that applying PCA to reduce the dimension of the space can cause another problem which is the elimination of some useful information from the null space. The 2FLD al-gorithm was introduced to address this problem and also the computational problem that applying PCA produces. But 2FLD algorithm introduces other prob-lems such that the output of 2FLD method is a matrix and its dimension for an ×m n image could be ×n n . This high dimension when a neural network is used for classification causes issues. A two and high di-mension matrix cannot be applied to a neural net-work. If the matrix is changed into a vector, a vector with size 2n is produced and because of low sample test of each face the network cannot be trained well. A direct LDA method that does not need to use PCA method before applying LDA has been proposed, but this method is time inefficient [36]. A fuzzy version of the LDA has also been proposed [20]. Shu et al. de-signed a linear discriminant analysis method that also preserved the local geometric structures [29]. In [9] the discriminant information was added into sparse neighborhood.

2.2. Fractional Multiple Exemplar Discriminant Analysis

The problem of face recognition differs from oth-er pattern recognition problems and therefore it re-quires different discriminant methods rather than LDA. In LDA the classification of each class is based on just one sample and that is the mean of each class. Be-cause of shortage of samples in face recognition appli-cations, it is better to use all the samples instead of the mean of each class for classification. Rather than mini-mizing within class distance while maximizing the be-tween class distance, multiple exemplar discriminant analysis (MEDA) finds the projection directions along which the within class exemplar distance (i.e., the distances between exemplars belonging to the same class) is minimized while the between-class exemplar distance (i.e., the distances between exemplars be-longing to different classes) is maximized [37].

In MEDA the within class scatter matrix is calcu-lated by:

Where ijX is the jth image of ith class. Through

comparison with the within class scatter matrix of LDA, it can be seen that in this method all the images in a class have participated in making the within class scatter matrix instead of using just the mean of the class, as in the LDA method. The between class scatter matrix is computed by:


Articles20

( )( )= = ≠ = =

= − −∑ ∑ ∑∑1 1, 1 1

1 nC C n Ti j i jB k l k li j

i j j i k l

ji

S X X X Xn n

Dissimilarly to LDA in which the means of each class and means of all samples made the between class scatter matrix, in MEDA all the samples in one class are compared to all samples of the other class. The computation of optimalE is the same as LDA.

There is a drawback which is common in both LDA and MEDA. In between class scatter matrix ( BS ) there will be no difference if the samples are closer or far from each other. However, it is clear that for the class-es which are closer to each other the probability of collision is more than the other classes.

When the idea was first proposed [21], it was used for LDA and was not applied to face recognition da-tabases. Later [13] the algorithm was combined with MEDA and was applied to face recognition. This algo-rithm suggests reducing the dimension of the problem step by step and in each iteration the samples which are closer are made to become far from each other. For this purpose a weight function has been introduced:

( ) ( )−= =1 2 1 2 , 3,4,...

p

X X X Xw d d p

Where 1 2X Xd denotes the distance of the center of each class from each other [21] but for MEDA it should be considered as the distance of each two samples [13]. The between class scatter matrix in fractional MEDA is defined as:

( )( )= = ≠ = =

= × − −

∑ ∑ ∑∑

1 1, 1 1

1C n n Ti j i jB k l k li j

X Xi j j i k l

i i

jik l

S w d X X X Xn n

The within class scatter matrix is the same as MEDA.

The fractional algorithm is shown in Table 1 [21]: In the pseudo code r is the number of fractional

steps used to reduce the dimensionality by 1 [21].

Table 1. Fractional algorithm [21]

Set ×= n nW I (the identity matrix)for =k n to ( )+ 1m step (-1)

for = 0 to ( )1−r to step 1Project the data using W as = Ty W xApply the scaling transformation to obtain

( )= ,z yϕ αFor the z patterns, compute the ×k k be-tween class scatter matrix bSCompute the ordered eigenvalues 1 2, ,..., kλ λ λ and corresponding eigenvectors 1 2, ,... kφ φ φ of

bSSet = W WF , where [ ]= 1 2, ,..., kF φ φ φend for

Discard the last ( )kth column of W .end for

The scaling transformation compresses the last component of y by a factor α with

( )< ∈ℜ → ∈ℜ1, . , ; : k ki e y y zα Ψ α such that:

( ) == = −

,, 1,2,..., 1

ii

i

y i kz

y i kα

Some explanations about this algorithm are [21]:• In the rth step, the reduction factor is −1rα . It

stipulates that a dimension is removed by −2 11, , ,... rα α α scales.

• When the number of steps is smaller, then αshould be chosen larger and vice versa.

• The weighting functions should be chosen, as − −3 4,d d and so on.

The FMEDA algorithm is shown in Table 2 [13].

Table 2. FMEDA algorithm [12]

1. Applying PCA on the training set.2. Computing within class scatter matrix using

3. Computing between class scatter matrixes using

4. Applying fractional step dimensionally reduction algorithm.

5. Computing optimal subspace using

[ ]−= = 1 2 1argmax , ,...,

TB

optimal E cTW

E S EE c c c

E S E

6. Computing most discriminant vectors using

= ⋅ToptimalP E X

2.3. Kernel MethodsKernel methods are more recent methods, as com-

pared to linear algorithms [3]. A kernel method finds the higher order correlations between instances and the algorithm, as described in this section. It is considered that patterns ∈ℜNx are available, and that the most information lies in the dth dimension of pattern x .

One manner to extract all the features from data is

to extract the relations between all the elements of a vector. In computer vision applications where all the images are converted to vectors, this feature extrac-tion shows the relations between all pixels of the im-age. For example in ℜ2 (an image) all the second or-der relations can be mapped into a non-linear space:

[ ] [ ]( ) [ ] [ ] [ ] [ ]( )

ℜ → = ℜ

2 3

2 2

1 2 1 2 1 2

:

, , ,

F

x x x x x x

F


Articles 21

This method is useful for low dimensional data but can cause problems for high dimensional data. For Ndimensional data there are

( )( )+ −

=−

1 !! 1 !F

N dN

d N

different combinations that make a feature space with FN dimension. For example, a 16*16 image with = 5d

has a feature space of moment 1010 . By using kernel methods there is no need to compute these relations explicitly.

For computing dot products ( ) ( )( )'.x xF F the ker-nel method is defined as follows:

( ) ( ) ( )( )' ', .k x x x xF F=

Which allows the dot product F to be computed without any need to map F . In this method, first used in [3], if x is an image then the kernel ( )'.

dx x

(or any other kernels) can be used to map onto a new feature space. This feature space is called the Hilbert space. In Hilbert space all the relations between any vectors can be shown using dot products. The input space is denoted as χ and the feature space is de-noted by F , and the map by : Fφ χ → . Any function that returns the inner product of two points ix χ∈ and jx χ∈ in the F space is called a kernel function.

Some of the popular kernels include [21]:

Polynomial kernel: ( ) ( ), . dk x y x y=

RBF kernel: ( )2

2, exp 2x yk x y σ

− −= Sigmoid kernel:

( ) ( )( ), tanh . , 0, 0dk x y x y dκ θ κ θ= + ∈ℵ > <

Also kernels can be combined using these meth-ods in order to produce new kernels:

( ) ( ) ( )1 2, , ,k x y k x y k x yα β+ =

( ) ( ) ( )1 2, , ,k x y k x y k x y=

2.3.1. Kernel MethodsBy having m instances kx with zero mean and[ ]1 2, ,..., T n

k k k knx x x x= ∈ℜ , principal component analy-sis method finds the new axis in the direction of the maximum variances of the data and this is equivalent to finding the eigenvalues of the covariance matrix C :

w Cwλ =

For eigenvalues 0λ ≥ and eigenvectors nw ∈ℜ in kernel principal component analysis, each vector x from the input space nℜ to the high dimensional fea-ture space fℜ is mapped using a nonlinear mapping function : ,n f f nF ℜ → ℜ > . In fℜ the eigenvalue problem is as follows:

lw F = C Fw F

Where C F is the covariance matrix. The eigenval-ues 0λ ≥ and eigenvectors w F \ 0w FF ∈ (the eigen-vectors with eigenvalues that are not zero) must be determined in a manner that qualifies lw F = C Fw F.

By using it:

Also the coefficient iα exists such that:

By combining the last three and by introducing K a m m× matrix:

The equation:

is reached, the kernel principal component analysis becomes:

2m K K m Kλ α α λα α= ≡ =

Where α is a column vector with values 1 ,..., mα α [27]. For normalizing the eigenvectors in F that is

( ) 1k kw w⋅ = the equation used is:

For extracting the principal components from the test instance x, and its projection in the fℜ space is F(x), only the projection of F(x) must be computed on the ei-genvectors kw in the feature subspace F by [27]:

It should be noted that none of equations need F(xi) in an explicit way. The dot products must only be calculated using the kernel function without the need to apply the map F. In face recognition, each vector x shows a face image and this is why the non-linear principal component is called kernel eigenface in the face recognition domain.

The kernel principal component analysis is shown is Table 3 [27].


Articles22

Table 3. KPCA algorithm [27]

1. Calculate the gram matrix by using:

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )

1 1 1 2 1

2 1 2 21 2

1 2

, , ... ,, , ... ,

... ... ... ..., , ... ,

m

mtraining

m m m m

k x x k x x k x xk x x k x x k x x

K

k x x k x x k x x

=

2. Calculate m Kλα α= and compute α3. Normalize nα using:

4. Calculate the principal component coefficients for test data x using:

( )( ) ( )1

,m

k ki i

i

w x k x xφ α=

⋅ = ∑

The classical principal component analysis is also a special version of kernel principal component analysis in which the kernel function is a first order polynomial. Therefore, the kernel principal analysis is a generalized form of principal component analysis that has used different kernels for nonlinear mapping.

Another important matter is using data with zero mean in the new subspace, which can be accom-plished using:

As there are no data in explicit form in the new space, the following method is used [26]. By consider-ing that for each i and j 1 1ij = .

The above formula can be rewritten as [26]:

For Kernel Fisher face first principal component analysis is applied on the image, and then LDA is ap-plied on the new vector [35].

2.4. Canonical Correlation AnalysisCanonical Correlation Analysis (CCA) is one mech-

anism for measuring the linear relationship between two multi-dimensional relationships. This method was first introduced by [16], and although it has been known as a standard tool in pattern recognition, it has been used rarely in signal processing and biometric identification systems. CCA has had various applica-tions in economics, medical studies and metrology.

It is assumed that X is a matrix with m n× di-mension that consists of m array of a n dimensional vector from a random variable x . The correlation co-efficient ijρ that shows the correlation between the

ix and jx is defined by: ij

ijii jj

C

C Cρ =

Where ijC shows the covariance matrix between ix and jx , and is computed by:

( )( )1

11

m

ij ki i kj jk

C X Xm

µ µ=

= − −− ∑

iµ is the average of ix ’s. xA is the centered matrix of X that its elements are

ij ij ja X µ= −

Therefore the covariance matrix is defined by:

11

Tx xC A A

m==

−

It has to be considered that correlation coefficients demonstrate a measurement of linear intersection between two variables. When two variables are un-correlated (i.e., their correlation coefficients are zero) it states that there is no linear function that could de-scribe the connection between the two variables.

The aim of CCA is to determine the correlation between two sets of variables. CCA attempts to find the basis vectors for two sets of multidimensional variables in such a way that the linear correlation be-tween the projected vectors on these basis vectors are maximized mutually.

The CCA method attempts to find the basis vectors for two sets of vectors, one for x and one for y in such a way that the correlation between the projection of these variables on the basis vectors are maximized. Assuming that the zero mean vectors are X and Y, the CCA method finds the vectors α and β such that the correlations between the projections of 1

Ta Xα= and 1

Tb Yβ= are maximized. The projections 1a and 1b are called the first canonical variables. Then the sec-ond dual canonical variables 2a and 2b are computed which are uncorrelated with the canonical variables

1a and 1b , and this process is continued.Considering 1 2, ,..., cω ω ω as features belonging to

class c and the training data space being defined as | NΩ ξ ξ= ∈ℜ , defining | PA x x= ∈ℜ and | qB y y= ∈ℜ then x and y are feature vectors

from one instance ξ which have been extracted using two different feature extracting method. The goal is to calculate the canonical correlations between x and y . 1

T xα and 1T yβ are the two first vectors, 2

T xα and 2T yβ are second dual vectors and can be written as:

( ) ( )1 2 1 2, ,..., , ,...,T TT T T T

d d xX x x x x W xα α α α α α∗ = = =

( ) ( )1 2 1 2, ,..., , ,...,T TT T T T

d d yY y y y y W yβ β β β β β∗ = = =


Articles 23

1

00

TTxx

Tyy

WW x xXZ

WW y yY

∗

∗

= = =

And the transform matrix is:

( ) ( )

1

1 2 1 2

00

, ,..., , , ,...,

x

y

x d y d

WW

W

W Wα α α β β β

=

= =

The directions iα and iβ are called the ith Canoni-cal Projective Vectors (CPV) and x , y , T

i xα and Ti yβ

are the ith features of canonical correlations. Also 1W and 2W are called Canonical Projective Matrix (CPM) and 1Z is Canonical Correlation Discriminant Feature (CCDF) and the method is called Feature Fusion Strategy (FFS) [2, 30].

For determining the CCA coefficients, it is assumed that x and y are two random variables with zero means. The total covariance matrix is defined by:

Txx xy

yx yy

C C x xC E

C C y y

= =

Where xxC and yyC are the inner set covariance matrix of x and y , and T

xy yxC C= is the between set covariance matrix. The correlation between x and y is defined as [30]:

1 1 2

1 1 2xx xy yy yx

yy yx xx xy

C C C CC C C C

α ρ αβ ρ β

− −

− −

= =

Where 2ρ is the square correlation and the eigenvec-tors α and β are normalized basis correlation vectors.

3. Experimental ResultsIn order to test the described algorithms, the

Sheffield (UMIST) [26] and ORL [23] databases were utilized in the experiments. The Sheffield database contains 575 images that belong to 20 people with variety of head pose from front view to profile. For training, 10 images were used from each person and the rest were used as test set. Figure 1 shows a sample of this database.

Fig. 1. Sheffield database [28]

ORL database contains 400 images. This database contains 40 people with variety in scale and pose of

the head. From every person, five images were used as training set and the rest as test set. Figure 2 shows a sample of this database.

Fig. 2. ORL database [23]

3.1. Linear MethodsIn order to establish a baseline, the linear algo-

rithms were utilized. Matlab was used for the simula-tion [22]. For the neural network, the network inputs are equal to the features vector’s dimensions. For the output two approaches can be used. The first one is the bit method in which the class number is shown by using bits. Each output neuron is equivalent to one bit. For instance, 000110 shows class 6 and 001001 shows class 9. The output of an RBF network is a real number between 0 and 1. The other method is consid-ering a neuron for each class. If there are 40 classes, then there are also 40 nodes in the output layer. The second method produced better results and in all sim-ulations the second method has been used. However in cases with large number of classes, the first method may be preferred. Also a neuron can be considered for images that do not belong to any classes.

It should be noted that the two other important neural networks classifiers, back propagation neural network and probabilistic neural network, have lower performances than RBF neural networks in these ex-periments. Back propagation neural network needs significant time for training compared to the RBF neu-ral network. The memory needed for back propaga-tion neural network is also much larger than the RBF neural network. The experiment also shows that the results using back propagation neural network is of lesser quality than RBF neural network. Probabilistic neural network performance is equivalent to distance based classifiers performance.

For linear methods, principal component analysis, linear discriminant analysis, fuzzy linear discrimi-nant analysis [20] and multiple exemplar linear dis-criminant analysis have been used. Results are shown based on the number of extracted features. Figure 3 illustrates the results for linear methods using RBF neural network [5, 10]. For all algorithms in this pa-per the distance based classifier was also used as a classifier and in most cases the RBF neural networks outperformed distance based classifiers. When the number of the extracted features was low, distance based had better results.


Articles24

Fig. 3. Linear based algorithms using RBF classifier on ORL database

As the figures show multiple exemplar discrimi-nant analysis has stronger discriminant capabilities compared with the other methods. Figure 4 shows the results for the Sheffield database.

Fig. 4. Linear based algorithms using RBF classifier on Sheffield database

Figure 5 and Figure 6 show the results of FMEDA algorithm compared with LDA and MEDA algorithm. The results indicate that FMEDA algorithm has better recognition rate compared to LDA and MEDA meth-ods and other linear methods.

Fig. 5. FMEDA algorithm using RBF classifier on ORL da-tabase

Fig. 6. FMEDA algorithm using RBF classifier on Shef-field database

3.2. Non-Linear MethodsFor nonlinear methods kernel principal compo-

nent analysis and kernel linear discriminant analysis have been used. For kernel linear discriminant analy-sis first kernel principal component analysis has been applied to the images and then the linear discrimi-nant analysis is applied to the new vector. Second or-der polynomial is used as kernel function. Figures 7 and 8 display the results.

Fig. 7. Non-linear based algorithms using RBF classifier on ORL database

Fig. 8. Non-linear based algorithms using RBF classifier on Sheffield database


Articles 25

As the figures show kernel linear discriminant analysis has better results compared to kernel prin-cipal component analysis. Also kernel principal com-ponent analysis has better results compared to Eigen face method.

Comparing the results with the linear algorithms confirms that the non-linear method is not that much better than the linear methods. The reason can be that the between class distances have not become more when the space is changed to a higher dimen-sional space.

3.3. Evaluating CCACombining the information is a powerful technique

that is being used in data processing. This combining can be done at three levels of pixel level, feature level, and decision level - Similar to combining the classifi-ers. CCA combines the information in the feature level.

One of the advantages of combining the features is that the features (vectors which have been calculated using different methods) contain different character-istic from the pattern. By combining these two meth-ods not only the useful discriminant information from the vectors are kept, but also the redundant informa-tion is omitted.

For this experiment, CCA was applied to two dif-ferent feature vectors. In this case two different meth-ods should be used where each extract features from the image using different technique. One of the meth-ods that are used is FMEDA which had better results compared to other linear and non-linear methods in appearance-based methods. The other method that we used is Discrimination Power Analysis (DPA). CCA is applied to the extracted features using these two methods.

A method has been introduced based on DCT that extract features that have better capability to dis-criminate faces [7]. As mentioned before in conven-tional DCT the coefficients are chosen using a zigzag manner, where some of the low frequency coefficients are discarded because they contain the illumination information. The low frequency coefficients are in the upper left part of the image. Some of the coefficients have more discrimination power compared to other coefficients, and therefore by extracting these fea-tures a higher true recognition rate can be achieved. So, instead of choosing the coefficients in a zigzag manner [7] searched for coefficients which have more power to discriminate between images. Unlike other methods such as PCA and LDA which use between and within class scatter matrices and try to maximize the discrimination in the transformed domain, DPA searches for the best discrimination features in the original domain.

The DPA algorithm is as follows [7]:Considering DCT has been applied to an image and

coefficients are X:

11 12 1

21 22 2

1 2

...

...... ... ... ...

...

N

N

M M MN M N

x x xx x x

X

x x x×

=

Where the number of people in the database is C (The number of classes), and for each person there are S images (Training images). There are total C*S training images. Table 4 shows how to calculate the DPA of each coefficient ijx :

Table 4. DPA algorithm [7]

1. Construct a large matrix containing all the DCT from the training images.

2. Calculate the mean and variance of each class:

3. Calculate variance of all classes

4. Calculate the mean and variance of all training samples:

5. For location (i, j) calculate the DP:

The higher values in D show the higher discrimi-nation ability it has. Table 5 shows the procedure for recognizing faces:

Table 5. Procedure for recognizing faces [7]

1. Compute the DCT of the training images, and normalize the results.

2. Use a mask, to discard some of the low and high frequencies.

3. Calculate DPA for the coefficients inside the mask.4. The n largest coefficients are found and marked.

Set the remaining coefficients to zero. The resulting matrix is an M*N matrix having n elements that are not zero.

5. Multiply the DCT coefficients by the matrix which was calculated in the previous step. Convert the resulting matrix into a vector.

6. Train a classifier using the training vectors. Apply the same process for the test images.


Articles26

Figure 9 shows the comparison between FMEDA, DPA and CCA. The results illustrate that applying CCA to the features can increase the recognition rate for human faces.

Fig. 9. Comparing CCA with FMEDA and DPA using RBF classifier on ORL database

Fig. 10. Comparing CCA with FMEDA and DPA using RBF classifier on Sheffield database

4. ConclusionIn this paper several linear and non-linear appear-

ance based method were discussed and the methods were applied on two popular face recognition data-base. In linear methods FMEDA had better results compared to other linear methods and in non-linear methods KLDA outperforms KPCA. Also the experi-ments show that the linear method has similar rec-ognition rate compared to non-linear methods. Also a new method for face recognition was introduced that outperforms existing linear and non-linear methods. Canonical Correlation Analysis (CCA) is a strong tool in combining the information at feature level. Fractional Multiple Exemplar Analysis (FMEA) and Discriminant Power Analysis (DPA) were used as feature extraction techniques. This paper’s experi-mental results show that CCA using DPA and FMEDA exhibits improved results compared to other related methods.

AUTHORSMohammadreza Hajiarbabi* – Department of Elec-trical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, USA. E-mail: [email protected]

Arvin Agah – Department of Electrical Engineeringand Computer Science, University of Kansas, Lawrence, Kansas, USA.E-mail: [email protected]

*Corresponding author

REFERENCES

[1] Blanz V. S., Vetter T., “Face identification across different poses and illuminations with a 3D mor-phable model”. In: IEEE International Conference on Automatic Face and Gesture Recognition, 2002, 202–207. DOI: 10.1109/AFGR.2002.1004155.

[2] Borga M., Learning multidimensional signal processing, Department of Electrical Engineer-ing, Linköping University, Linköping Studies in Science and Technology Dissertations, no. 531, 1998.

[3] Boser B. E., Guyon I. M., Vapnik V. N., “A training algorithm for optimal margin classifiers.” In: D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, 1992, 144–152. DOI: 10.1145/130385.130401.

[4] Brunelli R., Poggio T., “Face recognition: Features versus templates”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 10, 1993, 1042–1053. DOI: 10.1109/34.254061.

[5] Chen S., Cowan P.M., ”Orthogonal least squares learning algorithms for radial basis func-tion networks”, IEEE Transaction on Neural Networks, vol. 2, no. 2, 1991, 302–309. DOI: 10.1109/72.80341.

[6] Cootes T.F., Edwards G.J., Taylor C.J., “Active ap-pearance models”, IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 23, no. 6, 2001, 681–685. DOI: 10.1109/34.927467.

[7] Dabbaghchian S., Ghaemmaghami M., Aghagol-zadeh A., “Feature extraction using discrete co-sine transform and discrimination power analy-sis with a face recognition technology”, Pattern Recognition, vol. 43, no. 4, 2010, 1431–1440. DOI: 10.1016/j.patcog.2009.11.001.

[8] Fukunaga K., Introduction to statistical pattern recognition, 2nd ed., San Diego, CA: Academic Press, 1990, 445-450.

[9] Gui J., Sun Z., Jia W., Hu R., Lei Y., Ji S., “Discrimi-nant sparse neighborhood preserving embed-ding for face recognition”, Pattern Recognition, vol. 45, no. 8, 2012, 2884–2893. DOI: 10.1016/j.patcog.2012.02.005.

[10] Gupta J. L., Homma N., Static and dynamic neural networks from fundamentals to advanced theory, John Wiley & Sons, 2003.

[11] Hajiarbabi M., Askari J., Sadri S., Saraee M., “The Evaluation of Camera Motion, Defocusing and


Articles 27

Noise Immunity for Linear Appearance Based Methods in Face Recognition”. In: IEEE Confer-ence WCE 2007/ICSIE 2007, vol. 1, 2007, 656–661.

[12] Hajiarbabi M., Askari J., Sadri S., Saraee M., “Face Recognition Using Discrete Cosine Transform plus Linear Discriminant Analysis”. In: IEEE Conference WCE 2007/ICSIE 2007, vol. 1, 2007, 652–655.

[13] Hajiarbabi M., Askari J., Sadri S., “A New Linear Appearance-based Method in face Recognition”, Advances in Communication Systems and Electri-cal Engineering. Lecture Notes in Electrical En-gineering, vol. 4, Springer, 2008, 579–587. DOI: 10.1007/978-0-387-74938-9_39.

[14] Hajiarbabi M., Agah A., “Face Detection in color images using skin segmentation”, Journal of Au-tomation, Mobile Robotics and Intelligent Sys-tems, vol. 8, no. 3, 2014, 41–51.

[15] Hajiarbabi M., Agah A., “Human Skin Color Detec-tion using Neural Networks”, Journal of Intelli-gent Systems, under review, 2014.

[16] Hotelling H., “Relations between two sets of variates”, Biometrika, vol. 28, no. 3–4, 1936, 321–377. DOI: 10.2307/2333955.

[17] R., Hsu, M., and Abdel-Mottaleb, A., Jain, “Face De-tection in Color images”, IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, 2002, 696-706.

[18] Huang J., Heisele B., Blanz V., “Component-based Face Recognition with 3D Morphable Models”. In: Proceedings of the 4th International Confer-ence on Audio- and Video-based Biometric Person Authentication, chapter 4, Surrey, UK, 2003. DOI: 10.1007/3-540-44887-X_4.

[19] Kong S., Heo J., Abidi B., Pik P., M., Abidi, “Recent Advances in Visual and Infrared Face Recogni-tion – A Review”, Journal of Computer Vision and Image Understanding, vol. 97, no. 1, 2005, 103–135. DOI: 10.1016/j.cviu.2004.04.001.

[20] Kwak K.C., Pedrycz W., “Face recognition using a fuzzy Fisher face classifier”, Pattern Recognition, vol. 38, 2005, 1717–1732.

[21] Lotlikar R., Kothari R., “Fractional-step dimen-sionality reduction”, IEEE Transactions on Pat-tern Analysis and Machine Intelligence, vol. 22, no. 6, 2000, 623–627. DOI: 10.1109/34.862200.

[22] Math Works, 2015: www.mathworks.com. [23] ORL Database, 2015: http://www.camorl.co.uk.[24] Rowley H., Baluja S., Kanade T., “Neural network-

based face detection”, IEEE Pattern Analysis and Machine Intelligence, vol. 20, 1998, 22–38.

[25] Sarfraz M., Computer Aided Intelligent Rec-ognition Techniques and Applications, John Wi-ley & Sons, 2005, 1–10.

[26] Scholkopf B., Statistical learning and kernel meth-ods, Microsoft Research Limited, February 29, 2000.

[27] Scholkopf B., Smola A., Muller K.R., “Non-linear component analysis as a kernel eigenvalue prob-lem”, Neural Computation, vol. 10, no. 5, 1998, 1299–1319.

[28] Sheffield (UMIST) Database, 2015: http://www.

sheffield.ac.uk/eee/research/iel/research/face.[29] Shu X., Gao Y., Lu H., “Efficient linear discriminant

analysis with locality preserving for face recog-nition”, Pattern Recognition, vol. 45, no. 5, 2012, 1892–1898.

[30] Sun Q.S., Zeng S.G., Liu Y., Heng P.A., Xia D.S., “A new method of feature fusion and its appli-cation in image recognition”, Pattern Recogni-tion, vol. 38, no. 12, 2005. DOI: 10.1016/j.pat-cog.2004.12.013.

[31] Turk M., “A Random Walk through Eigen space”, IEICE Transactions on Information and System, vol. 84, no. 12, 2001.

[32] Turk M., Pentland A.,, “Eigen faces for recogni-tion”, Journal of Cognitive Neuroscience, vol. 3, 1991, 71–86.

[33] Viola P., Jones M.J., “Robust real-time object de-tection”. In: Proceedings of IEEE Workshop on Statistical and Computational Theories of Vision, 2001.

[34] Wiskott L., Fellous J.M., Kruger N., Malsburg C., “Face recognition by elastic bunch graph match-ing”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, 1997, 775–779.

[35] Yang J., Jin Z., Yang J., Zhang D., Frangi F., “Es-sence of Kernel Fisher discriminant: KPCA plus LDA”, Elsevier Pattern Recognition, vol. 37, no. 10, 2004, 2097–2100. DOI: 10.1016/j.pat-cog.2003.10.015.

[36] Yu H., Yang J., “A Direct LDA algorithm for high dimensional data with application to face recog-nition”, Pattern Recognition, vol. 34, no. 10, 2001, 2067–2070.

[37] Zhou Sh. K., Chellappa R., “Multiple-Exemplar discriminant analysis for face recognition”, Cen-ter for Automation Research and Department of Electrical and Computer Engineering University of Maryland, College Park, MD 20742, 2003.


28

gorithms require data from highly precise sensors, such as laser scanners [28], or have high computing power demands, if less precise data (e.g. from pas-sive cameras) are used [8]. Thus, the SLAM approach is rather unsuitable for small mobile robots, such like our SanBot [19], which have quite limited resources with respect to on-board sensing, computing power, and communication bandwidth. Thus, for such a robot an approach to self-localization that does not need to construct a map of the environment, or uses a simple and easy to survey representation of the known area is required. Moreover, the self-localization system should use data from compact and low-cost sensors.

In the context of navigation CCD/CMOS cameras are the most compact and low-cost sensors for mobile robots [6]. However, most of the passive vision-based localization methods fail under natural environmen-tal conditions, due to occlusions, shadows, changing illumination, etc. Therefore, in practical applications of mobile robots artificial landmarks are commonly employed. They are objects purposefully placed in the environment, such as visual patterns or reflect-ing tapes. Landmarks enhance the efficiency and ro-bustness of vision-based self-localization [29]. It was also demonstrated that simple artificial landmarks are a valuable extension to visual SLAM [3]. An obvi-ous disadvantage is that the environment has to be engineered. This problem can be alleviated by using simple, cheap, expendable and unobtrusive markers, which can be easily attached to walls and various ob-jects. In this research we employ simple landmarks printed in black and white that are based on the ma-trix QR (Quick Response) codes commonly used to recognize packages and other goods.

In our recent work [21] we evaluated the QR code landmarks as self-localization aids in two very differ-ent configurations of the camera-based perception system: an overhead camera that observed a land-mark attached on top of a mobile robot, and a front-view camera attached to a robot, which observed landmarks freely placed in the environment. Both so-lutions enable to localize the robot in real-time with a sufficient accuracy, but both have important practi-cal drawbacks. The overhead camera provides inex-pensive means to localize a group of few small mobile robots in a desktop application, but cannot be easily scaled up for larger mobile robots operating in a real environment. The front-view camera with on-board image processing is a self-contained solution for self-localization, which enables the robot to work autono-mously, making it independent from possible com-

Improving Self-localization Efficiency In a Small Mobile Robot by Using a Hybrid Field of View Vision System

Marta Rostkowska, Piotr Skrzypczyński

Submitted: 27th August 2015; accepted 18th September 2015

DOI: 10.14313/JAMRIS_4-2015/30

Abstract: In this article a self-localization system for small mobile robots based on inexpensive cameras and unobtrusive, passive landmarks is presented and evaluated. The main contribution is the experimental evaluation of the hybrid field of view vision system for self-localization with arti-ficial landmarks. The hybrid vision system consists of an omnidirectional, upward-looking camera with a mirror, and a typical, front-view camera. This configuration is inspired by the peripheral and foveal vision co-operation in animals. We demonstrate that the omnidirectional camera enables the robot to detect quickly landmark candidates and to track the already known landmarks in the environment. The front-view camera guided by the omnidirectional information enables precise mea-surements of the landmark position over extended dis-tances. The passive landmarks are based on QR codes, which makes possible to easily include in the landmark pattern additional information relevant for navigation. We present evaluation of the positioning accuracy of the system mounted on a SanBot Mk II mobile robot. The ex-perimental results demonstrate that the hybrid field of view vision system and the QR code landmarks enable the small mobile robot to navigate safely along extended paths in a typical home environment.

Keywords: self-localization, artificial landmark, omnidi-rectional camera

1. IntroductionAn important requirement for any mobile robot is

to figure out where it is within its environment. The pose of a wheeled robot (position and orientation xR = [xR yR θR]T) can be estimated by means of odom-etry, but this method alone is insufficient [27], and the pose has to be corrected using measurements from external sensors. Although there are many approach-es to self-localization known from the literature, now-adays the Simultaneous Localization and Mapping (SLAM) is considered the state-of-the-art approach to obtain information about the robot pose [7].

The SLAM algorithms estimate from the sensory measurements both the robot pose and the environ-ment map, thus they do not need a predefined map of the workspace. This is an important advantage, because obtaining a map of the environment that is suitable for self-localization is often a tedious and time-consuming task. However, the known SLAM al-


Articles 29

munication problems. However, the landmarks are detectable and decodable only over a limited range of viewing configurations. Thus, the robot has to turn the front-mounted camera towards the area of land-mark location before it starts to acquire an image. In a complicated environment, with possible occlusions this approach may lead to a lot of unnecessary mo-tion. Eventually, the robot can get lost if it cannot find a landmark before the odometry drifts too much.

In this paper we propose an approach that com-bines to some extent the advantages of the overhead camera and the front-view camera for self-localiza-tion with passive landmarks, avoiding the aforemen-tioned problems. We designed an affordable hybrid field of view vision system, which takes inspiration from nature, and resembles the peripheral and foveal vision in animals. The system consists of a low-cost omnidirectional camera and a typical, front-view camera. The omnidirectional component, employing an upward-looking camera and a profiled mirror pro-vides to the robot an analogy of the peripheral vision in animals. It gives the robot the ability to quickly de-tect interesting objects over a large field of view. In contrast, the front-view camera provides an analog of foveal vision. The robot can focus on details of already detected objects in a much narrower field of view. The cooperation of these two subsystems enables to track in real-time many landmarks located in the en-vironment, without the need to move the robot plat-form, whereas it is still possible to precisely measure the distances and viewing angle to the already found landmarks.

The reminder of this paper is organized as follows: In the next Section we analyze the most relevant re-lated work. Section 3 introduces the concept and de-sign of the hybrid vision system, whereas the land-marks based on QR codes and the image processing algorithms used in self-localization are described in Section 4. The experimental results are presented in Section 5. Section 6 concludes the paper and presents an outlook of further research.

2. Related WorkThe advantages of biologically-inspired vision for

robot self-localization have been demonstrated in few papers – for instance Siagnian and Itti [25] have shown that extracting the “gist” of a scene to produce a coarse localization hypothesis, and then refining this hypothesis by locating salient landmark points enables the Monte-Carlo localization algorithm to work robustly in various indoor/outdoor scenarios. However, in this work both the global and the local characteristics of the scene were extracted from typi-cal perspective-view images. One example of a system that is more similar to our approach and mimics the cooperation between the peripheral vision and the fo-veal vision in humans is given by Menegatti and Pag-ello [16]. They investigate cooperation between an omnidirectional camera and a perspective-view cam-era in the framework of a distributed vision system, with the RoboCup Soccer as the target applications. Only simple geometric and color features of the scene are considered in this system. An integrated, self-con-

tained hybrid field of view vision system called HOPS (Hybrid Omnidirectional Pin-hole Sensor), which is quite similar in concept to our design is presented in [5], where the calibration procedure is described that enables to use this sensor for 3D measurements of the scene. Unfortunately, [5] gives no real application examples. Also Adorni et al. [1] describe the use of a combined peripheral/foveal vision system including an omnidirectional camera in the context of mobile robot navigation. Their system uses both cameras in a stereo vision setup and implements obstacle detec-tion and avoidance, but not self-localization.

Although the bioinspired vision solutions in mo-bile robot navigation mostly extract natural salient features, in many practical applications artificial landmarks are employed in order to simplify and speed-up the image processing and to make the de-tection and recognition of features more reliable [15]. Visual self-localization algorithms are susceptible to errors due to unpredictable changes in the environ-ment [11], and require much computing power to process natural features, e.g. by employing local vi-sual descriptors [24]. The need to circumvent these problems in a small mobile robot that is used for edu-cation and requires reliable self-localization, offering only limited computing resources motivated us to enhance the scene by artificial landmarks. Although active beacons can be employed, such like infra-red LEDs [27], most of the artificial visual landmarks are passive. This greatly simplifies deployment of the markers and makes them independent of any power source. Depending on the robot application and the characteristics of the operational environment very different designs of passive landmarks have been pro-posed [9, 22]. In general, simple geometric shapes can be quickly extracted from the images, particularly if they are enhanced by color [3]. A disadvantage of such simple landmarks is that only very limited informa-tion (usually only the landmark ID) can be embedded in the pattern. In contrast, employing in landmark de-sign the idea of barcode, either one-dimensional [4] or two-dimensional [12] makes it possible to easily encode additional information. In particular, matrix codes, that proliferated recently due to their use in smartphone-based applications enable to fabricate much more information-rich landmarks. Moreover, landmarks based on matrix codes are robust to partial occlusion or damage of the content. Landmarks based on matrix codes are unobtrusive – their size can be adapted to the requirements of particular application and environment. As they are monochromatic, they can be produced in a color matching the surroundings, partially blending into the environment. The robotics and computer vision literature provides examples of successful applications of QR codes for mobile robot self-localization. Introducing QR codes into the envi-ronment has improved the robustness and accuracy of the 3D-vision-based Monte Carlo self-localization algorithm in a dynamic environment as demonstrated in [14]. The information-carrying capability of matrix codes can be efficiently used for self-localization and communication in a system of many mobile robots [18] and in an intelligent home space for service robot


Articles30

[13]. An applicability of QR codes for navigation and object labelling has been also demonstrated in [10] on the NAO humanoid robot.

3. Hybrid Field of View Vision System3.1. Concept and Components

Most of the mobile robots that employ vision for navigation use typical perspective cameras. A per-spective camera can observe landmarks located at relatively large distances and positioned arbitrary in the environment within the camera’s horizontal field of view. The distance to the robot and orientation of the landmark can be calculated from a single image taken by the perspective camera. Due to practical considerations, working indoors we assume that the landmarks are attached to vertical surfaces, such as walls that dominate man-made environments. Thus, we consider only the angle α between the camera’s optical axis and the normal to the landmark’s plane in 2D (Fig. 1) In the same camera coordinates the position of the landmark is defined by the distance zy measured along the camera’s optical axis, which is assumed to be coincident with the robot’s yR axis, and the distance d in the robot’s xR axis, computed as the offset between the center of the image (i.e. the optical axis) and the center of the landmark. The distance at which the landmark can be detected and recognized depends on the camera resolution and the physical size of the landmark [21]. The information about the actual landmark size, as well as the position and ori-entation in the global reference frame xL = [xL yL θL]T is encoded in the QR code of the landmark itself, so the robot doesn’t need to keep a map of known land-marks in the memory. Therefore, if at least one land-mark can be recognized and decoded, the position of the robot and its orientation can be computed. How-ever, in order to find landmarks in the surroundings, the robot has to constantly change its heading, which is inconvenient.

Fig. 1. Geometry of landmark measurements by using the perspective camera

The omnidirectional subsystem combines a stan-dard upward-looking camera with an axially sym-metric mirror located above this camera and provides 360ᵒ field of view in the horizontal plane. This type of omnidirectional sensor is called catadioptric [23] and can be implemented using mirrors of different

vertical profiles: parabolic, hyperbolic, or elliptical. The omnidirectional sensor used in this research has been designed and built within a project run by stu-dents, which imposed limitations as to the costs and the used technology. The mirror has been fabricated in a workshop from a single piece of aluminium using a simple milling machine, which limited the achiev-able curvature of the profile. Thus, a mirror of coni-cal shape with a rounded, parabolic tip was designed (Fig. 2). This profile could be fabricated at acceptable cost using typical workshop equipment. The mirror is hold by a highly transparent acrylic tube over the lens of an upward-looking webcam.

Fig. 2. Conical mirror: a – design of the conical mirror with rounded tip for the omnidirectional vision sensor, b – the fabricated mirror

Omnidirectional camera images represent geo-metrically distorted environment such as: straight lines are arcs, squares are rectangles. For this reason it is difficult to find characteristic elements which are needed in the localization process. It is therefore nec-essary to transform images using the single effective viewpoint [26]. Unfortunately, the chosen shape of the mirror makes it hard to achieve the single effec-tive viewpoint property in the sensor. While for hy-perbolic or elliptical mirrors this is simply achieved by placing the camera lens at a proper distance from the mirror (at one of the foci of the hyperbola/el-lipse), for a parabolic mirror, an orthographic lens must be interposed between the mirror and the camera [2]. This was impossible in the simple design sensor which uses a fixed-lens webcam as the cam-era. Therefore, it is impossible to rectify the images captured by our omnidirectional camera to geometri-cally correct planar perspective images [26]. While the captured pictures may be mapped to flat pan-oramic images covering the 360ᵒ field of view, these images are still distorted along their vertical axis, i.e. they do not map correctly all the distances between the objects and the sensor into the vertical pixel lo-cations. However, there are no distortions along the horizontal axis, which allows to recover the angular location of the observed objects with respect to the sensor. In the context of landmark-based positioning it means that while the landmarks can be detected in the omnidirectional images, only their angular loca-tions, but not the distances with respect to the robot can be determined precisely, particularly for more distant landmarks. Moreover, the internal content of


Articles 31

the landmark (QR code) cannot be decoded reliably from the distorted images. Eventually, while the omni-directional camera is capable of observing the whole proximity of the robot without unnecessary motion, it requires high computing power to rectify the whole images, still giving no guarantee that the geomet-ric measurements of landmark positions are precise enough for self-localization.

Fig. 3. Exemplary view from the omnidirectional vision component with artificial landmarks in the field of view

The aforementioned properties and limitations of the two camera subsystems resemble the character-istics of the foveal and peripheral vision in animals. This provides a strong argumentation to combine both systems. If the perspective view camera and the omnidirectional camera subsystems are coupled for landmark perception their drawbacks can be mutual-ly compensated to a great extent. The omnidirectional camera can provide 360ᵒ view with detection of land-marks, and then guide the perspective camera to the angular coordinates of the found landmarks. The per-spective camera can be pointed directly to the land-mark at the known angular coordinates, and then can

precisely measure its location and read the QR code. It should be noted, that in this cooperation scheme nei-ther full rectification of the omnidirectional images or the perspective correction in the front-view camera images are needed, which significantly decreases the required computing power.

3.2. Experimental System on the Mobile RobotThe experimental mobile robot with the hybrid

field of view vision system is shown in Fig. 4. It is based on the small, differential-drive mobile plat-form SanBot Mk II [19]. The robot is equipped with the front-view camera and the omnidirectional camera.

The front-view camera is mounted directly to the upper plate of the robot’s chassis. It is a Logitech 500 webcam, providing images at the resolution of 1280x1024. The Microsoft LifeCam webcam is used in the omnidirectional sensor. This particular camera has been chosen due to its compact size, high resolution (1280x720), and an easy to use API. The mirror has the diameter of 6 cm, and is located 11 cm above the camera lens. The omnidirectional camera is positioned precisely above the front-view camera. Both cameras stream images at 15 FPS through the USB interface.

In the current experimental setup image process-ing takes place on a notebook PC. The simple control-ler board of the SanBot robot receives only the calcu-lated positions of the landmarks that are necessary to compute the motion commands. These data are trans-ferred via a serial (COM) port. The robot schedules the sequence of motions to execute in order to follow the planned path. The robot stops for a moment when taking images, and then obtains the outcome of cal-culations related to landmark-based self-localization.

4. Landmarks and Self-localization4.1. Passive Landmarks with Matrix Codes

There are many possibilities to design a passive landmark, but if a CCD/CMOS camera has to be used as the sensor, the landmark should have the following basic properties:• should be recognizable and decodable over a wide

range of viewing ranges and angles;• should be easily recognizable under changing

environment conditions (e.g. variable lighting, partial occlusions);

• its geometry should allow easy and quick extraction from an image;

• it should be easy to prepare, preferably printable in one color;

• it should be unique within the robot’s working area, e.g. by containing an encoded ID.For the small mobile robot positioning we formu-

late a further requirement related to the limited com-puting power of the system: the landmarks should be able to carry additional information related to self-lo-calization and navigation, such like the position of the landmark in the global frame, object labels or guid-ance hints. Such information easily and robustly de-codable from the landmark’s image helps the robot to navigate without building a map of the environment in memory.

Fig. 4. SanBot Mk II with the hybrid field of view vision system: CAD drawing (a), and a photo of assembled ro-bot (b)+


Articles32

All of the above-listed requirements are met by matrix codes. In our previous work [20] we have ex-perimentally evaluated four types of commercially used matrix codes as candidates for landmarks. The results revealed that among these code types, the most suitable for navigation are the QR codes. The QR codes contain three marker positions (upper left, upper right and lower left), which are additionally separated from the data with white frame. This pat-tern allows easily recovering the code orientation. Comparing to other considered variants the QR code is also characterized by large size of a single module (i.e. white/black cell). This is an important advantage, which ensures proper measurements, even for long distances. In addition, QR codes are capable of partial error correction if they are damaged or occluded.

Fig. 5. Exemplary QR code based landmark: QR code encoding value ‘1’ (a), 16.5x16.5 cm landmark with frame (b), recognized landmark in the environment (c)

As a result, our landmark is designed around a standard QR code, which is placed in the center. The code is encompassed with a black frame, which is used for initial sorting of landmark candidates on images and for reducing the number of potential ob-jects, which can be subject to decoding and further processing. Landmarks are monochromatic (usually black-and-white), because they should be extremely low-cost and printable on any material, not only pa-per. An example of a QR code, in which the ID ’1’ has been encoded, is shown in Fig. 5a. A complete land-mark with the added black frame is depicted in Fig. 5b. The same landmark, recognized and decoded in the environment is shown in Fig. 5c.

The program processing the data from both cam-eras has been created using C# programming lan-guage and Microsoft Visual Studio 2010. The basic image processing part, leading to conversion of the omnidirectional images, extraction of landmark can-didates from these images, and computation of the geometric measurements is implemented with the EmguCV library [30], which is a C# port of the well-known OpenCV. The decoding of extracted QR codes is accomplished using the specialized MessagingToolkit library [32].

4.2. Localization of Landmarks on Perspective Camera Images

We assume that landmarks are mounted on rigid surfaces, so that they will not bend, deforming the square frames. Thus, the images from the perspective view camera are assumed to be not distorted. The al-gorithm of recognition and localization of landmarks observed by the perspective view camera is shown in Fig. 6. The image processing begins with acquiring im-ages from the front-view camera. The images are fil-trated, and then the thick frames around the QR code are searched for by extracting candidate rectangles.

If the surface of a landmark is roughly parallel to the camera sensor’s surface, there is no perspective deformation of the QR code, and it can be directly pro-cessed by the appropriate MessagingToolkit routine. If such a landmark is found, the distance to the robot’s camera is calculated. Whenever the camera’s optical axis intersects the center of the landmark (i.e. it is lo-cated horizontally in the center of the image) the dis-tance is calculated in a simple way:

, (1)

where z is the distance between the landmark and the camera, f is the camera’s focal length, hL is the known vertical dimension (height) of the landmark, and hI is the observed object’s vertical dimension on the image. The viewing angle can be computed from the formula:

, (2)

where wL is the known horizontal dimension (width) of the landmark, and wI is the observed object’s hori-zontal dimension on the image.

However, if the landmark isn’t located in the cen-ter of the image (cf. Fig. 1), the distance between the camera and the landmark is calculated from the right-angle triangle made by the distance zy measured along the camera’s optical axis (which is assumed to be co-incident with the robot’s yR axis), and the distance d in the robot’s xR axis, computed as the offset between the center of the image and the center of the landmark:

, (3)

Fig. 6. Landmarks detection and decoding algorithm for the perspective camera


Articles 33

where dp is the distance in pixels between the center of the image and the center of the landmark’s bound-ing frame.

The viewing angle between the camera’s optical axis and the vector normal to the landmark surface is calculated as:

, (4)

where zy is the perpendicular distance from cam-era to the landmark, and d is the distance calculated from (3).

Fig. 7. Spatial distribution of errors in QR code-based landmark localization by the perspective view camera: distance to landmark z errors (a), and viewing angle α errors (b)

However, if in the given viewing configuration of the perspective camera the surface of a landmark is not parallel to the camera’s sensor’s surface the per-spective deformation of the landmark’s image has to be corrected before decoding the QR code and calcu-lating the distances and angle from (1)–(4). In such a case the relation between locations of the character-istic points (corners) in 3D and the image plane has to be found in order to properly calculate the landmark’s position and rotation. Computations of this relation are described in more details in [21]. We omit these calculations here, because such situations should not occur when self-localization takes place by using both the perspective and omnidirectional cameras, as the perspective camera is set to a proper angular position before taking an image of the landmark. Thus, the viewing angle of the landmark never exceeds 15ᵒ. Re-sults of our earlier experiments [21] provide evidence that for such small viewing angles the correction of

perspective brings no improvement in the landmark localization, while this procedure is computation in-tensive.

Quantitative results for the measurements of an exemplary passive landmark (size 20x20 cm) are shown in Fig. 7. The landmark was observed by the perspective camera from distances up to 2 meters and for the viewing angles up to 60ᵒ. In this experi-ment the camera was positioned in such a way that the optical axis always intersected the center of the landmark, thus the d offset was zero. As it could be expected, the distance measurement error grows for larger distances, but it grows also slightly for large viewing angles (Fig. 7a), which could be attributed to the not corrected perspective deformation. As can be seen from the plot in Fig. 7b the measured viewing angle is less precise for large and very small distances. This is probably caused by the procedure searching for the thick black frame, which for very large images of a landmark (small distances) occasionally finds the inner border of the frame instead of the outer one. Average precision of the measurements (over all dis-tances and viewing angles) turned out to be 1.3 cm for the distance and 2ᵒ for the viewing angle.

4.3. Recognition of Landmarks on the Omnidirec-tional Images

As described in Section 3 the low-cost omnidirec-tional camera geometry and optics do not permit full rectification of the distorted images. Therefore, we use images from the omnidirectional camera only to find potential landmark candidates in the robot’s vi-cinity, and to track the known landmarks.

At the beginning, in order to reduce the amount of processing information, the color image from the camera is transmuted into black-and-white image. The data processing starts by cropping and unwind-ing the omnidirectional image. Cropping the image relies on selecting the part of the picture which is necessary for recognition of the landmarks. The un-winding procedure is a simple cylinder unrolling. At the beginning the algorithm sets the height and width of the unrolled picture:

, (5)

where R2 is the radius of outer circle, and R1 is the ra-dius of an inner circle marked in Fig 3. Next, the algo-rithm starts computing a new position of each pixel in the unrolled image. This procedure is shown in pseudo-code:

Listing 1. Pixel position calculation procedurey = H - 1;for (x = 0; x < W; x++) for (y = 0; y < H; y++) r = (y/H) * (R2 - R1) + R1; theta =(x/W) * 2 * PI;


Articles34

xs = Cx + r * Sin(theta); ys = Cy + r * Cos(theta); mapx.Data[y, x] = xs; mapy.Data[y, x] = ys; y--; y = H - 1;

The two operations described above provide the same result for each processed image, so they are executed only once at the beginning of the program. Afterwards, the program starts the procedure of un-winding the picture using the EmguCV cvRemap func-tion. This function transforms the source image using the specified map (in our case mapx and mapy from the algorithm in Listing 1).

After the procedure of unwrapping the image, the extreme parts of the image are duplicated in the op-posing ends in order to obtain a continuous image in areas where landmarks can appear. The unrolled image is shown in Fig. 8. This image undergoes mor-phological erosion in order to remove noise. Next, the Canny operator is used to find edges on the image. Among the found edges those that are connected into rectangular shapes are selected. Then, the algorithm eliminates all nested rectangles – those which are located inside other rectangles. The found landmark candidates are marked on the image.

Fig. 8. Results from the omnidirectional camera a – cropped and unwrapped image, b – extended image, c – unwrapped and extended image with marked can-didates

The viewing angle of the landmark with respect to the robot’s heading is calculated as:

, (6)

where xs and ys define center of the landmark, W is width of the unwrapped image, and Wd is width of du-plicated part of the picture.

Afterwards, the program makes a list of potential landmarks and relative angles. The algorithm of find-ing and localization of landmarks in the omnidirec-tional images is shown in Fig. 9.

4.4. Self-localization with the Hybrid SystemThe self-localization algorithm based on data from

both the omnidirectional and perspective camera is shown in Fig: 10. At the beginning, the program pro-cesses only an image from the omnidirectional cam-era. If the algorithm described in Subsection 4.3 finds a landmark candidate at the viewing angle smaller than ±15ᵒ, the program starts processing the image from the front-view camera. This way the robot does not need to aim the perspective camera directly at the landmark, which speeds up the self-localization pro-cess.

When the landmark is seen in the angular sector of ±15ᵒ, the image from the perspective camera is processed. The program searches for the landmark, decodes it and calculates robot’s position and orien-tation in the external reference frame (cf. Fig. 1). The orientation of the robot θR is concatenation of the landmark’s orientation in the global coordinates θL, and robot’s orientation with regard to the landmark α. The orientation is calculated as:

, (7)

where θL′ is θL – 180ᵒ and α is the angle calculated from (4). The robot’s position in the global reference frame is calculated as:

, (8)

where xL and yL define the landmark’s position, zy is the perpendicular distance between the camera and the landmark, and d is the distance calculated from (3). In (8) the plus sign is used to compute the

Fig. 9. Landmarks detection for omnidirectional camera


Articles 35

position in x-axis when the landmark is located to the right side of the robot, and minus when it is on the left. At the beginning of calculations the algo-rithm assumes that the landmark is located in front of the robot, and uses plus sign in (8) to compute the position in y-axis. But if the robot has to rotate 180ᵒ to decode the landmark, the algorithm uses minus sign.

If the candidate landmark is found at a viewing angle larger than ±15ᵒ, the robot turns around un-til the angle becomes smaller than ±15ᵒ. The most common situation is when the algorithm finds more than one potential landmark. In such case the robot turns towards the nearest landmark. Although the front-view camera is capable of recognizing land-marks that are visible at the angles up to ±60ᵒ, to ensure robustness, the QR codes are decoded when they are visible at an angle at most ±15ᵒ. Images from the front-view camera are processed only if the omnidirectional camera finds a potential land-mark. If we cannot find any landmark candidates in the unwrapped image, the robot does not localize itself, and tries to continue using the odometry.

5. Experiments and ResultsIn order to verify the accuracy of the landmark-

based self-localization and the usability of the hybrid vision system in practical scenarios, we have performed several experiments in a typical home environment.

In these experiments we used one SanBot Mk II equipped with the hybrid field of view vision sys-tem. The ground truth data about the covered path was collected by manually measuring the robot’s

2D position with respect to the planned path that was marked on the floor with scotch tape. Here we present the quantitative results for the longest path, spanning three rooms (Fig. 11). During this experiment, the robot covered the planned path ten times, which enabled us to asses the repeatabil-ity of the measurements carried out by our vision system.

The test environment has seven landmarks, which contain their localizations and orientations with regard to external system of coordinates. Us-ing only one landmark, its data and trigonometric relations, the robot can calculate its position. For this reason, landmarks in the environment are ar-ranged so that the robot can always see at least one landmark. At the beginning program searches potential landmarks in images from the omnidi-rectional camera. If the algorithm finds a potential landmark and its absolute viewing angle is less than ±15ᵒ, the program starts processing data from the front-view camera. The robot stops near a de-tected landmark. If the algorithm finds a landmark candidate, but the angle is bigger than ±15ᵒ, the ro-bot turns towards the landmark until the angle is

Fig. 11. Robot’s path during the experiment. Small squares represent points, at which the robot stops and takes images

Fig. 10. Landmarks’ detection and decoding algorithm for the hybrid system

Tab. 1. Viewing angle determination results for the om-nidirectional camera

Robot stop no. α[ᵒ] αg[ᵒ] Δα[ᵒ]

1 71.98 72.00 0.02

2 -9.19 -10.00 0.81

3 10.52 9.00 1.52

4 21.33 20.00 1.33

5 4.39 5.00 0.61

6 0.35 0.00 0.35

7 -37.69 -38.00 0.31

8 -3.56 -3.00 0.56

9 -49.88 -50.00 0.12

10 -3.40 -4.00 0.60

11 47.84 50.00 2.16

12 -2.91 -3.00 0.09


Articles36

smaller than ±15ᵒ. Then, the robot updates its pose from the computed localization data and continues to the next via-point on the planned path.

In this experiment the average error of deter-mining the position of the robot was 3 cm in x-axis, 5 cm in y -axis, and the orientation error was 4ᵒ, when using the front-view camera. For the omnidi-rectional camera the orientation error was only 1ᵒ. This enables to compensate the degraded orienta-tion accuracy in the robot pose by using data from

Tab. 2. Robot self-localization results for the hybrid system

L.no.

xL [cm]

yL [cm]

αL [ᵒ]

xR [cm]

yR [cm]

αR [ᵒ]

xgR

[cm]yg

R [cm]

αgR

[ᵒ]ΔxR

[cm]ΔyR

[cm]ΔαR [ᵒ]

σxR [cm]

σyR [cm]

σαR [ᵒ]

1 0.00 144.00 90.00 93.90 105.86 -75.62 97.00 108.00 -65.00 3.10 2.14 10.62 2.00 3.63 9.34

2 235.00 250.00 210.00 123.86 167.89 34.28 120.00 160.00 30.00 3.86 7.89 4.28 2.55 6.97 5.06

3 120.00 442.00 180.00 112.45 282.54 0.94 110.00 288.00 5.00 2.45 5.46 4.06 1.74 4.42 3.09

4 45.00 605.00 180.00 56.75 438.32 4.77 54.00 442.00 3.00 2.75 3.68 1.77 1.35 4.08 2.98

5 -135.00 544.00 90.00 15.85 529.85 -81.30 18.00 522.00 -85.00 2.15 7.85 3.70 1.2 6.90 1.05

6 -40.00 390.00 270.00 -82.49 429.76 98.51 -78.00 434.00 98.00 4.49 4.24 0.51 3.10 5.70 1.51

7 -220.00 222.00 90.00 -125.83 245.82 -81.79 -128.00 240.00 -85.00 2.17 5.82 3.21 1.40 4.92 2.10

Fig. 12 Exemplary images of a measurement taken at a single robot stop: a – cropped and unwrapped image where the algorithm finds a potential landmark but the angle is larger than 15ᵒ, b – cropped and unwrapped image where algorithm finds a potential landmark and the angle is less than 15ᵒ, c – images from the perspec-tive camera with marked landmark

the omnidirectional system. Sample images from the measurements are presented in Fig. 12. Results for the omnidirectional camera are shown in Tab. 1, where α denotes the measured angle and αg the known ground-truth angle. Final results for the perspective camera-based self-localization guided by the omnidirectional camera data are shown in Tab. 2, where xL,yL,αL describe the landmark posi-tion in the global frame, xR,yR,αR denote the comput-ed robot’s pose in the same global frame, xg

R,ygR,αg

R are the ground truth coordinates of the robot, ΔxR,ΔyR,ΔαR define the absolute localization errors, and σxR,σyR,σαR the standard deviations of the local-ization measurements. Both tables contain average results from 10 runs along the same path. These re-sults demonstrate that the system based on a com-bination of the omnidirectional camera and the per-spective camera provides localization accuracy that is satisfactory for home environment navigation, and allows improving the results in comparison to a system using only the front-view camera.

6. ConclusionsThis paper presents a new approach to mobile ro-

bot self-localization with passive visual landmarks. Owing to the hybrid field of view vision system even a small and simple robot can use passive vision for global self-localization, achieving both high accuracy and robustness against problems that are common in vision-based navigation: occlusions, limited field of view of the camera, and limited range of landmark recognition. The proposed approach enables to use low-cost hardware components and allows simplify-ing the image processing by avoiding full rectification and geometric correction of the images. The experi-ments conducted using a mobile robot demonstrat-ed that the omnidirectional component can in most cases determine the viewing angle of a landmark with the accuracy better than 1ᵒ, using a partially rectified image. The positional accuracy of robot localization using the hybrid field of view system was in most cas-es better than 5 cm, which is satisfactory for home or


Articles 37

office navigation.However, an omnidirectional camera that provides

the single effective viewpoint geometry should allow us to extend the applications of the hybrid system be-yond the artificial landmarks. This is a matter of on-going development. Another direction of further re-search is the model of measurements uncertainty for the omnidirectional camera. Such a model should en-able optimal fusion of the localization data from both cameras (e.g. by means of Kalman filtering), and more efficient planning of the positioning actions [27].

ACKNOWLEDGEMENTS This work was supported by the Poznań Univer-

sity of Technology Faculty of Electrical Engineering grant DS-MK-141 in the year 2015.

AUTHORSMarta Rostkowska* – Poznań University of Technol-ogy, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland,.E-mail: [email protected] Piotr Skrzypczyński – Poznań University of Technol-ogy, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland.E-mail: piotr.skrzypczyński @put.poznan.pl


REFERENCES

[1] Adorni G., Bolognini L., Cagnoni S., Mordonini M., A Non-traditional Omnidirectional Vision System

with Stereo Capabilities for Autonomous Robots, LNCS 2175, Springer, Berlin, 2001, 344–355. DOI: 10.1007/3-540-45411-X_36.

[2] Bazin J., Catadioptric Vision for Robotic Applica-tions, PhD Dissertation, Korea Advanced Insti-tute of Science and Technology, Daejeon, 2010.

[3] Baczyk R., Kasinski A., “Visual simultaneous localisation and map–building supported by structured landmarks”, Int. Journal of Applied Mathematics and Computer Science, vol. 20 no. 2, 2010, 281–293. DOI: 10.2478/amcs-2014-0043.

[4] Briggs A., Scharstein D., Braziunas D., Dima C., Wall P. , “Mobile Robot Navigation Using Self-Similar Landmarks”. In: Proc. IEEE Int. Conf. on Robotics and Automation, San Francisco, 2000, 1428-1434, DOI: 10.1109/ROBOT.2000.844798.

[5] Cagnoni S., Mordonini M., Mussi L., “Hybrid Ste-reo Sensor with Omnidirectional Vision Capa-bilities: Overview and Calibration Procedures“. In: Proc. Int. Conf. on Image Analysis and Pro-cessing, Modena, 2007, 99–104, DOI: 10.1109/ICIAP.2007.4362764.

[6] DeSouza G., A. C. Kak, “Vision for Mobile Robot Navigation: A Survey”, IEEE Trans. on Pattern Anal. and Machine Intell., vol. 24, no. 2, 2002, 237–267. DOI: 10.1109/34.982903.

[7] Durrant-Whyte H. F., Bailey T., “Simultaneous

localization and mapping (Part I)”, IEEE Robot-ics & Automation Magazine, vol. 13, no. 2, 2006, 99–108. DOI: 10.1109/MRA.2006.1638022.

[8] Davison A., Reid I., Molton N., Stasse O., “Mono-SLAM: Real-time single camera SLAM”, IEEE Trans. on Pattern Analysis and Machine Intel-ligence, vol. 29, no. 6, 2007, 1052–1067. DOI: 10.1109/TPAMI.2007.1049.

[9] Fiala M., “Designing highly reliable fiducial markers”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 32, no. 7, 2010, 1317–1324. DOI: 10.1109/TPAMI.2009.146.

[10] Figat J., Kasprzak W., “NAO-mark vs. QR-code Recognition by NAO Robot Vision”. In: Progress in Automation, Robotics and Measuring Tech-niques, vol. 2 Robotics, (R. Szewczyk et al., eds.), AISC 351, Springer, Heidelberg, 2015, 55–64. DOI 10.1007/978-3-319-15847-1_6.

[11] Lemaire T., Berger C., Jung I.-K., Lacroix S., “Vi-sion-based SLAM: Stereo and monocular ap-proaches”, Int. Journal of Computer Vision, vol. 74, no. 3, 2007, 343–364. DOI: 10.1007/s11263-007-0042-3.

[12] Lin G., Chen X., “A robot indoor position and orientation method based on 2D barcode land-mark”, Journal of Computers, vol. 6, no. 6, 2011, 1191–1197. DOI:10.4304/jcp.6.6.1191-1197.

[13] Lu F., Tian G., Zhou F., Xue Y., Song B., “Building an Intelligent Home Space for Service Robot Based on Multi-Pattern Information Model and Wireless Sensor Networks”, Intelligent Control and Automation, vol. 3, no. 1, 2012, 90–97. DOI: 10.4236/ica.2012.31011.

[14] McCann E., Medvedev M., Brooks D., Saenko K., “Off the Grid: Self-Contained Landmarks for Im-proved Indoor Probabilistic Localization“. In: Proc. IEEE Int. Conf. on Technologies for Practi-cal Robot Applications, Woburn, 2013, 1–6. DOI: 0.1109/TePRA.2013.6556349.

[15] Martínez-Gomez J., Fernańdez-Caballero A., Gar-ciá-Varea I. , Rodriǵuez L., Romero-Gonzalez C., “A Taxonomy of Vision Systems for Ground Mo-bile Robots”, Int. Journal of Advanced Robotic Sys-tems, vol. 11, 2014. DOI: 10.5772/58900.

[16] Menegatti E., Pagello E., “Cooperation between Omnidirectional Vision Agents and Perspective Vision Agents for Mobile Robots“, Intelligent Autonomous Systems 7 (M. Gini et al., eds.), IOS Press, Amsterdam, 2002, 231–135, 2002.

[17] Potúcek I., Omni-directional image processing for human detection and tracking, PhD Dissertation, Brno University of Technology, Brno, 2006.

[18] Rahim N., Ayob M., Ismail A., Jamil S., “A compre-hensive study of using 2D barcode for multi ro-bot labelling and communication”, Int. Journal on Advanced Science Engineering Information Tech-nology, vol. 2, no. 1, 80–84, 1998.

[19] Rostkowska M., Topolski M., Skrzypczynski P., „A Modular Mobile Robot for Multi-Robot Appli-cations”, Pomiary Automatyka Robotyka, vol. 17, no. 2, 2013, 288–293.

[20] Rostkowska M., Topolski M., „Usability of matrix barcodes for mobile robots positioning”, Postȩpy


Articles38

Robotyki, Prace Naukowe Politechniki Warsza-wskiej, Elektronika (K. Tchon, C. Zielinski, eds.), vol. 194, no. 2, 2014, 711–720. (in Polish)

[21] Rostkowska M., Topolski M., “On the Applica-tion of QR Codes for Robust Self-Localization of Mobile Robots in Various Application Sce-narios”. In: Progress in Automation, Robotics and Measuring Techniques, (R. Szewczyk et al., eds.), AISC, Springer, Zürich, 2013, 243-252. DOI 10.1007/978-3-319-15847-1_24.

[22] Rusdinar A., Kim J., Lee J., Kim S., “Implementa-tion of real-time positioning system using ex-tended Kalman filter and artificial landmarks on ceiling, Journal of Mechanical Science and Technology”, vol. 26, no. 3, 2012, 949–958, DOI: 10.1007/s12206-011-1251-9.

[23] Scaramuzza D., Omnidirectional vision: from cali-bration to robot motion estimation, PhD Disser-tation, ETH Zürich, 2008.

[24] Schmidt A., Kraft M., Fularz M., Domagala Z., “The comparison of point feature detectors and descriptors in the context of robot navigation”, Journal of Automation, Mobile Robotics & Intelli-gent Systems, vol. 7, no. 1, 2013, 11–20.

[25] Siagian C., Itti L., “Biologically Inspired Mobile Robot Vision Localization”, IEEE Trans. on Ro-botics, vol. 25, no. 4, 2009, 1552–3098, DOI: 10.1109/TRO.2009.2022424.

[26] Scharfenberger Ch. N., Panoramic Vision for Au-tomotive Applications: From Image Rectification to Ambiance Monitoring and Driver Body Height Estimation, PhD Dissertation, Institute for Real-Time Computer Systems at the Munich Univer-sity of Technology, Munich, 2010.

[27] Skrzypczynski P., “Uncertainty Models of the Vi-sion Sensors in Mobile Robot Positioning”, Int. Journal of Applied Mathematics and Computer Science, vol. 15, no. 1, 2005, 73–88.

[28] Skrzypczynski P., “Simultaneous Localization and Mapping: A Feature-Based Probabilistic Ap-proach”, Int. Journal of Applied Mathematics and Computer Science, vol. 19, no. 4, 2009, 575–588, DOI: 10.2478/v10006-009-0045-z.

[29] Yoon K-J., Kweon I-S., “Artificial Landmark Track-ing Based on the Color Histogram“. In: Proc. IEEE/RSJ Conf. on Intelligent Robots and Sys-tems, Maui, 2001, 1918-19203. DOI: 10.1109/IROS.2001.976354.

[30] EmguCV, http://www.emgu.com/wiki/index.php/Main

[31] OpenCV Documentation, http://docs.opencv.org [32] MessagingToolkit, http://platform.twit88.com


39

Design and Movement Control of a 12-legged Mobile Robot

Jacek Rysiński, Bartłomiej Gola, Jerzy Kopeć

Submitted: 9th September 2015; accepted: 21st September 2015

DOI: 10.14313/JAMRIS_4-2015/31

Abstract:In the present paper, design and performance of 12-legged walking robot is described. The complete tech-nical specification was developed for the proposed solu-tion. The analysis of stability of the robot movements was undertaken. Communication between robot and op-erator is based on remote control procedures performed by means of own software which is written in versions for smartphone or desktop computer. The software version for desktop computers has additional useful features i.e. monitoring of the area of robot work/activity via a wire-less camera mounted on fore side of the robot.

Keywords: mobile robot, design, control, kinematic ana-lysis

1. IntroductionSpecialized mobile robots are produced and uti-

lized all over the world. Their typical range of appli-cations is as follows: monitoring, repair routines, inspection of chemically contaminated (or under threat of contamination) areas, extinguishing of fires, detecting and removing of bombs as well as various actions against terrorism. Separate but intensively developed area of utilization of such types of robots is: detection and removal of landmines and associated task [2], [3], [4]. Additional applications of mobile inspection robots are tasks performed in pits and coal mines as well as similar location where human life (e.g. machine operator) is in danger, so special-ized mobile robots are utilized. In Poland, there are not any robots designed and/or manufactured for

such applications. Particularly, robots for anti-terror-ist actions are needed in Poland where the demand of these devices is on par with other countries.

All these devices have one common feature i.e. movement/drive system which enables performance of their movement.

The goal of the present work was to design and manufacture of walking robot together with some co-operation elements e.g. control system where control routines are performed via a modern phone (iPhone), tablet, notebook or desktop computer.

2. Mechanical Subsystem of the RobotWithin the design phase of the DUODEPED robot,

the ideas of motion based upon wheels or caterpillar tracks where excluded. On the contrary, the mecha-nism of special legs presented in 2005 in Austria by Dutch physicist Theo Jansen was applied. The concept of the design solution is based upon utilization of sim-ple geometric figures which are mutually connected by means of nodes (Fig. 1). The point – marked by means of blue color (1’) – is fixed on the rotation axis; the element which passes power is marked as green (2’). The device is driven into a motion which enables performance of consecutive steps via the node of a tri-angle which has contact with the ground.

Time of contact with the ground for an orange node (3’) is equal to the time of the 120° rotation of the green node around the driving shaft. Taking into account that one full rotation is equivalent to 360°, aiming for sufficient stability of the device, the num-ber of pair of legs (Fig. 1c) assigned to one rotation cycle was assumed as equal to 3. Therefore their driv-ing bands are mounted on the driving shaft fixed for every 120°. This design solution allows for perma-

a) b) c)

Fig. 1. Concept of robot legs


Articles40

nent contact of the legs with the ground, moreover it creates the base for stable walking.

2.1. Legs, Their Number and Motion TrajectoryThe robot legs consist of triangles connected by

bands. Their special shape (geometrical properties) allows for performance of leg movement trajecto-ries, in such a manner that during the performance of a step – point P6 moves parallel to the ground, giving an impression of smooth motion as well as assuring stability of performed robot steps.

Each robot leg consists of 40 elements, the ele-ments can move. Aiming for reduction of friction, the legs are mutually separated by means of separators (washers) made of Teflon having thickness 0.5 mm. Moreover, aiming for reduction of the size – instead of ball bearings, sliding bearings were utilized being a brass rings mounted on shafts.

2.2. Design Solution of the Robot LegUsing the above described ideas, a design solution

has been proposed which assures minimal number of support points. Therefore, 12 legs were designed which are grouped in four sets of three legs. The de-sign solution has a property that in a special case of motion (ahead) and the special legs arrangement – the number of these supports increases to eight, whereas remaining four legs are in the middle of the cycle of displacement of the support point over the ground.

The design solution according to Theo Jansen’s idea allows for permanent contact of minimal num-ber of legs with the ground. However, due to the fact that the center of gravity is placed in a relatively high position and simultaneously the support points are placed (distributed) closely to themselves – the con-

struction is not fully stable. Aiming for improvement of construction stability – the leg mounting points were displaced in a distance of 194 mm (Fig. 3).

Due to the change of positions for fixing the ro-bot legs, the drive element (module) was designed as a system of two co-operating crankshafts placed sym-metrically (180°). Their consecutive cranks are shifted mutually by an angle of 120°, around the main rotation axis. Their drive system was coupled with the gear wheels of a gear having the ratio equal to 2.29. The ad-vantages of the proposed design solution of the robot consisting in 12-legged walking system are as follows: high capacity of loading, relatively high velocity of mo-tion and ability to overcome the obstacles – preserving simultaneously stability of the whole structure.

2.3. Kinematical Analysis of Robot LegThe analysis of robot movement is performed us-

ing a few assumptions. Every linkage is not compli-ant. All kinematics pairs are without backlash. All the movements were projected to one plane. All activi-

Fig. 3. Design solution for fixing robot legs

Fig. 2. Kinematical scheme of single robot leg and trajectory of wheel axis


Articles 41

ties were performed using such kinematical scheme of mechanism.

The first stage of analysis was finding out the con-figurations of linkages for selected positions of driv-

ing crank. The results are depicted in Fig. 5. The path of movement of point P6 (which is corresponding to axis if ground wheel) was obtained by connecting subsequent positions of point P6. Notice that this tra-

Fig. 4. Main dimensions of leg and its drive mechanism

Fig. 5. Single leg movement, for selected positions of crank


Articles42

jectory was drawn up with respect to point Pc. Ad-ditional drawing (Fig. 6) shows positions for 3D CAD model of mechanism. Using such methodology the design of linkages was optimal and the risk of parts collisions and other design troubles were prevented.

The second stage of analysis was revealing the ve-locities in all joints of robot leg. The graphical method was used and the results are shown in Fig. 7 and Fig. 8.

The results of analysis (i.e. obtaining the shape of tra-jectory, knowledge of accurate positions of linkages and ground wheel, velocities plans) ensured the selected de-sign and facilitated it. Velocities collected in table 1 were determined for angular velocity of crank w0=5,09 rad/s.

Figure 9 shows the selected position of the leg with marked centre of gravity (CG). The full path of mechanism motion was traced.

Fig. 6. Free positions of robot legs driving mechanism

Fig. 7. Velocities of joints for position +0° Fig. 8. Velocities of joints for position +180°


Articles 43

Figure 10 shows changing the position of grav-ity centre for single leg mechanism during the mo-tion. Simulation was performed for angle from 0 to 360 deg. The results were obtained numerically from CAD model of mechanism, with assumption, that the mechanism is planar.

According to computer simulation the Figure 11 shows the variability of reduced mass moment of in-ertia with respect to crank axes was obtained. This is the plot for single leg.

2.4. Stability of MotionMulti-leg constructions moved usually in a mo-

tion (walking) which is statically stable. During such walking, the projection of the robot center of grav-ity is always placed inside the polygon of supports (Fig. 12 a). The stability spare range is defined as a distance between the projection of the center of gravity and an edge of support polygon. This distance is measured along the current vector of motion of the center of gravity (Fig. 12 b).

The stability spare range – for statically stabile walking – should not be lower than value called as minimal margin. The margin should be set in such a way that all neglected dynamical effects and action of external forces do not cause any loss of robot sta-bility. The best situation is when this stability spare range is determined experimentally. The velocities of motion of the contemporary walking robots or devic-es are relatively low, usually lower than several kilo-meters per hour (sometimes even less than 1 kilome-ter per hour). Therefore, the assumed simplifications are acceptable.

Table 1. Sample numerical values of joint velocities [mm/s]

VelocityPosition

0° 180°

v0 86.5 86.5

v1 46.3 102.1

v2 46.3 102.1

v3 0.0 0.0

v4 130.2 154.7

v5 121.6 98.8

v6 52.1 67.1

Fig. 9. Exemplary position of gravity centre for leg dur-ing the motion (angle coordinate 90°)

Fig. 10. Changes of gravity center of single leg mecha-nism

Fig. 11. Reduced mass moment of inertia with respect to crank axes

Fig. 12. a) Projection of the center of gravity during mo-tion through an inclined slope, b) Statical stability spare range for 3-support walking


Articles44

As can be seen, the criterion of the stability spare range does take into account configuration (shape) of a machine as well as properties of the ground. The sufficient margin of stability is defined taking into ac-count a possibility of reduction of the expected sup-port polygon due to lack of one point of support (lack of contact for one arbitrary leg). If the machine (ro-bot) is supported by n legs, then – besides the proper polygon – other polygons of support are created and considered in the prepared software. These polygons are adequate for all other possible phases (versions) of support by means of (n – 1) – legs. The sufficient polygon of support is built as common area of all n – 1 polygons. Sufficient spare range of stability is mea-sured in a way described above – therefore it could be considered for machines or robots having more than four legs.

The statically-stable walking of the designed 12-legged robot – consists of three 4-support phases (Fig. 13). This solution assures permanent contact of 4 legs with the ground, therefore there is not any threat of loss of stability. The crankshafts – which driver the robot legs – are connected via the gear. In consequence, only one type of walking is possible where the only variable is its velocity. Due to the mo-tion of the robot legs, its center of gravity slightly changes the position. The changes are so low among other – because the legs’ mass is relatively low in rela-tion to the whole mass of the construction (20% of the whole mass) and an effect of application of crank-shafts – mutually rotated by 180°. The last mentioned property assures that the legs of opposite sides bal-ance themselves. Slightly different are the matters ac-cording to the spare range of stability, which varies essentially during motion of robot legs. It is caused by the changes of the distance between the legs’ contact points with the ground and the robot center of gravity [8], [9], [10].

The measure of energetic stability of a particular robot position is the minimal work, which has to be done via whichever disturbing factor/activity causing destabilization of its current position. It is the work dedicated to displacement of the gravity center of the robot to such new position – in which, the gravity cen-ter is placed in the vertical plane above the edge of the support polygon. The distance is preserved within a particular time. Therefore, it is ultimate stable posi-tion (having ultimate stability). In consequence, any

a)

b)

Fig. 14. Layout / functional scheme

Fig. 13. a) L1 … L6, P1 … P6 – Designation of legs; b) diagram of a robot foot contact with the ground


Articles 45

smallest disturbance will cause overturning of the ro-bot. This work, in physical sense, is equivalent to the difference of potential energy of the gravity center adequate for the start and end positions of the robot.

3. Electronic Subsystem of the Walking RobotRobot control routines should be intuitive and

easy to perform for persons of different age groups. In general, control should be simple and additionally its visualization should be possible. Nowadays, for some people, it is even unimaginable to design such a de-vice without control visualization [7].

Aiming for increasing comfort and ease of robot control, the control system was based upon wireless transmission of data via Bluetooth. Due to this solu-tion, cables are obviously not needed; nevertheless remote control is possible from relatively high dis-tance. The distance considered is equal to 100m on an open area, depending on possibility of propagation of radio waves. A device – which could be used for ro-bot control – can be based on Android 2.3.6 system as well as upgraded ones mounted in e.g. smartphone, tablet or other devices based on Windows operation-al system e.g. notebook or desktop computer [6]. The necessary requirement is that the device is equipped in Bluetooth module.

Central subsystem of DUODEPED is the effective microcontroller PIC24HJ256GP610 made by Micro-chip company [1]. Special program was written in C algorithmic language, which was compiled by means of compiler C30. Microcontroller performs communi-cation utilizing the serial transmission UART type via Bluetooth BTM-222 module which assures radio con-nection with the control device (subsystem) [4]. The functional scheme enclosing all elements i.e. from the control device up to motors – is presented in Fig. 14.

The robot control system was written using the Basic4Android software. It allows for control of the robot via three working modes.

The first of available control work mode is the most simple control version (Fig. 15a). It is equipped with the direction arrows as well as hidden arrows re-sponsible for motion along curves or bands. Depend-ing on the chosen direction of robot motion, the robot will be moved ahead, turn back at the spot and turn-ing along the arc/band.

The second method of control (Fig. 15b) utilizes the accelerometer gauge which is available in a smart-phone. Just declining the device, the adequate infor-mation about the position of phone is transmitted. If the robot motors are triggered via a button START, then the robot will perform the motions in accordance with the declining of the control device. The third, i.e. the last control mode (Fig. 15c), performed by means of Android, utilizes the touch screen. The clicks on this screen ensure the robot starts its motion if only the option START has been earlier clicked – launching charging of the robot motors.

Robot control software for desktop computer (PC) or notebook.

Control of DUODEPED robot could be performed by means of a desktop computer or a notebook when the software inTOUCH version 10.1 is mounted (Fig. 16).

a) b) c)

Fig. 15. Screens dedicated to the control system: a) manual control; b) control using accelerometer; c) control using touch screen

Fig. 16. Screen „Preview & Control”


Articles46

Moreover, the DASMBSerial.2 of the package ArchestrA creates the communication system used for control purpose.

After triggering of the application, a user must log in using one of the available accounts. The login op-tions and passwords are gathered in the field „PIER-WSZE KROKI” (introductory steps). For the service activities, the password is hidden and it is the same as login.

The control of robot is performed in intuitive way. It is enough to log in as a user and the button „Preview & Control” is activated. Via this button, the window of control is open. In this window, we can chose the measurements of the voltage of the battery mounted in machine/robot and measurement of electric cur-rent which flows through the controller of DC motors and charges theses motors.

In the upper part of this window, there is a moni-toring field shown in right part of the discussed win-dow. Moreover, monitoring is available in visual way as a moving construction.

The window is zoomed from 0 up to 50 cm, there-fore above this distance the pictogram of construc-tion on visualization sub-screen (sub-window) will be placed on right side of the control window. The detection of an obstacle option is active by default. Therefore, if during its motion the robot approaches on obstacle for the distance lower than the second threshold – then its velocity in this direction is dimin-ish. But if the distance towards the obstacle further diminishes and it overcomes the threshold 1, then the drive mechanism: (a) stops the robot, (b) changes the direction of rotations and (c) it moves back with a low velocity (reduced to 20%). Next overcoming of the threshold 2, i.e. when the robot remains in safe distance, causes stopping of the robot and waiting for further orders. If we would like to start movement, then first the motor should be activated – i.e. ready to switch the power on. When the robot is not used for a long time, then for safety reasons, the motors should be deactivated – disenabling a possibility of using the robot by chance in unexpected moment.

4. Final RemarksMobile robots are more and more frequently uti-

lized by people for differing purposes. Within recent

years, mechanisms applied in robotized construc-tions as well as autonomous robots were successfully used in army, medicine and industrial plants. Since the dynamic development of robotics in modeling activities is observed. People are more and more in-terested in design and programming of so called per-sonal assistant robots. All over the world, robots are used for versatile tasks e.g. monitoring, production, safety and protection. It can be stated that nowadays robots are everywhere. Several times a year, presen-tation events and robot contests are organized in Po-land and abroad. In these events, robots designed by private people as well as by firms take part. It is worth nothing, that every year number of participants of such tournaments increases very fast.

The manufactured 12-legged walking robot – DU-ODEPED – had won several prizes for the design itself and innovative design solutions. Its name is created based upon the Latin i.e.: taking into account words: DUODE – twelve and PEDES – legs. It is a new name which has never been used – which could be con-firmed via search results of the popular web/internet browsers. The construction has been manufactured very solid, which could be confirmed by a possible of displacement of relatively high loads (approx. 100kg). Robot has many admirers and it has a circle of fans i.e. students as well as kids who can play endlessly.

AUTHORSJacek Rysiński* – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland.E-mail: [email protected].

Bartłomiej Gola – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland.E-mail: [email protected].

Jerzy Kopeć – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland.E-mail: [email protected].


REFERENCES

[1] Di Jasio L., Programming 16-bit PIC Microcon-trollers in C: Learning to Fly the PIC24, Newnes, Burlington 2007.

[2] Zielińska T., Walking machines, basics, mechanical design, control and biological aspects, Polish Sci-entific Publishers PWN, Warszawa 2003 (in Pol-ish).

[3] Tchon K., Mazur A., Dulęba I., Hossa R., Muszynski R., Manipulators and mobile robots, Akademicka Oficyna Wydawnicza, Warszawa 2000 (in Polish).

[4] Giergiel M. Malka P., Wireless communication systems in the control of robots, “Modelowanie

Fig. 17. DUODEPED – 12-legged walking robot


Articles 47

Inżynierskie”, Gliwice, vol. 36, 2008, 95–102 (in Polish).

[5] Maslowski A., „Intervention-inspection mobile robots”, IPPT PAN, Warszawa 1999 (in Polish).

[6] Frank A. W., Sen R., King Ch., “Android in action”, Helion S.A., Gliwice 2011 (in Polish).

[7] Pa P.S., Wu C.M., “Design hexapod robot with a ser-vo control and man-machine interface”, Robotics and Computer-Integrated Manufacturing, vol. 28, 2012, 351–358. DOI: 0.1016/j.rcim.2011.10.005

[8] Hasan A., Soyguder S., “Kinetic and dynamic anal-ysis of hexapod walking-running-bounding gaints robot and control actions”, Computers and Elec-trical Engineering, vol. 38, 2012, 444–458. DOI: 10.1016/j.compeleceng.2011.10.008.

[9] Parhi D.R., Pradhan S.K., Panda A.K., Behera R.K., “The stable and precise motion control for multiple mobile robots”, Applied Software Com-puting, vol. 9, 2009, 477–487. DOI: 0.1016/j.asoc.2008.04.017.

[10] Ferrell C., “A comparison of three insect-inspired locomotion controllers”, Robotics and Autono-mous Systems, vol. 16, 1995, 132–159. DOI: 10.1016/0921-8890(95)00147-6.

[11] Pa P.S., “Design of a modular assembly of four- footed robots with multiple functions”, Robot-ics and Computer-Integrated Manufactur-ing, vol. 25, 2009, 804–809. DOI: 10.1016/j.rcim.2008.12.001.


48

ity of drink water supplied to the city. Main problems connected with the water network management are water losses caused by the network damages, unsuit-able water pressures on the end user nodes caused by inappropriate work of pump stations installed on the network or by wrong planning of the water net, and a bad quality of produced water caused by incor-rect control of the network or by inaccurate planning of water net revitalization. All these problems can be solved in relatively simple way by using new infor-matics technologies and this idea led to the concept of an integrated ICT system for complex management of communal water networks. The system developed at IBS PAN is now tested in some Polish waterworks.

2. ICT System DescriptionAccording to the mentioned trend in waterworks

computerization an integrated ICT system for com-plex water networks management has been devel-oped at the Systems Research Institute and its struc-ture is shown in Fig. 1. The system is built in modular form and it consists of the following components:• GIS – for generating the numerical maps of the wa-

ter net investigated;• SCADA – for monitoring the water net parameters,

i.e. pressures and flows of the water;• CIS – for recording the water consumption of the

end users of the water net;• 20 computing programs with algorithms of math-

ematical modeling, optimization and approxima-tion for solving the water net management tasks.

Fig. 1. Block diagram of the ICT system for water net-works management

The components GIS, SCADA and CIS are adopted from other firms and integrated with the computing

ICS System Supporting the Water Networks Management by Means of Mathematical Modelling and Optimization Algorithms

Jan Studzinski

Submitted: 2nd September 2015; accepted: 24th September 2015

DOI: 10.14313/JAMRIS_4-2015/32

Abstract:In the/this paper a concept of an integrated informa-

tion system for complex management of water networks is presented. The ICT system is under development at the Systems Research Institute (IBS PAN) in Warsaw for a couple of years and it is gradually tested in some Pol-ish communal waterworks of differentiated size. Several waterworks management tasks requiring mathematical modelling, optimization and approximation algorithms can be solved using this system. Static optimization and multi-criteria algorithms are used for solving more com-plicated tasks like calibration of the water net hydraulic model, water net optimization and planning, control of pumps in the water net pump stations etc. [4] But some of the management tasks are simpler and can be performed by means of repetitive simulation runs of the water net hydraulic model. The water net simulation, planning of the SCADA system, calculation of water age and chlorine concentration in the water net, localization of hidden wa-ter leaks occurring in the network and planning of water net revitalization works are the examples of such tasks ex-ecuted by the ICT system. They are described in this paper.

Keywords: drink water distribution system, water net hydraulic model, hydraulic optimization, water net man-agement

1. IntroductionThe world trend in computerization of water-

works is currently the implementation of integrated information systems for complex management of whole enterprises or their whole key objects and under them of water networks what is the simplest venture from the technical, organizational and finan-cial point of view. An integrated management sys-tem for a communal water network consists usually from GIS (Geographical Information System), SCADA (System of Control and Diagnostics Analysis) and CIS (Customer Information System) systems which are integrated strictly with some modeling, optimiza-tion and approximation algorithms [7]. Due to this strict cooperation under several programs all tasks of the water net management concerning technical, organizational, administrative and economic prob-lems can be automatically executed or computer sup-ported [8]. Three essential goals that can be reached by computer aided management of municipal water networks are reduction of costs and simplification of waterworks operation as well as improving the qual-


Articles 49

programs via data files or data tables. The computing programs are responsible for realization of all man-agement tasks by means of water net hydraulic model and optimization algorithms. Some functions realized by the programs are as follows:1. Hydraulic modeling of water nets;2. Optimal planning of SCADA systems for water nets;3. Automatic calibration of hydraulic models;4. Optimization and planning of water nets;5. Control of pump stations in water nets;6. Control of pumps installed in pump stations;7. Detecting and localization of leakage points in wa-

ter nets;8. Calculation of water age in water nets;9. Calculation of chlorine concentration in wa-

ter nets;10. Planning of water net revitalization;11. control of network valves changing the water

flows distribution in water nets.The programs realizing the functions specificated

above work with the water net hydraulic model and while realizing the tasks concerning the model cali-bration, water net optimization and planning, pumps control and planning of SCADA they use an heuristic algorithm of multi criteria optimization [6]. For the solution of other tasks of the water net management only multiple simulations of the hydraulic model un-der different work conditions of the water net are ex-ecuted [8]. The functions realized in such the way by the ICT system are:12. calculation of height coordinates for the water

net nodes;13. drawing the maps of water flow and pressure dis-

tributions in water nets;14. drawing the maps of water net sensibility toward

the leakage events occurring in water nets;15. drawing the maps of water age distribution in wa-

ter nets;16. drawing the maps of the distribution of chlorine

concentration in water nets;17. drawing the maps of value distributions for some

environmental parameters like temperature in the area of the water network.The programs that realize these functions use the

algorithms of kriging approximation that enable to pic-ture in graphical form the value distributions of param-eters connected with water nets and their operation [1]. The last part of the management functions realized by the ICT system concerns the calculation of math-ematical models for forecasting the hydraulic load of water nets and of their end user nodes. This is done by means of the following time series methods [2]:18. Least squares method of Kalman;19. Generalized least squares method of Clarke;20. Maximum likelihood method.

Due to the cooperation of several programs while solving different management tasks a synergy effect arises what boosts essentially efficiency of the run-ning programs.

In the following some algorithms supporting the water nets management and implemented in the ICT system are described.

3. Algorithms of Modelling and Optimization3.1. Hydraulic Model Calibration

Calibration procedure in case of water nets con-sists usually in changing the roughness values of the network pipes in such the way that flows and pres-sures measured and calculated are possibly the same in the net points where the sensors of SCADA system have been installed. This changing occurs normally by hand for in the waterworks; there are not the ap-propriate programs that could support this action by automatic computing [9]. The algorithm presented executes the calibration procedure in three follow-ing steps:1. Preparation of the initial data consisting in divi-

sion all network pipes in groups depending on pipe diameters, age and material;

2. Changing the roughness of pipes regarding the pipe groups and not individual pipes;

3. If the roughness change in a group exceeds the values field given then changing there the nominal pipe diameters; this change occurs either in frame of a given values field.In this way the algorithm has got two phases of

calculation regarding the roughness and diameter changes that follow one after another.

Fig. 2. Exemplary water net model calibrated

Fig. 3. Preparation of data for the exemplary water net model calibration

In Figures 2 and 3 the exemplary water net model and the data preparation for its calibration are shown. The net consists of 25 pipes of the same age and made of the same material. On two pipes and on two nodes of the net the measuring devices for flow and pres-sure are installed. In Fig. 3 one can see the diagrams of calculated and measured flow and pressure values


Articles50

designed for 24 hours and shown for one pipe and one node before the calibration run. The pipes are divided into 2 groups regarding their diameters. By the calibration only the roughness values in two pipe groups will be changed.

Fig. 4. Differences between measurement data (in grey) and calculation values (in green) in node 2 (pres-sure values – left) and in pipe 5 (flow values) before calibration

Fig. 5. Results of calibration shown for node 2 (pressure values – left) and pipe 5 (flow values)

The results of the calibration done by means of a genetic algorithm are shown in Fig. 5. One can show there that the pressure values in the node and the flow values in the pipe (which are the same as before) are the practically identical for the calculation results and for the measurement data what allows to consider the calibration algorithm to be very effective.

3.2. Water Net Hydraulic OptimizationAnother algorithm supporting the water net man-

agement concerns the hydraulic optimization of the water network by means of exchange of particular network pipes and/or of control of pumps in the wa-ter take out stations or in the works raising the water pressure within the water net. In the algorithms solv-ing the task in small and medium waterworks the cal-culation can be done for all pump stations in the same run for there are not more than only several pumps in such the enterprises. In the case of big waterworks the situation is more complicated because there are many pump stations and also a lot of pumps in them and finding the control schemes for all devices simul-taneously is practically not possible. Because of that the algorithm proposed consists of two stages when on the 1st stage the controls are calculated for the pumps stations seen as single generalized pumps and the 2nd stage the calculation is done for each pumps station and for the pumps there individually. Such the

division of the hydraulic optimization task in two sep-arated stages makes the problem solvable from the computational point of view.

In the following the realization of the 2nd stage of the algorithm done for a real pumps station of a Polish waterworks is described. In the object only 1 pump works and from it 2 pipes ended with 2 nodes go out. The pressure values in these nodes are too small against the values that have been calculated on the 1st stage of the algorithm. The problem is to find out the pipes with new diameters and to calculate the pump velocity in such the way that the obtained node pres-sures will fit to the indicated earlier value areas.

Fig. 6. View on the pumps station calculated and the parameters of the pump to be controlled

Fig. 7. Parameters of two nodes being the outputs of the pump station calculated

In Figures 6 and 7 the scheme of the pipe connec-tions in the pumps station investigated and the char-acteristics of the pump and of the nodes concerned are shown. In Figures 8 and 9 the screens of the program developed at the IBS PAN and prepared for introduc-ing the input data for changing the pump velocity and the pipe diameters are to see [5]. In this example the pump velocity can be change/changed? in the area between 60% and 100% of its nominal speed, the new pipes can have their diameters between 100 mm and 1.200 mm and the acceptable or preferable node pressures are lying between 20 m and 70 m or 28 m and 36 m for one node and between 10 m and 70 m or 37 m and 45 m for another one, respectively. To solve the problem a genetic algorithm of optimization with the use of fuzzy sets while calculating the node pres-sures is applied; the fuzzy sets are used to make a dis-


Articles 51

tinction between the acceptable and preferable value areas while calculating the node pressures.

Fig. 8. Preparation of input data for the pump control

Fig. 9. Preparation of input data for the exchange of pipes

In Fig. 10 the results of the hydraulic optimiza-tion made for the single pumps station are given. The pressure values at the end nodes of the pumps work raised from 11.82 m to 35.86 m and to 37.24 m, re-spectively, what is the consequence of the pipe diam-eters change from 600 mm to 488 mm for one pipe and from 800 mm to 1.048 mm for another one, and of the pump velocity change from 0.76% to 0.89% of its nominal speed.

Figure 10. Results of the hydraulic optimization of the pumps station calculated

3.3. Water Net RevitalizationWater net revitalization belongs to the planning

tasks that can be divided into 3 kinds: hydraulic op-timization, drawing up new networks or extension of the old ones and revitalization or renovation. In the first two kinds of the tasks computer simulation of the water net hydraulic model as well as optimiza-tion algorithms must be used to secure right hydrau-lic conditions of the water net operation. They mean relevant water pressures in the end user nodes of the network and possibly fast velocities of water in the network pipes. In case of revitalization the network works right from the hydraulic point of view and the

reason to undertake the action is an old age of water net objects, mostly of pipes, or their wrong technical state causing the risk of failures. The susceptibility of water nets to accidents can cause in older municipal waterworks the water losses reaching even up to 30% of the water production what means essential finan-cial losses for the enterprise [3].

In the presented algorithm the revitalization task means the exchange of several pipes in the water net because of their wrong technical state and against the pipes with the same diameters. While planning the revitalization with such approach only multiple simu-lation runs of the network hydraulic model are done for the exchange of old pipes against new ones with reduced roughness values does not worsen but im-proves the hydraulic conditions of the water net. The goal of the algorithm is to reduce the liability of the network to break down and, in result, to reduce the potential water losses in the water net.

While planning the revitalization one must de-cide which pipes are to be exchanged to minimize the water net susceptibility to accidents and at the same time to secure proper functioning of the whole net-work. The following factors are taken into consider-ation when choosing the set of pipes to be replaced:• Technical state of the pipes characterized by their

roughness. After the indicators are calculated for all pipes a ranking list for them is prepared ac-cording to the diminishing indicator values.

• Current durability of the pipes calculated as the difference between the year of pipe construction and the normative pipe durability.

• Pipe liability to break down in percent defined on the base of historical data concerning the pipe damages.

• Risk of the water losses calculated as the pres-sure in the pipe modified by the pipe diameter: p * (1 + d/500).

• Costs of the pipe revitalization which consists of two components: the costs of the pipe installation and the costs of buying the new pipes.Depending on the financial funds which are at the

management disposal one can make choice of the set of pipes for the exchange taking the pipes from the top part of the ranking list and summarizing the costs of their revitalization up to the funds limit.

In order to select the pipes for revitalization from the whole set of the water net pipes the revitalization indicator is calculated from the following formula:

IR = wc * Cn + wt * (1.0 - Tn) + wa * An + ws * Sn (1)

where wc, wt, wa and ws are weights coefficients, Cn means pipe roughness, Tn means current pipe durabil-ity, An is pipe liability to break down and Sn is risk of the water losses defined for the pipe concerned. The weights coefficients can be chosen arbitrary by the program user and all factors in the formula are normal-ized in the standarized range of values from 0.0 to 1.0.

When the pipes to be exchanged are already se-lected then the effects of the planned revitalization can be verified by performing the hydraulic calcula-tion for the whole water net with roughness values


Articles52

equal to null for the selected pipes. When the revital-ization action is done then the vulnerability of the wa-ter net to the accidents will be reduced and the water pressures in some end user nodes as well as flow ve-locities in some pipes will be enlarged.

Fig. 11. Water net investigated before (up) and after (down) hydraulic calculation

Fig. 12. Pressure (up) and flow (down) distributions in the water net after its hydraulic calculation

The hydraulic graph of the water net investigated before and after the hydraulic calculation is shown in Fig. 11. The network is supplied with water by 2 pump stations located on its left-bottom side. On the right

side of the network and overhead 2 retention tanks are installed. The graph of the water net consists in total of 280 nodes and of 398 pipes. The distributions of flows and pressures in the water net before the re-vitalization action are shown in Fig. 12. The flow and pressure values are highest in the areas where the pump stations and tanks are situated.

Fig. 13. The graph of the water net with the pipes se-lected for revitalization

In Fig. 13 the pipes selected for the exchange are marked with the green colour. According to formula (1) and to data assumed, concerning all relevant fac-tors 31 pipes from 398, i.e. 8% of the whole, have been taken for the replacement.

The effects of the revitalization after performing the hydraulic calculation for the whole water net with roughness values equal to null for the selected pipes are shown in Figures 14 and 15. In Fig. 14 the curves received before the revitalization are marked with the blue colour.

Fig. 14. Comparison of water flows (up) and pressures (down) before and after the water net revitalization performed for 31 pipes

One can see from Figures 14 and 15 that in accor-dance with expectation the values of pressures and


Articles 53

flows in the water net have increased after the revi-talization. Nevertheless the changes of pressure values are very small and insignificant towards the changes of the flows. In that case not only the values but also the flow directions changed as the result of revitalization.

Fig. 15. Pressure (up) and flow (down) distributions in the water net after its revitalization performed for 31 pipes

Fig. 16. Comparison of water flows (up) and pressures before (down) and after the water net revitalization performed for all pipes

Table 1. Comparison of flow and pressure values before and after the water net revitalization performed for all pipes

Nr Flow before revital.

Flow after revital.

Pressure before revital.

Pressure after revital.

1 10.1323 16.2095 20.9 20.91

2 -15.6321 -19.6183 30.4 30.45

3 17.9016 29.9229 29.31 29.36

4 28.0094 37.7194 31.4 31.44

5 -22.4987 -34.7968 33.7 33.73

6 4.5943 13.1449 35.1 35.2

7 0.0599 9.847 35.09 35.2

8 2.3047 5.8978 32,11 32.17

9 -17.493 -22.9548 35.04 35.16

10 7,7127 5.7549 39.02 39.15

11 -38.4735 -44.7111 37.1 37.2

12 -49.7825 -62.1092 37.11 37.2

13 5.1874 8.3333 36.11 36.21

14 -10.1803 -6.506 36.41 36.54

15 1.8669 11.0067 37.21 37.32

16 -9.3078 -17.5968 39.12 39.22

17 13.2529 14.9158 40,09 40.26

18 -9.9838 -10.2465 31.07 31.23

19 10.4993 13.3058 36.59 36.71

20 -6.1282 3.0865 38.1 38.2

21 -9.289 4.3611 32.11 32.24

22 -10.7361 3.4721 36,69 36.81

23 274.2455 341.5322 42.14 42.29

24 256.7525 318.5774 32.48 32.56

25 175.2549 224.9842 20.9 20.91


Articles54

To see it better in another step of revitalization all pipes of the water net have been replaced. The hydraulic results received are shown in Fig. 16 and in Table 1 for exemplary pipes and nodes. Once again one can see that the pressure values increased but in a very small and practically marginal degree. Against it the flows increased their values essential-ly and also the flow directions have been changed in many pipes.

4. ConclusionsIn the paper some algorithms supporting the man-

agement of municipal water networks have been pre-sented. Among many algorithms developed for the waterworks there are several algorithms that use in their calculations only hydraulic model of the water net and with the simulation runs of this model several useful management tasks can be realized. These tasks are connected indeed only with planning the water net, like SCADA planning and revitalization algo-rithms, and with informing about the water net func-tioning, like calculations of network hydraulics, water age and chlorine concentration, but nevertheless they are important for correct water net operation. More complicated tasks like calibration of the water net hydraulic model or water net optimization or pumps or tank control need for their solution more sophisti-cated methods like multi criteria optimization algo-rithms. An important condition of effective operation of the algorithms described is however their using in strict cooperation with GIS and SCADA systems in frame of a united ICT system. Such the solution is more expensive than individual use of only water net hydraulic models but it makes sure that the manage-ment tasks will be done fast, easy, suitably and fault-less. Such system for waterworks is for a longer time under development at the Systems Research Institute of the Polish Academy of Sciences and some versions of it have been already made and tested in some com-munal waterworks in Poland.

AUTHORJan Studziński – Systems Research Institute Polish Academy of Sciences, Newelska 6, 01–447 Warszawa, Poland.E-mail: [email protected].

REFERENCES

[1] Bogdan L., Studzinski J., „Modeling of water pres-sure distribution in water nets using the kriging algorithms”. In: Industrial Simulation Conference ISC’2007 (J. Ottjes and H. Vecke, eds.) Delft, TU Delft Netherlands, 52–56.

[2] Hryniewicz O., Studzinski J., „Development of computer science tools for solving the environ-mental engineering problems”. In: Enviroin-fo’2006 Conference, Graz.

[3] Saegrov S., Care-W – Computer Aided Rehabilita-tion for Water Networks, IWA Publishing, Alliance House, London, 2005.

[4] Stachura M., Fajdek B., Studzinski J. 2012. „Model based decision support system for communal water networks”, ISC’2012 Conference, Brno.

[5] Sluzalec A., Studzinski J., Ziolkowski A., “MOS-KAN-W - the web application for modelling and designing of water supply system”. In: Simulation in Umwelt- und Geowissenschaften, Reihe: Umwel-tinformatik, ASIM-Mitteilung AM 150, Workshop Osnabrück 2014 (J. Wittmann, Hrsg.), Shaker Verlag, Aachen 2014, 143–153.

[6] Straubel R., Holznagel B., „Mehrkriteriale Op-timierung fuer Planung und Steuerung von Trink- und Abwasser-Verbundsystemen”. In: Wasser•Abwasser, 140, No. 3, 1999, 191–196.

[7] Studzinski J., „Computer aided management of waterworks”. In: Proceedings of QRM’2007 (R.A. Thomas, ed.), Oxford 2007, 254–258.

[8] Studzinski J., „Rechnerunterstützte Endscheid-ungshilfe für kommunale Wasserwerke mittels mathematischer Modelle, Krigingsapproxima-tion und Optimierung”. In: Modellierung und Simulation von Ökosystemen. Workshop Kölpinsee (A. Gnauck, Hrsg.), Shaker Verlag, Aachen 2012.

[9] Wojtowicz P., Pawlak A., Studzinski J., “Prelimi-nary results of hydraulic modelling and calibra-tion of the Upper Silesian Waterworks in Poland”. In: 11th International Conference on Hydroinfor-matics HIC 2014, NY City, USA, 2014A.


55

Development of Graphene Based Flow Sensor

Adam Kowalski, Marcin Safinowski, Roman Szewczyk, Wojciech Winiarski

Submitted: 31st August 2015; accepted 22nd September 2015

DOI: 10.14313/JAMRIS_4-2015/33

Abstract: This paper shows the research on a flow sensor based on graphene. Presented Results show the linear relation between voltage induced on graphene layer and flow velocity. The measurement shows that signal level is relatively low, and it is highly correlated with the time of the sample being submerged in water. A significant temperature dependency has been shown which indi-cates on necessity to develop a compensation system for the sensor. On the other hand, induced voltage is related to ion concentration of the liquid, so the sensor must be recalibrated for every working environment. The most important thing that turned out during research is that although the voltage signal itself is highly inconsistent, the difference between its value in steady state and for flowing liquid is always visible and correlated to the flow value – this property can be used in further deployment. Huge advantage of the sensor is also its scalability which opens so far unknown possibilities of applications. Keywords: graphene, flow, sensor, voltage

1. IntroductionGraphene is characterized by a wide range of

incredible properties, both electrical and mechani-cal, which makes it a very promising material in many branches. It is an excellent current [1] and heat [2] conductor and regarding its untypical dispersive rela-tion [3] it provides electron flow with 1/300 speed of light! Despite a very small thickness its extension strength is more than hundred times higher than for construction steel or Kevlar [4].

Probably the biggest and most remarkable achievement of Poland in that matter is elaborating an innovative manufacturing technology. In 2011 a team led by professor Włodzimierz Strupiński from Insty-tut Technologii Materiałów Elektronicznych (Insti-tute for Electronic Materials Technology) invented a method of producing thin graphene layers on SiC [5], which was given a patent the very same year. Presented results are the effect of the research made in FlowGraf project. Its final purpose is to design, build and deploy a flow sensor based on graphene. An underlying research focused on examining the influ-ence of various factors on the voltage induced in the graphene sample among which the main one was the velocity of the flowing liquid and the others were the quantities which can disturb the consistency of the voltage level: in this case temperature and concentra-tion of sodium chloride.

Currently known methods of flow measurement (e.g. ultrasonic, electromagnetic, Coriolis, vortex etc.) does not provide proper measurement of liquid flow for low speeds.

Research showed that the graphene sensor can be used in the measurement of low flow rate.

2. Possibility of Using Graphene as a Part of Flow Sensor

A flow sensor based on graphene has to meet sev-eral requirements, i.a.:a) induced voltage is related to the flow velocity,b) changes of voltage level are consistent and mathe-

matically describable in a relatively simple manner,c) signal dynamics is high enough – sensor is reason-

ably sensitive and can work in a wide range of flow velocities,

d) signal to noise ratio is low enough.Proper measurements, needed to check how the

sensor meets the requirements mentioned above, have

Fig.1. Laboratory stand for graphene sensor measure-ments


Articles56

been proceeded in Przemysłowy Instytut Automatyki i Pomiarów (Industrial Research Institute for Automa-tion and Measurement) in Warsaw, on a laboratory stand set specifically for that purpose (Fig. 1).

The stand presented above enabled to control the flow velocity using proportional valve and to stop the flow on either inlet or outlet of the tube using on-off valves and to examine the behavior of the sensor (mounted as shown in Fig. 2)

Fig. 2. Mounting of graphene sensor in the tube

Fig 3. Transient of voltage for different flow velocities

First thing that needed to be determined was the relation between flow velocity and voltage signal. It has been examined using (once) deionized water. During the research an attempt was made to simu-late the differential system – using a proportional valve a certain flow value was set, the measurement was made and the same approach was repeated for a steady state, when the water flow was stopped. An exemplary transient of voltage was shown in Fig. 3.

As we can see, the signal level is higher for flowing liquid than in steady state. Another thing to note is that

a signal for non-zero flow becomes stable after cer-tain amount of time which is probably associated with inertial effects – charging and discharging of a capaci-tance of our sample or fluctuations of flow velocity. Liquid velocity was estimated basing on the indica-tions of flow sensor – basing on that we obtained the relation between induced voltage and velocity for two series of measurements (Fig. 4).

Voltage level is higher for 1st series (black) than for the 2nd one (blue) which is related to constant drop of voltage in time. It shows again a phenomenon of dis-charging the graphene layer and every relation deter-mined one after another with a sample constantly sub-merged in liquid will be lower than the previous one. Results presented above, as well as others obtained dur-ing our work, were inconsistent as far as voltage level is concerned, but the sensitivity is always of the same order of magnitude – about 10 nV/ mm/s and there is always a visible and measurable difference between the signal level for flowing liquid and steady state which can be useful in further research (look: conclusion).

3. Influence of Liquid Characteristics on Electrical Signal Generation

3.1. Influence of TemperatureIn order to determine how temperature influences

the voltage of graphene sample, liquid was heated up to a certain temperature and flew through a sample for a constant flow rate. Research has been made for temperature within 20–47 °C range every 3–4 °C and voltage has been measures after achieving a desired temperature which resulted in a voltage-temperature relation (Fig. 5).

Voltage difference increases with temperature which can be explained by growth of charges mobility which results in higher potential differences in a gra-phene layer. Order of magnitude for temperature sen-sitivity can be estimated as δT = 100 nV/°C. Changes of voltage influenced by temperature aren’t at all neg-ligible – change of temperature by 1 °C causes simi-lar change of voltage as change of flow velocity by 10 mm/s. This show that in a final flow sensor construc-tion it is required to use temperature compensating systems (eg. thermistors) or taking it into account in a software algorithm for assigning flow velocity to certain voltage values.

Fig. 4. Relation between voltage and liquid velocity

Fig. 5. Voltage-temperature relation for differential voltage


Articles 57

3.2. Influence of Sodium Chloride ConcentrationFlow sensor is expected to work also for water

solutions of various compounds. Thus, we examined how the concentration influences voltage level. In this research, different sample has been used so the results aren’t quite comparable to the ones presented before, but can give us some knowledge about the scale of the phenomenon. We used sodium chloride solution within a range of concentration between 0 and 3% every 0,3% for measurements. Results are shown in Fig. 6.

Again we obtained constant sensitivity of order 100 µV/%NaCl. We can try comparing it to other results by scaling. For this sample voltage is of order 0,1 –1 mV, before it was 0,01 mV so 1–2 orders of magnitude higher. We can therefore estimate that for the sample examined before the sensitivity would be of order 1–10 µV/%NaCl. It is a very significant value compared to the previously estimated sensitivity val-ues for velocity and temperature which shows that sensor acts in a different way for every liquid regard-ing its concentration.

4. Conclusion and Further Research Directions

It turned out that voltage changes in graphene sensor caused by liquid flow are inconsistent. On the other side there is always a significant change of voltage level which can indicate whether there is

a flow of liquid or there isn’t or if the flow increased or decreased because the direction of change always stays the same. This feature has already been used in Industrial Research Institute for Automation and Mea-surement – graphene sensor is used in one of labora-tory stands as leak detector.

Further work is planned on the commercial appli-cation of sensor leaks. Recipients of a leak detec-tor can be manufacture of valves such as APATOR Powogaz S.A., Broen S.A., Gazomet Sp. z o.o., Norson, MPWiK Wrocław etc.

Huge advantage of presented sensor is its size – flow- meters already present on market are very big which automatically excludes many applications. It is easily scalable and supported by relevant research it could be used in micro scale – for instance in human circulatory system to detect and prevent blood con-gestions.

ACKNOWLEDGEMENTThis work has been supported by the National Cen-tre for Research and Development (NCBiR) with-in the GRAF-TECH programme (no. GRAF-TECH/NCBR/02/19/2012). Project „Graphene based, active flow sensors” (acronym FlowGraf).

AUTHORSAdam Kowalski, Marcin Safinowski*, Roman Sze-wczyk, Wojciech Winiarski – Industrial Research Institute for Automation and Measurements PIAP, Warsaw, Poland. E-mails: [email protected], msafinows-ki, wwiniarski, [email protected].


REFERENCES

[1] Wallace P. R., “The band theory of graphite”, Physi-cal Review, vol. 71, no. 9, 1947, 624. DOI: dx.doi.org/10.1103/PhysRev.71.622.

[2] Murali R., Yang Y., Brenner K., Beck T., Meindl J.D., “Breakdown Current Density of Graphene Nano Ribbons”, Applied Physics Letters, 94, 2009, 243114.

[3] Ghosh S., Calizo I., Teweldebrhan D., Pokatilov E. P., Nika D. L., Balandin A. A.,

[4] Bao W., Miao F., Lau C. N., “Extremely high thermal conductivity of graphene: Prospects for thermal management applications in nanoelectronic cir-cuits”, Apllied Physics Letters, vol. 92, no. 15, 2008, 151911. DOI: dx.doi.org/10.1063/1.2907977.

[5] Taisuke O., Bostwick A., Seyller T., Horn K., Roten-berg E., “Controlling the electronic structure of bilayer grapheme”, Science, 301, 2006, 952.

[6] Strupinski W., “Graphene epitaxy by chemical vapor deposition on SiC”, Nano Letters, vol. 11, no. 4, 2011, 1786. DOI: dx.doi.org/10.1021/nl200390e.

Fig. 6. Relation between voltage and sodium chloride concentration

hydraulic press

test valve

leak converter

FG leak sensor

FG leak sensor

Fig. 7. Measuring station for the graphene leak sensors testing

Journal of Automation, Mobile Robotics & Intelligent Systems VOLUME 9, N 4 2015

M T C P S :A N N C B A B




Submi ed: 20th September 2015; accepted: 26th October 2015

Sławomir Zadrożny, Janusz Kacprzyk, Marek Gajewski

DOI: 10.14313/JAMRIS_4-2015/34

Abstract:We deal with the problem of the mul aspect text catego-riza onwhich calls for the classifica on of the documentswith respect to two, in a sense, orthogonal sets of cat-egories. We briefly define the problem, mainly referringto our previous work, and study the applica on of the k-nearest neighbours algorithm. We propose a new tech-nique meant to enhance the effec veness of this algo-rithm when applied to the problem in ques on. We showsome experimental results confirming usefulness of theproposed approach.

Keywords: text categoriza on, intelligent system, near-est neighbour classifiers, topic tracking and detec on,fuzzy majority

1. Introduc onAn important feature desired for the intelligent

systems is the capability to deal with textual informa-tion. Despitemany efforts and success stories this areastill poses many challenges to the research commu-nity. Natural language processing is an example of adomain where much has been achieved but the ma-chines are still behind a human being and his capa-bility to understand the text in its full meaning. Eventhe domain of information retrieval, setting for itselfmoremodest goalswith respect to textual informationprocessing, calls for further research to address thetremendous growth of information to be processed aswell as the ambition to assist a human user in tack-ling with more and more complex problems, so far re-served for a human being. In this paper, we study oneof such problems, motivated by some real life applica-tions, and try to propose and extend somewell knowntechniques to deal with it.

Our starting point is the concept of themultiaspecttext categorization (MTC)whichwe introduced earlierin a series of papers [9,22,23,25]. The motivation is areal, practical problem ofmanaging collections of doc-uments for the purposes of an organization, notably apublic institutionwhichhas tobe carriedout followingformal regulations imposed by the state. A part of thisproblem its the well-known concept of the text cate-gorization (TC) [16] and thus relevant techniques andtools are readily applicable. Another part is, however,more challenging. Although it also can be interpretedas a TC problem, its characteristic makes it a more dif-icult task – irst of all due to a limited number of train-ing documents available but also due to the differentmotives underlying the grouping of documents.

We have studied the MCT problem in a number ofpapers, cited above, and proposed some solutions toit. Here we study the use of the k nearest neighboursclassi ier (k-nn) andpropose anewalgorithm inspiredby this study. The starting point is the study of Yang etal. [20] which is concerning a similar problem of thetopic detection and tracking (TDT) [1] and proposessome extensions to the basic k-nn algorithm in orderto deal properly with the speci icity of the problem athand.

The structure of this paper is the following. Thenext section brie ly introduces the MTC problem. InSection 3we recall the work of Yang et al. on the use ofthek-nn classi ier for thepurposesof theTDTproblemsolution and their extensions to the basic algorithm. Insubsection 3.2 we present our algorithm inspired bythe work of Yang et al. and combining somehow theparadigms of the nearest neighbour classi ier and thepro ile based classi iers [16]. Section 4 shows the re-sults of our computational experimentsmeant to com-pare discussed methods and Section 5 concludes anddiscusses some ideas for the further research.

2. MTC Problem Descrip on2.1. The Problem

The multiaspect text categorization problem(MTC) may be considered as a twofold standard mul-ticlass single-label classi ication. Thus, a collection ofdocuments is assumed:

D = d1, . . . , dn (1)These documents are, on one hand, assigned to the setof prede ined categories

C = c1, . . . , ct (2)On the other hand, they are also assigned to the se-quences of documents, referred to as cases, withintheir own categories. The cases, generally, are not pre-de ined and are established based on the documentsarriving to the classi ication system. We will assumethat at the beginning there are some cases alreadyformed. Some of themmay be treated as closed, i.e., nonew document should be assigned to them, and someof them are on-going, i.e., they are the candidates forthe newdocuments to be assigned (classi ied) to. Eachdocument d belongs to exactly one category and onecase within this category.

The cases will be denoted as σ and their set as Σ:σk =< dk1 , . . . , dkl

> (3)Σ = σ1, . . . , σp (4)

58


When a new document d∗ arrives it has to be prop-erly added to the collectionD, i.e., d∗ has to be classi-ied to a proper category and assigned to a proper casewithin this category. We consider the task of the sys-tem as of the decision support type, i.e., a human usershould be assisted by the system in choosing a propercategory c ∈ C and a proper a case σ ∈ Σ for the doc-ument d∗ but he or she is responsible for performingthese actions. Several ways of assigning documents tocategories/cases may be conceived; cf., e.g., [23, 25].We follow here the line of conduct presented in thelatter paper, i.e., of a two stage assignment: irst to acategory and then to a case. A set of categories is pre-speci ied and each of themmay be assumed to be rep-resented by a suf icient number of documents in thecollection D. Thus, the standard text categorizationtechniques may be employed [16]. On the other hand,the cases may be quite short and, moreover, emergedynamically during the lifetime of document manage-ment system. Moreover, it should be assumed that theorganization of the documents into categories is basedon some top level thematic grouping. For example, inthe structure of documents collection of a companyone categorymay comprise documents concerning re-lations of the company with the public administra-tion institutions, another category may gather docu-ments related to the activity of this company’s super-visory board,while still another categorymay concernall matters related to the human resources. On theother hand, documents are grouped into cases basedon some business process they are related to, e.g., hir-ing a newemployee. Thus, documents belonging to thecases within the same category are in general themat-ically similar and what should decide on assigning anew document to one of them is somehow differentfrom the clue on document assignment to a category.One of the aspects whichmay be helpful inmaking thedecision on assigning a document to a case is the factthat the documents are arranged within the case in aspeci ic order. This order is based on the logic of thebusiness process related to a given case and re lectsthe chronology of the running of this process. We as-sume that the documents arrive for the classi icationexactly in this order and thus this order may be ex-ploited during the classi ication.2.2. Related Work

The MTC problem has been formulated in our pre-vious papers; cf., e.g., [22]. It belongs to the broad classof text categorization problems. Its most similar prob-lemwell-known in the literature is the Topic Detectionand Tracking (TDT) [1] which may be very brie ly de-scribed as follows. Topic detection and tracking con-cerns a streamof newson a set of topics. Thebasic taskis to group together news stories on the same topic. Astory in TDT corresponds to a document in our MTCproblem de inition while a topic is a counterpart of acase. Categories as such are not considered in the orig-inal formulation of the TDT problem although later onthe concept of hierarchical TDT has been introduced[8] what brings the TDT and the MTC even closer.

Topics, similarly to cases, are not prede ined and

new topics have to be detected in the stream of storiesand then tracked, i.e., all subsequent stories concern-ing this topic should be recognized and properly clas-si ied. A subtask of the irst story detection is distin-guished which consists in deciding if a newly arrivedstory belongs to one of earlier recognized topics or isstarting a new topic.

Although the MTC and TDT problems share manypoints they are still different. In the former, categoriesand cases are considered while only topics are pre-sumed in the latter (even in the hierarchical TDT,men-tioned earlier, the relation between MTC’s categoriesand cases is not re lected as the hierarchy of topics isthere meant in the standard text categorization sense,i.e., the categories at different levels of a hierarchyare just themes considered at the different levels ofabstraction and do not follow a different principle ofgrouping such as theme versus business process, asit is assumed for the MTC). Moreover, cases in MTCare sequences of documents while topics in TDT arejust sets of stories. Again, even if stories in TDT aretimestamped, their successionwithin a topic is not as-sumed to carry out any semantic information and theuse of this temporal information to solve the tasks ofthe TDT, if any, is limited to reducing the in luenceof the older stories on the classi ication decision. Fi-nally, the practical context is different: for TDT this isthe news stories stream analysis while for MTC this isthe business documents management. The reader isreferred to our forthcoming paper [9] for a more indepth analysis of the relations between the TDT andMTC problems.

The solution approaches to the TDT problem be-long to the mainstream information retrieval. Stan-dard representation, most often in the framework ofthe vector spacemodel, is assumed for the stories. Thenotion of the similarity/dissimilarity of stories repre-sented as vectors in a multidimensional space is em-ployed to detect and track topics. Often, various clus-ter analysis techniques are used to group stories, in-terpret the clusters as topics and represent them bythe centroids of these clusters. A new story is com-pared against the centroids of particular topics to de-cide where it belongs. If there is no centroid similarenough then a new topic is established and the newlyarrived document is assigned to it.

In our previous papers we have proposed a num-ber of solutions to the MTC problem. We also most of-ten adopt the vector spacemodel as the starting point.Thematching of a document and a casewas computedas the weighted average of fuzzy subsethood degreesof the fuzzy set representing the document to a fuzzyset representing a given case and fuzzy set represent-ing the category of this case, respectively. Thisway, theassignment of a document to a case forwhich the high-est matching was obtained, which implied also the as-signment to its category, was based on the combina-tion of the matching of this document with respect tothe case as well as to the whole category.

Then we propose to model the cases and proceedwith the classi ication of the documents in the frame-

59


work of the hiddenMarkovmodels and sequencemin-ing [22], using the concepts of the computational in-telligence [23], or employing the support vector ma-chines [24].Wealso pursuedother paths, including se-mantic representation of documents, inding a parallelof theMTCwith text segmentation, studying the asym-metry of similarity [13,14], devising new cluster anal-ysis techniques [11] or investigating the applicabilityof the concepts related to the coreference detection indata schemas [17].

In this paperwe follow the line of research on com-bining some approaches related to the classi icationtask and computational intelligence tools to proposea new approach and also study the applicability of thewell known techniques to the problem of the the mul-tiaspect text categorization.

3. The Techniques Employed3.1. Basic k-nn Technique and Its Extensions to Topic

TrackingIn this paper we study the use of the k-nearest

neighbours technique (k-nn) to solve the multiaspecttext categorization problem. From the point of viewof the statistical pattern classi ication theory, thismethod belongs to the group of the nonparametrictechniques. This type of approaches seems to be mostpromising for the task at hand due to a limited set ofassumptions which have to be adopted to apply them.One of the characteristic MTC features is the sparsetraining data present and thus, e.g., assuming a spe-ci ic family of (conditional) probability distributionsof data and estimation of its parameters may be dif-icult, if possible at all. The k-nn technique proved tobe effective for many different classi ication tasks, in-cluding the topic tracking and detection problem [20]which is closely related to our MTC problem as dis-cussed in section 2.2.

Basically, the k-nn technique may be described inthe context considered here as follows. A set of cate-gories C and a set of training documents D (cf. sec-tion 2.1), for which the category assignment is known,are assumed. For a new document d∗ to be classi iedthe kmost similar to it documents inD are found. Thesimilarity measure is usually de ined as the inverseof some distance measure; usually the Euclidean dis-tance is adopted. The category to which the majorityof the k closest documents belong to is assigned to d∗.Formally, using the notation introduced in (1)-(4), thecategory c∗ assigned to the document d∗ is de ined asfollows:

c∗ = argmaxci

|d ∈ D : (Category(d) = ci) ∧

(d ∈ NNk(d∗))| (5)

where Category(d) denotes the category c ∈ C as-signed to a training document d and NNk(d

∗) denotesthe set of k documents d ∈ D which are the closest tod∗, i.e.,

NNk(d∗) = dδ(1), dδ(2), . . . , dδ(k)

where δ is such apermutationof the set 1, . . . , n thatdδ(j) is j-th most similar to d∗ document in the setD.

The k-nn is a very popular classi ier, often usedalso in the context of the text categorization, cf., e.g.,[10, 19]. An inspiration for our work is in particularthe paper by Yang et al. [20] on the application of thek-nn technique for the purposes of the topic trackingand detection. The authors adopt standard documentrepresentation of the stories within the framework ofthe vector spacemodel, using a variant a variant of thetf x IDF keyword weighting schema.

Yang et al. study in particular the use of thek-nn forthe solution of the topic tracking problem of the TDT(cf. section 2.2). They proposed somemodi ications tothe basic algorithm which proved to yield better re-sults on some benchmark datasets. Namely, in [20] thefollowing improvements to the basic k-nn algorithmhave been proposed. First of all, instead of the mul-ticlass problem they consider t binary classi icationproblems, one for each category c ∈ C (cf. (2)). More-over, instead of simply counting the number of docu-ments belonging to that category among k most simi-lar documentsNNk(d

∗) they compute the following in-dex, called kNN.sum (we slightly modify the originalnotation used in [20] to adjust it to the context of ourMTC problem):

r(d∗, c, k,D) =∑d∈P c

k

sim(d∗, d)−∑d∈Qc

k

sim(d∗, d)

(6)where P c

k = d ∈ NNk(d∗) : Category(d) = c,

i.e., it is a subset of the set of documents most simi-lar to d∗ which belong to the category c (are positive),Qc

k = NNk(d∗) \ P c

k , i.e., it is a subset of documentsbeing negative examples with respect to the categoryc, and sim(d∗, d) denotes the similarity measure be-tween the documents which is assumed to be the cosi-nus of the angle between the vectors representingdoc-uments in question. Then, the document is assigned toa category forwhich this index is the highest, providedit exceeds some threshold value (otherwise documentis treated as starting a new topic). From this point ofview this approach is a kind of theweighted k-nn tech-nique [6].

Yang et al. notice the dif iculty in setting an appro-priate value of the parameter k. In case of the TC prob-lem, i.e., when there is usually a large enough numberof positive examples for each category, an experimen-tally veri ied recommended value for k is rather large,higher than 30 and less than 200.When the number ofpositive examples for a given category is small, as it isthe case for topic tracking in TDT or assigning a docu-ment to a case in ourMTC, this recommendation is notvalid. If k is high the set NNk(d

∗) will be dominatedby negative examples as their number in D is muchgreater than the number of positive examples. How-ever, also choosing k low may lead to the set NNk(d

∗)comprising only negative examples unless the docu-ment to be classi ied d∗ is very similar to some posi-tive examples. Yang et al. [20] proposed to overcomethis dif iculty introducing modi ied versions of the in-dex (6).

60


The irst version is called kNN.avg1 and is de inedas follows:

r′(d∗, c, k,D) =1

|P ck |

∑d∈P c

k

sim(d∗, d)−

1

|Qck|

∑d∈Qc

k

sim(d∗, d) (7)

where | · | denotes the cardinality of a set. In this casethe similarity to the positive and to the negative exam-ples is averaged and thus even for a large k the domi-nance of the negative examples in the neighbourhoodof the classi ied document d∗ does not pose a problem.

The secondmodi ied version of the kNN.sum tech-nique (6) is called kNN.avg2 and is de ined as follows:

r′′(d∗, c, k,D) =

1

|U ckp|

∑d∈Uc

kp

sim(d∗, d)− 1

|V ckn|

∑d∈V c

kn

sim(d∗, d) (8)

In this case kp positive examples, i.e., belonging to thecategory c, most similar to d∗, and kn negative exam-ples most similar to d∗ are considered and they formthe sets U c

kpand V c

kn, respectively. Similarity between

d∗ and the documents from these two sets is averagedas in case of kNN.avg1. Thanks to that a small numberof the nearest training examples may be taken into ac-count and there is no risk that the negative exampleswill dominate. The kNN.avg2 techniques gives a higherlexibility making it possible to independently choosethe parameters kp and kn but these two parametershave to be tuned instead of just one k as in kNN.avg1.When the kNN.avg2 is going to be applied to the MTCproblem there is a risk that there are not enough posi-tive examples, i.e., their number is lower than the valueof kp. In particular, if c corresponds to a very short caseand is considered as a candidate for the assignmentof d∗ then for the reasonable value of kp it may eas-ily happen. In such a situation, our implementation ofthe kNN.avg2 reduces the value of kp to the number ofexisting positive examples.

Yang et al. [20] offer some recommendations as tothe tuningof theparametersk orkp andkn. Theirmainconcern is a limited number of training documentsmaking challenging the usual splitting of the trainingdata set into a genuine training set and a validationset. Thus, they devise the tuning in the framework ofan ensemble of tested classi iers comprising kNN.avg1and kNN.avg2 as well as a Rocchio type classi ier be-longing to the class of the pro ile based classi iers [16].The details can be found in [20]. In the current paperwe test a more standard way of tuning the parametersand we check how effective it is; cf. the next section3.2.

3.2. Our Approach to Improving the k-nn Classifier forthe Purposes of the MTC Problem Solu on

In our previous work [25] on the solution of theMTC problem we already proposed to use the k-nn

technique for the irst stage of the solution, i.e., decid-ing on the category towhich the newly classi ied docu-ment d∗ belongs as well as to the second stage, i.e., as-signing the document d∗ to a case. Herewe study someextensions to the basic k-nn procedure and comparethem experimentally with the approaches proposedby Yang et al. [20] which are presented in section 3.1.

In the TDT problem, more speci ically the topictracking problem, considered by Yang et al. [20] thedocuments related to a topic are assumed to arriveover some timebut the order inwhich they come is notessential. Namely, some documents may describe ex-actly the same aspect of a topic/event but come fromdifferent sources and thus it should not be expectedthat the order in which they appear on the input car-ries out some information useful for their classi ica-tion. In the MTC problem the situation is different andthe order of the documents within a case may be as-sumed to convey some extra information which maybe exploited for their proper classi ication. In [25] wehave shown that the documents can be quite success-fully classi ied to the cases by their comparison just tothe last document of candidate cases. Thus, it seemsto con irm that the similarity to the most recent doc-uments in a case should in luence most the classi i-cation decision. This observation reminds what havebeen con irmed in some computational experimentson the comparison of the path in the tree of an XMLdocument [17]: that the similarity of the last segmentsof these paths is decisive for establishing the corefer-ence of respective XML elements.

Here, we develop this idea and propose to take intoaccount during the comparison all documents belong-ing to a candidate case but with different weights: thecloser to the end of the case given document is locatedthe higher its weight is. Formally, we propose to usethe following index to evaluate the matching of thedocument d∗ against a candidate case σ ∈ Σ. Let usirst cast it in a strict but linguistically expressed formas the truth value of the following linguistically quan-ti ied proposition [12,21]:

The document d∗ is similar to mostof the important documents of the case σ (9)

According to the Zadeh’s calculus of linguisticallyquanti ied propositions the truth value of the propo-sition (9) for a document to be classi ied d∗ and a can-didate case σ is computed as follows:

m(d∗, σ) = µQ

(∑d∈σ min(sim(d∗, d), imp(d))∑

d∈σ imp(d)

)(10)

where sim(·, ·) denotes a similarity measure betweendocuments and imp(·) denotes the importance of thedocument d belonging to the case σ. The linguisticquanti ierQ in (10) represents the concept of linguis-tically expressed majority, exempli ied in (9) with theword “most”. Such a quanti ier may be formally repre-sented inmanydifferentways (cf., e.g., [5]) and inwhatfollows we adopt the original Zadeh’s approach [21].Thus, a linguistically quanti ied proposition its one of

61


the generic templates:

QX ′s areA, or (11)QBX ′s areA (12)

and expresses that, e.g., forQ = most, “most of the el-ements of a universeX possess a propertyA”, in caseof (11), or that “most of the elements of a universeXpossessing a property B possess also a propertyA”, incase of (12). Properties A and B are in general fuzzyand are represented by their membership functionsde ined on the universe of discourse X . A linguisticquanti ier Q is formally represented as a fuzzy set inthe interval [0,1]. For example, the membership func-tion ofQ = mostmay be expressed as follows:

µQ(x) =

1 for x ≥ 0.82x− 0.6 for 0.3 < x < 0.80 for x ≤ 0.3

(13)

The value of the membership function µQ(x) = y isinterpreted as meaning that if x ∗ 100% of elementsof X possess the property A then the truth value of(11) is equal y, or that if x ∗ 100% of elements of Xpossessing the property B possess also the propertyA then the truth value of (12) is equal y.

General formulae for the truth value of (11) and(12) are thus the following, respectively:

truth(QX ′s areA) = µQ

(∑x∈X µA(x)

n

)(14)

truth(QBX ′s areA) =

µQ

(∑x∈X min(µA(x), µB(x))∑

x∈X µB(x)

)(15)

Notice that thede inition of our indicator ofmatch-ing between the document d∗ and a case σ, as ex-pressed with (9), may be rephrased as:Most of the important documents of σ are sim-ilar to d∗ (16)

and thus its the general template (a protoform) (12)[12]. It may be also easily seen that the formula (10) isan instantiation of the formula (15) where X is a setof documents belonging to the case σ (treated in thefollowing formulae as a set of documents),A is a fuzzyproperty of a document d ∈ σ with the membershipfunction:

µA : σ → [0, 1]

µA(d) = sim(d, d∗)

and the fuzzy property B corresponds to the impor-tance of document dwith respect to the case σ, i.e.:

µB : σ → [0, 1]

µB(d) = imp(d)

The index (10) introduced here is used to assignnew document D∗ to a case in a straightforward way(we assume that before that d∗ is assigned to a cate-gory c using, e.g., the basic k-nn algorithm, as in ourprevious paper [25]):

1) the matching indexm de ined by (10) is computedfor all candidate (on-going) cases belonging to cat-egory c; the set of such cases is denoted asΣc,

2) the document d∗ is assigned to the case σ∗ suchthat:

σ∗ = arg maxσi∈Σc

m(d∗, σi)

Thus, our approachmay be treated as another wayof using the k-nn technique for classi ication of docu-ments, although to some extent it may be also inter-preted as a kind of a pro ile-based classi ication [16].We employ theweighted similarity of the document d∗with respect to the documents of a case – similarly asthe kNN.avg1 and kNN.avg2 are doing. However, wecompute the weighted average and with respect to allthe training documents comprising a particular can-didate case. Moreover, in our approach two differenttypes of weights are involved: one related to the simi-larity sim(d, d∗) and another one related to the impor-tance imp(d) of a document within a case.

In order to use effectively the introduced index(10) we need to devise the way to set its parameters,i.e.:1) the form of the quanti ierQ,2) the form of the similarity measure sim used

therein,3) the importanceweights assigned to particular doc-

uments of a case.Concerning the linguistic quanti ier employed, the

very nature of the proposed index m (10) suggeststhe use of the quanti ier expressing the concept of the(fuzzy) majority, such as “most”. More generally a so-called Regular Increasing Monotonic (RIM) quanti iershould be used [18], i.e., one with the monotone in-creasing membership function µQ, such as, e.g., (13),i.e.:

∀x, y x < y ⇒ Q(x) ≤ Q(y) (17)Thus, the choice of a speci ic quanti ier seems to beof a limited importance. The replacement of a linguis-tic quanti ier Q1 with Q2 in (10) may change the as-signment of a document to the case only due to theassumedweakmonotonicity of the membership func-tionµQ (cf. (17)), i.e., ifµQi(x) = µQi(y) andµQj (x) <µQj (y), where i, j ∈ 1, 2 and x < y. Thus, we as-sume the unitary linguistic quanti ier in (10), i.e., de-ined by the membership function:

µQ(x) = 1, ∀x ∈ X

It has to be noted that the choice of a linguistic quan-ti ier may play a more important role if a thresholdvalue of the index (10) is set and meant to decide ifdocument d∗maybe assigned to a given case or shouldstart a new case, i.e., when the irst story detectionproblem is considered. However, this goes beyond thescope of this paper.

The similarity measure sim in (10) may be de-ined inmanyways. In our previous work we aremostoften using the Euclidean distance between the vec-tors representing documents under comparison. Here

62


we adopt it also and take its complement as the mea-sure of the similarity. We assume the vectors repre-senting documents to be normalized in such a waythat their Euclidean norms are equal 1. Thus, the high-est possible Euclidean distance between two vectorsrepresenting documents equals

√2 and the similarity

between two documents d = [d1, . . . , dl] and d∗ =[d∗1, . . . , d

∗l ], assuming the number of keywords used

to represent the documents to be equal l, may be ex-pressed as follows:

sim(d, d∗) =

√2−

√∑li=1(di − d∗i )

2

√2

(18)

sim(d, d∗) ∈ [0, 1].Finally, the importance of each document in a case

is assumed to be an increasing function of the positionof the document in the case. The extreme cases are thefollowing:- important is only the last document in the case; cf.our paper [25] studying this case;

- all documents are equally important, i.e., effectivelythe importance is not taken into account

In the current paper we consider and test experimen-tally the following options (x denotes the position ofa document in the case, len denotes the length of thiscase, and a and b are the parameters):1) linear importance:

imp(x) =x

len(19)

2) quadratic importance:

imp(x) = a ∗ ( x

len)2 + 1− a (20)

3) radical importance:

imp(x) = a ∗√

x

len+ 1− a (21)

4) piecewise linear importance:

imp(x) =

0 if x ≤ a(x− a)/(b− a) if a < x ≤ b1 if x > b

(22)All options assume that the importance degree of thelast document of the case, i.e., the one most recentlyassigned to this case, is equal 1, i.e., is the highest. Alsoall aremonotone: documents located later in a case getnot lower importance than those located earlier.

The irst option is the simplest one making thelast documentmost important and gradually reducingthe importance of earlier documentswith the constantrate. No parameters have to be set. The second option,the quadratic importance, for high values of the pa-rameter a ∈ [0, 1] relatively, with respect to the linearimportance, reduces the importance of the documentsbehind the last document of the case. This reductionis highest for the documents located in the middle of

the case. For small values of a the importance of docu-ments in the case is increased relative to the linear im-portance. This increase is highest for the documentsat the beginning of the case. The radical importance,for any value of the parameter a ∈ [0, 1] increases im-portance of all documents in the case. The smaller thevalue of a the higher is this increase. The increase isrelatively highest for the documents located at the be-ginning of the case. Finally, the piecewise linear im-portancemakes it possible to set the importance of thedocuments located at the beginning of a case to 0whateffects in ignoring themduring the computation of theindex (10). At the same time, the documents locatedcloser to the end of the case can get importance degreeequal 1 - the highest importance degree is thus notanymore reserved for the last document of the case.This option requires setting of two parameters a andbwhich decide what proportion of the documents willget the importance degree equal 0 and 1, respectively.

An option for the importance degree may be cho-sen based on the experience of the user or may betuned using the training dataset. In our experimentsreported in section 4 we follow the latter way.3.3. Oversampling of Short Cases to Circumvent the Im-

balance of ClassesIn the previous section we have introduced a vari-

ant of the k-nn method which is meant to better ac-count for the relation of the subsequent documentsin a case. This is a rather far going modi ication ofthe original algorithm which, in a sense, replaces thecomparison of the document to be classi ied d∗ againsttraining documents with the comparison of d∗ againstthe whole cases with a proper account for the sequen-tial character of documents forming a case. In this sec-tion we propose another, modest, modi ication of theoriginal k-nn algorithm which is expected to improveits working for the MTC problem.

Namely, usually when a document d∗ is going tobe assigned to a case, particular candidate (on-going)cases are of different length. Some have just startedand comprise a small number of documents whileother are already well developed and may comprisetens of documents. Thus, when each case is treatedas a separate class then we have usually to deal withthe classes of imbalanced sizes in the training dataset.Wepropose to duplicate documents of short candidatecases thus increasing their visibility during the execu-tion of the regular k-nn method and its variants de-scribed earlier, including our approach presented insection 3.2.

Formally, a threshold caselen is set and all casesin a given category whose length is shorter that thisthreshold are replicated. Several strategiesmay be ap-plied: all documents of such a short case may be repli-cated the same number of times or the number ofreplicated copies may depend on the location of thedocument within the case. Following the similar rea-soning as in section 3.2 wemay putmore emphasis onthe most recent documents in the case and replicatethem more times. In our experiments in section 4 wetry a few variants.

63


This technique is basicallymeant for the original k-nn algorithm. Its variants proposed by Yang et al. [20]and described in section 3.1 are resistant somehowto this problem thanks to averaging of the similarity(and dissimilarity) over the documents neighbouringthe document d∗. It is also easy to notice that over-sampling corresponds to considering the importancein our approach presented in section 3.2 and eventu-ally boils down to the weighted averaging of the simi-larities of documents neighbouring d∗.

In the experiments discussed in the next sectionwe employ the oversampling in case of the regulark-nn technique to compare its effectiveness with theeffectiveness of the apparently more sophisticatedmethods discussed in sections 3.1 and 3.2.

4. Computa onal Experiments4.1. Data and So ware Used

The are no benchmark datasets yet for the mul-tiaspect text categorization problem dealt with here.In our work we are using a collection of papers aboutcomputational linguistics which have been structuredusing XML and made available on the Internet as theACLAnthologyReference Corpus (ACLARC) [4]. In ourexperiments we use 113 papers forming a subset ofthe ACLARC.We group papers to obtain categories (cf.section 2). After some trials we decided to look for 7categories (clusters) using the standard k-means clus-tering algorithm. The clustering is applied to the pa-pers represented according to the vector space model(cf., e.g., [2]) ignoring the XML markup. In particular,the following operations are executed to obtain theinal representation (cf. also our earlier papers, e.g.,[25]). The text of the papers is normalized, i.e., thepunctuation, numbers and multiple white spaces areremoved, stemming is applied, the case is changed tothe lower case, stopwords and words shorter than 3characters are dropped. The document-termmatrix iscreated for the whole set of the papers using tf × IDFterms weighting scheme. Next, the keywords presentin less than 10% of the papers are removed from thedocument-termmatrix. The vectors representing par-ticular papers are normalized by dividing each coor-dinate by the Euclidean norm of the whole vector andthus the Euclidean norm of each vector equals 1.

Next, we produce a set of cases based on the pa-pers, in the following way. The papers are originallypartitioned into sections (segments) and each sec-tion forms the content of the XML element Section.We treat each paper as a case while its sections areconsidered to be documents of this case, preservingtheir original order within the document. This waywe obtain a collection of 113 cases comprising 1453documents, cf. also [22–25]. The documents, i.e., thesections of the original papers, are represented us-ing the vector space model. Thus, again the opera-tions such as the punctuation, numbers and multiplewhite spaces removal, stemming, changing all charac-ters to the lower case, stopwords and words shorterthan 3 characters elimination are applied to the docu-ments. A document-termmatrix is constructed for the

above set of documents using tf × IDF terms weight-ing scheme. Again, sparse keywords appearing in lessthan 10%of documents are removed from thismatrix.and as a results 125 keywords are used to representthe documents. The vectors representing documentsare normalized in the same way as in case of the pa-pers, i.e., their Euclidean norm equals 1.

The dataset obtained this way is then split into thetraining and testing datasets. To this aim, a numberof cases are randomly chosen as the on-going caseswhich are thus the candidate cases for the documentd∗ to be assigned to. In each on-going case a cut pointis selected randomly: the document located at thecut point and all subsequent documents are removedfrom the case and serve as the testing dataset. All re-maining documents from the collection serve as thetraining dataset.

All computations are carried out using the R plat-form [15] with the help of several packages. In par-ticular, the text processing operations are are imple-mented using the tm package [7]. The FNN package[3] is employed to classify documents to cases withthe use of the original k-nn algorithm. The algorithmsmentioned in sections 3.1 and 3.2 are implementedourselves in the form of an R script.

4.2. The Goals of the Experiments and How the Param-eters Are ChosenOur goal is to compare the effectiveness of the ba-

sic k-nn algorithm and its variants discussed in sec-tions 3.1 and 3.2 for the solution of the MTC problempresented in section 2, and in particular, of its secondstage consisting in assigning the document d∗ to theproper case. Of a special interest is, of course, our ap-proach presented in section 3.2. Thus, we assume herethe representation of the collection of documents de-scribed in section 4.1 and we assume the two-stageapproach with two stages consisting in assigning thedocument d∗ to a category and a case within this cate-gory, respectively.

We run a number of experiments and based on theresults of each runwe evaluate the effectiveness of theassignment of documents to cases. As the evaluationof the effectiveness of the assignment we use the mi-croaveraged recall, i.e.,

accuracy =

number of documents prop-erly assigned to their casesnumber of all documentsbeing classi ied

(23)

The cardinalities of the sets of test documents belong-ing to particular cases do not differ extremely andthus microaveraging seems to be the measure best il-lustrating the quality of particular classi ication algo-rithms under consideration.

Each experiment consists in choosing on-goingcases and classifying all documents located behind thecut points in these cases.

An important aspect of the successful applicationof the classi ier is the question of tuning of its param-eters. Thus in the irst series of our experiments we

64


tune the parameters of the ive classi iers under com-parison (cf. section 3):1) the regular k-nn algorithm with the parameter

tuned to be k;2) the kNN.avg1 technique proposed by Yang et al.

[20] with the parameter tuned to be k;3) we consider also a simpli ied variant of kNN.avg1,

cf. r′ given by (7), which may be expressed as fol-lows:

r′0(d∗, c, k,D) =

1

|P ck |

∑d∈P c

k

sim(d∗, d) (24)

i.e., instead of taking account the similarity of thedocument d∗ to the closest, both, positive and neg-ative documents of the training data set, as r′ does,the index r′0 takes into account only the closest pos-itive neighbours. The simpli ied variant is more inthe spirit of the basic k-nn technique andwewouldlike to check if taking into account also the similar-ity of d∗ with respect to the negative examples re-ally increases the effectiveness of the classi ication;here again tuning concerns choosing the value of k,

4) the kNN.avg2 technique proposed by Yang et al.[20] with the parameters tuned to be kp and kn;for this technique the variant simpli ied along thelines proposed for kNN.avg1 may be also of inter-est but for large k’s it becomes identical with r′0introduced above, and in our preliminary tests itturned out to be inferior to the original kNN.avg2technique.

5) our approach given by (10) with the parametertuned to be the importance function imp.

Tuning the Parameters k One may choose and ix theparameters values based on his or her experience,some extra knowledge concerning the characteristicof the problem at hand, or on historical data. It is alsopossible to dynamically and automatically choose theparameters values each time a classi ication decisionis to be made, again based on the available data. In theirst series of experiments we check if such a dynamictuning really helps in comparison to using ixed valuefor the parameter k. We compare the results obtainedfor all ive algorithms under consideration: basic k-nn, kNN.avg1 and its modi ied variant de ined by (24),and kNN.avg2, with the tuning of the parameter k andwithout tuning it, using ixed values k = 1, 5, and 10for k-nn, and k = 5, and 10 for the kNN.avg1 and itsmodi ied variant (for k = 1 these two latter methodscoincide with the 1-nn). For kNN.avg2 all 9 combina-tions for kp, kn ∈ 1, 3, 5 are investigated.

An option to consider for the dynamic tuning iswhich part of the data set to use. Basically, it should bea separate validation data set. However, it is dif icultto obtain due to the limited size of classes (cases) andthe requirement that the training data set have to beformed of the pre ixes of the cases (i.e., original casesup to a cut off point). Thus, for the purposes of the tun-ing we have employed training and testing datasets

formed as for the original dataset but assuming thatall the cut off points in the on-going cases are one po-sition earlier than in the original dataset (if such anew cut off point happens to be the irst position ofthe case then such a case is not used during the tun-ing). Then, we compared two tuning procedures, bothbased on adopting subsequent values of k from the in-terval [1,10] and checking if the testing documents areassigned to proper cases but differing in the numberof testing documents taken into account. In a simplerprocedure the testing set comprises only documentslocated at the new cut-off points while in the secondprocedure also the preceding documents are used -down to the document located at the second positionin a given case. The former procedure, to be called sim-ple in what follows, may bene it from employing fortuning a dataset most similar – in terms of the lengthof cases and their content – to the actual test dataset.This procedure is also cheaper computationally as lessdocuments are classi ied. The latter procedure, to becalled complex in what follows, may better re lect theproperties of the cases belonging to a given categorybut is more expensive.

In Tables 1-4 we show the results of the tuning ofthe parameter k for all the methods under compari-son (or kp and kn in case of the kNN.avg2 technique).Table 1 shows that for the basic k-nn technique thebest results are obtained for ixed value of k equal 1and for the simple tuning procedure. The former ishowever much cheaper computationally and thus wewill use k ixed to 1 for the basic k-nn in our furthercomparisons with other techniques discussed in thispaper. In case of the kNN.avg1 technique the resultsobtained for various values of k are more uniform, asshows Table 2 (k=1 is omitted as the method then co-incides with 1-nn). This seems to be the effect of theaveraging employed by the kNN.avg1 technique. Forfurther comparisons we choose the complex tuningprocedure. The same happens for our simpli ied ver-sion of the kNN.avg1 technique and we again choosethe complex tuning. In case of the kNN.avg2 techniquethe best results are obtained for largest tested num-ber of kp, i.e., kp = 10 (in some extra tests for evenhigher values of kp we have not obtained better re-sults). The value of kn does not make much differenceso we choose the following setting (kp, kn) = (10, 1)for further comparisons.

Tab. 1. The averaged results of 100 runs of the basick-nn algorithm for the following fixed values of k: 1, 5,10, and for the values tuned using the simple andcomplex procedures. First row shows the mean valueof the accuracy over all the runs while the second rowshows the standard devia on

k1 5 10 simple complex

0.6338 0.5186 0.4566 0.6077 0.57410.0656 0.0607 0.0524 0.0641 0.0656

65


Tab. 2. The averaged results of 100 runs of thekNN.avg1 algorithm for the following fixed values of k:5, 10, and for the values tuned using the simple andcomplex procedures. First row shows the mean valueof the accuracy over all the runs while the second rowshows the standard devia on

k5 10 simple complex

0.6079 0.5961 0.6164 0.61960.0564 0.0582 0.0599 0.0572

Tab. 3. The averaged results of 100 runs of the modified(simplified) kNN.avg1 algorithm for the following fixedvalues of k: 5, 10, and for the values tuned using thesimple and complex procedures. First row shows themean value of the accuracy over all the runs while thesecond row shows the standard devia on

k5 10 simple complex

0.6150 0.6020 0.6307 0.63290.0596 0.0634 0.0615 0.0553

Choosing the Importance Func on This parameterapplies only to our technique proposed in section 3.2.We have to choose one of the importance functions(19)-(22). Besides choosing the very function,with ex-ceptionof the linear importance,wehave also the free-dom to choose its parameters. In case of the quadraticand radical functions (20)-(21) there is only one pa-rameter a ∈ [0, 1] which we sample every 0.1. In caseof the piecewise importance two parameters a, b ∈[0, 1] are to be selected.

In our experiments we are tuning this parametercomparing some ixed settings and thedynamic tuningprocedure (using the simple procedure as describedearlier in case of tuning the parameter k for the othertechniques under comparison). The ixed settings arethe following:1) the linear importance (19),2) the quadratic (20) and radical (21) impor-

tances with the parameters (a, b) set to(0.1, 0.9), (0.5, 0.5), (0.9, 0.1) each

3) the piecewise linear importance with the parame-ters (a, b) set to (0.0, 0.5), (0.3, 0.8), (0.5, 1.0)

Thedynamic tuning checks thewhole space of the pos-sible settings for the importance function. In partic-ular, all four importance functions are taken into ac-count and tested on the test dataset formed follow-ing the simpleprocedure, i.e., making the cut-off pointsone position earlier than in the original test data setand using only the documents located at these new cutoff points for tests. During this testing the linear im-portance does not need any parameters, the quadraticand radical importance functions with all the pairsfrom U = (a, b) : a = 0.1, 0.2, . . . , 1.0 and b = 1− aare testedwhile in caseof thepiecewise importance allthe pairs fromU = (a, b) : a = b−1.0, b−0.9, . . . , b−0.1 and b = 0.1, 0.2, . . . , 1.0 are tested. The combi-

nation of the parameters, i.e., the importance functiontogether with the parameters a and b setting, whereapplicable, is chosen for the actual classi ication o anewly arrive document.

Table 5 shows the results of the tuning of the im-portance function parameter. Several combinationsgive equally good results. Also using our approachwith no importance, what is equivalent to assigninghighest importance of 1.0 to all documents of a casein question, yields good results. In the latter case,the index (10) underlying our approach boils downto averaging the similarity of a document to classifyover all documents of a candidate case, what makesit close to the kNN.avg1 and kNN.avg2 techniques ofYang et al. [20]. For the further comparisons of our ap-proachwith respect to the other techniqueswe choosethe radical importance function with the parameters(a, b0=(0.5, 0.5) what corresponds to the importancefunction imp(x) = 0.5

√xlen + 0.5.

All ive techniques under consideration em-ploy a similarity measure. In our experiments weadopted the Euclidean distance, also for kNN.avg1and kNN.avg2 which are originally de ined in [20]with the use of the cosine measure. We leave theexperiments with other similarity measures for thefuture research.

Oversampling Variants In section 3.3 we propose toto use the oversampling of documents from shortcases in the training data set to remedy the imbalanceof cases sizes. In our experiments we have applied theoversampling to the cases of length lower than 3, i.e.,effectively the only the cases in the training data setcomprising one or two documents are affected. Wehave tested the following three variants:

over1 in which the oldest documents in the case isoversampled more, i.e., effectively the irst docu-ment in a short case is tripledwhile the second (ifexists) is doubled,

over2 in which the newest documents in the case isoversampled more, i.e., effectively the irst doc-ument in a short case is doubled while the sec-ond (if exists) is tripled; this is a strategy in-linewith our general assumption that the newest doc-uments, located closer to the end of the case, mat-ter the most for the successful classi ication,

over3 in which all documents in short cases areequally oversampled; effectively the irst and thesecond document in a short case are doubled.

4.3. The ResultsAfter choosing the ixed parameters or their dy-

namic tuning, as described earlier, we inally comparethe effectiveness of the techniques discussed in sec-tions 3.1 and 3.2. In the Table 6 we show the accuracyof the 20 following variants of the earlier discussed al-gorithms, averaged over 200 runs:1) basic 1-nn technique,2) basic 5-nn technique

66


Tab. 4. The averaged results of 100 runs of the kNN.avg2 algorithm for the following fixed values of (kp, kn): (1,1),(1,5), (1,10), (5,1), (5,5), (5,10), (10,1), (10,5), (10,10), and for the values tuned using the simple and complexprocedures. First rows show the mean value of the accuracy over all the runs while the second rows show the standarddevia on

(kp, kn)(1,1) (1,5) (1,10) (5,1) (5,5) (5,10)0.6264 0.6339 0.6307 0.6552 0.6534 0.64950.0562 0.0565 0.0570 0.0564 0.0569 0.0564

(kp, kn)(10,1) (10,5) (10,10) simple complex0.6682 0.6639 0.6614 0.6429 0.64110.0546 0.0533 0.0551 0.0548 0.0571

Tab. 5. The averaged results of 100 runs of our algorithm for the following fixed choices of the importance func on:linear, quadra c with (a, b) = (0.1, 0.9), (0.5, 0.5), (0.9,0.1), radical with a = (a, b) = (0.1, 0.9), (0.5, 0.5), (0.9,0.1),piecewise with (a, b) = (0.0, 0.5), (0.3, 0.8), (0.5, 1.0), dynamically tuned, and with importance iden cally equal 1.0 (noimportance). First rows show the mean value of the accuracy over all the runs while the second rows show thestandard devia on

Importance functionslinear quadratic radical

(0.1,0.9) (0.5,0.5) (0.9,0.1) (0.1,0.9) (0.5,0.5)0.6332 0.6545 0.6577 0.6164 0.6532 0.65950.0599 0.0651 0.0593 0.0640 0.0642 0.0623

Importance functionsradical piecewise tuned no importance(0.9,0.1) (0.0,0.5) (0.3,0.8) (0.5,1.0)0.6557 0.6525 0.5888 0.5368 0.6279 0.65180.0609 0.0610 0.0624 0.0663 0.0653 0.0636

3) kNN.avg1 with dynamically tuned value of k usingcomplex tuning

4) kNN.avg1 with k ixed and equal 55) the simpli ied version of the kNN.avg1 technique

with dynamically tuned value of k using complextuning

6) the simpli ied version of the kNN.avg1 techniquewith k ixed and equal 5

7) kNN.avg2 with (kp, kn) ixed and set to (10,1)8) our algorithm presented in section 3.29) 5-nn with oversampling in variant 1 (cf. section

3.3)10) kNN.avg1 with k ixed and equal 5 and with over-

sampling in variant 111) the simpli ied version of the kNN.avg1 technique

with k ixed and equal 5 and with oversampling invariant 1

12) kNN.avg2 with (kp, kn) ixed and set to (10,1) andwith oversampling in variant 1

13) 5-nn with oversampling in variant 214) kNN.avg1 with k ixed and equal 5 and with over-



16) kNN.avg2 with (kp, kn) ixed and set to (10,1) andwith oversampling in variant 2

17) 5-nn with oversampling in variant 318) kNN.avg1 with k ixed and equal 5 and with over-



20) kNN.avg2 with (kp, kn) ixed and set to (10,1) andwith oversampling in variant 3Our goal was to compare our method (10) with

other techniques, check how the simpli ied version ofthe kNN.avg1 compares with its original form, checkif oversampling discussed in section 3.3 increases theeffectiveness of the techniques to which it is applica-ble and if there is a difference between its variants. Ofcourse, as the test has been executed on one datasetany far going conclusions are not fully justi ied.

The best results are obtained for the kNN.avg2(for all parameters tested). Our approach producesslightly, but statistically signi icant, worse results ac-cording to the paired Wilcoxon signed-rank test at0.05 signi icance level (we use this statistical test inwhat follows, too). In particular, in 108 runs out of 200reported in Table 6 the kNN.avg2 without sampling(algorithm no. 7 in Tab. 6) was better than ours whileour was better in 63 runs. The third is the simple k-nn algorithm with k = 1, i.e., 1-nn, which is however

67


Tab. 6. The averaged results of 200 runs of the compared algorithms. First rows show the mean value of the accuracyover all the runs while the second rows show the standard devia on

The algorithm1 2 3 4 5 6

1-nn 5-nn kNN.avg1 kNN.avg1 simp kNN.avg1 simp kNN.avg1tuned k=5 tuned k=5

0.6309 0.5230 0.6264 0.6100 0.6253 0.60530.0566 0.0611 0.0567 0.0582 0.0576 0.0592

The algorithm7 8 9 10 11 12

kNN.avg2 our approach 5-nn kNN.avg1 simp kNN.avg1 kNN.avg2kp=10,kn=1 over1 k=5 over1 k=5 over1 kp=10,kn=1 over10.6667 0.6569 0.6084 0.6125 0.6081 0.66710.0558 0.0585 0.0620 0.0576 0.0579 0.0560

The algorithm13 14 15 16 17 185-nn kNN.avg1 simp kNN.avg1 kNN.avg2 5-nn kNN.avg1over2 k=5 over2 k=5 over2 kp=10,kn=1 over2 over3 k=5 over30.6084 0.6124 0.6079 0.6665 0.6084 0.61300.0620 0.0578 0.0580 0.0558 0.0620 0.0580

The algorithm19 20

simp kNN.avg1 kNN.avg2k=5 over3 kp=10,kn=1 over30.6085 0.66670.0583 0.0558

signi icantly worse than two previously mentioned al-gorithms while is better than the kNN.avg1 algorithmand its simpli ied version.

Concerning the simpli ication of the kNN.avg1 al-gorithm which we have considered, our experimentsseem to show statistically signi icant reduction of thequality of the classi ication due to its use for most ofthe parameters settings, i.e., when the pairs of algo-rithms (4,6), (10,11), (14,15) and (18,19) in Table 6are compared. However, for the setting where bothtechniques perform the best, i.e., when the parame-ter k is dynamically tuned (pair (3,5)), there is no sta-tistically signi icant difference between the originalkNN.avg1 technique and its simpli ied version.

Concerning the oversampling, the most strikingeffect is visible in case of the basic 5-nn algorithm.Its performance without oversampling is poor whileif coupled with oversampling, in any of the variantsover1, over2 or over 3, it produces results not signif-icantly worse than e.g., the kNN.avg1 technique. ForkNN.avg1 itself and its simpli ied version adding over-sampling alsoproduces the signi icantly better results,again in case of any variant. For kNN.avg2 no signi i-cant impact of oversampling is visible.

5. ConclusionsWe have studied the application of the k-nn tech-

nique and its variants to the problem of the multi-aspect text categorization (MTC), in particularwith re-spect to the classi ication of a document to a case. One

of the variants known from the literature [20] provedto be the best when applied to a data set we preparedfor our experiments with the solutions to MTC. Weproposed also our technique which makes it possibleto take into account the importance of the documentswithin a case, in an intuitively appealing way. This ap-proach also yields good results in our experiments.

We have also studied various ways of tuning of theparameters of the classi iers employed, as well as wehave checked if the oversampling of data may helpto increase the accuracy of the classi ication. The re-sults are mixed in this respect: for some classi iersthe dynamic tuning of the parameters works while forother there is no improvement. Theoversampling sup-ports better classi ication but the results are convinc-ing mainly for the basic 5-nn classi ier.

Further research is surely needed concerning thetuning of the considered techniques. Our experimentswith an ACL ARC dataset have con irmed some limitedusefulness of the parameters tuning. However, in an-other setting adjusting parameters to a given collec-tion may turn worth consideration. Thus, it may beimportant to devise the tuning algorithms in the com-putationally optimal way. In our experiments the dy-namic tuning has been performed by a direct repe-tition of the functions implementing particular tech-niques. This can be surely improved. The ways tomore ef iciently sample the parameters space shouldbe looked for as well as combining the sampling withthe implementation of a given technique may be ad-

68


vantageous.Wehavealsodiscussed thequestionof theform of the test/validation dataset and here there isalso some room for further investigations.

AUTHORSSławomir Zadrożny∗ – Systems Research In-stitute, Polish Academy of Sciences, 01-447Warszawa, ul. Newelska 6, Poland, e-mail: [email protected] Kacprzyk – Systems Research Institute, PolishAcademy of Sciences, 01-447 Warszawa, ul. Newelska6, Poland, e-mail: [email protected] Gajewski – Systems Research Institute, PolishAcademy of Sciences, 01-447 Warszawa, ul. Newelska6, Poland, e-mail: [email protected].∗Corresponding author

ACKNOWLEDGEMENTSThis work is supported by the National Science Centre(contract no. UMO-2011/01/B/ST6/06908).

REFERENCES[1] J. Allan, ed., Topic Detection and Tracking: Event-

based Information, Kluwer Academic Publishers,2002.

[2] R. Baeza-Yates and B. Ribeiro-Neto, Modern in-formation retrieval, ACMPress andAddisonWes-ley, 1999.

[3] A. Beygelzimer, S. Kakadet, J. Langford, S. Arya,D. Mount, and S. Li. FNN: Fast Nearest Neigh-bor Search Algorithms and Applications, 2013. Rpackage version 1.1.

[4] S. Bird, R. Dale, B. Dorr, B. Gibson, M. Joseph, M.-Y. Kan, D. Lee, B. Powley, D. Radev, and Y. Tan,“The ACL anthology reference corpus: A refer-ence dataset for bibliographic research in com-putational linguistics”. In: Proc. of LanguageResources and Evaluation Conference (LREC 08),Marrakesh, Morocco, 1755–1759.

[5] M. Delgado, M. D. Ruiz, D. Sanchez, andM. A. Vila,“Fuzzy quanti ication: a state of the art”, FuzzySets and Systems, vol. 242, 2014, 1–30, http://dx.doi.org/10.1016/j.fss.2013.10.012.

[6] S. A. Dudani, “The distance-weighted k-nearest-neighbor rule”, IEEE Transac-tions on Systems, Man, and Cybernet-ics, vol. 6, no. 4, 1976, 325–327, http://dx.doi.org/10.1109/TSMC.1976.5408784.

[7] I. Feinerer, K. Hornik, and D. Meyer, “Text min-ing infrastructure in R”, Journal of Statistical Soft-ware, vol. 25, no. 5, 2008, 1–54, http://dx.doi.org/10.18637/jss.v025.i05.

[8] A. Feng and J. Allan, “Hierarchical topic detectionin tdt-2004”.

[9] M. Gajewski, J. Kacprzyk, and S. Zadrozny, “Topicdetection and tracking: a focused survey and anew variant”, Informatyka Stosowana, to appear.

[10] E. Han, G. Karypis, and V. Kumar, “Text catego-rization using weight adjusted k-nearest neigh-bor classi ication”. In: D. W. Cheung, G. J.Williams, and Q. Li, eds., Knowledge Discoveryand Data Mining - PAKDD 2001, 5th Paci ic-AsiaConference, Hong Kong, China, April 16-18, 2001,Proceedings, vol. 2035, 2001, 53–65.

[11] J. Kacprzyk, J. W. Owsinski, and D. A. Viattchenin,“A new heuristic possibilistic clustering algo-rithm for feature selection”, Journal of Automa-tion, Mobile Robotics & Intelligent Systems, vol.8, no. 2, 2014, http://dx.doi.org/10.14313/JAMRIS_2-2014/18.

[12] J. Kacprzyk and S. Zadrozny. “Power of linguis-tic data summaries and their protoforms”. In:C. Kahraman, ed., Computational Intelligence Sys-tems in Industrial Engineering, volume 6 of At-lantis Computational Intelligence Systems, 71–90.Atlantis Press, 2012. http://dx.doi.org/10.2991/978-94-91216-77-0_4.

[13] D. Olszewski, J. Kacprzyk, and S. Zadrozny.“Time series visualizationusing asymmetric self-organizing map”. In: M. Tomassini, A. Antonioni,F. Daolio, and P. Buesser, eds., Adaptive and Nat-ural Computing Algorithms, volume 7824 of Lec-ture Notes in Computer Science, 40–49. SpringerBerlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37213-1_5.

[14] D. Olszewski, J. Kacprzyk, and S. Zadrozny.“Asymmetric k-means clustering of the asym-metric self-organizing map”. In: L. Rutkowski,M. Korytkowski, R. Scherer, R. Tadeusiewicz,L. Zadeh, and J. Zurada, eds.,Arti icial Intelligenceand Soft Computing, volume 8468 of LectureNotes in Computer Science, 772–783. Springer In-ternational Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-07176-3_67.

[15] R Core Team. R: A Language and Environment forStatistical Computing. R Foundation for Statisti-cal Computing, Vienna, Austria, 2014.

[16] F. Sebastiani, “Machine learning in automatedtext categorization”, ACM Computing Survys, vol.34, no. 1, 2002, 1–47, http://dx.doi.org/10.1145/505282.505283.

[17] M. Szymczak, S. Zadrozny, A. Bronselaer, andG. D. Tre, “Coreference detection in an XMLschema”, Information Sciences, vol. 296, 2015,237 – 262, http://dx.doi.org/10.1016/j.ins.2014.11.002.

[18] R. Yager, “Quanti ier guided aggregation us-ing OWA operators”, International Journalof Intelligent Systems, vol. 11, 1996, 49–73,http://dx.doi.org/10.1002/(SICI)1098-111X(199601)11:1%3C49::AID-INT3%3E3.0.CO;2-Z.

[19] Y. Yang, “An evaluation of statistical approachesto text categorization”, Information Retrieval, vol.1, no. 1-2, 1999, 69–90, http://dx.doi.org/10.1023/A:1009982220290.

69

http://dx.doi.org/10.1016/j.fss.2013.10.012

http://dx.doi.org/10.1016/j.fss.2013.10.012

http://dx.doi.org/10.1109/TSMC.1976.5408784

http://dx.doi.org/10.1109/TSMC.1976.5408784

http://dx.doi.org/10.18637/jss.v025.i05

http://dx.doi.org/10.18637/jss.v025.i05

http://dx.doi.org/10.14313/JAMRIS_2-2014/18

http://dx.doi.org/10.14313/JAMRIS_2-2014/18

http://dx.doi.org/10.2991/978-94-91216-77-0_4

http://dx.doi.org/10.2991/978-94-91216-77-0_4

http://dx.doi.org/10.1007/978-3-642-37213-1_5

http://dx.doi.org/10.1007/978-3-642-37213-1_5

http://dx.doi.org/10.1007/978-3-319-07176-3_67

http://dx.doi.org/10.1007/978-3-319-07176-3_67

http://dx.doi.org/10.1145/505282.505283

http://dx.doi.org/10.1145/505282.505283

http://dx.doi.org/10.1016/j.ins.2014.11.002

http://dx.doi.org/10.1016/j.ins.2014.11.002

http://dx.doi.org/10.1002/(SICI)1098-111X(199601)11:1%3C49::AID-INT3%3E3.0.CO;2-Z



http://dx.doi.org/10.1023/A:1009982220290

http://dx.doi.org/10.1023/A:1009982220290


[20] Y. Yang, T. Ault, T. Pierce, and C. W. Lattimer, “Im-proving text categorization methods for eventtracking”. In: SIGIR, 2000, 65–72, http://dx.doi.org/10.1145/345508.345550.

[21] L. Zadeh, “A computational approach to fuzzyquanti iers in natural languages”, Computersand Mathematics with Applications, vol. 9, 1983,149–184, http://dx.doi.org/10.1016/0898-1221(83)90013-5.

[22] S. Zadrozny, J. Kacprzyk, M. Gajewski, andM. Wysocki, “A novel text classi ication prob-lem and its solution”, Technical Transaction. Au-tomatic Control, vol. 4-AC, 2013, 7–16.

[23] S. Zadrozny, J. Kacprzyk, and M. Gajewski, “Anovel approach to sequence-of-documents fo-cused text categorization using the concept ofa degree of fuzzy set subsethood”. In: Pro-ceedings of the Annual Conference of the NorthAmerican Fuzzy Information processing SocietyNAFIPS’2015 and 5th World Conference on SoftComputing 2015, Redmond, WA, USA, August 17-19, 2015, 2015.

[24] S. Zadrozny, J. Kacprzyk, and M. Gajewski. “Anewapproach to themultiaspect text categoriza-tion by using the support vector machines”. In:G. De Tre, P. Grzegorzewski, J. Kacprzyk, J. W.Owsinski, W. Penczek, and S. Zadrozny, eds.,Challenging problems and solutions in intelligentsystems, to appear. Springer, Heidelberg NewYork, 2016.

[25] S. Zadrozny, J. Kacprzyk, andM. Gajewski, “A newtwo-stage approach to the multiaspect text cate-gorization”. In: 2015 IEEE Symposium on Compu-tational Intelligence for Human-like Intelligence,CIHLI 2015, Cape Town, South Africa, December 8-10, 2015, to appear, 2015.

70

http://dx.doi.org/10.1145/345508.345550

http://dx.doi.org/10.1145/345508.345550

http://dx.doi.org/10.1016/0898-1221(83)90013-5

http://dx.doi.org/10.1016/0898-1221(83)90013-5

JAMRIS 2015 Vol 9 No 4

Documents

Transcript of JAMRIS 2015 Vol 9 No 4