Self Modelling Approach in Robot Manipulators Leading to Eye Hand Coordination

8/10/2019 Self Modelling Approach in Robot Manipulators Leading to Eye Hand Coordination

1/91

A Self Model Learning Robotic ManipulatorLeading to Eye-Hand Coordination

By

Michael Jacob MathewEnrolment No: 30EE12A12005

Submitted in Partial Fulfilment of the Requirements

for the Degree of Master of Technology in

Mechatronics of Academy of Scientific and Innovative Research (AcSIR)

CSIR-Central Mechanical Engineering Research Institute

M.G. Avenue, Durgapur-713209, West Bengal

July 2014


2/91

CERTIFICATE

This is to certify that the thesis titled A Self Model Learning Robot Manipulator

Leading to Eye-Hand Coordinationsubmitted in partial fulfilment of the requirements

for the award of degree ofMaster of Technology in Mechatronics of Academy of Scien-

tific and Innovative Research, is an authentic record of the work carried out by me under

the supervision of Dr. S. Majumder at CSIR-Central Mechanical Engineering Research

Institute, Durgapur.

The work presented in this thesis has not been submitted to any other University/ Insti-

tute for the award of any degree.

4 July 2014 Michael Jacob Mathew

It is certified that the above statement by the candidate is correct to the best of my

knowledge and belief.

4 July 2014 Dr. S. Majumder

Chief Scientist

CSIR-CMERI

Surface Robotics Laboratory


3/91

ACKNOWLEDGEMENT:

It gives me extreme pleasure in upbringing the project and thesis for fulfilment of Master

of Technology in Mechatronics at Council of Scientific and Industrial Research-Central

Mechanical Engineering Research Institute Durgapur, West Bengal, India. I express sincer-

est thanks to Director CSIR-CMERI & DG CSIR for giving the opportunity to pursue my

masters programme with AcSIR. I am equally gratified to Prof. S.N. Shome, Dean, School

of Mechatronics for his continuous support and valuable suggestions.

I express my immense gratitude to my guide Dr. S. Majumder for his constant, advice,

revelation and dedicated support. He not only advised me on the technical aspects of my

research but also taught me the art of progressing, writing and presenting a research. My re-search work would have been futile without him. I am overwhelmingly accede the freedom

he has given me in this work as well invaluable opportunity to use all of his lab facilities.

I am indeed fortunate to have such an excellent advisor whose energy and enthusiasm for

research, galvanized me. I also thank Ms. S. Datta for her constructive and critical feed-

back on my research work. Equally important is Dr. Dip Narayans help in procurement

and making of my experimental setup without which I couldnt have completed this thesis.

Rachit Sapra, my colleague, batch mate and avid supporter cannot be endorsed in few

words. His work and suggestions helped me to improve my research in great deal. His

dedication and patience indeed motivated me to stay calm at times of failure and bleak sit-

uations. Manvi Malik, my senior batch mate and colleague helped me in organising this

thesis. The beautiful layout of this thesis evolved from her suggestions and feedback. I

thank her from the bottom of my heart since proof reading is not at all an interesting task.

Equally important is Bijo Sebastian, my junior batch mate whose timely support and assess-

ment greatly abetted in the progress of this research. I would also like to express profuse

thanks to the colleagues in SR lab:- Suman Sen, Sukanta Bhattacharjee, Arijit Chowdhury

and especially Anjan Lakra and Nalin Paul for developing necessary hardware used for my

experimental test setup. The fact that I could rely on the support of Anjan and Nalin at any

time is indeed laudable.

Last but not the least I thank God whose conspiration was profoundly felt at stages

where progress was negligible which usually catapulted zero progress weeks to propitious

weeks. At this point I would also like to extremely thank my parents P A Mathew and

Sherly Jacob, my sister Mary Mathew and my most dearest, trusted friend Arya Sree for

their prayers, motivation and steady backing. These people take the entire credit of my

emotional well being and maintenance without which this thesis would have still stayed a

title.


4/91

Contents

Abstract 1

List of Figures 2

List of Tables 40.1 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1 Introduction 6

1.1 Learning from human behaviour . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Eye-Hand Coordination in robots using human behaviour . . . . . . . . . . 8

1.3 Using visual feedback to estimate manipulator model . . . . . . . . . . . . 9

1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Organisation of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Motivation and Objective 12

2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Need behind self-modelling robotic manipulator . . . . . . . . . . . . . . . 12

2.3 Various Challenges in the problem . . . . . . . . . . . . . . . . . . . . . . 14

2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Study of Existing Methods 16

3.1 Self Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.1 Evolutionary Robotics Approach. . . . . . . . . . . . . . . . . . . 17

3.1.2 Continuous Self Modelling Approach . . . . . . . . . . . . . . . . 17

3.1.3 Self Recalibrating Approach . . . . . . . . . . . . . . . . . . . . . 18

3.1.4 Human Inspired Self Modelling . . . . . . . . . . . . . . . . . . . 19

3.1.5 Using Bayesian Networks for Learning Robot Models . . . . . . . 19

3.1.6 Learning Based on Imitation . . . . . . . . . . . . . . . . . . . . . 20

3.2 Eye Hand Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.1 Using Discrete Kinematic Mapping . . . . . . . . . . . . . . . . . 20

3.2.2 Principle of Automated Tracking. . . . . . . . . . . . . . . . . . . 20

3.2.3 Principle of Visual Servoing . . . . . . . . . . . . . . . . . . . . . 21


5/91

3.2.4 Coordination Using Fuzzy Logic. . . . . . . . . . . . . . . . . . . 22

3.2.5 Calibration Using Active Vision . . . . . . . . . . . . . . . . . . . 22

3.2.6 Un-calibrated Stereo Vision in Coordination. . . . . . . . . . . . . 22

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Proposed Approach to the Problem 24

4.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Steps in Detail. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.1 Acquiring Images and Manipulator Control . . . . . . . . . . . . . 26

4.2.2 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.3 Curve Fitting and Finding DH Parameter . . . . . . . . . . . . . . 26

4.2.4 Optimize DH Parameters . . . . . . . . . . . . . . . . . . . . . . . 27

4.2.5 Validating the Model . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Stereo Image Processing and Obtaining Raw Data 29

5.1 Stereo Vision Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.1.1 Calibration of stereo camera . . . . . . . . . . . . . . . . . . . . . 30

5.2 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.1 Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.2 Refining ROI: Morphological Operations . . . . . . . . . . . . . . 32

5.2.3 Computing Centroid of ROI . . . . . . . . . . . . . . . . . . . . . 325.3 Raw Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

6 Data Analysis 36

6.1 Statistical Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.2 Regression Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.2.1 Circular Regression. . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.2.2 Elliptical Regression . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.3 Non-Linear Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.3.1 Nelder Mead Approach. . . . . . . . . . . . . . . . . . . . . . . . 41

6.4 Simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.4.1 Data generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.4.2 Results of simulation . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.4.3 Comments on simulation . . . . . . . . . . . . . . . . . . . . . . . 46

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


6/91

7 Experiment Setup 47

7.1 Hardware used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.1.1 The Invenscience LC ARM 2.0 . . . . . . . . . . . . . . . . . . . 47

7.1.2 Stereo Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

7.1.3 Other Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

7.2 Detailed Steps of the Procedure. . . . . . . . . . . . . . . . . . . . . . . . 49

7.3 Design of Control Design Software. . . . . . . . . . . . . . . . . . . . . . 50

7.3.1 Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . 51

7.4 Design of Analysis Tool Box . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

8 Results and Discussion 56

8.1 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

8.1.1 Validation of model. . . . . . . . . . . . . . . . . . . . . . . . . . 57

8.2 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

9 Conclusion and Future Work 65

9.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9.2 Future Work and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

A Robotic Manipulators 68

A.0.1 Important Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 70

A.0.2 Denavit Hartenberg Notation . . . . . . . . . . . . . . . . . . . . . 71

A.0.3 3D Transformations and Forward Kinematics . . . . . . . . . . . . 71

A.0.4 Inverse Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . 75

B Software Tools Used 78

B.0.5 Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

B.0.6 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Bibliography 80

List of Publications 85


7/91

Abstract

The evolution and behaviour expressed by natural intelligence has always baffled scientists

and researchers. This led to constant study of other animals including humans to under-

stand and mimic similar behaviour in machines. Intelligence of a machine is brought by

a bunch of mathematical formulations which approximates the model of the machine and

the environment which it constitutes. Apart from this, the model of a robot is essential formany reasons. The performance of every robotic system often depends on accuracy of the

model representing the actual system. Traditionally the robot model is fed into the system

by the designer and then later calibrated to the adequate amount. This work tries to present

a robotic system tie approach where model of the system is evaluated using a few basic apri-

ori information and deriving the rest of the information using visual feedback. The ability

of the robot to self model will assist it to recover in case of partial damages as well as create

an opportunity to recalibrate the system using the on board sensors. The model thus stud-

ied can be further refined by more number of trials using a learning based approach whichcould lead to perfecting the coordination between eye and hands of the robotic manipulator

system.

The thesis takes inspiration from the biological world for deriving self model of the

robot manipulator. The self model derived by the robotic system is used for developing eye-

hand coordination in the robotic manipulator. The core technique of eye-hand coordination

is the mapping from visual sensory to the robotic manipulator. The visual data makes the

open chain of robotic manipulator a closed chain. Many of the traditional ways developed

are for robotic manipulators in structured environment and has camera that overlooks the

entire workspace. An effective coordination between the visual sensor and the end effector

will help in manipulating object in case of partial occlusion also.

This thesis suggests an approach to self model the manipulator by tracking the end ef-

fector position of the manipulator under various actuation commands and then derives the

characteristics of each links and joints based on the information derived from the visual

data. The approach uses tools from statistical learning like regression and a numerical op-

timization technique. The approach was experimented using an Invenscience make robotic

manipulator using a Point Grey binocular camera. The result shows that it is possible to

generate an approximate self model of the system with reasonable degree of accuracy. The

model generated is validated by taking some random configurations with the manipulator.

1


8/91


9/91


10/91

List of Tables

8.1 Table showing coordinate locations of base joint . . . . . . . . . . . . . . . 56

8.2 Table showing link lengths learned after analysis . . . . . . . . . . . . . . 59

8.3 Table showing various joint angles learned after analysis . . . . . . . . . . 60

8.4 Table showing link offsets learned after analysis . . . . . . . . . . . . . . . 60

A.1 DH Parameters of Puma 560 manipulator . . . . . . . . . . . . . . . . . . 73

4


11/91

Notations

The main abbrevations used in this thesis are as listed below.

0.1 Abbreviations

1. DOF - Degree of Freedom

2. PWM - Pulse Width Modulation

3. LSB - Least Significant Bit

4. MSB - Most Significant Bit

5. RHS - Right Hand Side

6. LHS - Left Hand Side

7. SDK - Software Development Kit

8. SMPS - Switch Mode Power Supply

9. V - Volt [unit of electric potential]10. A - Ampere [unit of electric current]

11. ROI - Region Of Interest

12. API - Application Programming Interface

5


12/91

Chapter 1

Introduction

Animals including human beings evolved in both physical and mental abilities which made

them capable to survive in the dynamic environment of the earth. This evolution took sev-

eral thousand years to become perfect and this immense ability to sustain and prevail in

unpredictable environment in a robust fashion influenced researchers and scientists to cre-

ate similar machines that could make our life better. Thus the field of robotics emerged

with the glorious dream of creating machines that can learn, think and behave like animals

and ultimately humans[52]. This attempt to create artificially intelligent machines to match

natural intelligence motivates many researchers to study human behaviours, decision mak-

ing and learning process. Finding an answer to questions on how human being think, learn,

coordinate and behave will possibly show a path to reach this goal. In this respect, how ourmotor system is coordinated with the sensory system is a quite fascinating topic that has

been investigated by many researchers.

1.1 Learning from human behaviour

From an engineering perspective human body is a marvellous piece of highly robust care-

fully manufactured autonomous machine, a beautiful synergy of sensors, intelligence and

actuators.

Figure 1.1: An infant curious on his actions

[50]

The sensors are used to learn changes in

environment, intelligence analyses these

changes, infer information leading to de-

cision making based on which the actua-

tors(muscles) are triggered accordingly.

Though the most intelligent species on

earth, human infants are one of the least

capable offsprings at birth. They develop

their instincts, knowledge and behaviour

6


13/91

from the way they are nourished and nurtured. The influence of self learning is unquestion-

able where the human infants learn relations, patterns from their action and corresponding

reaction in the environment.

This work is primarily inspired by the human development from infancy. There are

several aspects of human behaviour that are interesting; however this work concentrates on

the development of the strong relationship between human hands and their eyes.

Figure 1.2: A player playing tennis [60]

The tremendous relationship between their

eyes and hands has coined the phrase eye-

hand coordination in literature. Human in-

fants does not know how each of their mus-

cles and connected or how they function,

same is the case with their visual system.

But still they figure out a relationship be-tween their eyes and hands in due course

of growth. First acquaintance of a human

infant with an object leads to observation

and subsequent manipulations in numerous

ways (most of them being incorrect). Finally, the infant either learns by himself or observ-

ing a fellow humans ability to handle it correctly. Gradually key of correctly reaching and

holding that particular object is generated. In spite of several failures and confusions they

are able to continuously learn by making observations about their actions. Self-efficacy of

infants increases gradually by understanding/learning the relationship between their motor

behaviour, visual feedback and their interaction with the environment.

Self-efficacy can be defined as the possession of knowledge of how ones actions can

affect the environment[26]. Humans tend to react and manipulate an object based on this

knowledge. Usually, the way of grasping an object is determined on the fly if we have prior

experience in handling that particular object.

Figure 1.3: A player hitting the baseball [32]

When games like tennis, badminton, and

cricket are played, the player only concen-

trates on the target (tennis ball/shuttle cock)

rather than his manipulator (hands). Fig-

ures1.2,1.3and1.4shows this explicit re-

lationship between our hands and eyes. His

hand gets automatically adjusted for a par-

ticular strike which he plans dynamically

based on the trajectory of the target. In liter-

ature this is described as the eye-hand coor-

dination of the player[24]. The player per-

fects this relationship through hours of practice (experience). This kind of self modelling

7


14/91

which results in the reach and positioning of an arm where the eyes are focused towards the

object only is tried in this work.

1.2 Eye-Hand Coordination in robots using human behaviourLike humans, the robot manipulator should also be able to exploit the posture of manipula-

tion as a solution to an obvious problem. It should be able to have an intuitive understanding

of the problem and find the solution to the problem using its prior experience of handling

similar problems before. When a new problem arises, it should be able to find at least an

approximate solution with potential to perfect it using feedback and add to its knowledge

base. Even a human can mishandle an object due to initial inexperience or lack of super-

vision. Humans are able to see these tasks as obvious problems because their brain can

correlate with similar experiences they had by making use of the immense knowledge they

attain in due course of life.It has been observed that an imperfect model of the system re-

sults in rather complicated control algorithm. A robotic system model can be imperfect due

to many reasons like noise, kinematic parameter errors, friction at the joints, elasticity of

links, backlash at the joints and other un-modelled errors.

Figure 1.4: The Robonaut Robot by NASA

undergoing calibration [5]

By making the robot self model it at-

tains the capability to correct the model

since mechanical imperfections can creep

into the system as time progresses. It isoften needed to recalibrate it to retrieve its

prior performance once the system goes out

of sync with its sensor measurements[47].

The performance of a robotic platform de-

pends heavily on the system model that is

fed into its software. For the software the

entire system is a bunch of mathematical

equations which it manipulates to bring out

solutions to the current problem. The

presence of robots in unstructured and un-

certain environments makes it complicated

to obtain an accurate model which is in

sync with the onboard sensors. Self modelling can be an alternative to pre-programming

of the robot model since it uses direct sensory inputs[26]. Even after pre-programming,

robotic systems are usually recalibrated in accordance with the sensory input. For exam-

ple, industrial manipulators are recalibrated with vision data after a specific set of cycles to

guarantee consistent performance.

8


15/91

1.3 Using visual feedback to estimate manipulator model

Although large number of works are done to make the robot capable of learning and mod-

elling environment exist, relatively little is known on the capability of a robot to learn its

own morphology from direct observation. This work tries to bring about a human be-haviour inspired self modelling of the robotic system. Here the robot learns the relationship

of its body by employing various actuations and then analyzing its effect on the environ-

ment. The software tries to make up a coarse model of the robot by repeated observations

and then refine the model further by minimizing the error generated between the predicted

posture using the coarse model and the observed model using an iterative procedure. The

method described here is rather an intuitive way of self modelling. This work concentrates

on trying to estimate the model of a manipulator using visual feedback. The typical ap-

proach of learning the kinematic model, then to calibrate sensors and then the effect of

both on the environment is avoided; rather the robot by itself derives the sensor-actuator

relationship by analyzing the result of its direct interaction with the environment. In this

work, the robot analyzes the trace of a marker attached to the end-effector produced by

tracking the marker using Image processing techniques. The trace of the end-effector will

be a partial curve of some standard curves like circle, ellipse or straight line depending on

the type of joint. Hence by finding the curve of which the trace is part will help in deriving

the necessary parameters required for the kinematic representation of the manipulator. The

kinematic representation thus obtained will contain a relationship between input and out-

put and would help in incorporate all the mechanical uncertainties in usual modelling. The

relationship between input and output is derived by considering the manipulator as a black

box.

1.4 Summary

The field of robotics is moving towards creation of robust architectures similar to the bi-

ological machines like animals. The ability of humans to sense, learn and adapt using

various actions has motivated researchers to try out similar approaches in robots. The per-

fect coordination between human hands and visual sensors still stays as a holy grail in

robotic manipulator research. The relationship between the eyes and hands are derived in

due course of growth from infancy through observation of various action-reaction of them-

selves and from the environment. A similar approach to derive relationship without the use

of any prior approximate model between visual sensors (camera) and manipulators would

lead to solve many problems like the need of re-calibration, widespread usage of robotic

manipulator in human inaccessible unstructured environments and ultimately improve the

resilience and robustness of the system under slight damages.

9


16/91

1.5 Organisation of Thesis

The work tries to relate the action reaction of the manipulator by using various data analysis

techniques which are explained in detail in the subsequent chapters.

Chapter 2

Chapter 2 deals with the motivation and objective of this thesis work. The chapter ex-

plains about the problem statement, the need behind the self modelling robotic manipula-

tor, various challenges to be addressed and study of various approaches that could solve the

problems. This chapter gives the possible subsections of the problem.

Chapter 3

Literature survey done in regard with the thesis title is presented in a concise fashion. Var-

ious existing approached for self modelling and eye-hand coordination is discussed. The

chapter has sections about evolutionary robotics to human inspired self modelling. The

main techniques that are used to bring eye-hand coordination between the robot manipula-

tor and the camera which are used in industry as well as research are discussed.

Chapter 4

Chapter 4 gives an insight towards the proposed approach to solve the problem. The chapter

is a derivative of various literature survey and suggests a totally new way of self modelling

using an intuitive method and regression. The chapter also discusses various steps present

in the experiment to validate the model derived.

Chapter 5

One of the major parts of this project is the stereo vision processing for self modelling. A

stereo camera is the only sensor that is used to understand about the manipulator configura-

tions. The chapter explains the fundamentals of stereo vision and calibration techniques ofstereo camera. The chapter further gives details on various image segmentation algorithms

that are employed in this work to derive data from the image.

Chapter 6

This chapter is consists of details of various statistical learning tools available and some

basic knowledge about the field. The chapter explains in detail the concept of regression

and algorithms specifically used in this work. The algorithms used are simulated using

generated data. The criteria used for the generation and validation are also discussed.

10


17/91

Chapter 7

Chapter 7 is about the experimental setup that is used to validate the method suggested

by this thesis. It also illustrates the type of hardware selected for the manipulator as well

as the camera. There are two different software that were designed to learn model of the

manipulator. This chapter gives details about the software architecture.

Chapter 8

This chapter is the important chapter in this thesis. The results section displays the screen-

shots of various outputs that were obtained during the experiment. The discussions section

shows the inferences derived from the method and validation of the learned model by the

manipulator.

Chapter 9

This is the last chapter of the thesis and it describes various techniques that can be used or

implemented to further improve the system. This chapter also contains a conclusion section

which summarises the entire work and results obtained.

11


18/91

Chapter 2

Motivation and Objective

Modelling of a robotic system is an age old problem. A robot system performs various

actions by manipulating its system model using various inputs and feedback measurements

from the sensors. Thus the degree of closeness of the system model to the actual system

determines the performance of the system. Following section explains various aspects of

self modelling of robot system and other objectives.

2.1 Problem Statement

Deriving entire model of a fully fledged robot system is out of scope of this work. This

thesis work concentrates on making a robot camera - manipulator system derives a relation

between its visual sensors and manipulator so as to achieve coordination between its eyes

and hands. The robot system observes the reaction of its manipulator on the environment

using visual sensors. Stereo camera is used as visual sensors since human learn and perceive

using their binocular vision for obtaining 3D information. Though absolute measurements

are not possible with human visual system, a relative measurement would usually suffice for

our visual cortex to learn from the environment. The problem statement can be illustrated

using the schematic shown in figure2.1.

2.2 Need behind self-modelling robotic manipulator

Some of the aspects that motivates to solve this problem are listed below.

Unavailability of human intervention when robots are send on operation on remoteplanets or space stations.

To survive itself from minor injuries in a foreign environment.

Re-calibration of robots in complex unstructured environments is not an easy task.

12


19/91

Figure 2.1: Overview of the problem statement

Helps towards the goal of making completely autonomous robots.

The robot system capable of re-modelling itself will be more robust.

The system can review its performance and modify accordingly so that problems likewear, tear, friction and other mechanical constraints that accumulate in time can be

automatically incorporated.

The system can be made to perfect the model by evaluating its performance with eachaction-reaction pair.

The robot system becomes closer towards their natural counter parts.

Huge amount of time can be saved.

The system automatically derives the constraints of its system.

As mentioned before robotic research started with the intension to make machines that

could either replace or help human beings in their daily tiresome chores. A system that is

capable to model and re-model itself would completely avoid the need of a human operator

or programmer. The system will be able to re-model itself in case of minor damages or

additional mechanical constraints that may arise in time. Present robot manipulator industry

manufactures manipulator for specific tasks but humans on the other hand use two hands for

all the tasks that are put infront of them. To impart similar behaviour in robot manipulators,

the system should be able to learn about itself, its environment and its constraints.

13


20/91

2.3 Various Challenges in the problem

There are several problems that are to be addressed to make the robot capable to learn from

its actions so as to infer its own model from it. The various challenges identified in this

problem that are to be individually solved are the following.

Study of existing solutions and methods: This involves extensive literature survey

of current approaches that are used to address similar problems. Equally important is

the need to thoroughly understand various basic concepts of robot manipulators and

stereo vision.

Selection of hardware: This problem involves market research and finding of a robotic

manipulator that can fit to solve your problem. The manipulator chosen should have

adequate number of sensors and could be remotely controllable. The remote control-

ling ability makes it easier for the programmer to sit far from the manipulator there

by avoiding any proximity hazards. Apart from the robotic manipulator study and

procurement of a stereo camera is also needed which will used for stereo visual feed-

back. The camera chosen should have minimum distortions and lens aberrations to

get images containing less noise. The success of the entire experiment depends on

the choice of the stereo camera and hence it is important.

Making hardware operational: This step involves study of data sheets, software li-

braries and sample codes so as to interface the camera and manipulator with a com-

puter. Knowledge of interfacing hardware, programming is all required for this step.

Principles of computer vision: Images obtained from the stereo camera are to be

processed so as to obtain infer logically plausible information from the camera data.

This involves study and implementation of various image processing and computer

vision algorithms.

Elements of statistical learning: The raw data derived using various image process-

ing algorithms are to be analyzed using learning algorithms to make the system model

from the visual data. Selections of algorithms or creating algorithms that can infer

such information are to be studied. For example for implementing curve fitting us-

ing regression there are multiple algorithms like gradient descent, newton-raphson,

lavenberg marquart etc. The choice of the algorithm depends on the performance

parameters the designer consider. Some performance parameters are speed of execu-

tion, accuracy of execution, repeatability etc.

Design a control software: Ultimately all the above steps has to be combined to form

a single software that can self learn the model of a manipulator. This step involves

selection of the programming language, programming platform and sufficient knowl-

edge about command signals to the hardware are required.

14


21/91

2.4 Summary

The main problem statement of this thesis is explained in detail. The objective and motiva-

tion that are required to solve this problem is clear from the chapter. Solving the problem

consists of several sub-challenges. The chapter gives an overview of various problems thatare to be addressed to solve this problem.

15


22/91

Chapter 3

Study of Existing Methods

Kinematic and dynamic model of the robots own body are one of the most important

tools for control and manipulation of robots. The process of studying the basic underlying

mechanisms of the robot using its own actions on the world is known as model learning[46].

Availability of large number of data analysing tools in statistics allow straightforward and

accurate model approximation. The missing knowledge of the robots model is completed

by observing the robots action and reaction on the environment. This capability will help

the robot for self modelling in unstructured, uncertain and human inhabited environments

like space.

This thesis aims at achieving eye-hand coordination using self modelling approach. The

thesis objective can be divided into two main problems. They are(i) Self Modelling and(ii) Eye-Hand coordination. Eye-Hand coordination can be stated as the ability to reach a

point seen by the camera in workspace of the robot using its manipulator. For a perfect eye

hand coordination the robot has to have a perfect model of itself. Since the robot is not

pre-programmed with the model, the robot has to derive the input-output relationship by

observing its own actions. This is known as self modelling or model learning.

3.1 Self Modelling

Self modelling is the approach involuntarily practiced by the natural world to learn relation-

ship between their eyes and motor system. This model may not be in the form of an explicit

mathematical relationship between input and output as seen in the case of many robots.

Natural world counterparts un-learn, re-learn and improvise the relationship between their

eyes and hands as their experience in similar tasks increase. This is evident from the simple

example that a sports person who practices only excels in the field. This makes the natu-

ral counterparts to do a task in a different way or operate their limbs even after an injury.

Resilience and robustness seen in natural world is highly desirable for engineered systems

since such capability will ensure longer life and endurance. This thesis concentrates on just

16


23/91

robotic manipulator system capable of self modelling. Most of the industrial manipulators

are employed in structured environments for repetitive tasks, achieving robust performance

under uncertainty must also be addressed. A few of the relevant approaches are listed in the

following sub-sections.

3.1.1 Evolutionary Robotics Approach

This is one of the earliest approaches by researchers to bypass the problem of modelling

the robot controller due to the impossibility of foreseeing each problem the robot might

face[48]. Evolutionary robotics approach is also called automatic design procedure which

is based on the concept of survival of the fittest. The concept can be explained as deriv-

ing the overall body plan of the robot by progressively adapting to specific environment

and specific problems confronted through an artificial selection process that eliminates ill-

behaving individuals in a population while favouring the reproduction of better adapted

competitors[44].

The approach is implemented using evolutionary procedures like genetic algorithms[21],

evolution strategy[59] or genetic programming[37].The approach takes place in two steps.

Initially simulations on specific robot model performed are downloaded to the actual sys-

tem to check their fitness. In some cases evaluations are performed on the actual robot itself

rather than simulations to validate the real world constraints. Finally the selected model is

used to modify the so called evolvable hardware[56].

Though evolutionary robotics is a good approach, one obvious problem with this ap-

proach is that it involves large number of trial and error and hence experimentation will be

expensive [28].

3.1.2 Continuous Self Modelling Approach

Josh Bongard et al. in his work [10] claims how resilience of machines can be improved

by continuous modelling. The work describes how an engineered system can fail at an

unexpected situation or damage and an approach to autonomously recover the system fromsuch damage. The work uses a four legged platform that derives its own model by using

its own actuation sensation relationships indirectly. The model thus learned is used for

forward locomotion of the robot. The platform is capable of self adaptation once a leg

part is removed. The algorithm developed consists of three stages. Self-model synthesis,

exploratory action synthesis and target behaviour synthesis. The algorithm employed in

this work is outlined in figure3.1. The approach is on the concept of generating multiple

internal competing models and generation of actions to maximize disagreement between

predictions of these models.

17


24/91

Figure 3.1: Algorithm outline of continous self modeling machines [10]

3.1.3 Self Recalibrating Approach

Calibration is the technique to bring some modifications to the existing known model so as

to match certain performance metric like accuracy, precision and repeatability. Tradition-

ally robots are calibrated by a human operator which later evolved to imparting ability to

the robot to calibrate itself based comparison of set point and sensor measurements. The

error between the set value and observed value is minimized modifying the known system

model.

Self calibration techniques can be classified into two categories. (i) Redundant Sensor

Approach (ii) Motion Constraint Approach. Vision based sensors are used for accurate

calibration in robots. Calibration of the robot robonaut by NASA(1.4) using visual feedback

is an interesting work. In this work[47] the robot is calibrated using a spherical marker

attached to the end-effector of the manipulator and visually monitoring the marker position.

A configuration required is set on to the manipulator which the manipulator achieves using

the existing known system model. Using the existing model the position the end effector

attains is predicted and the attained end effector position is measured using a stereo camera.

The error between the predicted and measured position is minimized by modifying the

robot model. The system automatically undergoes daily calibration procedures to meet the

performance requirements.

A similar work is done by Alemu et al. in [6] which involves self calibration of a cam-

era equipped SCORBOT[54]. The work claims that there is no need of multiple precise

3D feature points on the reference coordinate system. The kinematic parameters of the ma-

nipulator are found by measuring the joint parameters by taking various trajectories whichare planned and simulated. Later the measured values are compared with the simulated

18


25/91

values to modify the system model. The paper uses a non-linear optimization algorithm

to improve the robustness. The usage of visual sensors avoids drawbacks of sensor based

approaches is that some of the kinematic parameters are dependent on the error model.

Such a dependency would result a cumulative error in the system model. The problem of

the motion constraint approach is that the error in locked joints may become unobservable.

Other similar works are done in [18], [45], [43] and [65].

3.1.4 Human Inspired Self Modelling

In [26] Justin W Hart et al describes a robotic self models inspired by human development.

The work concentrate on two aspects which are ecological self, that is the agents ability to

encompass its sensorimotor capabilities and self-efficacy, that is ability to impact the envi-

ronment. The ecological self can be described as a cohesive, egocentric model of the body

and its relationship to environment arising from the coordination between the senses[55].

The sense of self-efficacy describes the casual relationship between the motor actions, body

and the environment.

The approach used for deriving the relationship is called kinematic learning in which

structure of a robot arm in the visual field of a calibrated stereo setup is used. The imple-

mentation of the principle is done on a 2 DOF manipulator of a humanoid robot that was

modelled using the concept of kinematic learning[27]. Since output is directly observed,

the error model of the system automatically gets incorporated into the system.

3.1.5 Using Bayesian Networks for Learning Robot Models

Antony Dearden et.al in their work titled learning forward models for robots [17] describes

the usage of a Bayesian Network for learning and representing the forward models of

robots. The approach is outlined in figure3.2.The feedback about its motor system comes

from visual sensors the system posses. The system learns the model by observing the results

of the actuation as a result of sending some motor babbling commands.

Figure 3.2: Learning forward models of robots using Bayesian Networks [17]

19


26/91

3.1.6 Learning Based on Imitation

The approach deals with the ability to imitate an action performed by another robot or

individual in front of the system. Leitner et al. in [42] describes a method adopted to teach

a humanoid platform of name iCub to learn its model. The humanoid robot is trained using

another manipulator which moves an object to different locations that are not known apriori.

The locations are lies within the work envelope of the humanoid robot. The humanoid uses

a neural network [58] to relate its pose and visual inputs for object localization. In this work

also a prior model is available which is refined to a more accurate model using principles

of machine learning [8].

In short, self modelling would make the robot learn its model and recalibrate itself. The

model can be continuously modified so as to attain better performance with each iteration.

Most of this literature try to derive the system model by looking into the system as a black

box and deriving the model based on the input and observed output.

3.2 Eye Hand Coordination

The intention of this work is to achieve eye to hand coordination using a self modelling

approach. Eye-hand coordination is an important skill that a person perform effortlessly

[64]. The non- quantitative nature of human vision makes this skill that link perception to

action in an iterative manner.

3.2.1 Using Discrete Kinematic Mapping

M Xie in [64] describes a developmental approach for the robotics hand-eye coordination.

The work tries to develop a computational principle for mapping visual sensory data into

the hands motion parameters in the Cartesian space. Eye-hand coordination involves devel-

opment of kinematic model of the manipulator and its inverse kinematic equations. M Xie

proposes a new method calleddiscrete kinematic mapping, to directly map the coordinates

in Task space into the coordinates in Joint space at different levels of granularity which canbe gradually refined through the development. The approach is briefly illustrated by figure

3.3.

3.2.2 Principle of Automated Tracking

Another interesting work is done by Peter K Allen et.al in [7] in which an automated track-

ing and grasping of moving object with a robotic hand-eye system is performed. The work

focuses on three main aspects of eye-hand coordination which are (i) fast computation of

3D motion parameters from vision, (ii)predictive control of a moving robotic arm and (iii)

interception and grasping using the end effecter of the manipulator. Principles of optical

20


27/91

Figure 3.3: M Xies Development approach [64]

flow in computer vision, kalman filter concepts and probabilistic noise model employed

control algorithms were utilised for the task.

3.2.3 Principle of Visual Servoing

The principle of visual servoing initially finds the target location to be reached and in-

crementally move the manipulator to the desired location. Here tracking of the arm isperformed so as to correct the trajectory course while it moves towards the target. The

tracking module and the robot control module together forms a closed loop control in visual

servoing. Visual servoing is divided into two types, (i)Eye-In-Hand and(ii)Stand-Alone.

Eye-In-Hand systems has a camera attached to the end effector and it reaches the goal while

stand alone system has a camera that overlooks the manipulator and the target. In case of

stand alone systems, the work envelope of the manipulator is limited by the visual field of

the camera. There are also methods which use two cameras where one camera is on the end

effector while the other is overlooks the workspace. These are shown in figure 3.4

Figure 3.4: Principle of visual servoing[38]

21


28/91

3.2.4 Coordination Using Fuzzy Logic

Sukir et al. in their work [39] claims the use of Fuzzy logic for developing coordination

between eyes and hand of a robotics manipulator. The eye-hand coordination of a robotic

manipulator is described as an ill posed inverse mapping problem. The algorithm imple-

mented uses inverse homogeneous transformation technique to find the pose of the work

piece with respect to the world coordinate system given the pose of end effector position,

the 2D image with the work piece position, camera parameters and pose of robots end-

effector.

Usually a change in the camera position or manipulator position demands re-calibration

of the entire system. To avoid this, the authors calculated the projective transformation at

two extreme vertical position of the camera (the range of vertical motion possible) and used

Fuzzy logic to determine the change in coefficients. The implementation of Fuzzy logic

consists of(i) determination of the fuzzification input (the squared distance between two

known points in the image) (ii)selection of fuzzy inference rules for reasoning([36]) and

(iii) defuzzification and computation of outputs (calculation of the required value of co-

effiecients.) The algorithm though claims to remove the need of re-calibration, the method

can only be used in case of a vertical displacement of the camera.

3.2.5 Calibration Using Active Vision

Principle of active vision is similar or same as visual servoing. In this method multiplecameras (usually greater than 2) are used to overlook the workspace as well a camera is

attached to the end-effector. All the cameras that overlook the workspace is calibrated and

fixed. The advantage of using multiple cameras is that it avoids visual occlusion and also

gives better trajectory planning capabilities. The disadvantage of this system is that the

mobility of the manipulator platform is not possible, that is the base of the manipulator

should be in a fixed location. This approach is used in industries where robots are used for

manufacturing jobs. The last image in figure3.4is an example of active vision. The same

principle is used in the calibration of robonaut by NASA[47]. Current industrial manipu-

lators undergo calibration using active vision after a particular interval of task execution.

This is very much needed in case of precision placements and repetitive tasks. This method

well suited for structured indoor environments.

3.2.6 Un-calibrated Stereo Vision in Coordination

Nicholas H et al. in [31] implements an affine stereo algorithm to estimate position and

orientation to be reached. Affine active contour models are used to track real time motion

of the robot end-effector across the two images. The affine contours used are similar to aB-spline snakes with discrete sampling points. The shape of the object itself is used as the

22


29/91

tracking feature. In the algorithm implemented, at each time step the tracker moves and

deforms to minimize the sum of squares of offsets between the model and image edges.

3.3 SummaryAlthough some approaches exist to incorporate the changes into the model using some

parametric identification, it is still difficult in the case of complex machines. To sum it

up it can be said that Model learning can be a better substitute to manual re-programming

for eye-hand coordination of the robot since the robot tries to derive the model from the

measured data directly. Model learning can be divided into two sections. (i) Deducing

behaviour of the system from observed quantities. (ii) Determining how to manipulate

the system from the inferred data [46]. Many of the relevant approaches that existed in

literature is quoted above. The human inspired self modelling shows a clear departure from

the traditional model learning paradigms present and is thought to be fruitful if pursued.

Eye-Hand coordination is still considered to be a task that it to be perfected since a

long gap stands in comparison with the natural counterparts. The problems of mechanical

constraints pose a serious limitation and vouches for the need of re-calibration after certain

number of cycles. One important thing to be noted is that many of these methods deal with

manipulators that are either fixed or belonging to a structured environment and hence most

of these methods cannot be implemented in un-predictable environments.

All these studies calls for a need to develop a new way to acquire eye-hand coordinationin robotic manipulator - vision systems which would improve their resilience, robustness

which could be a step towards the attainment of singularity.

23


30/91

Chapter 4

Proposed Approach to the Problem

From the literature survey section it is clear that black box imitation learning can at best

produce the desired behaviour [46]. Deriving a relationship between the input and output

without seeking details of the internal system will automatically take into account the me-

chanical constraints, friction and other parameters which are not taken or approximated in

usual modelling.

The following section deals with the important theory behind the approach that is used

in this paper. The approach is based on the assumption that most common joints in a robot

manipulator are either Rotary (R) joint or Prismatic (P) joint. Other joints like cylindrical

joint ( RP) or spherical joint (R R R)[35] can be expressed as a combination ofthese joints. When a link attached by an R joint or P joint moves, the trace of the tipof the link will either be a circle (or ellipse) or a straight line. To find the parameters of

the part curve produced by the trace, one should have some curve fitting techniques like

regression analysis. The necessary values that are required to learn the coarse model of

the manipulator can be obtained from the best fit curve to the raw data produced from the

trace of the tip of the link. The end effector of a manipulator can be related to the base

of the manipulator using either the Denavit Hartenberg (DH) convention or the Power of

Exponents. This work uses DH convention since it has much lesser parameters compared

to the Power of Exponents. The details of the methods employed in this work are explained

in detail as follows.

4.1 Approach

The approach suggested by this thesis is illustrated in figure4.1. The solution starts with

operating the robotic manipulator using specific control commands wirelessly transmitted

to it and simultaneously recording the changes of the end effector position using a stereo

camera. After the acquiring of the images, the images are processed to obtain the manipu-

lator end effector position which is used as input to the curve fitting section which tries to

24


31/91

find out the relationship between the raw data points.

Figure 4.1: Flow chart of the proposed solution

Using the curve fitting results and stereo vision principles 5.1,various DH parameters

A.0.2are derived. Using the DH parameter of the manipulator a coarse model of the system

is formed. The coarse model formed is optimized using a numerical optimization techniqueto derive a finer model. The model is finally validated by taking random configurations.

25


32/91

4.2 Steps in Detail

Detailed explanation of the various steps involved in the approach is as described in subse-

quent sections.

4.2.1 Acquiring Images and Manipulator Control

Since the manipulator works in a 3D world, concept of depth perception is very important

and hence a stereo camera needs to be used which provides binocular vision. The camera

should also be calibrated so that the real world values can be computed from the images.

Camera calibration would provide values like centre of the image, focal length of the lens,

distortion parameters and the extrinsic parameters between the two cameras of the stereo

rig.The manipulator is to be controlled wirelessly so as to avoid any proximity hazards for

the programmer. Only one joint of the manipulator is moved at a time so that traces of the

end-effector to that particular curve can be obtained.

4.2.2 Image Processing

The images obtained from the camera are to be processed to obtain useful information from

it. The images has to be undistorted, rectified, pre-processed and segmented to obtain the

tracker position attached to the end effector of the robot manipulator. The centroid of the

segmented marker is computed to obtain the centre location of the tracker in the image. The

image locations from both left and right camera are used to compute the 3D world position

of the manipulator end-effector.

4.2.3 Curve Fitting and Finding DH Parameter

The raw data obtained from the image processing section has to be analyzed using var-

ious statistical learning tools available so as to derive relationship between the data. As

mentioned earlier, a revolute joint moves in a circular fashion and hence the path taken by

the end-effector would be a part of a circle. If the manipulator end-effector is moving ina plane parallel to the image plane, then, the image trace of the manipulator end-effector

will have a circular trace. If the end-effector is moving in a plane that is non-parallel to the

image plane, then the image trace will have an elliptical shape. So by using circular and

elliptical curve fitting, one can deduce the complete parameters about the curve formed by

the manipulator. This would used to derive the DH parameters of the manipulator.

Checking Validity

Using the details provided inA.0.2, the transformation matrix can be calculated as follows.

This matrix was used to The general transformation matrix from frame itoi-1 can be given

26


33/91

as equation4.1[35].

i1Ti=

cosi sini cosi sini sini ai cosisini cosi cosi cosi sini ai sini

0 sini cosi di0 0 0 1

(4.1)

The details on how to derive this transformation is discussed in Chapter 2. The parameter

with subscriptiare of the frameiand with subscripti-1are of framei-1. aiis the link length

of link i anddi is the frame offset of frame i from frame i-1. The inverse transformation

from framei-1 to framei is given by equation6.21.

iTi1=

cosi sini 0 aisini cosi cosi cosi sini di sini

sini sini cosi sini cosi di cosi0 0 0 1

(4.2)

By substituting various manipulator values computed into the above matrix and using the

transformation fundamentals, the validity of the model can be checked. By giving com-

mands to the manipulator, predicting the 3D location using the internal sensor readings and

comparing the actual position reached by measuring the end effector position will give the

validity of the model.

4.2.4 Optimize DH Parameters

The DH parameters obtained in the previous section will provide only a coarse model of

the manipulator due to the inclusion of noise to the data. The DH parameters obtained is

optimized and refined by using a numerical optimization technique. The previous subsec-

tion illustrates how to check the validity of the obtained coarse model. The computed error

between the predicted value and observed value is to be minimized to refine the DH table.

The value is minimized by modifying the DH parameters so as to reduce the error. Thisstep would produce a highly refined result within the tolerance limit.

4.2.5 Validating the Model

The model finally obtained is to be validated. Initially the experiment started with no knowl-

edge of the DH table of the manipulator. The final section provides the DH parameters that

are required to describe the manipulator. The model can be validated by making the ma-

nipulator reach random goal locations using the obtained model. The random locations can

be supplied using a marker attached to a rod that could be moved in the work space by the

programmer which is identified and reached by the manipulator end-effector.

27


34/91

4.3 Summary

This chapter discusses the proposed solution to the problem posed in the previous chapters.

The chapter also explains various steps that are required to implement the experiment to

obtain the model of the manipulator. The experiment is started with a few basic knowledgeon the manipulator like the number of joints present and type of joint present and with

no knowledge on the DH parameters of the robot. The steps explain how the suggested

approach can lead to a solution more of which is discussed in the later chapters.

28


35/91

Chapter 5

Stereo Image Processing and Obtaining

Raw Data

5.1 Stereo Vision Fundamentals

Animals including humans see the world through sophisticated visual system. The two eyes

separated by a distance enables the brain to accurately produce a 3D representation of the

world [14]. This is the basis of various 3D imaging systems also which can be seen from

figure7.2. Stereo vision works with the basic assumption that there is considerable overlap

between the images formed on both the left and right camera. The principle behind the

working of stereo vision is that points on a scene seen by both cameras will be experienced

differently due to their difference in relative position. The image in one frame, say the left

frame will be shifted by a few pixels compared to the image in the right image plane. This

difference between the corresponding pixels on both images of the same 3D world point

is called the disparity of the pixel. By calculating a disparity map for the entire image,

the relative depth map of the world can be calculated provided one knows the internal

Figure 5.1: Schematic view of a 3D camera system

29


36/91

and external parameters of the camera. Depth of a 3D world point is calculated using the

equation5.1.

Z= f T

d(5.1)

where, f is the focal length of the camera in pixels, T is the base line between the twocameras in meters and dis the disparity of the given point in pixels.

To find the image correspondences between the pixels in both images, several pre processes

has to be done like image un-distortion, image rectification and image alignment [12].

5.1.1 Calibration of stereo camera

The mathematical model that is used to describe image formation is called weal perspective[12].

Imperfections in the camera lens leads to generation of distorted images which if not cor-

rected can give false readings in inference. This avoided by calibrating the camera. The

process of finding intrinsic parameters andextrinsic parameters of the camera to correct

the images is known as camera calibration. Intrinsic parameters include lens distortions

likeradial distortion, tangential distortionand spherical aberations. Extrinsic distortions

consist of external camera parameters like translation and rotation. In case of a stereo

(a) Left Image (b) Right Image

(c) Left Image (d) Right Image

Figure 5.1: Colour thresholding based image segmentation

camera, both cameras present are to be calibrated. Usually camera calibration is performed

by using a checker board which is used to capture left and right camera images. Then usingthe algorithm described in [30] various parameters are derived. A MATLAB [23] tool box

30


37/91

is available for stereo calibration which can be found in [11]. Few of the images captured

while calibration is shown in figure5.1.

5.2 Image ProcessingThe obtained images through the stereo camera need to be processed to infer data from

the same. Processing is done in three steps. First is a pre-processing step that involves

undistorting and rectifying the camera images. Second is the filtering and smoothening of

the images. Third is the segmentation of the images and finally the calculation of centroid

of the segmented image to obtain the position of the marker attached to the end-effector.

Image smoothing is performed to remove or suppress the high frequency noise present in

the system. A median filter is used for effective removal of salt and pepper noise.

5.2.1 Image Segmentation

Segmenting an image means extracting out a region of interest from an image. Several

methods exist for segmentation of images [51]. All methods rely on some prominent fea-

tureof the image like colour, texture, scale invariant features like corners, edges, entropy

changes etc. Since segmentation is not the chief aim of this thesis, a relatively simple

method was chosen for the same.

In this thesis, data from the image is derived using simple colour thresholding tech-

niques. Since RGB domain colour encoding is more external light intensity dependant, a

less susceptible domain like the HSV colour encoding domain is used. An example of this

method is illustrated using figure5.2. The technique can be illustrated with the equation

5.2.

I(x,y) =

1i f I(x,y) Threshold value

0otherwise(5.2)

(a) Image (b) HSV Image (c) Segmented Image

Figure 5.2: Colour thresholding based image segmentation

31


38/91

5.2.2 Refining ROI: Morphological Operations

Morphology in image processing is based on set theory. These operations are done on

binary images. Sets in images are the objects. Erosion and dilation are two fundamental

operations in morphological image processing.

Erosion:WithA andB sets inZ2, the erosion ofA byB, denoted by equation5.3

AB= {z|(B)z A} (5.3)

This means that erosion ofA by B is the set of all points z such that B, translated by z,

is contained inABis usually a structuring element and it doesnt share any informationwith the background. Erosion can mathematically represented by equation5.4.

AB= {z|(B)zA=} (5.4)where,A is the compliment ofA andis the empty set.

Dilation: Dilation can be defined as follows. With A and B as sets inZ2, the dilation ofA

byB, denoted , is defined as equation5.5.

AB{z|(B)zA =} (5.5)

Morphological closing is another operation in image processing where dilation followed by

erosion using the same structuring element is done. An illustration can be seen in figure

5.3c.

(a) Thresholded Binary Image (b) Image after morphological

close operation

(c) Image after morphological

open operation

Figure 5.3: Morphological Operations

5.2.3 Computing Centroid of ROI

The end-effector of the manipulator should described by a single point in the image. Since

the segmented image contains more than one pixel, the centroid of the image is calculatedto denote the exact position of the end-effector. The centroid of the segmented image

32


39/91

is calculated by computing moments of the image. Moment of an image is defined as a

weighted average of the pixels contained in the image. Mathematically for an image, it can

be represented as equation5.6.

Mi j=x y xiyjI(x,y) (5.6)

The centroid is computed by calculating the central moment of the segmented image (5.7).

pq= xy(xx)p(yy)qI(x,y)

CentroidX= M10M00

CentroidY= M01M00

(5.7)

5.3 Raw Data

Before using any algorithm to find the pattern in any raw data, it is always better to analyze

the raw data manually. This may not be the case always since if the amount of raw data is

huge it is literally impossible to find a pattern in the data or comment on the correctness

of data using manual observation. But here the number of data is less since there are only

maximum 41 data points per joint. So below are the raw data plot obtained for the tracking

of data. Thex axis contains the row data of image pixels and y axis contains the column

data of the image pixels.

Figure5.5consist of images that are taken form RHS camera of the stereo camera while

figure5.4consist of images that are taken from LHS camera of the stereo camera.

Comments on the raw data obtained

Since each of the joints present in the manipulator are revolute joints, we expect arc as the

trace of the marker. The arcs can be a part of a circle or an ellipse. A circle when viewed

in a skewed plane compared to the image plane of the camera, it appears to be an ellipse.

(a): The raw data obtained shows the trace of the end effector marker by the left and right

camera of the stereo rig. DOF 1 data from both the camera look like they are points on an

incomplete ellipse.

(b): The raw data from DOF 2 from both camera look like they are a part of a circle.

(c): The raw data from DOF 3 from both camera look like they are part of a circle.

(d): The raw data from DOF 4 from both cameras also look like they are part of a circle.

The raw data obtained contains less amount of noise and is expected to fit with various

curve fitting methods explained in the previous chapter.

33


40/91

(a): DOF 1 Data (b): DOF 2 Data

(c): DOF 3 Data (d): DOF 4 Data

Figure 5.4: Raw data plots of LHS Camera

(a): DOF 1 Data (b): DOF 2 Data

(c): DOF 3 Data (d): DOF 4 Data

Figure 5.5: Raw data plots of RHS Camera

34


41/91

5.4 Summary

Fundamentals of stereo vision are explained with the need to calibrate and method of cal-

ibration to derive useful information from the image. The images acquired are segmented

using simple thresholding technique and the centroid of the image is calculated by com-puting the moments of the image. The centroid thus calculated depicts the position of the

end-effector of the robotic manipulator in the image. The raw data plots obtained are plotted

against X and Y coordinates. The data visually looks similar to standard curves of ellipse

and circle. Further analysis on the image can be found in later chapters.

35


42/91

Chapter 6

Data Analysis

6.1 Statistical LearningThe usage of various tools of statistics like regression, classification and clustering for

qualitative and quantitative analysis is known as Statistical Learning. This domain can be

classified into supervised learning and unsupervised learning. Supervised learning can be

defined as predicting or getting inference from a set of inputs and corresponding outputs.

Unsupervised learning can be defined as deriving structure or relationship in the input data

without using a supervisory output. All analysis problems of statistics can be collectively

classified into three. (i) Regression Problem(ii) Classification Problem or (iii)Clustering

Problem.

Supervised learning involves estimation of a function that approximates the actual func-

tion using input observations and corresponding outputs. The parameters on which the data

depends are called predictors or variables. Suppose there are pvariables which can be de-

noted as X1,X2,X3, ,Xn for outputY then the actual relationship between the input andoutput can be denoted as6.1

Y= f(X1,X2, ,Xn) + (6.1)

where, epsilon denotes the error in the funtion which is taken with zero mean. This function

can be estimated by Y= f(X1,X2, ,Xn). The degree of closeness of f(X1,X2, ,Xn)to 6.1 determines the performance of the model. The observations and outputs that are

used to estimate the function is called the training data and the inputs used to predict or

estimate its outputs are called the test data. The model is derived using training data and

accuracy of the model is evaluated using the test set. Supervised learning mainly consists

of the regression problem and classification problem [29]. The Regression problem can be

defined as prediction of a quantitative value based on an estimated model. For example

we have to predict the exact wage of a person using features experience, qualification etc.

The classification problem deals with qualitative inference from the data. For example

36


43/91

survival of a patient due to cancer based on the data of survival and death of previous

cancer patients. Here we are interested only in the categorical aspect of whether the person

will die or survive and not in some exact numerical value.

Unsupervised learning involves deducing pattern or relationship in the data so as to

group them into different clusters. These enables in finding groups of similar characteristics

together. The problem with clustering is that there is no way to know the number of clusters

prior. Here we are not predicting or inferring any information from the observation. For

example to know which genes cause cancer, clustering algorithms are used to group genes

that have similargene expression.

This thesis involves only few techniques of supervised learning which involves esti-

mation of a function. The function estimation is done using two methods in general.

The first method is called parametric method and the second method is called the non-

parametric method. Parametric method involves two steps. The first step is to take anassumption about the possible function that will best estimate the given data. For example

if the analyst think linear model will best fit his data then he can choose an equation like

Y=1X1+2X2+ +nXn. Now estimation of the function has reduced to finding pa-rameters 1to n. The second step involves estimation of the parameters using the available

training data. Non-parametric methods involve no assumption and it uses several functions

to find the best suited function for the input data. Parametric methods are simpler since a

function model is already assumed but it has a disadvantage that the chosen model may not

adequately fit the data[29]. Non-parametric methods has the advantage that it will estimate

a function that comes very close to the actual function (within the allowed tolerance limit)

but can be computationally expensive due to lack of an idea of the best fit function.

As mentioned, the performance of the estimated function depends on the mean squared

errorcalculated between the data and the function approximated. The mean squared error

consists of three components. The reducible error, the bias and the variance. Reducible

error is the deviation from the actual function and approximated function. This is called

reducible error because choosing a better approximation technique can result in a lesser er-

ror. Details of some relevant approaximation techniques are covered in Chapter 5. The bias

and variance together are called irreducible errors since we cannot completely eliminate

them. Variance is defined as the deviation that will arise in the approximated function when

a different training set is used. Ideally this difference should be very low. But many a time

flexible functions are chosen for better fit to the data. Highly flexible function approxima-

tions will have larger deviation since they are highly sensitive to the errors in training data.

Hence a slight variation in training set can result in a different approximated function. In

short more the flexibility of the funtion, more is the variance of the function. Bias is the

error that gets introduced when a real life problem is approximated by a model. Approxi-

mation always has errors and this is reduced by choosing functions that are more flexible so

that it fits the data more precisely. It is seen that as flexibility increases bias reduces. The

37


44/91

rate of decrease of bias is greater than the rate of increase of variance.

In short the total error present in the system will depend on flexibility of the function.

However the actual performance of the approximated function is to be done by calculating

mean squared error of the test data rather than the training data. Though bias reduces

with increase in flexibility of the model, it is observed that this influence does not last

forever. The effect of flexibility becomes constant beyond a point and at this time the error

is influenced by increasing rate of variance. Hence flexibility of the function is chosen as

an optimized value based on the relative rate of change of variance and bias.

6.2 Regression Analysis

Regression analysis can be defined as a statistical tool that is used to derive relationship

between variables from a set of data. We use regression analysis for curve fitting. There

are different types of regression analysis based on the type of curve we try to fit into the

available data. Here we use circular regression and elliptical regression.

6.2.1 Circular Regression

The most accurate and robust fit minimizes geometric (orthogonal) distances from the ob-

served points to the fitting curve. Fitting a circle to scattered data involves minimization

of[15]

f(a,b,R) =n

i=1

[

(xia)2 + (yib)2R]2 (6.2)

where where,xi,yi,a,band R represent the data x-coordinate, y-coordinate, centre coordi-

nates (a,b) and radius respectively. This is relatively simple expression and we use Leven-

berg Marquart (LM) procedure for fitting the circle. The algorithm can be explained as in

1[15].

Algorithm 1LM procedure of fitting circle

1: Initialize (a0,b0,R0) and, computeF0=F(a0,b0,R0)2: Assuming thatak,bk,Rkare known, computeri,ui,vi for alli3: Assemble the matrixNand the vectorJTg

4: Compute the matrixN=N+I5: Solve the systemNh= JTgfor h by Cholesky factorization6: Ifh/Rk< (small tolerance), then terminate the procedure7: Use h = (h1,h2,h3) to update the parameters ak+1= ak+ h1,bk+1=bk+ h2,Rk+1=

Rk+ h38: ComputeFk+1= F(ak+1,bk+1,Rk+1)9: IfFk+1= FkorRk+1 0, update

and return to Step 4 otherwise increment k,

update and return to Step 2

38


45/91

ak,bk,Rkdenote the centre coordinates and radius of thekth circle respectively. ri is the

distance of theith data point from the curve,ui andviare the partial derivative of (6.2) with

respect toa and partial derivative of (6.2) with respect tob respectively. where,

ri=

(xia)2 + (yib)2 (6.3)

ui= ria

=(xia)

ri(6.4)

vi= rib

=(yib)

ri(6.5)

N=

uu uv u

uv vv v

u v 1

(6.6)

J=

u1 v1 1

... ...

...

un vn 1

(6.7)

JTg=n

RuurRv vr

R r

=

Ru + axRv + by

R r

(6.8)

g= [r1R, rnR]T

(6.9)

uv=

ni=1 u

2i

n (6.10)

The performance of the LM method depends on its initial estimate. LM utilizes the fast

convergence of Newton-Raphson method and stability of Gradient Descent method. The

initial estimate is found by the method explained in2

Algorithm 2Algorithm for finding initial estimate of the circle

1: For every consecutive three points in the data set, do the following ...

2: Check whether three points chosen are collinear3: If not collinear calculate the circumcircle centre of the triangle formed by three vertices

4: Store the centre coordinates, substitute it in the standard equation of the circle to get

the radius.

5: Thex,y coordinate and radius are stored in three arrays.

6: Repeat it for nP3combinations wheren is the number of data points.

7: Mean of allx coordinates,y coordinates and radius is returned

39


46/91

6.2.2 Elliptical Regression

The revolute joints can produce an elliptical end-effector trace if the end effector is moving

in a plane that is not parallel to the camera plane. Elliptical Regression analysis is used to

fit an ellipse to the elliptical trace of the end-effector. An ellipse can be described by the

following general conic equation [61].

am21+ bm1m2+ cm22+ dm1+ em2+f=0 (6.11)

Where,(a,b,c,d,e) such that a2 + b2 + c2 >0 and discriminant = b2 4ac< 0.Equation of the ellipse in matrix form can be written as:

Tu(x) =0 with the constraint TF>0

where,

F=

0 0 2 0 0 00 1 0 0 0 02 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

(6.12)

u(x) = [m21,m1m2,m22,m1,m2,1]

T (6.13)

= [a,b,c,d,e,f]T (6.14)

The minimum value ofrepresents the best fit ellipse of the data point which can be found

by minimizing the cost function described in (6.15). An ellipse can be fit into the data thus

produced by an ellipse fitting algorithm proposed by Zygmunt L Szpak et al. in [61]. The

algorithm also uses LM algorithm for optimizing the merit function and a seed value using

the algorithm described in [61]. This method is a combination of algebraic and geometric

fit. The algorithm can be summarized is given i

Self Modelling Approach in Robot Manipulators Leading to Eye Hand Coordination

Documents

Transcript of Self Modelling Approach in Robot Manipulators Leading to Eye Hand Coordination