the analysis of the selected procedu res in 2 Matlab …...Matlab environment. It comprises...

88

ANN training – the analysis of the selected procedures in Matlab environment

Jacek Bartman, Zbigniew Gomółka, Bogusław Twaróg

University of Rzeszow, Department of Computer Engineering, 35-310 Rzeszow, Pigonia 1, Poland

e-mails: {jbartman, zgomolka, btwarog}@univ.rzeszow.pl

Abstract. The article presents the development of artificial neural networks in Matlab environment. It comprises description of information that is stored in the variable representing the neural network and analysis of the function used to learn artificial neural networks.

Keywords: artificial intelligence, neural network, matlab, ANN training

1 Introduction

Matlab is a programming environment dedicated primarily to calculations and computer simulations; of course, it can be applied in other fields as well. The main component of the environment is a command interpreter that lets you work in batch mode and interactive mode - by issuing single commands in the command line. An integral, but optional part of Matlab are libraries (so called toolboxes) representing a set of m-filesthat are dedicated for the applications in a narrow specialty, e.g.: NNet groups the functions in the field of Artificial Neural Networks, Fuzzy in the field of fuzzy sets. Some libraries need to be installed before other libraries,as they use the functions that are contained in the former.

Simplicity, intuitiveness and a graphical presentation of results makes Matlab a tool which is very often applied. Extended thematic libraries facilitate the development of programs, as it happens, e.g. in case of NNet library which is dedicated to artificial neural networks. A programmer, by selecting parameters will virtually have an impact on every single element of the proposed neural network: it establishes its architecture, activation function of neurons, training method together with its parameters, the method of assessing the progress of training, selects a trainingset determining its division into training, testing and validating subsets. This means that Matlab is very flexible for its users as they can customized it to their own needs.

The apparent disadvantage of the package is a great quantity of service functions - it makes it difficult to create universal and unified programs. While it is very easy to construct functions for a particular task in Matlab, it is very complicated to create features that would fully attune to the philosophy of the package, be able to use its full capabilities and behave the same as the original functions. A huge amount of induced functions with different parameters, whose names and configuration change is the main difficulty here.

2 Matlab – creating multilayer feedforward neural networks

For creating artificial neural networks the package offers a few commands which are [2, 3]: newff - creating a multilayer feedforward neural network, newfffd - creating a multilayer feedforward neural network

with a time delay vector, newp - creating a single layer network consisting of preceptors newlin - creating a single layer network consisting of linear neurons. newlind - designing a single layer network consisting of linear neurons.

Before creating the network it is necessary to define the matrix of: • a training set of data P = [0 0 1 1; 0 1 0 1];• a set of expected data T = [1 0 0 0];

With so defined sets of data we can create, at a later stage, a variable net (you can specify any other name) representing the neural network. In this variable, which is formally a structure, all the information about the construction of the network created is stored. For constructing the network a newff.m function creating a multilayer feedforward neural network will be used. The syntax of this function is as follows:

net = newff(P,T,S,TF,BTF,BLF,PF,IPF,OPF,DDF)

where: P – aset of training data;T – aset of expected results;Si – an amount of neurons in particular hidden layers, hence the index „i”;TFi – names of the activation function for particular layers. A default activation

function for the hidden layers is a tangensoidal function, and linear function (pureline) for the output layer;

BTF – a name of the training network method, Levenberga-Marquardta(trainlm) algorithm by default;

BLF – the name of function used for the modification of weights, learngdm bydefault

PF – goal function,the mean squared error (mse) by default;IPF – table row cells of the input processing functions which are default:

fixunknowns, removeconstantrows, mapminmax;OPF – table row cells of the output processing functions which are default:

removeconstantrows, mapminmax; DDF – the functions of training data set division to the set of relevant, validating

and testing data, dividerand.m by default; net – the created artificial neural network.

89

ANN training – the analysis of the selected procedures in Matlab environment

Jacek Bartman, Zbigniew Gomółka, Bogusław Twaróg

University of Rzeszow, Department of Computer Engineering, 35-310 Rzeszow, Pigonia 1, Poland

e-mails: {jbartman, zgomolka, btwarog}@univ.rzeszow.pl

Abstract. The article presents the development of artificial neural networks in Matlab environment. It comprises description of information that is stored in the variable representing the neural network and analysis of the function used to learn artificial neural networks.

Keywords: artificial intelligence, neural network, matlab, ANN training

1 Introduction

Matlab is a programming environment dedicated primarily to calculations and computer simulations; of course, it can be applied in other fields as well. The main component of the environment is a command interpreter that lets you work in batch mode and interactive mode - by issuing single commands in the command line. An integral, but optional part of Matlab are libraries (so called toolboxes) representing a set of m-filesthat are dedicated for the applications in a narrow specialty, e.g.: NNet groups the functions in the field of Artificial Neural Networks, Fuzzy in the field of fuzzy sets. Some libraries need to be installed before other libraries,as they use the functions that are contained in the former.

Simplicity, intuitiveness and a graphical presentation of results makes Matlab a tool which is very often applied. Extended thematic libraries facilitate the development of programs, as it happens, e.g. in case of NNet library which is dedicated to artificial neural networks. A programmer, by selecting parameters will virtually have an impact on every single element of the proposed neural network: it establishes its architecture, activation function of neurons, training method together with its parameters, the method of assessing the progress of training, selects a trainingset determining its division into training, testing and validating subsets. This means that Matlab is very flexible for its users as they can customized it to their own needs.

The apparent disadvantage of the package is a great quantity of service functions - it makes it difficult to create universal and unified programs. While it is very easy to construct functions for a particular task in Matlab, it is very complicated to create features that would fully attune to the philosophy of the package, be able to use its full capabilities and behave the same as the original functions. A huge amount of induced functions with different parameters, whose names and configuration change is the main difficulty here.

2 Matlab – creating multilayer feedforward neural networks

For creating artificial neural networks the package offers a few commands which are [2, 3]: newff - creating a multilayer feedforward neural network, newfffd - creating a multilayer feedforward neural network

with a time delay vector, newp - creating a single layer network consisting of preceptors newlin - creating a single layer network consisting of linear neurons. newlind - designing a single layer network consisting of linear neurons.

Before creating the network it is necessary to define the matrix of: • a training set of data P = [0 0 1 1; 0 1 0 1];• a set of expected data T = [1 0 0 0];

With so defined sets of data we can create, at a later stage, a variable net (you can specify any other name) representing the neural network. In this variable, which is formally a structure, all the information about the construction of the network created is stored. For constructing the network a newff.m function creating a multilayer feedforward neural network will be used. The syntax of this function is as follows:

net = newff(P,T,S,TF,BTF,BLF,PF,IPF,OPF,DDF)

where: P – aset of training data;T – aset of expected results;Si – an amount of neurons in particular hidden layers, hence the index „i”;TFi – names of the activation function for particular layers. A default activation

function for the hidden layers is a tangensoidal function, and linear function (pureline) for the output layer;

BTF – a name of the training network method, Levenberga-Marquardta(trainlm) algorithm by default;

BLF – the name of function used for the modification of weights, learngdm bydefault

PF – goal function,the mean squared error (mse) by default;IPF – table row cells of the input processing functions which are default:

fixunknowns, removeconstantrows, mapminmax;OPF – table row cells of the output processing functions which are default:

removeconstantrows, mapminmax; DDF – the functions of training data set division to the set of relevant, validating

and testing data, dividerand.m by default; net – the created artificial neural network.

90

2.1 The formula of the neural network structure

A representative network variable net contains thorough information on the architecture of the created neural network.The values of basic network parameters can be obtained by inserting a name variable (e.g. net) directly in the command line:

net = Neural Network object: architecture: numInputs: 1 a number of network inputsnumLayers: 2 a number of network layersbiasConnect: [1; 1] inputConnect: [1; 0] layerConnect: [0 0; 1 0] outputConnect: [0 1] numOutputs: 1 (read-only) numInputDelays: 0 (read-only) numLayerDelays: 0 (read-only)

subobject structures: inputs: {1x1 cell} of inputs layers: {2x1 cell} of layers outputs: {1x2 cell} containing 1 output biases: {2x1 cell} containing 2 biases inputWeights: {2x1 cell} containing 1 input weight layerWeights: {2x2 cell} containing 1 layer weight

functions: adaptFcn: 'trains' divideFcn: 'dividerand' gradientFcn: 'calcgrad' initFcn: 'initlay' performFcn: 'mse' plotFcns:{'plotperform','plottrainstate','plotregression'} trainFcn: 'traingd'

parameters: adaptParam: .passes divideParam: .trainRatio,

.valRatio, .testRatio

gradientParam: (none) initParam: (none) performParam: (none)

Defines a part of data set used for: - relevant training - trainRatio (default 60%) - validation- valRatio (default20%) - tests - test Ratio (default20%)

trainParam: .show, the number of epochs that the results are shown .showWindow, graph. training presentation (nntraintool.m)

.showCommandLine, generating output of command line .epochs, max.number of training epochs

.time, max. time of training network .goal, goal function

.max_fail, max. number of error changes .lr, training rate .min_grad, minimal change of gradient

weight and bias values: IW: {2x1 cell} containing 1 input weight matrix LW: {2x2 cell} containing 1 layer weight matrix b: {2x1 cell} containing 2 bias vectors

other: name: '' userdata: (user information)

The values of particular parameters can be changed. To do so, one must substitute a new value in the right field. For example, if we want to change the maximum number of training to 1000, we need to give the following command [1]:

net.trainParam.epochs=1000

Apart from the basic parameters in the net object structure, the hidden details of the network construction are saved as well.To obtain information about them, we need to give the following command:

net.hint

Then a list of elements appears, which are mostly the complex structures from which we can learn about, e.g. the size of the network input layer (net.hint.inputSizes), the size of the network output layer (net.hint.outputSizes), the transition functions that are used in particular layers (net.hint.transferFcn), indexation of synaptic weights in the input layer(net.hint.inputWeightInd{i})and in further layers (net.hint.layerWeightInd{i,j}), and indexation of biases (net.hint.biasInd{i}), the number of all the weights and biases values (net.hint.xLen).

91

2.1 The formula of the neural network structure

A representative network variable net contains thorough information on the architecture of the created neural network.The values of basic network parameters can be obtained by inserting a name variable (e.g. net) directly in the command line:

net = Neural Network object: architecture: numInputs: 1 a number of network inputsnumLayers: 2 a number of network layersbiasConnect: [1; 1] inputConnect: [1; 0] layerConnect: [0 0; 1 0] outputConnect: [0 1] numOutputs: 1 (read-only) numInputDelays: 0 (read-only) numLayerDelays: 0 (read-only)

subobject structures: inputs: {1x1 cell} of inputs layers: {2x1 cell} of layers outputs: {1x2 cell} containing 1 output biases: {2x1 cell} containing 2 biases inputWeights: {2x1 cell} containing 1 input weight layerWeights: {2x2 cell} containing 1 layer weight

functions: adaptFcn: 'trains' divideFcn: 'dividerand' gradientFcn: 'calcgrad' initFcn: 'initlay' performFcn: 'mse' plotFcns:{'plotperform','plottrainstate','plotregression'} trainFcn: 'traingd'

parameters: adaptParam: .passes divideParam: .trainRatio,

.valRatio, .testRatio

gradientParam: (none) initParam: (none) performParam: (none)

Defines a part of data set used for: - relevant training - trainRatio (default 60%) - validation- valRatio (default20%) - tests - test Ratio (default20%)

trainParam: .show, the number of epochs that the results are shown .showWindow, graph. training presentation (nntraintool.m)

.showCommandLine, generating output of command line .epochs, max.number of training epochs

.time, max. time of training network .goal, goal function

.max_fail, max. number of error changes .lr, training rate .min_grad, minimal change of gradient

weight and bias values: IW: {2x1 cell} containing 1 input weight matrix LW: {2x2 cell} containing 1 layer weight matrix b: {2x1 cell} containing 2 bias vectors

other: name: '' userdata: (user information)

The values of particular parameters can be changed. To do so, one must substitute a new value in the right field. For example, if we want to change the maximum number of training to 1000, we need to give the following command [1]:

net.trainParam.epochs=1000

Apart from the basic parameters in the net object structure, the hidden details of the network construction are saved as well.To obtain information about them, we need to give the following command:

net.hint

Then a list of elements appears, which are mostly the complex structures from which we can learn about, e.g. the size of the network input layer (net.hint.inputSizes), the size of the network output layer (net.hint.outputSizes), the transition functions that are used in particular layers (net.hint.transferFcn), indexation of synaptic weights in the input layer(net.hint.inputWeightInd{i})and in further layers (net.hint.layerWeightInd{i,j}), and indexation of biases (net.hint.biasInd{i}), the number of all the weights and biases values (net.hint.xLen).

92

3 MATLAB–neural network training

The created neural network contains random values of weights and biases. The training can be achieved by using the train or adapt functions. The train function trains the neural network according to the selected training method that is included in the net.trainFcn field, and parameters included in the fields net.trainParam (the adapt function uses analogic fields net.adaptFcn and net.adaptParam). The basic difference between both functions is that the adapt function does only one training epoch, while the train function learns until one of the stop conditions are acquired [4]:

• the error will be achieved in the defined net.trainParam.goal field, • the maximum number of training epochs will be exceeded, given in the

olunet.trainParam.epochs field, • the time for training the network will exceed the value defined in the

net.trainParam.time field, • othercondition will be achieved, which results from the specification of the

method which is used for training. The syntax of both commands is equivalent:

[ne ttr Y E] = train(net,P,T);

The input parameters are: the learnt network (net), the matrix of input vectors (P), the matrix of expected answers (T). The function returns the learnt network (net), and the process of training (tr), the values of network answers (Y), training errors (E).

3.1 Theoretical basics of the selected method of training(of classic backward method of error propagation)

A basic training method of feedforward multilayer neural networks is the method of backward propagation of error that for defining the weights correction uses the error gradient:

∆

where: E – goal function (mean squared error); η – training rate; wkj – value j-of this weight k-of this neuron.

When we assume that the correction of weights comes after inserting all training elements, the mean squared error, which describes the goal function, will look as follows:

( )2

1

12

mout

k kk

E d y=

= −∑

where: m – the number of neurons in the output layer; dk – the value of the expected answer k–of this neuron; yk – virtual answer k-of this neuron.

When we include the dependencies of network architecture and the properties of variables, we obtain a formula for the neurons weights correction. In case of the output layer it looks as follows:

∆

And for the hidden layers:

( ) 11

f )f ) (( houtmjh out out ink

ji k k kj ioutk k j

d ud uw d y w x

du duη

=

∆ = −

∑

where: f – transition function of neurons (activation); dk – the value of the expected answer k–of this neuron; yk – virtual answer k-of this neuron.

3.2 The analysis of the method implementation

The presented dependencies in part 3.1 were implemented in Matlab in the traingd function. In Matlab there are also other implementations of training methods, for each a separate function is dedicated. All training functions have a common prefix - train*. In the basic version the backward method of error propagation is quite slow, however its implementation contains all components that are characteristic for training neural networks in Matlab.

The script begins with a function subject line which looks as follows:

function[net,tr]=traingd(net,tr,trainV,valV,testV,varargin)

where: net – the object which describes the architecture of neural network (initiated

by train.m);tr – the parameter which contains the description of the training process,

(initiated by the train.m function);trainV – the training set(created by the train.m function);valV – the validating set (created by the train.m function);testV – the testing set (created by the train.m function);varagin– the optional argument, allows receiving a varied amount of data.

Below the subject line there is a description of function; it appears after giving the help traingd command. The comment includes the information about the

93

3 MATLAB–neural network training

The created neural network contains random values of weights and biases. The training can be achieved by using the train or adapt functions. The train function trains the neural network according to the selected training method that is included in the net.trainFcn field, and parameters included in the fields net.trainParam (the adapt function uses analogic fields net.adaptFcn and net.adaptParam). The basic difference between both functions is that the adapt function does only one training epoch, while the train function learns until one of the stop conditions are acquired [4]:

• the error will be achieved in the defined net.trainParam.goal field, • the maximum number of training epochs will be exceeded, given in the

olunet.trainParam.epochs field, • the time for training the network will exceed the value defined in the

net.trainParam.time field, • othercondition will be achieved, which results from the specification of the

method which is used for training. The syntax of both commands is equivalent:

[ne ttr Y E] = train(net,P,T);

The input parameters are: the learnt network (net), the matrix of input vectors (P), the matrix of expected answers (T). The function returns the learnt network (net), and the process of training (tr), the values of network answers (Y), training errors (E).

3.1 Theoretical basics of the selected method of training(of classic backward method of error propagation)

A basic training method of feedforward multilayer neural networks is the method of backward propagation of error that for defining the weights correction uses the error gradient:

∆

where: E – goal function (mean squared error); η – training rate; wkj – value j-of this weight k-of this neuron.

When we assume that the correction of weights comes after inserting all training elements, the mean squared error, which describes the goal function, will look as follows:

( )2

1

12

mout

k kk

E d y=

= −∑

where: m – the number of neurons in the output layer; dk – the value of the expected answer k–of this neuron; yk – virtual answer k-of this neuron.

When we include the dependencies of network architecture and the properties of variables, we obtain a formula for the neurons weights correction. In case of the output layer it looks as follows:

∆

And for the hidden layers:

( ) 11

f )f ) (( houtmjh out out ink

ji k k kj ioutk k j

d ud uw d y w x

du duη

=

∆ = −

∑

where: f – transition function of neurons (activation); dk – the value of the expected answer k–of this neuron; yk – virtual answer k-of this neuron.

3.2 The analysis of the method implementation

The presented dependencies in part 3.1 were implemented in Matlab in the traingd function. In Matlab there are also other implementations of training methods, for each a separate function is dedicated. All training functions have a common prefix - train*. In the basic version the backward method of error propagation is quite slow, however its implementation contains all components that are characteristic for training neural networks in Matlab.

The script begins with a function subject line which looks as follows:

function[net,tr]=traingd(net,tr,trainV,valV,testV,varargin)

where: net – the object which describes the architecture of neural network (initiated

by train.m);tr – the parameter which contains the description of the training process,

(initiated by the train.m function);trainV – the training set(created by the train.m function);valV – the validating set (created by the train.m function);testV – the testing set (created by the train.m function);varagin– the optional argument, allows receiving a varied amount of data.

Below the subject line there is a description of function; it appears after giving the help traingd command. The comment includes the information about the

94

formal parameters of functions, gives default values of network parameters, determined during its creation and marks the training algorithm.

The working part of the traingd function was divided into sections where each is responsible for a particular task. The more extended sections were divided into blocks. The names of sections are preceded with the „%%” symbol and the names of blocks are preceded with the comment symbol „%”.

Below there is a characterisation of other sections that are included in the traingdfunction.

Info section The Info section contains basic information about the training method. The

section content may be viewed by issuing the command:

traingd('info')

We will receive the answer:

ans = function: 'traingd' title: 'Gradient Descent Backpropagation' type: 'Training' version: 6 training_mode: 'Supervised' gradient_mode: 'Gradient' uses_validation: 1 param_defaults: [1x1 struct] training_states: [1x2 struct]

giving, among others, a file name - 'traingd', method name - 'Gradient Descent Backpropagation', training mode - 'Supervised'.

The block is also used to assign default values of network parameters to fields info.param_defaults.*

NNET section 5.1 Backward CompatibilityAnother section’s name is NNET 5.1 Backward Compatibility is

responsible for the compatibility withthe previous versions of functions. The Parameters block, which is included in the section, creates variables where the training network parameters are stored:

% Parameters epochs = net.trainParam.epochs; goal = net.trainParam.goal; lr = net.trainParam.lr; max_fail = net.trainParam.max_fail; min_grad = net.trainParam.min_grad;show = net.trainParam.show;

time = net.trainParam.time;gradientFcn = net.gradientFcn;

Defined Variables improve the clarity of functions and reduce the size of its code. Another element of NNET section 5.1 Backward Compatibility is the Parameter Checking block; it checks whether the values passed to the function of training parameters (stored in the transferred network net) are acceptable or whether they are commonsense. Here's an example condition which checks whether the value of a variable describing the maximum number of training epochs is correct:

if (~isa(epochs,'double')) || (~isreal(epochs)) || ... (any(size(epochs)) ~= 1) || (epochs < 1) || ... (round(epochs) ~= epochs)error('NNET:Arguments','Epochs is not a positive integer.')

end

The instruction consecutively checks whether the epochs variable: • is not the floating type of double precision ~isa(epochs,'double'),• has no value that belongs to real numbers ~isreal(epochs),• is the matrix jestany(size(epochs)) ~= 1,• has a value lower than 1 epochs < 1,• has no incomplete value round(epochs) ~= epochs.

When any of these conditions is fulfilled, an error message is displayed and training is not performed. Similarly, other training parameters are tested.

The last two blocks of the section is are Initialize and Initialize Performance. The first one, Initalize, initiates five new variables:

% Initialize Q = trainV.Q; TS = trainV.TS; val_fail = 0; startTime = clock; X = getx(net);

TrainV is one of the parameters passed to the trangd function; it contains information about the data that is used for network training (about the data for which the appropriate training is performed). The Q field indicates the number of training vectors, and the field TS informs about the number of time steps. The Val_failvariable is used to count the number of error steps of training, and the startTimevariable saves the time of the start of training neural network. To initiate the startTime variable the clock built-in function is used, returning a six-element data vector that contains a current date and time in the form of: [year month day hour minute second ]. The last of the initiated variables ( X ) is used to save the initial values of weights and biases. They are obtained by using the getX(net) function. Each weight and bias of network has assigned index that can be read from hidden network parameters:

95

formal parameters of functions, gives default values of network parameters, determined during its creation and marks the training algorithm.

The working part of the traingd function was divided into sections where each is responsible for a particular task. The more extended sections were divided into blocks. The names of sections are preceded with the „%%” symbol and the names of blocks are preceded with the comment symbol „%”.

Below there is a characterisation of other sections that are included in the traingdfunction.

Info section The Info section contains basic information about the training method. The

section content may be viewed by issuing the command:

traingd('info')

We will receive the answer:

ans = function: 'traingd' title: 'Gradient Descent Backpropagation' type: 'Training' version: 6 training_mode: 'Supervised' gradient_mode: 'Gradient' uses_validation: 1 param_defaults: [1x1 struct] training_states: [1x2 struct]

giving, among others, a file name - 'traingd', method name - 'Gradient Descent Backpropagation', training mode - 'Supervised'.

The block is also used to assign default values of network parameters to fields info.param_defaults.*

NNET section 5.1 Backward CompatibilityAnother section’s name is NNET 5.1 Backward Compatibility is

responsible for the compatibility withthe previous versions of functions. The Parameters block, which is included in the section, creates variables where the training network parameters are stored:

% Parameters epochs = net.trainParam.epochs; goal = net.trainParam.goal; lr = net.trainParam.lr; max_fail = net.trainParam.max_fail; min_grad = net.trainParam.min_grad;show = net.trainParam.show;

time = net.trainParam.time;gradientFcn = net.gradientFcn;

Defined Variables improve the clarity of functions and reduce the size of its code. Another element of NNET section 5.1 Backward Compatibility is the Parameter Checking block; it checks whether the values passed to the function of training parameters (stored in the transferred network net) are acceptable or whether they are commonsense. Here's an example condition which checks whether the value of a variable describing the maximum number of training epochs is correct:

if (~isa(epochs,'double')) || (~isreal(epochs)) || ... (any(size(epochs)) ~= 1) || (epochs < 1) || ... (round(epochs) ~= epochs)error('NNET:Arguments','Epochs is not a positive integer.')

end

The instruction consecutively checks whether the epochs variable: • is not the floating type of double precision ~isa(epochs,'double'),• has no value that belongs to real numbers ~isreal(epochs),• is the matrix jestany(size(epochs)) ~= 1,• has a value lower than 1 epochs < 1,• has no incomplete value round(epochs) ~= epochs.

When any of these conditions is fulfilled, an error message is displayed and training is not performed. Similarly, other training parameters are tested.

The last two blocks of the section is are Initialize and Initialize Performance. The first one, Initalize, initiates five new variables:

% Initialize Q = trainV.Q; TS = trainV.TS; val_fail = 0; startTime = clock; X = getx(net);

TrainV is one of the parameters passed to the trangd function; it contains information about the data that is used for network training (about the data for which the appropriate training is performed). The Q field indicates the number of training vectors, and the field TS informs about the number of time steps. The Val_failvariable is used to count the number of error steps of training, and the startTimevariable saves the time of the start of training neural network. To initiate the startTime variable the clock built-in function is used, returning a six-element data vector that contains a current date and time in the form of: [year month day hour minute second ]. The last of the initiated variables ( X ) is used to save the initial values of weights and biases. They are obtained by using the getX(net) function. Each weight and bias of network has assigned index that can be read from hidden network parameters:

96

net.hint.inputWeightInd – indices of input synaptic weights; net.hint.layerWeightInd – indices of layered synaptic weights; net.hint.biasInd – indices for threshold values.

Current values of weights and biases can be viewed after inserting: • net.IW{i} – current values for input synaptic weights. Letter iin brackets

substitutes the numer of layer about which a user wants to display the current weights values;

• net.LW{i} – current values for layered synaptic weights; • net.b{i} – current values for biases.

The number of included synaptic weights and biases is stored by: net.hint.xLen.net object field. The following block is responsible for assigning weights and biases to a particular index:

x = zeros(net.hint.xLen,1); for i=1:net.numLayers for j=find(inputLearn(i,:)) x(inputWeightInd{i,j}) = net.IW{i,j}(:); end for j=find(layerLearn(i,:)) x(layerWeightInd{i,j}) = net.LW{i,j}(:); end if biasLearn(i) x(biasInd{i}) = net.b{i}; end end

The block begins with a command of creating a zero matrix The value of net.hin.xLen has influence on the dimension of the zero matrix. It specifies how many lines the matrix will contain. At a later stage the zero matrix will be replaced by other values than zero.

In the next line the for. loop starts. The number of its iterations represents the number of layers of neural network. It contains twofor loopsand one conditional instruction.

The for j=find(inputLearn(i, :) loop will be executed as long as the value of the inputLearn field value will equal 1. According to the appropriate indexation, the value net.IW {i, j} (:) will be assigned to the X vector. The second loop operates in ananalog manner.

And the ifbiasLearn(i) conditional instruction is responsible for entering threshold values in appropriate indices.

The second initiating block is the Initialize Performance block that initiates the variables used to assess network performance. The calcperf2function, which is created in it, sets the initial values of the (perf) goal function,errors(EI) , output values(trainV.Y).

[perf,El,trainV.Y,Ac,N,Zb,Zi,Zl] = calcperf2(net,X,trainV.Pd,trainV.Tl,trainV.Ai,Q,TS);

The above initiation of functions, here, leads to the determining the initial values of goal function (perf), errors (El), output values (trainV.Y), using the following arguments:

• net – the already known net object, for the function used, among others, the parameters like: number of layers – net.numLayers or the selected goal function (e.g. mse or sse) – net.performFcn;

• X – current values of synaptic weights and biases saved in the form of a singular vector, created by using the getx(net) function;

• trainV.Pd – the matrix of the delays of input signal samples in network; • trainV.Tl – stores a set of expected values; • trainV.Ai – the matrix of the delays of signal samples in the following

network layers; • Q – A drawn number of learning vectors, in the dividerand.m function, for

which the right training (trainV) is fulfilled. • TS – the number of time steps that was already mentioned.

Training Record Section Next section Training Record, initiates the data fields of the tr variable.

%% Training Record tr.best_epoch = 0; tr.goal = goal; tr.states = ... {'epoch','time','perf','vperf','tperf','gradient','val_fail'};

tr.best_epoch indicates the number of an epoch in which the network gained best training results, before the training takes place it is the 0 epoch. The value of goal function goal(net.trainParam.goal) is assigned to the tr.goal field, and the tr.states field stores the statuses of network training.

Status SectionThe Status section is used to open a

window that shows the process of training (fig.1.) The window is generated by the nntraintool.m function,which in turn, isgenerated by the nn_train_feedback.m private function, that is started in the Status section. Generating of function is preceded with the initiation of the statusstructure, used for the window description.

Fig.1. The window presents the training process of neural network

97

net.hint.inputWeightInd – indices of input synaptic weights; net.hint.layerWeightInd – indices of layered synaptic weights; net.hint.biasInd – indices for threshold values.

Current values of weights and biases can be viewed after inserting: • net.IW{i} – current values for input synaptic weights. Letter iin brackets

substitutes the numer of layer about which a user wants to display the current weights values;

• net.LW{i} – current values for layered synaptic weights; • net.b{i} – current values for biases.

The number of included synaptic weights and biases is stored by: net.hint.xLen.net object field. The following block is responsible for assigning weights and biases to a particular index:

x = zeros(net.hint.xLen,1); for i=1:net.numLayers for j=find(inputLearn(i,:)) x(inputWeightInd{i,j}) = net.IW{i,j}(:); end for j=find(layerLearn(i,:)) x(layerWeightInd{i,j}) = net.LW{i,j}(:); end if biasLearn(i) x(biasInd{i}) = net.b{i}; end end

The block begins with a command of creating a zero matrix The value of net.hin.xLen has influence on the dimension of the zero matrix. It specifies how many lines the matrix will contain. At a later stage the zero matrix will be replaced by other values than zero.

In the next line the for. loop starts. The number of its iterations represents the number of layers of neural network. It contains twofor loopsand one conditional instruction.

The for j=find(inputLearn(i, :) loop will be executed as long as the value of the inputLearn field value will equal 1. According to the appropriate indexation, the value net.IW {i, j} (:) will be assigned to the X vector. The second loop operates in ananalog manner.

And the ifbiasLearn(i) conditional instruction is responsible for entering threshold values in appropriate indices.

The second initiating block is the Initialize Performance block that initiates the variables used to assess network performance. The calcperf2function, which is created in it, sets the initial values of the (perf) goal function,errors(EI) , output values(trainV.Y).

[perf,El,trainV.Y,Ac,N,Zb,Zi,Zl] = calcperf2(net,X,trainV.Pd,trainV.Tl,trainV.Ai,Q,TS);

The above initiation of functions, here, leads to the determining the initial values of goal function (perf), errors (El), output values (trainV.Y), using the following arguments:

• net – the already known net object, for the function used, among others, the parameters like: number of layers – net.numLayers or the selected goal function (e.g. mse or sse) – net.performFcn;

• X – current values of synaptic weights and biases saved in the form of a singular vector, created by using the getx(net) function;

• trainV.Pd – the matrix of the delays of input signal samples in network; • trainV.Tl – stores a set of expected values; • trainV.Ai – the matrix of the delays of signal samples in the following

network layers; • Q – A drawn number of learning vectors, in the dividerand.m function, for

which the right training (trainV) is fulfilled. • TS – the number of time steps that was already mentioned.

Training Record Section Next section Training Record, initiates the data fields of the tr variable.

%% Training Record tr.best_epoch = 0; tr.goal = goal; tr.states = ... {'epoch','time','perf','vperf','tperf','gradient','val_fail'};

tr.best_epoch indicates the number of an epoch in which the network gained best training results, before the training takes place it is the 0 epoch. The value of goal function goal(net.trainParam.goal) is assigned to the tr.goal field, and the tr.states field stores the statuses of network training.

Status SectionThe Status section is used to open a

window that shows the process of training (fig.1.) The window is generated by the nntraintool.m function,which in turn, isgenerated by the nn_train_feedback.m private function, that is started in the Status section. Generating of function is preceded with the initiation of the statusstructure, used for the window description.

Fig.1. The window presents the training process of neural network

98

Train Section The last section of the traingd.m function is the Train section; it is the

section where the training of neural network is realized.The section consists of a few blocks that are repeated iteratively. The condition of ending the iteration is gaining a demanded number of training epochs, saved in the net.trainParam.epochs field and meeting another criterion that is defined in the Stopping Criteria block.

The first block of the section is the Gradient block. In this block only one function calcgx is generated, is used to compute the value of gX vector elements and the value of gradient. The gX vector, at a later stage, is used for the correction of weights and biases values, saved in the X vector

% Gradient [gX,gradient] = calcgx(net,X,trainV.Pd,Zb,Zi,Zl,N,Ac,El,perf,Q,TS);

The calcgx.m function required the following arguments: net – the structure describing the studied neural network X – current values of synaptic weights and biases, saved in the form of

singular vector (created with the getx(net)) function trainV.Pd – the matrix of the delays of input signal samples in the network; Zb – biases; Zi – input weights; Zl – weights of layers; N – network inputs; Ac – linked output layers; El – errors of layers; perf – value of goal function; Q – number of training vectors for which the right training is created

(trainV); TS – number of time steps.

The second block of the Train section is the StoppingCriteria block which was mentioned before. It groups all the conditions whose realization should stop the training process and leaving the iteration:

% Stopping Criteria current_time = etime(clock,startTime); [userStop,userCancel] = nntraintool('check'); If userStop, tr.stop = 'User stop.'; net = best_net;

elseif userCancel, tr.stop = 'User cancel.'; net = original_net;

elseif (perf<= goal), tr.stop = 'Performance goal met.'; net = best_net;

elseif (epoch==epochs), tr.stop ='Maximum epoch reached.'; net = best_net;

elseif (current_time>= time), tr.stop = 'Maximum time elapsed.'; net = best_net;

elseif (gradient <= min_grad), tr.stop = 'Minimum gradient reached.'; net = best_net;

elseif (doValidation) && (val_fail>= max_fail), tr.stop = 'Validation stop.'; net = best_net;

end

After determining of the current time of network training by the etime function and saving it in the current_time variable, it starts to be checked whether a user has not pressed the StopTraining button or Cancel. The following block code is used to control whether any of conditions of stopping the training has been met; the following are checked in the following order:

• the userStop value signalling that the StopTraining button was pressed;• the userCancel value signalling that the Cancel button was pressed;• perf<= goal fulfilling of the condition means that the error, which was

made by the network, is smaller than the maximum acceptable error – thenetwork has been trained;

• epoch == epochs meeting the condition means that a maximumacceptable number of training epochs was executed;

• current_time>= time meeting the condition means that the training timehas exceeded the acceptable value;

• gradient <= min_grad meeting the condition means that the gradient issmaller than the acceptable, which means that the network is not actually beingtrained;

• (doValidation) && (val_fail >= max_fail)whether thevalidation has been performer and the number of error training steps (causingthedeterioration of the goal function value) exceeded its acceptable amount.

If any of the conditions is met, the comment that is appropriate for the situation is assigned to the tr.stop field. If the tr.stop fieldis not empty (some comment has been typed in it), the Stop block will causethe ending of the function process, and the accounts (saved in w tr.stop ) will show the user the reason of stopping the training [5].

99

Train Section The last section of the traingd.m function is the Train section; it is the

section where the training of neural network is realized.The section consists of a few blocks that are repeated iteratively. The condition of ending the iteration is gaining a demanded number of training epochs, saved in the net.trainParam.epochs field and meeting another criterion that is defined in the Stopping Criteria block.

The first block of the section is the Gradient block. In this block only one function calcgx is generated, is used to compute the value of gX vector elements and the value of gradient. The gX vector, at a later stage, is used for the correction of weights and biases values, saved in the X vector

% Gradient [gX,gradient] = calcgx(net,X,trainV.Pd,Zb,Zi,Zl,N,Ac,El,perf,Q,TS);

The calcgx.m function required the following arguments: net – the structure describing the studied neural network X – current values of synaptic weights and biases, saved in the form of

singular vector (created with the getx(net)) function trainV.Pd – the matrix of the delays of input signal samples in the network; Zb – biases; Zi – input weights; Zl – weights of layers; N – network inputs; Ac – linked output layers; El – errors of layers; perf – value of goal function; Q – number of training vectors for which the right training is created

(trainV); TS – number of time steps.

The second block of the Train section is the StoppingCriteria block which was mentioned before. It groups all the conditions whose realization should stop the training process and leaving the iteration:

% Stopping Criteria current_time = etime(clock,startTime); [userStop,userCancel] = nntraintool('check'); If userStop, tr.stop = 'User stop.'; net = best_net;

elseif userCancel, tr.stop = 'User cancel.'; net = original_net;

elseif (perf<= goal), tr.stop = 'Performance goal met.'; net = best_net;

elseif (epoch==epochs), tr.stop ='Maximum epoch reached.'; net = best_net;

elseif (current_time>= time), tr.stop = 'Maximum time elapsed.'; net = best_net;

elseif (gradient <= min_grad), tr.stop = 'Minimum gradient reached.'; net = best_net;

elseif (doValidation) && (val_fail>= max_fail), tr.stop = 'Validation stop.'; net = best_net;

end

After determining of the current time of network training by the etime function and saving it in the current_time variable, it starts to be checked whether a user has not pressed the StopTraining button or Cancel. The following block code is used to control whether any of conditions of stopping the training has been met; the following are checked in the following order:

• the userStop value signalling that the StopTraining button was pressed;• the userCancel value signalling that the Cancel button was pressed;• perf<= goal fulfilling of the condition means that the error, which was

made by the network, is smaller than the maximum acceptable error – thenetwork has been trained;

• epoch == epochs meeting the condition means that a maximumacceptable number of training epochs was executed;

• current_time>= time meeting the condition means that the training timehas exceeded the acceptable value;

• gradient <= min_grad meeting the condition means that the gradient issmaller than the acceptable, which means that the network is not actually beingtrained;

• (doValidation) && (val_fail >= max_fail)whether thevalidation has been performer and the number of error training steps (causingthedeterioration of the goal function value) exceeded its acceptable amount.

If any of the conditions is met, the comment that is appropriate for the situation is assigned to the tr.stop field. If the tr.stop fieldis not empty (some comment has been typed in it), the Stop block will causethe ending of the function process, and the accounts (saved in w tr.stop ) will show the user the reason of stopping the training [5].

100

In another block Training record the fields of the tr variable are updated. An update of the fields is done by generating the tr_update.m script. Before the update a conditional instruction appears, that checks whether the logic value of the doTest variable is real. The doTest condition is real when the testing training set exist (that is the testV.indices variable contains at least one number of the testing index of training set).

% Training record If doTest

[tperf,ignore,testV.Y] = ... calcperf2(net,X,testV.Pd,testV.Tl,testV.Ai,testV.Q,testV.TS);

end tr = ... tr_update(tr,[epoch current_timeperfvperftperf gradient val_fail]);

After the update of the fields of the tr variable, the update of parameters begins, which present: current number of training epochs, gradient value, value of goal function and time of training. These parameters are displayed in the nntraintoolgraphics window by generating the nn_train_feedback.m function with the'update' argument.

% Feedback nn_train_feedback('update',net,status,tr,{trainVvalVtestV},... [epoch,current_time,best_perf,gradient,val_fail]);

The Stop block, in turn, with the use ofthe conditional instruction, checks whether the tr.stop field is not empty. If it contains any value, the operation of the for loop is ended with the break command.

% Stop if ~isempty(tr.stop), break, end

Another Gradient Descent block of the Train section is responsible for the update of weights and biases

% Gradient Descent dX = lr*gX; X = X + dX; net = setx(net,X); [perf,El,trainV.Y,Ac,N,Zb,Zi,Zl] = … calcperf2(net,X,trainV.Pd,trainV.Tl,trainV.Ai,Q,TS);

First, the correction value of weights vector is determined –it is obtained by multiplying the value I gX (calculated in the calcgx.m function) by the training lr coefficient (net.trainParam.lr). In the next step, a new weights value is set by adding the selected dX correction to the current X weight value. The

setx(net,X)function does updates of the records of weights and biases in the net object . At the end of the block the calcperf2.m function calculates new error values, outputs, goal function. calcperf2.m errors , outputs, the objective function [6].

The last block of the section is the Validation block. In this block the values of validating set are calculated. It starts with a conditional instruction that checks if the logic variable doValidation is real. The situation is analogue as in case of the conditional instruction which checks a logic value of the doTestvariable.

4 Conclusions

Matlab is a calculation-simulation environment that is commonly valued. Its great possibilities may be extended by creating own scripts and functions that use ready libraries. However, if one wants to use all functions of Matlab, they need to explore it thoroughly. In this article the analysis of the selected representative functions, that are used to train artificial neural networks, has been presented. Its analysis allows forwarding some general conclusions:

• the variable that describes the neural network (usually called net) is a structure but particular fields may have straight values or may be records;

• the variable that describes the neural network (net) contains all the information concerning the composition and training of neural network; some parameters are hidden;

• the training functions are divided into sections that are responsible for the realization of specific tasks;

• during the training process a lot of very technical help functions are generated; • the parameters which are passed to the function, very often receive new names

and new form in the function body.

5 Bibliography

1. Bartman J. – Reguła PID uczenia sztucznych neuronów – Metody Informatyki Stosowanej 3/2009 s. 5-19;

2. Beale M., Hagan M., Demuth H., - Neural Network Toolbox™ User's Guide - MathWorks 1992-2014,

3. MATLAB Programming Techniques - MathWors 2010 4. Werbos P. - The Roots of Backpropagation - New York, Willey 1994 5. Gomółka Z., Twaróg B., Bartman J.: Improvement of Image Processing by

Using Homogeneus Neural Networks with Fractional Derivatives Theorem – Dynamical Systems, Differential Equations and Applcations Vol 1 Suplement 2011 pp. 505-514

6. Gomółka Z., Twaróg B.: Artificial intelligence methods for image processing The Symbiosis of Engineering and Computer Science, Rzeszow 2010, ISBN 978-83-7338-620-4, str. 93-124

101

In another block Training record the fields of the tr variable are updated. An update of the fields is done by generating the tr_update.m script. Before the update a conditional instruction appears, that checks whether the logic value of the doTest variable is real. The doTest condition is real when the testing training set exist (that is the testV.indices variable contains at least one number of the testing index of training set).

% Training record If doTest

[tperf,ignore,testV.Y] = ... calcperf2(net,X,testV.Pd,testV.Tl,testV.Ai,testV.Q,testV.TS);

end tr = ... tr_update(tr,[epoch current_timeperfvperftperf gradient val_fail]);

After the update of the fields of the tr variable, the update of parameters begins, which present: current number of training epochs, gradient value, value of goal function and time of training. These parameters are displayed in the nntraintoolgraphics window by generating the nn_train_feedback.m function with the'update' argument.

% Feedback nn_train_feedback('update',net,status,tr,{trainVvalVtestV},... [epoch,current_time,best_perf,gradient,val_fail]);

The Stop block, in turn, with the use ofthe conditional instruction, checks whether the tr.stop field is not empty. If it contains any value, the operation of the for loop is ended with the break command.

% Stop if ~isempty(tr.stop), break, end

Another Gradient Descent block of the Train section is responsible for the update of weights and biases

% Gradient Descent dX = lr*gX; X = X + dX; net = setx(net,X); [perf,El,trainV.Y,Ac,N,Zb,Zi,Zl] = … calcperf2(net,X,trainV.Pd,trainV.Tl,trainV.Ai,Q,TS);

First, the correction value of weights vector is determined –it is obtained by multiplying the value I gX (calculated in the calcgx.m function) by the training lr coefficient (net.trainParam.lr). In the next step, a new weights value is set by adding the selected dX correction to the current X weight value. The

setx(net,X)function does updates of the records of weights and biases in the net object . At the end of the block the calcperf2.m function calculates new error values, outputs, goal function. calcperf2.m errors , outputs, the objective function [6].

The last block of the section is the Validation block. In this block the values of validating set are calculated. It starts with a conditional instruction that checks if the logic variable doValidation is real. The situation is analogue as in case of the conditional instruction which checks a logic value of the doTestvariable.

4 Conclusions

Matlab is a calculation-simulation environment that is commonly valued. Its great possibilities may be extended by creating own scripts and functions that use ready libraries. However, if one wants to use all functions of Matlab, they need to explore it thoroughly. In this article the analysis of the selected representative functions, that are used to train artificial neural networks, has been presented. Its analysis allows forwarding some general conclusions:

• the variable that describes the neural network (usually called net) is a structure but particular fields may have straight values or may be records;

• the variable that describes the neural network (net) contains all the information concerning the composition and training of neural network; some parameters are hidden;

• the training functions are divided into sections that are responsible for the realization of specific tasks;

• during the training process a lot of very technical help functions are generated; • the parameters which are passed to the function, very often receive new names

and new form in the function body.

5 Bibliography

1. Bartman J. – Reguła PID uczenia sztucznych neuronów – Metody Informatyki Stosowanej 3/2009 s. 5-19;

2. Beale M., Hagan M., Demuth H., - Neural Network Toolbox™ User's Guide - MathWorks 1992-2014,

3. MATLAB Programming Techniques - MathWors 2010 4. Werbos P. - The Roots of Backpropagation - New York, Willey 1994 5. Gomółka Z., Twaróg B., Bartman J.: Improvement of Image Processing by

Using Homogeneus Neural Networks with Fractional Derivatives Theorem – Dynamical Systems, Differential Equations and Applcations Vol 1 Suplement 2011 pp. 505-514

6. Gomółka Z., Twaróg B.: Artificial intelligence methods for image processing The Symbiosis of Engineering and Computer Science, Rzeszow 2010, ISBN 978-83-7338-620-4, str. 93-124

the analysis of the selected procedu res in 2 Matlab …...Matlab environment. It comprises...

Documents

Transcript of the analysis of the selected procedu res in 2 Matlab …...Matlab environment. It comprises...