Developing Synthetic Logs Using Artificial Neural...
Transcript of Developing Synthetic Logs Using Artificial Neural...
Developing Synthetic Logs Using Artificial Neural Network:
Application to Knox County in Kentucky
Fedra Ghavami
Problem Report submitted to the
College of Engineering and Mineral Resources
at West Virginia University
in partial fulfillment of the requirements
for the degree of
Master of Science
in
Petroleum & Natural Gas Engineering
Committee:
Professor Samuel Ameri, Chair
Dr. Khashayar Aminian
Dr. Razi Gaskari
Department of Petroleum and Natural Gas Engineering
Morgantown, West Virginia
2011
Keywords: Reservoir; Well Logs; West Virginia; Artificial Neural Network
ABSTRACT
Developing Synthetic Logs Using Artificial Neural Network: Application to Knox County in Kentucky
Fedra Ghavami
The purpose of this study was to examine missing data from oil production logs as well as
predict oil reservoir production levels. Historically, data logs contain gaps or missing data
points due to limitations of the tools and methodologies that are used to collect the data. To help
predict and examine the gaps with no data points, Back Propagation was used to extrapolate both
existing and non-existing data.
Stemming from the research that was performed using the Back Propagation method, missing
data was identified. To validate and confirm the accuracy of the data, the extrapolated data was
compared against actual data logs from core samples.
A methodology to generate synthetic wireline logs is presented. Synthetic logs can help to
analyze the oil reservoir properties in areas where the set of logs that are necessary, are absent or
incomplete. The approach presented involves the use of Artificial Neural Networks, as the main
tool, in conjunction with data obtained from conventional wireline logs. Implementation of this
approach aims to reduce operation costs to companies.
There is a synthetic methodology to generate wireline logs. In some cases, we need to have
absent or incomplete logs. Synthetic logs will help to analyze the reservoir properties in this
manner. Artificial Neural Networks have been implemented as the main model to predict
wireline logs. Implementation of this technique will reduce the operation costs for oil and gas
companies.
Development of the neural network model was completed using a back propagation and five oil
wells. Data collected from the five wells was collected using the following logs:
1. Gamma ray,
2. Density,
3. Neutron,
4. Caliper logs.
Synthetic logs were generated through two different exercises. Exercise one involved four wells
for development and training of the network. Exercise two included four wells in series and one
well out.
Developing of the artificial neural networks model is implemented using a Backprpagation of
Artificial Neural Network and five wells that included gamma ray, neutron, density, and
resistivity logs. Synthetic logs are generated through two experiments. Experiment one has
involved five wells for training and development of the network.
Verification was used for each well to train the network. The second experiment involved four
wells for the development and training of the network. A fifth well that was not applied during
calibration and training was selected for verification. Four mixtures of inputs/outputs are
selected to train the network. Mixture “A” consists of the neutron log which contains the output
as well as density, gamma ray, and density caliper logs, which comprises the input.
In conclusion, it is demonstrated that the quality of data has an important role in the
implementation of the neural network model. It is really important to do a careful quality control
of the data before building a neural network model. It is concluded conversely; that the neural
network modeling doesn’t affect the performance of having lithologic heterogeneities in the
reservoir in generation of synthetic logs.
iv
Acknowledgments
First I thank god for giving me the capability and the courage to finish my thesis and complete
my MS in Petroleum and Natural Gas Engineering.
I would like to dedicate this thesis to my parents, who always gave me their support and
encouragement, and most importantly, their love. It kept me going during some of the difficult
moments of this work. I appreciate you and I love you.
I don’t have the words to express my thanks and appreciation to Professor Sam Ameri, chair of
my committee, who brought me to the Petroleum and Natural Gas Engineering Department and
encouraged me to complete my master’s degree. I wouldn’t be able accomplish this without his
support and advice, not only during this work but also throughout the time I spent at the
department.
I want to extend my sincere appreciation and gratitude to my research advisor Dr. Razi Gaskari,
for introducing me to the fascinating area of Natural Networks, for his friendship, and for his
continuous guidance, encouragement, support and patience throughout this work.
Special thanks to Dr. Khashayar Aminian for being on my committee and for the enriching
contributions and comments to this work.
Finally, my deepest gratitude to Mrs. Beverly Matheny for her assistance and enthusiasm during
every semester spent in the department.
v
TABLE OF CONTENTS
1. INTRODUCTION ................................................................................................................................ 1
2. BACKGROUND .................................................................................................................................. 2
2.1. Geological Setting ......................................................................................................................... 2
2.1.1. Description of Knox county in Kentucky ............................................................................. 2
2.2. Well logs fundamentals ................................................................................................................. 9
2.2.1. Gamma Ray .......................................................................................................................... 9
2.2.2. Caliper log ........................................................................................................................... 12
2.2.3. Density-Porosity logs .......................................................................................................... 12
2.2.4. Neutron Log ........................................................................................................................ 14
2.3. Artificial neural network ............................................................................................................. 14
2.3.1. Characterization of Neural Network ......................................................................................... 15
2.3.2. Biological Neural Network ................................................................................................. 15
2.3.3. Typical Architectures .......................................................................................................... 18
2.3.4. Single-Layer Net ................................................................................................................. 20
2.3.5. Multilayer Net ..................................................................................................................... 21
2.3.6. Setting the Weights ............................................................................................................. 21
2.3.7. Supervised Training ............................................................................................................ 22
2.3.8. Unsupervised Training ........................................................................................................ 22
2.3.9. Fixed-weight Nets ............................................................................................................... 23
2.3.10. Common Activation Functions ........................................................................................... 23
3. LITERATURE REVIEW ................................................................................................................... 28
4. METHODOLOGY ............................................................................................................................. 40
4.2. Data Preparation ............................................................................................................................... 41
4.2.1. Exercise 1: filling the Gaps (Five Wells Combined) .......................................................... 52
4.2.2. Exercise 2: Four Wells Combined, one well out ................................................................. 53
5. RESULTS ........................................................................................................................................... 54
5.1. Exercise 1. Results ........................................................................................................................... 54
5.2. Exercise 2. Results ........................................................................................................................... 57
5.2.1. Result of R-Sauer for Training, Calibration, and Verification .................................................. 57
6. Conclusion .......................................................................................................................................... 64
vi
7. REFERENCES ................................................................................................................................... 65
LIST OF FIGURES
Figure 2.1.1. Cambrian and Deeper Tests of Kentucky, 1999. ..................................................................... 2 Figure 2.1.2. has shown the location of the Kentucky River and Irvine-Paint Creek Fault Systems. .......... 3 Figure 2.1.3. the Knox Group is composed of a thick sequence of dolomite of Cambrian and Ordovician age that underlies the entire state of Kentucky. ............................................................................................ 5 Figure 2.1.4. stratigraphic correlation chart for Cambrian rocks in the Rome Trough study area. Modified from Harris and Baranoski (1996). ............................................................................................................... 7 Figure 2.1.5. stratigraphic model for Conasauga Group in the outcrop belt in eastern Tennessee. .............. 8 Middle Cambrian paleogeography ................................................................................................................ 8 Figure 2.1.6. Middle Cambrian Paleogeography. ......................................................................................... 9 Figure 2.2.1. Gamma ray and sonic logs from the Alberta basin, and their response ................................. 11 to different lithologies (adopted from Cant, 1992). .................................................................................... 11 Figure 2.2.2. Gamma ray emission spectra of K-40, uranium, and thorium series ..................................... 12 (adopted from Bassiouni, 1994). ................................................................................................................. 12 Figure 2.3.1. biological neuron. .................................................................................................................. 17 Figure 2.3.2. A very simple neural network. .............................................................................................. 19 Figure 2.3.3. A single-layer neural net. ....................................................................................................... 20 Figure 3.2.10. identify function. ................................................................................................................. 24 Figure 3.2.11. Binary step function............................................................................................................. 24 Figure 3.2.12. Binary sigmoid, range (0-1) ................................................................................................. 25 Figure 3.2.13. Bipolar Sigmoid. .................................................................................................................. 26 Figure 3.2.14. Backpropagation neural network with one hidden layer. .................................................... 30 Figure 4.2.2. Density-Porosity log for five verified wells. ......................................................................... 44 Figure 4.2.3. Caliper log for five verified wells. ......................................................................................... 45 Figure 4.2.4. Neutron log for five verified wells. ....................................................................................... 46 Figure 4.2.6. Logs of Patterson well. .......................................................................................................... 48 Figure 4.2.7. Logs of Gibson well. ............................................................................................................. 49 Figure 4.2.8. Logs of Partin well. ............................................................................................................... 50 Figure 4.2.9. Logs of Gambrie well. ........................................................................................................... 51 Figure 4.2.10. illustrates distribution of wells used for training /testing and Production dataset through exercise 1. ................................................................................................................................................... 52 Figure 4.2.11. illustrates distribution of wells used for training /testing and Production dataset through exercise 2. ................................................................................................................................................... 53 Figure 5.1.1. result of R-squared for five wells in function of neutron vs. depth. ...................................... 55 Figure 5.1.2. Training R-Square in the application graph........................................................................... 56 Figure 5.1.3. Calibration R-Square in the application graph. ..................................................................... 56 Figure 5.1.4. Verification R-Square in the application graph. .................................................................... 57 Figure 5.2.1. actual and virtual results comparison in Well Hilton. ........................................................... 58 Figure 5.2.2. actual and virtual results comparison in Well Patterson. ....................................................... 59 Figure 5.2.3. actual and virtual results comparison in Well Partin. ............................................................ 61
vii
Figure 5.2.4. actual and virtual results comparison in Well Gibson. .......................................................... 62 Figure 5.2.5. actual and virtual results comparison in Well Gambrie. ........................................................ 63
LIST OF TABLES
Table 2.2.1. The values of matrix density for the type of different rocks… ................................................................ 13
Table 4.2.1. Segment of the matrix prepared for well HELTON ................................................................................ 42
Table 5.2.1. Results of R-Squered for training, calibration, and verification of the five wells.................................... 57
1
INTRODUCTION
Prediction of hydrocarbon production from geological formations using computer modeling
techniques has become very popular and widely accepted in the petroleum industry. There is an
analysis tool, known as artificial neural networks (ANNs), which imitates the thought process of
the human brain. Neural networks can be a useful tool to predict oil, gas or water in formations
using data from well logs and core samples. Traditional models have been used for many years
to make predictions from well logs, which are measurements of formation properties as a
function of depth. Well logging is performed with devices lowered into a well to measure
properties of the formations via electrical, nuclear or acoustic methods. Well logging is primarily
used after drilling a well or during the drilling of a well to determine the production potential in
hydrocarbon reservoirs. This data may be used to determine the feasibility of drilling additional
wells in the area, to select the final depth of the well being drilled, etc.
Moreover, the application of neural network in the oil and gas industry has increased rapidly in
recent years. Neural network results can be interpreted in terms of models that can be compared
with real or statistical data in well logs.
2
BACKGROUND
1.1. Geological Setting
1.1.1. Description of Knox county in Kentucky
The Rome Trough is a basement structural feature filled with Precambrian and Paleozoic
sequences. The mostly dolomitic carbonate sequence that spans the Lower Ordovician and Upper
Cambrian periods in Kentucky is the Knox Group. The Lower Ordovician Beekmantown
Dolomite is the upper part of the Knox Group and the Upper Cambrian Copper Ridge Dolomite
is the lower part of the Knox Group. (Figures 2.1.1, and Figures 2.1.2)
Figure 2.1.1. Cambrian and Deeper Tests of Kentucky, 1999.
3
Due to a pre-Conasauga unconformity the upper Rome carbonate and lower Conasauga units
are absent on the shelf between these faults. Deeper in the trough, a full Rome section is
present. The upper Rome carbonate grades laterally into shale in an interpreted intrashelf
basin in south-central Kentucky (SW).
Figure 2.1.2. has shown the location of the Kentucky River and Irvine-Paint Creek Fault Systems.
Exploration Recommendations
Several recommendations for continued exploration in the Rome Trough area can be made based
on the results of this work.
4
Reservoir Trends
The percentage of sandstones commonly is showing 6-10% porosity in the Rome Formation and
Maryville Limestone of the Conasauga Group. In the high sandstone percentage areas there is a
risk that porosity development does not appear. In this study has shown the sandstone percentage
in the maps that two areas have the best probability for encountering porous sandstones. The
highest sandstone percentages has mapped in the structural shelf between the Kentucky River
Fault System and the Irvine-Paint Creek Fault System for the Rome Formation. Sandstones
increase toward the north against the Kentucky River Fault. This trend is prospective but it
narrows east of the Isonville Fault, into Carter and Boyd Counties.
A north-south sandstone trend has been mapped from the Rome Trough north into Ohio in the
Maryville interval. In the Homer Field in the Homer Field and also the sandstones are porous. An
area in the center part of the Irvine-Paint Creek shelf contains the highest percentage of
sandstone in both Rome and Maryville. The increase chances of multiple layers of good
sandstone quality in the area.
In this study the stratigraphic framework of the Rome Trough has refined greatly. Moreover, it
has been identifying areas with low sandstone potential which are deeper intrashelf basins,
deeper parts of the trough, and also high sandstone potential.
Due to neither porous carbonates nor dolomitized zones were observed, then Cambrian
carbonates are assumed to have a low potential for the reservoir development. Although,
dolomites; Hydrothermal similar to those in Ordovician carbonates have not been observed in
this study. Fractured carbonates are considered high risk but could have potential for reservoir
development, also Fractured Nolichucky is considered high risk shales produce in a well in
Johnson County, Ky. in this type of reservoir. This reservoir is the most attractive reservoir target
5
in the Trough because of abundance of porous sandstones at reasonable drilling depths on the
Irvine-Paint Creek shelf.
Figure 2.1.3. the Knox Group is composed of a thick sequence of dolomite of Cambrian and Ordovician age that underlies the entire state of Kentucky.
Regional Stratigraphy and Depositional History
There is a modification of the stratigraphic correlation chart by Harris and Baranoski (1996) for
the central Appalachian Basin as shown in Figure 2.1.4. There are some Key changes to this
interpretation follow:
1. In the outcrop belt in eastern Tennessee are correlated into eastern Kentucky and they
been defined by the Conasauga Group and its member formations. Rome in Kentucky
was the name of Conasauga that has been interpreted Much of the interval now.
6
2. The Rome Formation is restricted to the Rome Trough and south of the Kentucky River
Fault Zone, and does not extend north in Kentucky.
3. Pumpkin Valley Shale, Rutledge Limestone, and Rogersville Shale are confined to the
deeper parts of the Rome Trough in eastern Kentucky as the oldest 3 formations in the
Conasauga Group. The Maryville Limestone unconformably overlies the Rome
Formation in the place that those three units are absent on the shallower Irvine-Paint
Creek shelf, where. The Maryville Limestone unconformably is named the pre-
Conasauga unconformity.
4. Figure 2.1.4 retains the Mt. Simon Sandstone that is the time-equivalent to the lower
Maryville Limestone to the east and south.The original position of the Middle-Upper
Cambrian boundary from Harris and Baranoski (1996) has shown in Figure 2.1.4. This
boundary may be revised upward to the top of the Maryville Limestone in Ohio and
Kentucky.
7
Figure 2.1.4. stratigraphic correlation chart for Cambrian rocks in the Rome Trough study area. Modified from Harris and Baranoski (1996).
8
There are three major transgressive/regressive cycles that are present, composed of an upper
carbonate formation and lower shale formation. The minor cycle including the Craig Limestone
has not been observed in the Rome Trough area is shown in Figure 2.1.5. (Rankey, 1994).
Figure 2.1.5. stratigraphic model for Conasauga Group in the outcrop belt in eastern Tennessee.
Middle Cambrian paleogeography has shown in Figure 2.1.5 is paleogeography in northeast
Tennessee and southwestern Virginia. A Conasauga carbonate platform existed in Middle
Cambrian, described by cyclic progradation to cratonward into Kentucky. As shown here, the
location of the intrashelf basin has been reinterpreted to the west, in south-central Kentucky and
the cycles of shale-carbonate have been correlated into the Rome Trough.
9
Figure 2.1.6. Middle Cambrian Paleogeography.
2.2. Well logs fundamentals
2.2.1. Gamma Ray
By penetrating the wellbore and sending the natural gamma-ray, emission of the various layers
will be measured using a gamma-ray log. The radiogenic isotopes of potassium, uranium and
thorium will be related to its property. As shown in Figure 2.2.1.
Some tools might perceive gamma ray energies of less than 0.5 to more than 2.5 millivolts.
Figure 2.2.2 also shows the distinctive emission spectra for thorium, uranium, and potassium.
Thorium, uranium, and especially potassium are common in clay evaporates and some minerals.
The successions of terrigenous clastic in the log demonstrate the “shaliness” which is interpreted
as having high radioactivities on the API scale of the rock, averaged over an interval of depth.
This is also known as “cleanness” which illustrates a lack of clays in the formation (Figures 2.2.1
10
and 2.2.3). This characteristic effect on gamma-ray log patterns mimic vertical carbonate-content
or sand-content trends.
It should be emphasized that the gamma-ray reading is the proportion of radioactive elements not
a function of grain size or carbonate content. These readings could also be related to the
proportion of shale content as well. As an example, lime mudstone gives the same response as
grainstone, and also clay free sandstones or conglomerates that are mixed with sand and pebble-
clast sizes on average give similar responses.
There is a relationship between the concentrations of radioactive elements and increasing
compact in shale. Gamma Rays radiates through the Gamma Ray (GR) tool into sedimentary
formations to the penetration of wellbore. Compton-scattering collisions with formation atoms
occur when these gamma rays pass through a formation (Cant, 1992).
Compton-scattering collisions cause the aforementioned gamma rays to subsequently lose energy
(Bassiouni, 1994). These atoms absorb gamma rays energy through the photoelectric effect in
the formation. The function of the formation density is related to the amount of absorption.
Moreover, on the GR log the radioactivity level shown can be different for two formations that
have the different densities with the same amount of radioactive material per unit volume. The
accumulation of radioactive material will typically appear in lower density formations. With the
casing borehole, the application of GR Log is necessary for the tool to run to completion and in
work-over operations. In some cases, for example the SP resolution is poor in either cased holes
or open holes. In such situations the GR log can be run as a substitute for the SP log. In this case
the GR allows for an accurate positioning of perforating guns while run with casing collar
locator logs. Gamma Ray logs are useful for the locating of source beds and for the interpretation
of depositional environment (Ameri, 2009).
11
Figure 2.2.1. Gamma ray and sonic logs from the Alberta basin, and their response to different lithologies (adopted from Cant, 1992).
12
Figure 2.2.2. Gamma ray emission spectra of K-40, uranium, and thorium series (adopted from Bassiouni, 1994).
2.2.2. Caliper log
Caliper log uses as the set of measurements of the shape and the size of a wellbore that made
during the drilling oil or gas wells. Also, it measures the diameter deviation in bore hole. It is
built with two articulated arms that measure the borehole as far as it pushes against the bore hole.
The arms register the movements by creating electrical variation, and measuring electrical
resistance. The deviation is translated by changing the diameter as an output after calibration. In
the function of depth the caliper log is known as a continuous value of borehole diameter.
2.2.3. Density-Porosity logs
The formation density log measures the electron density of a formation. It also acts as a porosity
log which measures the porosity of the formation as well. It may assist the geologist to detect
13
gas-bearing zones, determine hydrocarbon density, and evaluate shale sand reservoirs and
complex lithologies.
The gradiomanometer or a nuclear fluid densimeter is a density logging device. It is a contact
tool that consists of a medium-energy gamma ray source. This tool emits gamma rays into a
formation that will be scattered back to the detector to the electron density of the rock. The
electron density is related to the density of the solid material. The amount of density is
determined either by the percentage of pore fluids, or holdup which is the record of the fractions
of different fluids at different depths in the wellbore for the different fluids.
Formation bulk density (ρb) is a function of porosity, density of the fluid in the pores such as salt,
mud, hydrocarbons, or fresh mud, and also matrix density. To determine porosity, the density of
the fluid in the wellbore and/or density of the matrix (ρ ma) should be known.
Table 2.2.1 demonstrates the common value of ρ ma.
Rock Type Matrix density (g/cm3)
Sand or Sandstone 2.65
Limestone 2.71
Dolomite 2.87
Anhydrite 2.98
Table 2.2.1. The values of matrix density for the type of different rocks
Depending on temperature, pressure and salinity, density of formation water ranges various from
0.95 g/cc to 1.10 g/cc. Density of oil varies over an equally wide range and also is lower than
these values. According to Bassiouni (1994), the tool in the investigation ratio is shallow.
Moreover, this investigates in the invaded zone, ρ f is expressed by:
14
ρ f = S xo ρmf + (1 − S )ρ
Where Sxo is the mud filtrate saturation in the invaded, ρ mf is the mud-filtrate density, zone,
and ρh is the invaded zone hydrocarbon density.
2.2.4. Neutron Log
The neutron exhibits a high penetrating potential. This penetration power property is, due to the
neutron’s lack of an electric charge that plays an important role in well logging applications.
Moreover, hydrogen is responsible for measuring the concentration of epithermal neutrons that
indicates the concentration of hydrogen in the material and it also responsible for water-bearing
formations. The concentration of hydrogen reflects the lithology and porosity in shale-free,
water-bearing formations.
The neutron log measures the hydrogen index or hydrogen concentration in the rock. The tool
emits neutrons that will measure the energy of neutrons reflected from the rock. As a result, the
hydrogen concentration might be delineated due to the fact that energy is lost easily to particles
of similar mass.
2.3. Artificial neural network
An artificial neural network is a chain of information in a system that executes the data similarly
to mathematical models of human neural biology. In human cognition some performances occur
which use an artificial neural network as well.
• Neurons are the simplest elements for processing information.
• There are connections between neurons that pass signals.
• Every connection link has an allocated weight which transmits the signal.
15
• Every neuron has a function which is nonlinear to determine an output with input and
summation of weighted input signals
2.3.1. Characterization of Neural Network
A neural network has an arrangement of the connections between the neurons which is called
“Architecture”.
Training or learning algorithm is the method which determines the weight on the connections.
There is an activation function for neural network.
For recognition of artificial neural networks with other systems, information processing provides
an answer of “how” and “when”.
Each neuron is directly connected to another one. It means there is a communication link which
includes the weight. A neural consists of some elements for simple processing which are called
nodes or cells, units, and neurons. In each neural network cycle, weights participate to solve the
problem by using the information. A wide array of problems can be affected on neural networks.
Activation or activity level is an internal state for each neuron. Activation is a function of the
inputs it has received. Activation is defined as a signal from a neuron to the other neurons. Each
neuron can send one signal at a time which it can broadcast to one or several neurons.
2.3.2. Biological Neural Network
For some, a primary concern for neural network models is the extent to which they differ from
biological neural systems; for others, these differences are outweighed by the ability of the net to
play a role in performing certain tasks. There is a similarity between the structure of neuron and
element processing. Individual neurons from different species are much more similar than neural
networks within the same system.
16
A neuron is composed of a soma or cell body and two different types of branches. They are the
dendrites and the axon. The nucleus contains the information in the cell body. The information,
such as plasma, contains the tools for producing what is needed by the neuron.
Dendrites act as the receivers to transmit signals generated by the cell body along the axon
(transmitter) which then separates into strands and sub strands.
Synapses are at the ends of the strands. A synapse is a connection between a dendrite of one and
an axon of another. Neurotransmitters are released when the signal arrives at the synapse’s ends.
These chemicals cross the synaptic gap to improve the neuron’s ability to send impulses,
depending on the kind of synapse. The synapses can be trained to be more efficient by
repetition of these signals. This dependence on history acts as a memory, which is possibly
responsible for human memory.
In humans the cerebral cortex is a thin layer covering the surface of each cerebral hemisphere.
This cerebral cortex contains a vast number of neurons which are interconnected.
Neurons exchange information through a series of very short pulses. The communications of
train pulses in neurons are very short; usually the duration time is measured in milliseconds.
Pulse-transmission frequency modulates the message. This frequency is around a million times
slower than the switching speed of electronic circuits. However, humans are more efficient in
regards to complex perceptual decisions. A good example would be face recognition. Face
recognition for humans occurs within a few hundred milliseconds, even though the operational
speed of the neurons is only a few milliseconds (Figure 2.3.1).
17
Figure 2.3.1. biological neuron.
There is a series of the processing components in neural networks which work on biological
neurons. They are:
1. All signals have to be received and be processed.
2. After releasing signals from a neuron, the signals could be modified by its weight then
receive information through synapse from the other neuron.
3. The summation weight which is merged with weighted inputs has to be included in the
processing element.
4. When the neuron conveys just one single output it shows that there was a sufficient input
to work appropriately.
5. The axon branches from the other neurons reach the output from a transmitter neuron.
6. Memory is distributed:
• There are two kinds of distribution of information which are called long-term
memory and short-term memory. Long-term memory wells or settles in the
weight or synapse and short-term memory is consistent with the signals when the
transmitter neurons send it.
• Short-term memory is the signals sent by the neurons.
18
7. The strength of synapse is not steady, and it could be modified by practical situations.
8. Transmitting on the synapses could be provocative or prohibitive.
There are two types, artificial neural networks that share an ability with biological neural
systems which is called fault tolerance.
1. We can recognize input signals with slight differences from signals we have had before.
As an example, the human ability to distinguish a person from a picture which they have seen
before or distinguish a person after a while.
2. Despite continuing to lose neurons, humans can still learn. In some cases when the neural
is destroyed it can be trained by other neurons to pass over the function of destroyed cells.
Even for uses of artificial neural networks that are not intended primarily to model biological
neural systems, attempts to achieve biological plausibility may lead to improved computational
features.
Attempts to achieve biological plausibility may lead to improved computational features via the
use of artificial neural networks that are not originally intended to model biological neural
systems.
2.3.3. Typical Architectures
Arranging neurons in layers, gives us a good perspective to observe them in a different behavior.
Neurons with the same behaviors usually categorize in the same layer. Key factors are based on
sending and receiving signals; the definition of neurons behavior in their activation function
assembles on connection of weight pattern with sending and receiving signals. For simplicity of
what many neural networks do, is the neurons in the specific layer are either not interconnected
19
or totally interconnected. Each hidden layer is the interconnection layer between input neurons
and output neurons.
The net architecture is the pattern of connections between layers and the neurons arrangement
into layers. In neural networks layer, the activation an input layer is equal to an external input
signal for each unit. In Figure 2.3.2 is a shown combination of input unit, one hidden unit, and
one output unit.
Figure 2.3.2. A very simple neural network.
Neural networks are usually categorized as multilayer or single layer. Definition of the number
of layers is based on the slabs of neurons between weighted interconnection links. The input and
output units are not counted as a layer by themselves. Obviously, the weights that exist in a
network contain very important information. Figure 2.3.3 is shown in two layers of weighted.
20
Figure 2.3.3. A single-layer neural net.
As illustrated in Figures 2.3.2 and 2.3.3 the single-layer and multilayer nets are feedforward nets
which are nets where the signals flow from the input units to the output units, in a forward
direction.
2.3.4. Single-Layer Net
When there is just one layer which is connected to weights called a single-layer net, a single-
layer net is distinguished by input units, that receive signals from the outside of system, and
output units that respond to what be read from the net. A single-layer net, the input units are
totally connected to output units; then there is no connection between two single-layers. As
shown in Figure 2.3.3 this is the typical single layer net.
Each output unit corresponds for the classification pattern, to a particular category to any input
vector that may or may not belong. There are two different patterns effect in a single-layer net.
Obviously, they use different problems based on the behavior of the response of the net. On
classification pattern, each output unit may or may not belong to an input vector. The weights in
a single-layer net for one output unit is fully separated from other output units (it means each
21
weight is dedicated to one output unit and it doesn’t influence to other output unit). On
association patterns, the output signals send the associated response with the input signal which
causes it to be produced.
2.3.5. Multilayer Net
A net with one or more levels (or layers) of nodes and has hidden units is called a multilayer.
Between input layers and output units there is a layer of weights, therefore, the organization of
the multilayer net is input, hidden, and has output levels of units. Apparently, training in a single-
layer has one level of weight rather than multilayer which has a layer of weights. The multilayer
net is much more complicated then the single-layer nets and training in multilayer may be more
difficult. Moreover, training can be successful in some cases, as long as it is able to solve a
problem in multilayer which a single-layer net cannot be trained to perform correctly at all.
2.3.6. Setting the Weights
The architecture of weight settings defines the different characteristics of neural nets. In addition,
the importance is based on distinguishing is based on the method of values setting of the training.
The difference of categorizing makes the distinction of the task in neural nets which can be
trained to arrange to the areas of clustering, mapping and constrained optimization. The common
issue of pattern association and pattern classification might be related to the special form of
mapping input vectors over the specified output vectors.
There are two methods of the training labeling; one of them is supervised and the other is
unsupervised. Also, some researchers believe self-supervised is a useful third category. The point
of training methods is to solve the problems with useful relationship between the type of problem
and the type of suitable training for that related problem.
22
2.3.7. Supervised Training
Supervised training is a process which is accomplished with presetting consequence of training
pattern, related with suitable target output vector. Due to the application of a learning algorithm
the weights are adjusted.
Pattern classification was integrated into the functions of some of the simplest and historically
earlier neural nets. A good example would be the function of classifying input vectors as either
belonging or not belonging to a given category. The output is a bivalent element in this type of
neural net if the input vector belongs to the category or if it does not belong. Such nets are
trained using a supervised algorithm. Pattern classification is the simplest neural networks that
are designed to classify an input pattern and whether it belongs to a given category.
In the mapping problem which is another special form of pattern association, the output doesn’t
define a “yes” or “no”, it defines a pattern. Associative memory is a neural network which is
trained to connect a set of input patterns that are equivalent to a set of output patterns.
Autoassociative memory occurs when the desired output target pattern is equivalent of the input
pattern. Heteroassociative memory occurs when the willing output pattern is differed from the
input pattern. After training, when the given input vector is similar to the vector, it has learned
then an associative memory can be recalled to store the pattern.
2.3.8. Unsupervised Training
Unsupervised training is a self-organized neural network group, which has the most similar input
vectors that participate with no specific training data that involves any usual member of each
vector that belongs to each group. The significance of unsupervised training is to provide input
vectors without specific target vectors. This network modifies the weights and is trained to relate
23
to the input vector that is assigned to the same cluster or output unit. The neural network will
provide a representative vector for each cluster unit generated.
2.3.9. Fixed-weight Nets
There are other types of neural nets that solve restricted optimization problems. Some nets solve
problems which are caused by the difficulty of traditional techniques, such as conflicting
constraints. In some cases the network can find the optimum solution which is satisfaction.
When the networks are designed based on problems, the weights are set to provide the quality of
constraints that can be maximized or minimized.
2.3.10. Common Activation Functions
In most cases, a nonlinear activation function is used. In order to achieve the advantages of
multilayer nets, compared with the limited capabilities of single-layer nets, nonlinear functions
are required due to the fact that as a result of this function the signal must pass through two or
more layers of linear processing elements. As an example, a single layer can obtain the same
results as elements with linear activation functions. The artificial neural operation includes
weight summation with an input signal then applying as output, or activation, function. As
Figure 3.2.10 below this function is the identity function. Neurons in a particular layer can use
the same activation function of a neural net. Activation function often uses a nonlinear function.
(i) Identity function:
f(x)=x for all x.
24
Figure 3.2.10. identify function.
Conventionally a step function is used for converting the net input in a single-layer network.
This function has variables that is valued with an output unit that is either a binary (1 or 0) or
bipolar (1 or -1) continuously, as shown in Figure 3.2.11. Heaviside function or threshold
function is known as the binary step function.
(ii) Binary step function (with threshold θ):
ƒ(x) = �1 if x ≥ θ1 if x < θ
�
Figure 3.2.11. Binary step function.
There are useful types of activation functions which are S-shaped curves or Sigmoid functions.
Also the most common functions are the hyperbolic tangent and the logistic function which are
used in neural networks by backpropagation. Their advantages are the simple relationship
25
between the value of the derivative at the point, and the value of the function at that point
reduces the computational burden during training. Either logistic or binary sigmoid is the logistic
function with range of the sigmoid function. As shown in Figure 3.2.12 this function solves the
values of the steepness parameter δ.
(iii)
ƒ1(x) = 11+exp(−𝛿𝛿𝛿𝛿 )
ƒ′′(x) = δ ƒ(x) [1- ƒ(x)]
Figure 3.2.12. Binary sigmoid, range (0-1)
The logistic sigmoid function often solves any range of values which can be from -1 to 1 that
comprises a bipolar sigmoid. As shown in Figure 3.2.13 for δ=1.
26
Figure 3.2.13. Bipolar Sigmoid.
(iv) Bipolar sigmoid:
g(x) = 2 ƒ(x) -1 = 2
1 +exp (−𝛿𝛿𝛿𝛿 )− 1
= 1−exp (−𝛿𝛿𝛿𝛿 )1+exp (−𝛿𝛿𝛿𝛿 )
g(x) = 𝛅𝛅𝟐𝟐
[1 + g(x)][1− g(x)]
hyperbolic tangent function and bipolar sigmoid are related; when the range of output value is
between -1 and 1 in the activation function.
27
g(x) = 1−exp (−𝛿𝛿)1+exp (−𝛿𝛿)
The hyperbolic tangent is
h(x) = exp (𝛿𝛿)−exp (−𝛿𝛿)exp (𝛿𝛿)+exp (−𝛿𝛿)
= 1−exp (−2𝛿𝛿)1+exp (−2𝛿𝛿)
The derivative of the hyperbolic tangent is
h(x) = [1 + h(x)][1− h(x)]
The value in binary data is within the range from 0 to 1, but it prefers to use the bipolar sigmoid
or hyperbolic tangent which needs to convert to bipolar.
28
LITERATURE REVIEW
Learning Rate
The regulation of changing the weights when the network is trained, prescribes by the learning
rate. There is a relation between the amount of weight modification and the errors, which means
if the learning rate is set 0.5; then the weight modification is a function of the error. Thus, as the
larger the learning rate appears, the larger the change of the weight the learning.
Momentum
There is a variable called Momentum in neural nets. The determination of the proportion of the
last weight is added to the new weight change. Although, this variable is applied to the large
learning rates drive to oscillation of weight changes which results in the learning process never
becoming complete.
Backpropagation Neural Net
The limitation of Single-layer in neural networks led to the creation of a training method is
known as backpropagation or the generalized delta rule. After discovering a general method of
training a multilayer neural network, a wide variety of problems were solved. The
backpropagation training method means there is a multilayer neural network that is trained by
backpropagation. This method helps to solve a variety of problems in different areas. Using the
applications nets provides every field virtually, using neural net problems including mapping
onto a set of inputs to a specified set of outputs. Most cases in neural networks imply to train the
network to achieve reasonable results by keeping the balance between giving rational reasonable
responses to input that is not identical.
29
There are three stages involved by the use of propagation in the training network. The
feedforward of the input training vector; the weight adjustment; and the associated error of the
backpropagation and computation. On training step, application of the network involves the
calculations of the feedforward phase. Backpropation phase has been implemented to improve
the speed of the training procedure.
More than one hidden layer may be beneficial in some applications, but one hidden layer is
sufficient. Although a single-layer net is severely limited in the mappings, it can learn a
multilayer net (with one or more hidden layers) can learn any continuous mapping to an arbitrary
accuracy.
The benefits of some applications are reflected in the differences of their hidden layers. Although
a single-layer net is limited in the mappings, it can learn with one hidden layer that is sufficient
to run an application. The multilayer net can learn mapping with one or more hidden layers.
Architecture
As shown in Figure 2.2.4 in a multilayer neural network, if it is with one layer of hidden units
(the Z units). The unit 𝑌𝑌𝑘𝑘 is bias on a typical output that is denoted by 𝑊𝑊𝑜𝑜𝑘𝑘 ; the unit 𝑍𝑍𝑗𝑗 is bias on
a typical hidden which is denoted 𝑉𝑉𝑜𝑜𝑗𝑗 ; As shown in shown in Figure 2.2.9 those bias terms act
like weights on connections in units that output unit is 1. In Figure 3.2.14 has shown the
feedforward phase of operation in the direction of information flow. Moreover, signals are sent
in the reverse direction during the backpropagation phase of learning.
30
Figure 3.2.14. Backpropagation neural network with one hidden layer.
Algorithm
By using backpropagation in training a network involves three stages: the backpropagation of the
associated error, the adjustment of the weights, and the feedforward of the input training pattern.
Feedward duration broadcasts the receiving signal from input units (Xi) to the each of the hidden
units Z1,…,Zp. The hidden units after receiving data from input units, computes the activation as
well as sending its signal (Zj) to the output unit. Every output unit (Yk) uses the Yk function to
respond for the given input pattern of the net.
Every output unit during training compares its analysis function Yk with target value (tk) to
evaluate the associated error for the pattern with the unit. The factor 𝛿𝛿𝑘𝑘 (k=1, …, m) is
determined by this error. The hidden units which are connected to Yk. Although this 𝛿𝛿𝑘𝑘 factor is
used to distribute the error at output unit Yk, that is received from the hidden units and connected
to Yk, back to all units in the previous (hidden) layer. This factor also will be used to update the
weights that are between the output and hidden layer.
After determining of the all δ factors, the weights for all layers will be adjusted simultaneously.
The weight Wjk will be adjusted from hidden unit Zj to output unit Yk. The weight Wjk is based
31
on the activation 𝑧𝑧𝑗𝑗 and the factor 𝛿𝛿𝑘𝑘 of the hidden unit 𝑧𝑧𝑗𝑗 . The weight Vij will be adjusted from
input Xi to hidden unit 𝑍𝑍𝑗𝑗 . The weight Vij is based on the activation xi and the factor 𝛿𝛿𝑗𝑗 of the
input unit.
Nomenclature
The training algorithm for the backpropagation net describes the nomenclature as follows:
x input training vector:
x= (x1, …, xi , …, xn)
a Output target vector:
a= (a1, …, ak, …, an)
The weight adjustment for 𝑊𝑊𝑗𝑗𝑘𝑘 is portion of error correction which is due to an error at output
unit 𝑌𝑌𝑘𝑘 . Moreover, the error for information at unit 𝑌𝑌𝑘𝑘 which is propagated back to the hidden
units, feeds into𝑌𝑌𝑘𝑘 .
The weight adjustment for 𝑉𝑉𝑖𝑖𝑗𝑗 is the proportion of error connection which is due to error
information of the backpropagation from the output layer to the hidden units 𝑧𝑧𝑗𝑗 .
α Learning rate.
t step of training.
m Momentum
𝛿𝛿1 Input unit i:
𝛿𝛿𝑖𝑖 is considered the same for the input signal and output signal in an input unit.
𝑉𝑉𝑜𝑜𝑗𝑗 Bias on hidden unit j
32
𝑧𝑧𝑗𝑗 Hidden unit j:
The network input 𝑧𝑧𝑗𝑗 is denoted z –𝑖𝑖𝑖𝑖𝑗𝑗 :
z-𝑖𝑖𝑖𝑖𝑗𝑗 = 𝑣𝑣𝑜𝑜 + 𝛴𝛴𝛿𝛿𝑖𝑖𝑣𝑣𝑖𝑖𝑗𝑗
The output activation of 𝑧𝑧𝑗𝑗 is denoted 𝑧𝑧𝑗𝑗 :
𝑧𝑧𝑘𝑘= ƒ(z-𝑖𝑖𝑖𝑖𝑗𝑗 )
𝑤𝑤𝑜𝑜𝑘𝑘 Bias on output unit k:
𝑌𝑌𝑘𝑘 Output unit k:
The network input 𝑌𝑌𝑘𝑘 is denoted y-ink:
y-𝑖𝑖𝑖𝑖𝑘𝑘= 𝑤𝑤𝑜𝑜𝑘𝑘 + 𝛴𝛴𝑧𝑧𝑗𝑗𝑤𝑤𝑗𝑗𝑘𝑘
The output activation 𝑌𝑌𝑘𝑘 is denoted 𝑌𝑌𝑘𝑘 :
𝑦𝑦𝑘𝑘= ƒ(y-𝑖𝑖𝑖𝑖𝑘𝑘)
Activation function
In a backpropagation net an activation function could have several important characteristics; it
could be continued, monotonically non-decreasing and differentiable. Moreover, for calculating
efficiency, the derivative will be easy to compute. The activation functions use the value of the
function which in turn is the value of independent variable. The function is expected to saturate
generally.
33
The binary sigmoid function is one of the most typical activation functions that has range of (0,1)
and is defined as:
ƒ1(x) = 11+exp(−𝛿𝛿)
With
ƒ′(x) = δ ƒ(x) [1- ƒ(x)]
This function is illustrated in Figure 2.2.7.
Bipolar sigmoid is another common activation function that has range of (-1,1) and is defined as:
ƒ2(x) = 11+exp(−𝛿𝛿)
− 1
With
ƒ2′ (x) = 0.5 [1+ ƒ2(x)] [1- ƒ2(x)]
Tanh (x) = 𝑒𝑒𝛿𝛿−𝑒𝑒−𝛿𝛿
𝑒𝑒𝛿𝛿+𝑒𝑒−𝛿𝛿
34
Training Algorithm
The activation functions (in the previous) defined section can be used in the standard
backpropagation algorithm given here. The form of the target values are an important factor in
selecting the suitable function. The exponential function is required to calculate the derivatives
needed in the backpropagation of the algorithm with no additional assessments because there is
the simplest relation between the value of the function and its derivative.
The algorithm is as follows:
1. Initial weights in small random values
2. If stopping condition is false, do step 3-9
3. For every training pair, do step 4-8
4. For every training pair unit (xi, i=1, …, n) receives input vector xi and sends that signal
to all units in the layer above in the hidden units.
5. Every hidden unit (zj, j=1, …, p) sums input signals in its weighted,
z-inj = voj + ∑ 𝛿𝛿𝑖𝑖𝑖𝑖𝑖𝑖=1 𝑣𝑣𝑖𝑖𝑗𝑗
6. Every output unit (𝑌𝑌𝑘𝑘 ,𝑘𝑘 = 1, … ,𝑚𝑚) sums input signals in its weighted,
y-𝑖𝑖𝑖𝑖𝑘𝑘 = 𝑤𝑤𝑜𝑜𝑘𝑘 + ∑ 𝑧𝑧𝑗𝑗𝑤𝑤𝑗𝑗𝑘𝑘𝑝𝑝𝐽𝐽=1
Error in Backpropagation:
7. Every output unit (𝑌𝑌𝑘𝑘 ,𝑘𝑘 = 1, … ,𝑚𝑚) receives a target pattern corresponding to the input
training pattern, computes its error information term,
35
8. Every output unit (𝑌𝑌𝑘𝑘 ,𝑘𝑘 = 1, … ,𝑚𝑚) computes the error information term that receives
from a target pattern to the input training pattern,
𝛿𝛿𝑘𝑘 = (а𝑘𝑘 − 𝑦𝑦𝑘𝑘)ƒ ′(𝑦𝑦 − 𝑖𝑖𝑖𝑖𝑘𝑘)
Calculates weight correction term that is used to update 𝑤𝑤𝑗𝑗𝑘𝑘 later,
∆ 𝑤𝑤𝑗𝑗𝑘𝑘 (t)= α 𝛿𝛿𝑘𝑘𝑧𝑧𝑗𝑗 + m ∆ 𝑤𝑤𝑗𝑗𝑘𝑘 (t-1)
m ∆ 𝑤𝑤𝑗𝑗𝑘𝑘 (t-1) = 𝑤𝑤𝑗𝑗𝑘𝑘 (t-1)- 𝑤𝑤𝑗𝑗𝑘𝑘 (t-2)
Calculates bias correction term that is used to update 𝑤𝑤𝑜𝑜𝑘𝑘 later.
∆ 𝑤𝑤𝑜𝑜𝑘𝑘 (t)= α 𝛿𝛿𝑘𝑘𝑧𝑧𝑗𝑗 + m ∆ 𝑤𝑤𝑜𝑜𝑘𝑘 (t-1)
m ∆ 𝑤𝑤𝑜𝑜𝑘𝑘 (t-1) = 𝑤𝑤𝑜𝑜𝑘𝑘 (t-1)- 𝑤𝑤𝑜𝑜𝑘𝑘 (t-2)
also it sends 𝛿𝛿𝑘𝑘 to units in the layer below.
9. Every hidden unit (zj , j = 1, … , P) sums delta inputs from units in the layer above,
Δ-𝑖𝑖𝑖𝑖𝑘𝑘 = ∑𝑘𝑘=1𝑚𝑚 𝛿𝛿𝑘𝑘𝑤𝑤𝑗𝑗𝑘𝑘
by multiplying the derivative of the activation function to compute the error information term.
𝛿𝛿𝑗𝑗 = 𝛿𝛿 − 𝑖𝑖𝑖𝑖𝑗𝑗 ƒ ′ (Z - 𝑖𝑖𝑖𝑖𝑗𝑗 )
Compute the weight correction term that is used to update 𝑉𝑉𝑖𝑖𝑗𝑗 later,
∆𝑉𝑉𝑖𝑖𝑗𝑗 (𝑡𝑡) = 𝛼𝛼𝛿𝛿𝑗𝑗𝑋𝑋𝑖𝑖 + 𝑚𝑚∆𝑉𝑉𝑖𝑖𝑗𝑗 (𝑡𝑡 − 1)
Computes the bias correction term that is used to update 𝑉𝑉𝑜𝑜 later,
∆𝑉𝑉𝑜𝑜(𝑡𝑡) = 𝛼𝛼𝛿𝛿𝑗𝑗 + 𝑚𝑚∆𝑉𝑉𝑜𝑜(𝑡𝑡 − 1)
36
Update weights and biases
1. Every output unit (𝑌𝑌𝑘𝑘 ,𝑘𝑘 = 1, … ,𝑚𝑚) updates its bias and weights (j = 0,…,P):
𝑤𝑤𝑗𝑗𝑘𝑘 (t)= 𝑤𝑤𝑗𝑗𝑘𝑘 (t-t) + ∆ 𝑤𝑤𝑗𝑗𝑘𝑘 (t-1),”
2. Every hidden units (zj, j = 1, … , P) updates its bias and weights (i= 0,…,n):
Vij (t) = Vij (t − 1) + ∆Vij (t)
3. Test stopping condition.
Note that in implementing this algorithm, as Step 7, 𝛿𝛿𝑘𝑘 separates arrays should be used for
the deltas for the output units and the deltas in Step 8, 𝛿𝛿𝑘𝑘 for the hidden units.
In one cycle through the entire set of training vectors calls epoch. The epochs require the
training in a backprpagation neural net. Each training pattern presents the updates the weights
after the prior algorithm. In batch updating there is a difference which is the weight updates
are accumulates over an entire epoch before being applied.
A gradient descent technique is the optimization mathematical basis for the backpropagation
algorithm. In this case, the function is the error and variables are the weights of the net in
every gradient of a function that gives the direction in which the function increases more
speedily. The reason of updating weight is related to the derivation that should be done after
all of the 𝛿𝛿𝑘𝑘 , and 𝛿𝛿𝑗𝑗 expressions have been calculated, not during backpropagation.
Initial weights and biases
The initial weights will influence whether only a local minimum of the error or the net
reaches a global, if so how quickly it converges. Between two units the update of the weight
depends on both the derivative of the activation the lower unit and the upper unit’s activation
function. Due to this reason, it would better to avoid choices of initial weights that would
37
make it either derivatives of activations or activation’s are zero. The values for the initial
weights must not be the initial input signals to every hidden unit or output unit will be falling
in the region or not be too large. In the other word, the reason that causes the procedure of
learning to be too slow is the initial weights are so small; the net input to a output or hidden
unit will be close to zero.
A procedure is to initialize the weights values between -1 and 1 or some other suitable
randomly interval or between -0.5 and 0.5. The values could be positive or negative due to
the final weights and also could be of either sign.
How Long to Train the Net
For applying a backpropagation net is to achieve a balance between good responses to new
input patterns and correct responses to training patterns. For example, a balance between
memorization and generalization that is not necessarily to continue the training unit in the
function of total squared error actually reaches a minimum. Hecht-Nielscn (1990)
recommends applying two sets of data during the set of training-testing patterns. These two
sets are put out of articulation. On the training patterns weight adjustments are contributed at
the intervals during training. In using the training-testing patterns the error will be computed.
As long as the training pattern continues the error for the training-testing patterns decreases.
Training will be terminated when the error starts to increase; then the net is beginning to
memorize the training patterns and also starting to lose its ability to generalize.
Application
Here is the lists are neural network applications were developed in real life:
Oil exploration, Stock market prediction, Price forecasts, Horse racing picks, Spectral
analysis and interpretation, Disease diagnosis, Magazine sales forecasts, Legal strategies,
38
Commodity trading, Urinalysis results, Credit application processing, Predicting student
performance, Optimizing raw material orders, Psychiatric diagnosis, product identification,
Sales prospect selection, Quality control, Help desk applications, Chemical compound
identification, Employee selection, Screening of estimates for time and material, Glass design,
Capital markets analysis, Response instructions for alarm activity reception operator, Travel
voucher screening, workload prediction, Security risk profiling, Process control, Economic
indicator forecasts, computer-aided design, analytical chemistry applications, NFL
predictions, Troubleshooting of scientific instruments, predicting labor hours required for
industrial processes, Optimizing operation of fusion furnaces, Hypertension therapy, Mental
tests, Bacteria identification, Optimizing scheduled machine maintenance, Real estate sales
forecasting, Drug screening, Physical system modeling, Cost analysis, Fault tracing systems,
Spectral peak recognition, Teaching languages, Screening decisions regarding child abuse
and neglect, Currency movement forecasts, Ground water quality control, Heart murmurs
differential diagnosis, Inventory analysis, Optimizing results of biological experiments,
Property tax analysis, Temperature and force prediction in mills and factories, Teaching
problem solving, Selection of criminal investigation targets, Nutrition analysis, Information
reconstruction, Harness racing, Geophysical and seismological research problems, Mining
education, Factory and shop problem analysis, Travel service recommendations, Cash flow
forecasting, Predicting parolee recidivism, Botanical data analysis, Damaged product
identification, Method selection for chemical characterizations, EKG diagnosis, Optimizing
production scheduling, Molding machine operation, Other sales forecasts, computer aided
instruction, Predicating Employee retention, Purchase order screening, Agriculture
39
experiments, Particle recognition, Teaching AI, college application screening, Mutual fund
picks, Ecosystem evolution, Water resources management.
40
METHODOLOGY
4.1. Neural network model development
The objective of this research was to generate synthetic wireline logs from other conventional
wireline logs to approach that uses an artificial neural network model. Therefore, this technique
intends to generate artificial log of any nonexistent log at any specific location. Developing of
this technique can be computed to measure reservoir characteristics such as fluid saturation, rock
permeability and effective porosity. Execution of this method will reduce the companies’
operation costs.
An application has used in this study related to Artificial Neural Network design has applied to
generate the synthetic logs, IDEA, Intelligent Data Evaluation & Analysis. This application was
used to compute the best model based on given data set to apply for this research. This
application, IDEA, has implied to a set of procedures for building and executing the neural
network modeling due to artificial intelligence techniques such as neural networks, genetic
algorithms and fuzzy logic to solve complex problems for the oil and gas industry.
The application would be able to create and execute a variety of architectures in neural network
through different procedures that allow the user, preparing the input data for training, built the
neural network, apply the model to the new data and analyze the outputs. The outputs can be
computed in terms of R-squared (𝑅𝑅2 ). In the artificial Neural Networks 𝑅𝑅2 is called the
predictive power. The range of 𝑅𝑅2 is measured between 0 and 1. 𝑅𝑅2 in the model defines better
as closer as to one. R-squared is defined as:
𝑅𝑅2 = 1 − 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑌𝑌𝑌𝑌
, where
SSE = ∑(𝑦𝑦 − Ў)2
41
𝑆𝑆𝑆𝑆𝑌𝑌𝑌𝑌 = ∑(𝑦𝑦 − )2
y= actual value
Ў= the predicted value of y, and
= the mean of the values.
The indicators assess the quality of results by interpreting R2 values. The indicators are related to
the results produced by the Network. The users measure the given results from the model and
determine whether the network is working properly or not.
This study has used five wells data included gamma ray, density, neutron, and caliper logs.
backpropagation algorithms were checked until the best results in terms of R2 and matching of
the synthetic logs generated by the network versus the actual logs were achieved.
Development of the neural network model was completed using three different model sets. These
model sets are called Intelligent Data Partitioning, Data Randomly Selected, This proprietary
technique makes sure that the data is partitioned in an optimum fashion and that all three
partitions are statistically representative.
In hope to get the best R squared, different range of parameters has been considered. These
parameters were momentum, learning rate, weight decay and hidden layers. The best result was
achieved when momentum was 0.8, decay was 0.3, learning rate was 0.2, and hidden layers were
30.
4.2. Data Preparation
The next step was to prepare a matrix in a spreadsheet to be used in IDEA application. The
matrix for every well contained the well name, the depths, the latitude, the longitude, and the
42
values of the density (DPOR), Density caliper (CLDC), gamma ray (GRGC), and neutron
(NPRL) logs. Figure 4.2.1 is shown as an example of arrangement in the matrices.
Table 4.2.1. Segment of the matrix prepared for well HELTON.
43
Data logs for the five wells have been showed in Figures 4.2.2, 4.2.3, 4.2.4, and 4.2.5.
Figure 4.2.1. Gamma Ray log for five verified wells.
47
Data logs for every well has been showed in Figures 4.2.6, 4.2.7, 4.2.8, 4.2.9, and 4.2.10.
Figure 4.2.5. Logs of Hilton well.
52
4.2.1. Exercise 1: filling the Gaps (Five Wells Combined)
The entire set of data, in this exercise, including of five wells. The data set were applied during
development and training of the network. Therefore, each one of these wells was applied to
verify the trained network. The data have been used into the network as inputs/outputs were the
locations of the wells. The inputs defined in terms of the depths, the latitude, the longitude, and
the values of the density (DPOR), density caliper (CLDC), and gamma ray (GRGC) logs value.
Neutron (NPRL) log applied as an output to the model.
In this exercise the data partition was randomly. In this model the coordinates and depths (XYZ),
the values of the density (DPOR), density caliper (CLDC), gamma ray (GRGC) logs value were
used as inputs, and the neutron (NPRL) log value was used as an output. The percentages used
for training, calibration and verification were 65%, 15%, and 30% respectively (Figure 4.2.10).
Figure 4.2.10. illustrates distribution of wells used for training /testing and Production dataset through exercise 1.
53
4.2.2. Exercise 2: Four Wells Combined, one well out
Differing from exercise 1, this exercise used only four wells for development and training of the network
while the fifth well, never applied during training and calibration. The fifth well was selected to generate
synthetic logs out of the other four wells (verification). In this exercise the data partition was
manually. Since the verification data set consisted of a data records never applied during training, and
the model was developed with training and calibration data sets. The percentages used were distributed 60%
for training, 20% for calibration, and 20% for verification, as shown in Figure 4.2.11.
Therefore, wells Helton, Paternson, Gibson, and Patrin where combined to generate logs
in well Gambrie; wells Helton, Paternson, Gibson, and Gambrie where combined to generate
logs in well Patrin; wells Helton, Paternson, Patrin,and Gambrie where combined to generate
logs in well Gibson; and wells Helton, Gibson, Patrin,and Gambrie where combined to generate
logs in well Paternson; and wells Paternson, Gibson, Patrin,and Gambrie where combined to
generate logs in well Helton.
Figure 4.2.11. illustrates distribution of wells used for training /testing and Production dataset through exercise 2.
54
RESULTS
As mentioned before, the neural network used in this work to generate the synthetic logs
involved Backpropagation neural network. During training the network uses a data set consisting
of inputs and outputs. For calibration the data set consists of a similar number of inputs and
outputs, but in this case they are used to validate the network by verifying how well the network
is performing on data that were never seen before during the training process. In this fashion, the
partially constructed network is checked at certain intervals of training by applying the
calibration data set. Finally the verification set is used to prove the ability of the network to
provide accurate results on the unseen data. Therefore, the values of R2 obtained for each of the
dataset mentioned, reflect the performance of the network during training, calibration and
verification.
5.1. Exercise 1- Results
In this exercise the R-squared for five wells indicated as, training was 0.8961, calibration was
0.87489, Verification was 0.88542. Figures 5.1.1 has shown R-squared. Figures 5.1.2, 5.1.3, and
5.1.4 have shown the application graphs.
56
Figure 5.1.2. Training R-Square in the application graph.
Figure 5.1.3. Calibration R-Square in the application graph.
57
Figure 5.1.4. Verification R-Square in the application graph.
5.2. Exercise 2- Results
5.2.1. Result of R-Sauer for Training, Calibration, and Verification
In this part, the results of second exercise will be verified. Summary of the results is shown in
table 5.2.1 below.
Table 5.2.1. Results of R-Squered for training, calibration, and verification of the five wells.
58
From the table, it can be found that R-squared for all wells are reasonable. However, R-squared
is lower in well Patterson than other wells. Since r-squared is an overall result, the software
output (neutron) and real data should be verified well by well.
The comparison between the best well and the worst well due to synthetic log and actual log are
shown in Figures 5.2.1 and 5.2.2.
Figure 5.2.1. actual and virtual results comparison in Well Hilton.
59
For well Hilton, which has r-squared of 0.9053, the virtual and actual results match very good
with each other. However, when the depth increases (after 500 ft), the actual and virtual results
deviate from each other.
Figure 5.2.2. actual and virtual results comparison in Well Patterson.
60
For well Patterson, which has the lowest amount of r-squared (0.79), it can be seen that by depth
increment, the deviation between actual and virtual results escalates. Since, the actual neutron
log fluctuates severely in this well; it is expectable to get a low amount of r-squared.
Like Hilton, wells Partin, Gambrie and Gibson have reasonable value of R-squared and with
depth increment, the actual and virtual results deviate. The comparison between synthetic logs
and actual logs are shown in Figures 5.2.3, 5.2.4, and 5.2.5.
64
CONCLUSION
This study has demonstrates generation of synthetic logs using Backpropagation Neural Network.
The data set consisting of inputs and outputs have been applied during training of the network.
The data set consisted of a similar number of inputs and outputs for calibration. Although, the
applied data set are used to validate the network by verifying how well the network was
performing on data that were never seen during the training process. In this case, the designed
network is checked partially at certain intervals of training by applying the calibration data set in
the model. The verification data set is used to prove the ability of the network to provide
reasonable results based on the unseen data.
Moreover, the values of R2 reflect the performance of the network during training, calibration
and verification, and also the values of R2 obtained for each of the mentioned dataset.
The neural network model to predict neutron log was built through exercises one and two, as
well as using different combinations inputs and outputs.
Results indicate that the best performance was obtained for Hilton well in exercise two.
Therefore, the results of Gambrie (R2=0.86) and Gibson (R2=0.90) wells were reasonable. In
Patterson (R2=0.78) log was the least predictable as a consequence of radioactive fluctuations of
the source during the logging operation.
This research is demonstrated that the quality of data has involved generation synthetic logs, play
a very important role during the development of the neural network model. Quality of logs would
be determined factor when logs are not available in digital format and have to be digitized.
Moreover, it is recommended to perform a quality control of the data before a neural network
model is built for future works.
65
REFERENCES
Ameri, S., 2009. Notes for the Advanced formation Evaluation course. West VirginiaUniversity,
Morgantown, West Virginia.
Amizadeh, F., 2001. Soft computing information applications for the fusion and analysisof large
data sets: Fact Inc./dGB-USA. University of Berkeley, California.
Banchs, R. and Michelena, R., 2002. From 3D seismic attributes to pseudo-well-logvolumes
using neural networks: Practical considerations. The Leading Edge, Vol. 21, No.10 , pp
996-1001.
Bassiouni, Z., 1994. Theory, measurement, and interpretation of well logs. SPE textbook series,
vol. 4, 372 p.
Bhattacharya A. K., and Venkobachar, C. 1984. Removal of Cadmium by Low Cost
Adsorbents. Journal of Envirnmental Engineering, Vol. 110, No. 1, pp 110-122.
Bhuiyan, M., 2001. An Intelligent System’s Approach to Reservoir Characterization in Cotton
Valley. M.S. Thesis, West Virginia University, Morgantown, West Virginia.
Boswell, R. M., and Donaldson, A. C., 1988. Depositional architecture of the Catskill delta
complex; central Appalachian basin, U.S., in Canadian Society of PetroleumGeologists
Memoir 14, pp. 65-84.
Boswell, R.M., Heim, L.R., Wrightstone, G.R., and Donaldson, A.C., 1996. Play Dvs: Upper
Devonian Venango sandstones and siltstones in Roen, J.B. and Walker, B.J. (Eds), The
Atlas of Major Appalachian Gas Plays, Publication V-25, West Virginia Geological and
Economic Survey, Morgantown, WV. Pp. 63-69.
Cant, D., 1992. Subsurface facies analysis, in Walker, R. and James, N., eds. Facies Models:
response to sea level change. Geological Association of Canada, 454 p.
66
Faucett, L., 1994. Fundamentals of neural networks. Architectures, Algorithms, and applications.
Prentice Hall, Englewood Cliffs. NJ.
Gaskari, R. 1996. Virtual adsorber system for removal of lead, application of artificial neural
networks to a complex environment problem
Hebb, D., 1949. The organization of behavior: Wiley
Jaeger, H., 2002. Supervised training of recurrent neural networks. An INDY Math Booster
tutorial, AIS, September / October 2002.
Kammer, T.W., and Bjerstedt, T.W., 1988. Genetic stratigraphy and depositional systems of
the Upper Devonian-Lower Mississippian Price-Rockwell delta complex in the central
Appalachians. USA: Sedimentary Geology, v. 54, p 265-301.
Lawrence, S., Giles, C. L., and Tsoi, A., 1997. Lessons in Neural Network Training: Training
may be harder than Expected. Proceedings of the Fourteenth National Conference on
Artificial Intelligence, AAAI-97, (pp. 540-545), Menlo Park, California: AAAI Press.
McBride, P., 2004. Facies Analysis of the Devonian Gordon Stray Sandstone in West Virginia.
M.S. Thesis, West Virginia University, Morgantown, West Virginia.
Pofahl, W., Walczak, S., Rhone, E., and Izenberg, S., 1998. Use of an Artificial Neural
Network to Predict Length of Stay in Acute Pancreatitis. American Surgeon, Sep98, Vol.
64 Issue 9, (pp: 868 – 872)
Poulton, M., 2002. Neural networks as an intelligence amplification tool: A review of
applications. Geophysics, Vol. 67, No.3. p 979-993.
Prechelt, L., 1998. Early Stopping-but when? Neural Networks: Tricks of the trade, (pp. 55-69).
Retrieved March 28, 2002 from World Wide Web:
http://wwwipd.ira.uka.de/~prechelt/Biblio/
67
Sarle, W., 1997. Neurak Network FAQ, part 1 of 7: Introduction. Periodic posting to the Usenet
newsgroup comp.ai.neural-nets. URL: http://ftp.sas.com/pub/neural/FAQ.html
Specht, D., 1991. A General Regression Neural Network. IEEE Transactions Of Neural
Networks. Vol 2, No. 6
Tonn, T., 2002. Neural network seismic reservoir characterization in a heavy oil reservoir. The
Leading Edge, Vol 21, No.3 , pp 309-312.
White, D., Aminiam K., Mohaghegh S., & Esposito P., 1995. The application of ANN for
Zone identification in a complex reservoir: Society of Petroleum Engineers. SPE Eastern
Regional Conference & Exhibiton.