Data Mining in Forecasting PVT Correlations of Crude …mlearn/Data Mining in Forecasting PVT... ·...
Transcript of Data Mining in Forecasting PVT Correlations of Crude …mlearn/Data Mining in Forecasting PVT... ·...
Data Mining in Forecasting PVT Correlations of Crude Oil Systems Based on Type-1 Fuzzy Logic Inference
Systems
1
2
3
4
5 6
Emad A. El-Sebakhy Information & Computer Science Department, College of Computer Sciences and Engineering,
King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia [email protected] and [email protected] 7
Abstract 8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Pressure-Volume-Temperature properties are very important in the reservoir engineering computations. There are many
empirical approaches for predicting various PVT properties using regression models. Last decade, researchers utilized neural
networks to develop more accurate PVT correlations. These achievements of neural networks open the door to data mining
techniques to play a major role in oil and gas industry. Unfortunately, the developed neural networks correlations are often
limited and global correlations are usually less accurate compared to local correlations. Recently, adaptive neuro-fuzzy
inference systems have been proposed as a new intelligence framework for both prediction and classification based on fuzzy
clustering optimization criterion and ranking. This paper proposes neuro-fuzzy inference systems for estimating PVT properties
of crude oil systems. This new framework is an efficient tool for modeling the kind of uncertainty associated with vagueness and
imprecision. It is a novel hybrid computational intelligence scheme that is able to forecast/classify an output in the uncertainty
situations. We briefly describe the learning steps and the use of the Takagi Sugeno and Kang model and Gustafson–Kessel
clustering algorithm with K-detected clusters from the given database. It has featured in a wide range of medical, power control
system, and business journals, often with promising results. A comparative study will be carried out to compare their
performance of this new framework with the most popular modeling techniques, such as, neural networks, nonlinear regression,
and the empirical correlations algorithms. The results show that the performance of neuro-fuzzy systems is accurate, reliable,
and outperform most of the existing forecasting techniques. Future work can be achieved by using neuro fuzzy systems for
clustering the 3D seismic data, identification of lithofacies types, and other reservoir characterization.
Keywords –Type1 neuro-fuzzy systems; Feedforward neural networks, Empirical correlations, PVT properties; Formation
volume factor; Bubble point pressure
1
1. Introduction 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Knowing both chemical and physical properties of formation water is very important in various reservoir
engineering computations, especially in water flooding and production. Ideally, these properties should be obtained
experimentally. On some occasions, these properties are neither available nor reliable; then, empirically derived
correlations are used to predict brine Pressure-Volume-Temperature (PVT) properties. These correlations offer an
acceptable approximation of formation water properties. However, the success of such correlations in prediction
depends mainly on the range of data at which they were originally developed. These correlations were developed
using equation of state (EOS), linear/nonlinear statistical regression, or graphical techniques. The currently available
PVT simulator predicts the physical properties of reservoir fluids with vary degree of accuracy based on the type of
used model, the nature of fluid, and the prevailing conditions. Nevertheless, they all exhibit the significant
drawback of lacking the ability to forecast the quality of their answers. The equation of state is based on knowing
the detailed compositions of the reservoir fluids. The determination of such quantities is expensive and time
consuming.
The equation of state involves numerous numerical computations. On the other hand, PVT correlations are
based on easily measured field data: reservoir pressure, reservoir temperature, oil, and gas specific gravity. In the
petroleum process industries, reliable experimental data are always to be preferred over data obtained from
correlations. However, very often reliable experimental data are not available, and the advantage of a correlation is
that it may be used to predict properties for which very little experimental information is available. The importance
of accurate PVT data for material-balance calculations is well understood. It is crucial that all calculations in
reservoir performance, in production operations and design, and in formation evaluation be as good as the PVT
properties used in these calculations. The economics of the process also depends on the accuracy of such properties.
Reservoir fluid properties are very important in petroleum engineering computations, such as, material balance
calculations, well test analysis, reserve estimates, inflow performance calculations, and numerical reservoir
simulations. Ideally, these properties are determined from laboratory studies on samples collected from the bottom
of the wellbore or at the surface. Such experimental data are, however, very costly to obtain. Therefore, the
solution is to use the empirically derived correlations to predict PVT properties, see Osman et al. (2001). There are
many empirical correlations for predicting PVT properties, most of them were developed using equations of state
2
(EOS) or linear/non-linear multiple regression or graphical techniques or feedforward neural networks (FFN).
However, they often do not perform very accurately and suffer from a number of drawbacks, such as, FFN is a
black box modeling scheme that is based on the trial-and-error approach. In addition, FFN architectural parameters
have to be guessed in advance, such as, number and size of hidden layers and the type of transfer function(s) for
neurons in the various layers. Moreover, the training algorithm parameters were determined based on guessing
initial random weights, learning rate, and momentum. Although acceptable results may be obtained with effort, it is
obvious that potentially superior models can be overlooked.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
The considerable amount of user intervention not only slows down model development, but also works against
the principle of ‘letting the data speak’. Furthermore, each correlation was developed for a certain range of
reservoir fluid characteristics and geographical area with similar fluid compositions and API oil gravity. Thus, the
accuracy of such correlations is critical and not often known in advance. Among those PVT properties is the bubble
point pressure (Pb), Oil Formation Volume Factor (Bob), which is defined as the volume of reservoir oil that would
be occupied by one stock tank barrel oil plus any dissolved gas at the bubble point pressure and reservoir
temperature. Precise prediction of Bo is very important in reservoir and production computations.
The development of correlations for PVT calculations has been the subject of extensive research, resulting in a
large volume of publications. Several graphical and mathematical correlations for determining both Pb and Bo have
been proposed during the last decade. These correlations are essentially based on the assumption that Pb and Bo are
strong functions of the solution gas-oil ratio (Rs), the reservoir temperature (Tf), the gas specific gravity (Gg), and
the oil specific gravity (G0), see El-Sebakhy et al. (2007), Goda et al. (2003), and Osman et al. (2001) for more
details.
The main objective of this paper is to investigate the feasibility of Type1 neuro fuzzy inference systems
(ANFIS) in estimating the PVT properties of crude oil systems, specifically develop a new intelligence system
framework for predicting both bubble point pressure and Oil Formation Volume Factor using different databases of
four input parameters, namely, solution gas-oil ratio (Rs), reservoir temperature (Tf), oil gravity (API), and gas
relative density. The rest of this paper is organized as follows. Section 2 provides a brief literature review and
related work. Section 3 provides both data acquisition and statistical quality measures. Section 4 consists of the
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
adaptive neuro-fuzzy inference systems methodology and architecture. The experimental set-up is discussed in
section 5. Section 6 shows the performance of the approach by giving the experimental results.
2. Literature Review
Last six decades, engineers realized the importance of developing and using empirical correlations for PVT
properties. Studies carried out in this field resulted in the development of new correlations.
2.1. Empirical Models and Evaluation Studies
There are numerous of empirical correlations published in literature, the most popular ones can be summarized
as, (a) in 1947, the author in Standing (1947) presented Standing empirical correlations for bubble point pressure
and for oil formation volume factor. These empirical correlations were based on laboratory experiments carried out
on 105 samples from 22 different crude oils in California, (b) in 1980, Glazo (1980) developed Glaso empirical
correlation for formation volume factor using 45 oil samples from North Sea hydrocarbon mixtures, and (c) in 1992,
the author in Al-Marhoun (1992) published his second Al-Marhoun empirical correlation for oil formation volume
factor based on database of 11,728 experimentally data points for formation volume factors at, above, and below
bubble point pressure. The data set represents samples from more than 700 reservoirs from all over the world,
mostly from Middle East and North America. The reader can consider other empirical correlations, see Al-
Shammasi (1997) and El-Sebakhy et al. (2007) for more empirical correlations. In this paper, we only concentrate
on the most common three empirical correlations, see Al-Marhoun (1992), Glazo (1980) and Standing (1947) for
the sake of simplicity to do our comparative studies with these three popular empirical correlations.
2.2. Modeling PVT Properties Based on Neural Networks
Artificial neural networks (ANNs) are parallel distributed information processing models that can recognize
highly complex patterns within available data. In recent years, neural network have gained popularity in petroleum
applications. Many authors discussed the applications of neural network in petroleum engineering such as Ali
(1994), Elsharkawy (1998), Gharbi et al. (1997 and 1997a), Kumoluyi et al. (1994), Mohaghegh (1994, 1995, and
2000), and Varotsis et al. (1999). The most common widely used neural network in literature is known as the
feedforward neural networks with backpropagation training algorithm, see Ali (1994), Duda et al. (2001), and
4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Osman et al. (2001). This type of neural networks is excellent computational intelligence modeling scheme in both
prediction and classification tasks. Few studies were carried out to model PVT properties using neural networks.
Recently, feedforward neural network serves the petroleum industry in predicting PVT correlations; see the work of
Gharbi et al. (1997 and 1997a) and Osman et al. (2001).
The author in Al-Shammasi (1997 and 2001) presented neural network models and compared their performance
to numerical correlations. He concluded that statistical and trend performance analysis showed that some of the
correlations violate the physical behavior of hydrocarbon fluid properties. In addition, he pointed out that the
published neural network models missed major model parameters to be reproduced. He uses two hidden layers
(2HL) neural networks (4-5-3-1) structure (neural network has the architecture: four input variables, two hidden
layers (the first hidden layer has five hidden nodes and the second hidden layer has three hidden nodes), and one
output variable in the output layer for predicting both properties: bubble point pressure and oil formation volume
factor. He evaluates published correlations and neural-network models for bubble point pressure and oil formation
volume factor for their accuracy and flexibility in representing hydrocarbon mixtures from different locations
worldwide. Comparative studies between the feedforward neural networks performance and the four empirical
correlations: Standing, Al-Mahroun, Glaso, and Vasquez and Beggs empirical correlation were carried out in
Gharbi et al. (1997 and 1997a) and Osman et al. (2001). The performance results were explained in details in Al-
Marhounm (1988), El-Sebakhy et al. (2007), and Osman et al. (2001). The authors in Gharbi et al. (1997 and 1997a)
published neural network models for estimating bubble point pressure and oil formation volume factor for Middle
East crude oils based on the neural system with log sigmoid activation function to estimate the PVT data for Middle
East crude oil reservoirs, while in Gharbi et al. (1997a), the authors developed a universal neural network for
predicting PVT properties for any oil reservoir. In Gharbi et al. (1997), two neural networks are trained separately
to estimate the bubble point pressure and oil formation volume factor, respectively. The input data were solution
gas-oil ratio, reservoir temperature, oil gravity, and gas relative density. They used two hidden layers (2HL) neural
networks: The first neural network, (4-8-4-2) to predict the bubble point pressure and the second neural network,
(4-6-6-2) to predict the oil formation volume factor. Both neural networks were built using a data set of size 520
observations from Middle East area. The input data set is divided into a training set of 498 observations and a
testing set of 22 observations.
5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
17
18
20
21
22
23
24
25
26
The authors in Gharbi et al. (1997a) follow the same criterion of Gharbi et al. (1997), but on large scale
covering additional area: North and South America, North Sea, South East Asia, with the Middle East region. They
developed a one hidden layer neural network using a database of size 5434 representing around 350 different crude
oil systems. This database was divided into a training set with 5200 observations and a testing set with other 234
observations. The results of their comparative studies were shown that the FFN outperforms the conventional
empirical correlation schemes in the prediction of PVT properties with reduction in the average absolute error and
increasing in correlation coefficients. The reader can take a look at Al-Shammasi (1997) and El-Sebakhy et
al. (2007) for more utilization of other type of neural networks in predicting the PVT properties, for instance, radial
basis functions and abductive networks.
3. Data Acquisition and Statistical Quality Measures
3.1. The Acquired Databases
To demonstrate the usefulness of the Type1 Fuzzy modeling scheme, the developed calibration model based on
three distinct databases (i) database with 160 observations and (ii) database with 283 observations will be used to
predict both Pb and Bob, and (iii) the world wide database with 782 observations. The complete databases are utilized
before in distinct published research articles, the detail of these databases were explained below:
1. The first database was drawn from the article of Al-Marhounm (1988). This database has 160 data drawn from 16
69 Middle Eastern reservoirs, which published correlations for estimating bubble point pressure and oil
formation volume factor for Middle East oils.
2. The second database was drawn from articles by Al-Marhoun et al. (2002) and Osman et al. (2002 and 2005). 19
This database has 283 data points collected from different Saudi fields to predict the bubble point pressure, and
the oil formation volume factor at the bubble-point pressure for Saudi crude oils. The models were based on
neural networks with 142 training set to build FFN calibration model to predict the bubble point pressure, and
the oil formation volume factor, 71 to cross-validate the relationships established during the training process
and adjust the calculated weights, and the remaining 70 to test the model to evaluate its accuracy. The results
show that the developed Bo model provides better predictions and higher accuracy than the published
correlations.
6
3. The third database was obtained from the works of Goda et al. (2003) and Osman et al. (2001), where the 1
authors used feedforward learning scheme with log sigmoid transfer function in order to estimate the formation 2
volume factor at the bubble point pressure. This database contains 782 observations after deleting the redundant 3
21 observations from the actual 803 data points. This data set is gathered from Malaysia, Middle East, Gulf of 4
Mexico, and Colombia. The authors in Goda et al. (2003) and Osman et al. (2001) designed a one hidden layer 5
(1HL) feedforward neural network (4-5-1) with the backpropagation learning algorithm using four input 6
neurons covering the input data of gas-oil ratio, API oil gravity, relative gas density, and reservoir temperature, 7
one hidden layer with five neurons, and single neuron for the formation volume factor in the output layer. 8
To evaluate the performance of each Type1 fuzzy inference system, feedforward neural network with
backpropagation learning scheme, and the most common three empirical correlations in literature using the above
defined three distinct databases. We use the stratified criterion to divide the provided database by selecting 70% of
the it for building the calibration type1 fuzzy model (internal validation) and 30% of the data for testing/ validation
(external validation or cross-validation criterion). We repeat both internal and external validation processes for 1000
times to have a fair partition through the entire process operations.
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
3.2. Background and Implementation Process
During the implementation, the user should be aware of the input domain values to make sure that the input
data values fall in a natural domain. This step called the quality control and it is really an important step to have
very accurate and reliable results at the end. The following is the most common domains for the input/output
variables, gas-oil ratio, API oil gravity, relative gas density, reservoir temperature; bubble point pressure, and oil
formation volume factor that are used in the both input and output layers of modeling schemes for PVT analysis:
• Gas oil ratio with range between 26 and 1602, scf/stb.
• Oil formation volume factor which varies from 1.032 to 1.997, bbl/stb.
• Bubble point pressure, starting from 130, ending with 3573, psia.
• Reservoir temperature with its range from 74° F to 240° F.
• API gravity which changes between 19.4 and 44.6.
• Gas relative density, change from 0.744 to 1.367.
7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
In this study, we utilize the data provided in Al-Marhoun et al. (2002) and Osman et al. (2002 and 2005) in
both internal and external validation process by dividing the 382 data points to 267 for building the calibration
model and the remaining 115 for cross-validation process to evaluate its accuracy and trend stability. Next, we
investigate the capability of the established calibration neuro-fuzzy type1 inference relationships to forecast both
bubble point pressure and Oil Formation Volume Factor for new unseen databases based on the same four input
parameters, namely, solution gas-oil ratio, reservoir temperature, oil gravity, and gas relative density. For both
internal and external validation processes, different quality control and statistical measures were calculated to
compare between the new intelligence framework, the feedforward neural networks with backpropagation learning
algorithm, and the most popular empirical correlations (Standing, Al-Mahroun, and Glaso empirical correlation)
performance. We repeat the same process with the other two databases as well. The obtained results of the entire
process were shown in Tables 1 through 6, respectively.
3.3. The Most Common Statistical Quality Measures
To compare the performance and accuracy of the new intelligence framework to other empirical correlations,
statistical error analysis and quality measures are performed. The most common statistical quality measures that are
utilized in both petroleum engineering and data mining journals were namely, the average percent relative error
(Er), average absolute percent relative error (Ea), minimum and maximum absolute percent error (Emin and (Ermax))
root mean square errors (Erms), standard deviation (SD), and correlation coefficient (R2), see Duda et al. (2001) and
Osman et al. (2001) for more details about their corresponding mathematical formulae.
As it is shown below in the empirical study section, the results show that the Type1 neuro-fuzzy inference
intelligence system scheme is faster and more stable than both empirical correlations and other forecasting schemes
reported in the petroleum engineering literatures. Moreover, the new data mining modeling scheme outperforms
both feedforward neural network and all the most common existing correlations models in terms of root mean
squared error, absolute average percent error, standard deviation, and correlation coefficient.
4. Neuro-Fuzzy Inference Systems
Fuzzy logic is an application of recognized softcomputing techniques. It is a design method that can be
effectively applied to problems that, because of complex, nonlinear, or ambiguous models, cannot be easily solved
8
1
2
3
4
5
6
7
8
9
10
11
12
13
using traditional engineering analytical techniques. Fuzzy logic comprises of fuzzy sets, which are a way of
representing non-statistical uncertainty and approximate reasoning, which includes the operations used to make
inferences. Fuzzy theory is a theoretical framework having fuzzy sets and fuzzy logic as its core; it started with the
fuzziness concept and its expression in the form of fuzzy, LeCun et al. (1995), and Liu et al. (2003).
The fuzzy theory has found many applications in a variety of fields such as plant process control,
autoimmunization, pattern recognition and decision-making. It is an excellent tool for modeling the kind of
uncertainty associated with vagueness, with imprecision, and/or with a lack of information regarding a particular
element of the problem at hand. Fuzzy systems perform well on uncertain information, very similar to the way
human reasoning does. Moreover, the information in prediction or pattern classification problems is imprecise rather
than precise in nature, and fuzzy set theory allows us to properly model this vague information. All the basic
concepts, such as, concepts of fuzzy set theory, including fuzzy relations, fuzzification and defuzzification,
construction of membership functions, and fuzzy arithmetic are shown in details in LeCun et al. (1995), and
Taghavi (2005). Amabeoku et al. (2005) and Taghavi (2005) propose the Fuzzy sets are defined through their
membership functions, μ which map the elements of the considered universe to the unit interval [0 . The
membership of an element
1],14
x in the crisp set A is represented by the characteristic function Aμ of A , that is, 15
1 if( )
0 ifA
x Ax
x Aμ
; ∈⎧= ⎨ ; ∉⎩ .
16
Figure1. Type-1 FLS with crisp inputs and output
9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Generally, the rule-based fuzzy modeling technique can be classified into three categories, namely the
linguistic (Mamdani-type), the relational equation, and the Takagi, Sugeno and Kang (TSK), see Cuddy et
al. (1998), Hambalek et al. (2003), Jong-Se et al. (2004), LeCun et al. (1995), Liu et al. (2003), and McCain et
al. (1998). In linguistic models, both the antecedent and the consequence are fuzzy sets, while in the TSK model the
antecedent consists of fuzzy sets but the consequence is made up of linear equations. Fuzzy relational equation
models aim at building the fuzzy relation matrices according to the input-output process data determined. We are
going to focus on the use of the neuro-fuzzy systems with the TSK model for predicting the PVT correlations of
crude oil systems, because of TSK needs less rules and its parameters can be estimated from numerical data using
optimization methods such as least-square algorithms, see Abdulraheem et al. (2007), LeCun et al. (1995), and
Liu et al. (2003).
4.1. Adaptive Neuro-Fuzzy Inference Systems
The neuro-fuzzy inference system Type1 is a hybrid forecasting/classification framework, which learns the
rules and membership functions from data. It is a network of nodes and directional links. Associated with the
network is a learning rule, for instance, backpropagation. It’s called adaptive because some, or all, of the nodes have
parameters which affect the output of the node. These networks are learning a relationship between inputs and
outputs. This type of networks cover number of different approaches, namely, Mamdani type and the Takagi-
Sugeno-Kang type, see Abdulraheem et al. (2007), Amabeoku et al. (2005), Cuddy et al. (1998), Hambalek et
al. (2003), Jong-Se et al. (2004), McCain et al. (1998), and Standing (1947)] for more details.
The basic architecture of a type-1 FLS with crisp inputs and output is shown in Figure 1. The TSK fuzzy
modeling method was proposed by Takagi and Sugeno as a framework for generating fuzzy if then rules from
numerical data. A TSK fuzzy model consists of a set of fuzzy rules, each describing a local linear input-output
relationship:
1 1 2 2if is and is and if isi i i p ipREL x A x A … x A: , , , 0 1 1then 1 2i i i ip py a a x … a x i … n= + + + ; = , , , ;23
where iREL is the rule; thi 1 px … x, ,
i
are the input variables; are the fuzzy sets assigned to
corresponding input variables;
1iA … A, , ip24
y represents the value of the local output; and are the model thi 0ia … a, , ip25
10
consequent parameters. The final global output of the TSK fuzzy model for a crisp input vector 1( )px … x= , ,x is
calculated using fuzzy mean-weight formula
1
( ) ( )1
1 1y
n n
i i i i ii i
x y xβ β−
= =
⎡ ⎤= ⎢ ⎥⎣ ⎦∑ ∑
, where ( )i ixβ represents the degrees of firing
(DOF) of the fuzzy rule that is defined as
2
( ) ( )1 1i i Ax Min x …β μ
i ipA pxμ⎧ ⎫⎪ ⎪⎛ ⎞⎨ ⎬⎜ ⎟
⎝ ⎠⎪ ⎪⎩ ⎭,= , . The construction of TSK fuzzy model
from numerical data proceeds in three steps: fuzzy clustering, setting the membership functions, and parameter
estimation, see Al-Marhoun et al. (2002), El-Sebakhy et al. (2007a), Osman et al. (2002 and 2005). The most
common architecture of the Type1 neuro-fuzzy inference systems in literature is shown in Figure 2.
3
4
5
6
Layer 2 Layer 5Layer 3 Layer 4 LaLayer 1
N1
∑y
N3
N2
N4
1
x2
x1
yer 6x1 x2
A1 1∏
2∏
3∏
4∏
A2
B1
B2
2
3
4
Figure 2. Adaptive neuro fuzzy Architecture for a two rule Sugeno system
4.2. Fuzzy Clustering and Partitioning Based on Gustafson Kessel Scheme 7
8 Fuzzy clustering partitioning of the input–output space is performed in the first step, using the selected
clustering method. The clustering method utilizes training data, 1{( )}i ipx … x y= , , ;D , of n input
vectors of dimension
1i …= , ,n9
p and one output. Each obtained cluster represents a certain operating region of the system,
where input–output data values are highly concentrated. The learning data, divided into these information clusters,
are then interpreted as rules. The most popular fuzzy clustering methods in the machine learning and data mining
literature are fuzzy c-means (FCM) and Gustafson–Kessel (GK), see Goda et al. (2003), LeCun et al. (1995), Liu et
al. (2003), and Taghavi (2005) for more details.
10
11
12
13
14
11
Let xj be the input vector for the jth observation over different samples, and let vi be the ith cluster centroid
(prototype). Then a typical distance norm between xj and vi is
1
( ) (22 ,T
ij j i j i j iA)D x v x v A x v= − = − − where A is a
symmetric and positive definite matrix and V = [v1,v2,...,vk] is a vector of the centroids of the fuzzy clusters
C1,C2,...,Ck. Use of the matrix Ai makes it possible for each cluster to adapt the distance norm to the geometrical
structure of the data at each iteration. Therefore, different norms can be induced by the choice of the matrix A. The
Euclidean norm is induced when A = I, where I is an identity matrix. The Mahalonobis distance (norm) is induced
when A = M–1, where M–1 is the inverse of the covariance matrix of patterns in the system. Although many clustering
methods have been studied in the literature, a common limitation of conventional methods is to use a fixed distance
norm for finding clusters; this fixed norm imposes a fixed geometrical structure and finds clusters of that shape even
if they are not present. The Euclidean norm-based methods find only spherical shape of clusters and the
Mahalanobis norm-based methods find only ellipsoidal ones even if those shapes of clusters are not present in a
dataset. Based on the norm-inducing matrices, the objective of the GK method is obtained by minimizing the
function Jm, that is,
2
3
4
5
6
7
8
9
10
11
12
13
j ijA( ) 2
1 1, , : ,
i
k n m
m ii j
J U V A X Dμ⎛ ⎞⎜ ⎟⎝ ⎠
= =
= ∑ ∑ 14
)where ( 1 1, ,..., kA A A A= is a k-tuple of the norm-inducing matrices, U, where is a fuzzy partition matrix
of X satisfying the following constraints:
( )*ij k n
U μ=15
16
- [ ]0,1 ,1 , 1ij i k j nμ ∈ ≤ ≤ ≤ ≤ , , and 1
1 for 1k
iji
j nμ=
= ≤ ≤∑17
- , where 1
0 , 1n
ijj
n i kμ=
< ≤ ≤ ≤∑ [ )1, is a weighting exponent that controls the membership degree ij
m ∈ ∞ μ of
each data point xj to the cluster Ci.
18
19
20 The choice of appropriate m value is of importance because the final clusters may vary depending on the m
value selected. As m 1, J1 produces a hard partition where µij {0,1}. As m approaches infinity, J∞ produces a
maximum fuzzy partition where µij = 1/c. To obtain a feasible solution by minimizing the additional constraint
is required for Ai, that is,
21
22 mJ
( )det ; 0; 1 , i i iA i kρ ρ= > ≤ ≤ where iρ is a cluster volume for each cluster. This constraint
guarantees that Ai is positive-definite, indicating that Ai can be varied to find the optimal shape of cluster with its
23
24
12
volume fixed. Using the Lagrange multiplier method, minimization of the function with respect to Ai, then we
obtain
mJ1
( ) 1/ -1det p
i i i iA F Fρ= ⎡ ⎤⎣ ⎦ , where the fuzzy covariance matrix of cluster Ci is defined as: 2
m( ) ( )( ) ( )1
1 1
n nm T
i ij j i j i ijj j
F x v x vμ μ−
= =
⎡ ⎤ ⎡ ⎤= − −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦∑ ∑ . 3
4
5
The set of fuzzy covariance matrices is represented as a k-tuple of F = (F1, F2,..., Fk). Generally, to show that Ai
satisfies the constraint of a symmetric and positive-definite matrix, assume that there are p linearly independent data
points pRξ ∈ in the dataset. Then, the matrices Tξξ are symmetric and positive semi-definite and also their
weighted sum (Fi), and hence Ai is symmetric and positive-definite, see Bezdek et al. (1999) and Dumitrescu et
al. (2000) for more details. There is no general agreement on what value to use for
6
7
iρ ; without any prior
knowledge, a rule of thumb is that many investigators use 1 for
8
iρ in practice, see Bezdek et al. (1999). Moreover,
based on the notion that
9
iρ represents the cluster volume for each cluster, in the present study we estimated iρ as 10
( )det i iF⎡ ⎤⎣ ⎦ρ =11 by exploiting the definition on the volume of fuzzy cluster Ci [37] making the Gustafson–Kessel
method to be fully operational. Therefore, under this formulation, the fixed norm 12
13
14
22iijA Aj iD x v= − calculated for the
distance between xj and vi is replaced in the Gustafson–Kessel scheme with the distance,
( ) ( ) ( ) ( )1 1
2 21/ 2 1/ 2 / 22iijA
<
( )0U
det det deti i
p p pi i j i i j iF F
D F F x v F x v− −
+= − = −⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦ ⎣ ⎦15
16
17
18
19
20
By using the GK clustering algorithm with clusters on the data set , we compute the fuzzy partition
matrix, U . The process of this fuzzy clustering is an iterative process as it is shown below: For data set, , choose
the number of clusters 1 , the weighting exponent and the termination tolerant . Initialize
the partition matrix randomly.
K D
D
0K n< 1m > ε >
At the end of the iterative procedure, the membership values, kiμ and cluster centers are obtained. The
detected fuzzy clusters in the input-output product space give information on how the data points are structured in
the input space. This information, which is captured in the cluster centers and eigen values of the fuzzy covariant
kV21
22
23
13
matrices, is projected into the input axes to induce the antecedent fuzzy sets. If are the input space
coordinates of the cluster center, then the antecedent fuzzy sets of the TSK model are defined by the triangular
membership as
1iv … v, , ip1
2 thi
( ) 0 1 1i kp
ipA i ip
x Vbx M ax k … K−⎡ ⎤= , − ; = , , ,⎢ ⎥⎣ ⎦
3
4
5
μ with the center coordinates and the parameters
controlling respectively the mean and the spread of the membership function; Amabeoku et al. (2005), LeCun et
al. (1995), and McCain et al. (1998). Finally, the parameters are estimated using the least-square approximation.
ipv ipb
Step 1: Compute cluster means: 6
( )
( )
1
( ) 1m1
1
1ki
ki
n mli
l ik n
l
i
xV k
μ
μ
⎡ ⎤−⎢ ⎥⎣ ⎦
=
⎡ ⎤−⎢ ⎥⎣ ⎦
=
= ; =∑
∑… K, , ,
7
( )i kxμ where kiμ is the triangular/Gauessian bell membership function defined above, . 8
Step 2: Compute cluster covariance: 9
( ) ( )
( )
m1
1m1
1
1 .ki
ki
n Tl lk k
ik n
l
i
x VF k … K,
μ
μ
⎡ ⎤ ⎡ ⎤−⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
=
⎡ ⎤−⎢ ⎥⎣ ⎦
=
−= ; = ,∑
∑ 10
Step 3: Compute the distances: 11
( ) 12 1( 1)det 1T
T nki k k kkD F i …F
⎡ ⎤−+⎢ ⎥⎢ ⎥⎣ ⎦
= ; =H H n, , , 12
where ( )lk k kx V⎡ ⎤
⎢ ⎥⎣ ⎦
= −H for . 1k … K= , ,13
Step 4: Update the partition matrix ikμ as follows: 14
( ))(
( ) [ ] ( )
2 1
( )1
0 a
n
ki=
; >
∑n
l lki ki
i 1
1 if 0
0 if nd 0 1 with 1
kim
kslski
ki
DD D
D
μ
μ μ
/ −
=
⎧ ; > ;⎪⎪
= ⎨⎪⎪ ; ∈ , = ,⎩
∑
15
for ; until 1k …= , , K i 1 … n= , , ( ) ( )1l lU U ε−− < , where is the clustering distance defined in step3 and ikD16
ikμ is the triangular/Gauessian bell membership function defined above. 17
18
14
Let X denote the matrix whose row is the input vector thi ix and let Y denote the column vector with iy as
its component. Let denote the n×n real diagonal matrix that represents normalized firing strength of
the rule for the observation or sample, W x
1
2 thi
k
( )k iW x
thith( ) ( (
1k i j x
=)
K
k ij
xβ )1
iβ−
⎡ ⎤= ⎢ ⎥
⎣ ⎦
thi
∑ where . Suppose
that denote the vector of consequent parameters of the rule. In order to estimate the off-set
term, , a unitary column I is appended to the matrix,
1k … K= , , ; 1i = ,...,n3
4 0i ia … a⎡⎢⎣ , , in
⎤⎥⎦Θ ≡
0ia X , to produce the extended matrix [ ]eX
1
i eX−⎡ ⎤
⎢ ⎥⎣ ⎦. . .
X≡
e W. .
I,
T TiX Y
.
Therefore, the unknown parameters are calculated via least-square criterion, .
Therefore, the end, the output of the Type1 neuro-fuzzy logic inference systems model is approximated by
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
iΘ i eX WΘ =
iΘ .Y X≈ .
5. The Empirical Study, Discussion and Comparative Studies
We have done the quality control check on all these data sets and clean it from redundant data and un-useful
observations. To evaluate performance of each modeling scheme, entire database is divided using the stratified
criterion. Therefore, we use 70% of the data for building Type1 Fuzzy learning model (internal validation) and 30%
of the data for testing/validation (external validation or cross-validation criterion). Both internal and external
validation processes are repeated 1000 times. Therefore, data were divided into two/three groups for training and
cross validation check. Therefore, of the 782 data points, 382 were used to train the neural network models, the
remaining 200 to cross-validate the relationships established during training process and 200 to test model to
evaluate its accuracy and trend stability. For testing data, statistical summary to investigate different quality
measures corresponding to Type1 Fuzzy scheme, feedforward neural networks system, and the most popular
empirical correlations in literatures to predict both bubble point pressure and Oil Formation Volume Factor.
Generally, after training the Type1fuzzy inference systems, the calibration model becomes ready for testing and
evaluation using the cross-validation criterion. Comparative studies were carried out to compare the performance
and accuracy of the new Type1 Fuzzy model versus both the standard neural networks and the three common
published empirical correlations, namely, Standing, Al-Mahroun, and Glaso empirical correlations.
15
5.1. Parameters Initialization 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
In this study, we follow the same procedures in Al-Marhoun et al. (2002), Osman et al. (2001 and 2002) a single
hidden layer feedforward neural network based on back propagation (BP) learning algorithm with both linear and
sigmoid activation functions. The initial weights were generated randomly and the learning technique is achieved
based on 1000 epoch or 0.001 goal error and 0.01 learning rate. Each layer contains neurons that are connected to
all neurons in the neighboring layers. The connections have numerical values (weights) associated with them, which
will be adjusted during the training phase. Training is completed when the network is able to predict the given
output. For the two models, the first layer consists of four neurons representing the input values of reservoir
temperature, solution gas oil ratio, gas specific gravity and API oil gravity. The second (hidden) layer consists of
seven neurons for the Pb Model, and eight neurons for the Bo model. The third layer contains one neuron
representing the output values of either Pb or Bob. Simplified schematic of the used neural networks for Pb and Bo
models are illustrated in Al-Marhoun et al. (2002) and Osman et al. (2002). It gives the ability to monitor the
generalization performance of the network and prevent the network to over fit the training data based on repeating
the computations for 1000 times and take the average over all runs.
The implementation process started by feeding the nuero-fuzzy inference system shown in Figure 2 by the
available input data sets, one observation at a time, then the rules and membership functions have to be developed
from the available input data. We have tried both triangular and Gaussian Bell membership functions with both grid
partition and subtractive clustering with radius 0.1 based on two different learning criteria, such as, backpropagation
and least squares. A combination of both least squares method and back propagation gradient descent method were
used for training fuzzy inference system membership function parameters and is applied to emulate a given training
data set. The resulted weights for the Bo and Pb models are given below in different Tables and graphs. Moreover,
the relative importance of each input property are identified during the training process and given for Bo and Pb
models as it is shown below.
5.2. Discussion and Comparative Studies
One can investigate other common empirical correlations besides these chosen empirical correlations, see El-
Sebakhy et al. (2007) and Osman et al. (2001) for more details about these empirical correlations mathematical
16
1
2
3
4
5
6
formulas. The results of comparisons in the testing (external validation check were summarized in Tables 1 through
6, respectively. We observe from these results that the type1 Fuzzy intelligence modeling scheme outperforms both
neural and the most common published empirical correlations. The proposed model showed a high accuracy in
predicting the Bo values with a stable performance, and achieved the lowest absolute percent relative error, lowest
minimum error, lowest maximum error, lowest root mean square error, and the highest correlation coefficient
among other correlations for the used three distinct data sets.
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 21
1.5
2
Pred
icte
d B
ob
Training Performance: RMSE = 0.0056024
Actual Bob
R2 = 1
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 21
1.5
2
Pre
dict
ed B
ob
Testing Performance: RMSE = 0.014549
Actual Bob
R2 = 0.997
Figure 3: Cross plot of type1 fuzzy inference systems modeling scheme for Bo based on dataset by Al-
Marhounm (1988).
17
0 500 1000 1500 2000 2500 3000 3500 40000
1000
2000
3000
4000
Pred
icte
d Pb
Training Performance: RMSE = 18.9876
Actual Pb
R2 = 1
0 500 1000 1500 2000 2500 3000 3500 4000-2000
0
2000
4000
Pre
dict
ed P
b
Testing Performance: RMSE = 411.7358
Actual Pb
R2 = 0.943
Figure 4: Cross plot of type1 fuzzy inference systems modeling scheme for Pb based on dataset by Al-
Marhounm (1988).
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.91
1.5
2
Pred
icte
d Bo
b
Training Performance: RMSE = 0.0064111
Actual Bob
R2 = 0.999
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.91
1.5
2
Pre
dict
ed B
ob
Testing Performance: RMSE = 0.011891
Actual Bob
R2 = 0.997
Figure 5: Cross plot of type1 fuzzy inference systems modeling scheme for Bo based on data of Al-
Marhoun et al. (2002) and Osman et al. (2002 and 2005).
18
0 500 1000 1500 2000 2500 3000 35000
1000
2000
3000
4000
Pre
dict
ed P
b
Training Performance: RMSE = 45.1691
Actual Pb
R2 = 0.999
0 500 1000 1500 2000 2500 3000 35000
1000
2000
3000
4000
Pred
icte
d Pb
Testing Performance: RMSE = 85.5766
Actual Pb
R2 = 0.995
Figure 6: Cross Plot of type1 fuzzy inference systems modeling scheme for Pb based on data of Al-
Marhoun et al. (2002) and Osman et al. (2002 and 2005).
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 31
1.5
2
2.5
3
Pre
dict
ed B
ob
Training Performance: RMSE = 0.047403
Actual Bob
R2 = 0.9978
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 31
1.5
2
2.5
3
Pre
dict
ed B
ob
Testing Performance: RMSE = 0.077422
Actual Bob
R2 = 0.9971
Figure 7: Cross plot of type1 fuzzy inference systems modeling scheme for Bo based on data of Goda et
al. (2003) and Osman et al. (2001).
19
0 1000 2000 3000 4000 5000 6000 7000 80000
2000
4000
6000
8000
Pre
dict
ed P
b
Training Performance: RMSE = 301.8642
Actual Pb
R2 = 0.9792
0 1000 2000 3000 4000 5000 6000 70000
5000
10000
Pred
icte
d P
b
Testing Performance: RMSE = 436.1069
Actual Pb
R2 = 0.9781
Figure 8: Cross plot of type1 fuzzy inference systems modeling scheme for Pb based on data of Goda et
al. (2003) and Osman et al. (2001).
Figures 3-8 illustrate six scatter plots of the predicted results versus the experimental data for both Pb and Bo
values using the provided three distinct data sets. These cross plots indicates the degree of agreement between the
experimental and the predicted values based on the high quality performance of the Type1 Fuzzy modeling scheme.
The reader can compare theses patterns with the corresponding ones of the published neural networks modeling and
common empirical correlations in Al-Marhoun et al. (2002) and Osman et al. (2001 and 2002).
1
2
3
4
5
6
7
8
9
10
As it is shown in Figures 9 through 10, four scatter plots are drawn for the used three distinct databases. These
graphs show the measurements of both absolute percent relative error (EA) and correlation coefficient (R2) for type1
fuzzy logic inference systems scheme, feedforward neural networks (the used computational intelligence schemes)
and three empirical correlations. Each modeling scheme is represented by a symbol; the good forecasting scheme
should have the highest correlation and lowest absolute percent relative error.
20
Table 1 Testing results (Al-Marhoun (1988), El-Sebakhy et al. (2007), and Osman et al. (2001) data): Statistical quality measures when estimate Bo.
Correlation Er EA Emin Emax SD R2 Standing (1947) -0.170 2.724 0.008 20.180 2.5823 0.974
Glaso (1980) 1.8186 3.374 0.003 17.776 2.673 0.972 Al-Marhoun (1992) -0.115 2.205 0.003 13.179 1.2842 0.981
ANN System 0.3024 1.789 0.008 11.775 0.89835 0.988 Type1 Fuzzy Sys. 0.1501 1.322 0.002 7.4513 0.7876 0.997
Table 2 Testing results (Al-Marhoun (1988), El-Sebakhy et al. (2007), and Osman et al. (2001) data): Statistical quality measures when estimate Pb.
Correlation Er EA Emin Emax SD R2 Standing (1947) 67.60 67.73 0.1620 102.08 25.159 0.867
Glaso (1980) -1.616 18.52 0.1056 138.96 25.171 0.945 Al-Marhoun (1992) 8.008 20.01 0.0254 109.12 12.839 0.906
ANN System 8.129 21.02 0.0182 145.29 14.871 0.943 Type1 Fuzzy Sys. 7.432 14.22 0.0093 101.23 11.337 0.952
1 2
3
4
5
6
7
8
By looking at these scatter plots; we observed for example, in estimating Bo based on the data set used in
Abdulraheem et al. (2007), the symbol corresponding to Type1 fuzzy has the smallest absolute percent relative
error, EA = 1.3218%, the largest correlation coefficient, R2 = 0.9970, and the smallest standard deviation, SD =
0.7876, while neural network is below type1 fuzzy logic inference systems scheme with EA = 1.7886%, R2 =
0.9878, and SD = 0.89835. The other empirical correlations indicates higher error values with lower correlation
coefficients, for instance, Al-Marhoun (1992) has EA = 2.2053%, R2 = 0.9806, and SD = 1.2842; Standing has EA =
2.7238%, R2 = 0.9742, and SD = 2.673; and Glaso Correlation with EA = 3.3743%, R2 = 0.9715, and SD = 2.583.
Table 3 Testing results (Al-Marhoun et al. (2002) and Osman et al. (2002 and 2005) dataset): Statistical quality measures when estimate Bo.
Correlation Er EA Emin Emax SD R2 Standing (1947) -1.054 1.6833 0.066 7.7997 2.1021 0.9947
Glaso (1980) 0.4538 1.7865 0.0062 7.3839 2.1662 0.9920 Al-Marhoun (1992) -0.392 0.8451 0.0003 3.5546 1.1029 0.9972
ANN System 0.217 0.5116 0.0061 2.6001 0.6626 0.9977 Type1 Fuzzy Sys. 0.016 0.3247 0.0011 2.3365 0.3856 0.997
21
Table 4 Testing results (Al-Marhoun et al. (2002) and Osman et al. (2002 and 2005) dataset): Statistical quality measures when estimate Pb.
Correlation Er EA Emin Emax SD R2 Standing (1947) -8.441 10.4562 0.2733 47.0213 11.841 0.8974
Glaso (1980) -18.48 20.7569 2.0345 63.7634 16.160 0.9837 Al-Marhoun (1992) 0.941 8.1028 0.0935 38.085 11.41 0.9905
ANN System -0.222 5.8915 0.2037 38.1225 8.678 0.9930 Type1 Fuzzy Sys. -0.456 3.186 0.0012 24.347 4.82 0.995
Table 5 Testing results (Goda et al. (2003) and Osman et al. (2001) dataset): Statistical quality measures when estimate Bo.
Correlation Er EA Emin Emax SD R2 Standing (1947) -2.628 2.7202 0.0167 13.2922 5.7655 0.9953 Glaso (1980) -0.5529 0.9821 0.0086 6.5123 5.2274 0.9959
Al-Marhoun (1992) -0.4514 2.0084 0.0322 11.0755 5.0028 0.9935 ANN System 0.3251 1.4592 0.0083 5.3495 4.7402 0.9968
Type1 Fuzzy Sys. 0.0621 0.2456 0.0043 2.4013 4.5873 0.9971
Table 6 Testing results (Goda et al. (2003) and Osman et al. (2001) dataset): Statistical quality measures when estimate Pb.
Correlation Er EA Emin Emax SD R2 Standing (1947) 12.811 24.684 0.62334 59.038 13.696 0.8657
Glaso (1980) -18.887 26.551 0.28067 98.78 25.27 0.9675 Al-Marhoun (1992) 5.1023 8.9416 0.13115 87.989 25.015 0.9701
ANN System 4.9205 6.7495 0.16115 65.3839 22.73 0.9765 Type1 Fuzzy System 3.2218 4.0651 0.1178 52. 1921 15.821 0.9781
Similarly, for the bubble point pressure, Pb based on the used data sets used in Al-Marhoun et al. (2002), El-
Sebakhy et al. (2007), Goda et al. (2003), and Osman et al. (2001 and 2002), we observed that the symbol
corresponding to Type1 Fuzzy scheme has the smallest absolute percent relative error, EA = 14.224%, the largest
correlation coefficient, R2 = 0.9520, and the smallest standard deviation, SD = 11.337, while neural network is
below Type1 Fuzzy with EA = 21.017%, R2 = 0.943, and SD = 14.871. The other correlations indicates higher
error values with lower correlation coefficients, for instance, Al-Marhoun (1992) has EA = 20.011%, R2 = 0.906,
and SD = 12.839; Standing has EA = 67.73%, R2 = 0.867, and SD = 25.171; and Glaso empirical correlation with
EA = 18.523%, R2 = 0.945, and SD = 25.159. Overall computations, we observed that the new intelligence system
framework has a reasonable value of the relative errors, Er, especially, by looking at Tables 5 and 6, the Type1
neuro-fuzzy system has the smallest Er values compared to the other published techniques. This indicator can be
used as a very good indicator to say that the performance of the new framework is the one with the least bias.
1
2
3
4
5
6
7
8
9
10
11
22
Figure 9: Average Absolute relative errors and Correlation coefficients when we run the provided data sets under type1 fuzzy inference systems, neural network, and three empirical correlations for predicting Bob.
1
2
3
4
5
6
7
The same implementations processes may be repeated for the other statistical quality measures, but for the sake
of simplicity, we did not include it in this context. Finally, we conclude that developed type1 fuzzy inference
systems modeling scheme has better and reliable performance compared to the most published modeling schemes
and empirical correlations. The bottom line is that, the developed type1 fuzzy inference systems modeling scheme
outperforms both the standard feedforward neural networks and the most common published empirical correlations
in predicting both Pb and Bo using the four input variables: solution gas-oil ratio, reservoir temperature, oil gravity,
and gas relative density.
23
Figure 10: Average Absolute relative errors and Correlation coefficients when we run the provided data sets
under type1 fuzzy inference systems, neural network, and three empirical correlations for predicting Pb.
6. Conclusion and Recommendation 1
2
3
4
5
6
7
8
9
In this study, three distinct published data sets were used in investigating the capability of the type1 fuzzy
inference systems scheme as a new framework for predicting the PVT properties of oil crude systems. Based on the
obtained results and comparative studies, the conclusions and recommendations can be drawn as follows:
A new computational intelligence modeling scheme based on the type1 fuzzy inference systems scheme to
predict both bubble point pressure and oil formation volume factor using the four input variables: solution gas-oil
ratio, reservoir temperature, oil gravity, and gas relative density. As it is shown in the petroleum engineering
communities, these two predicted properties were considered the most important PVT properties of oil crude
systems.
24
- The developed type1 fuzzy inference systems scheme outperforms both the standard feedforward neural 1
networks and the most common published empirical correlations. Thus, the developed type1 fuzzy logic 2
inference systems scheme has better, efficient, and reliable performance compared to the most published 3
correlations. 4
- The developed type1 fuzzy inference systems scheme showed a high accuracy in predicting the Bo values with a 5
stable performance, and achieved the lowest absolute percent relative error, lowest minimum error, lowest 6
maximum error, lowest RMSE and the highest R2 among other correlations for the used three distinct data sets. 7
- The developed type1 fuzzy inference intelligence framework performance showed the smallest standard 8
deviation values all over the entire datasets utilized in the manipulation process, which indicates that the new 9
framework is more stable and robust than the most published techniques. Since the initial topology of the new
framework take into consideration the entire input domain in both GK clustering and membership functions.
Therefore, there is no any risk of over-fitting and complexity problems and then there is no lack of robustness
in prediction.
10
11
12
13
15
16
18
19
20
21
22
23
24
25
26
- The new intelligence system framework has a reasonable value of the relative errors, Er, especially, by looking 14
at Tables 5 and 6, the Type1 neuro-fuzzy system has the smallest Er values compared to the other published
techniques. This indicator indicates that the performance of the new framework is the one with the least bias.
- The type1 fuzzy inference modeling scheme is flexible, reliable, and shows a bright future in implementing it 17
for the oil and gas industry, especially permeability and porosity prediction, history matching, rock mechanics
properties, flow regimes and liquid-holdup multiphase follow, 3D seismic data, and faceis classification.
NOMENCLATURE
Bob = OFVF at the bubble- point pressure, RB/STB
Rs = oil solution gas oil ratio, SCF/STB
T = reservoir temperature, degrees Fahrenheit
γo = oil relative density (water=1.0)
γg = gas relative density (air=1.0)
Er = average percent relative error
25
Ei = percent relative error 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Ea = average absolute percent relative error
Emax = Maximum absolute percent relative error
Emin = Minimum absolute percent relative error
RMS = Root Mean Square error
Acknowledgement
The author wishes to thank King Fahd University of Petroleum and Minerals, Mansoura University, and Cornell
University for the facilities utilized to perform the present work and for their support.
References
Abdulraheem A., El-Sebakhy E. A., Ahmad M., Vantala A., Korvin G., and Raharja I. P., 2007. “The Capability of
Neuro-Fuzzy` Systems in Predicting Permeability and Porosity from Well-Log”. SPE105350, the SPE 15th
Middle East Oil Show held in Bahrain, 11–14 March.
Ali J. K., 1994. “Neural Networks: A New Tool for the Petroleum Industry”. SPE 27561 presented at the 1994
European Petroleum Computer Conference, Aberdeen, U.K., March 15-17.
Al-Marhoun M., “PVT Correlations for Middle East Crude Oils,” Journal of Petroleum Technology, pp 650-666,
1988.
Al-Marhoun M. A., 1992. “New Correlation for formation Volume Factor of oil and gas Mixtures,” JCPT , March
22.
Al-Marhoun M.A. and Osman E. A., 2002. “Using Artificial Neural Networks to Develop New PVT Correlations
for Saudi Crude Oils”. SPE 78592, presented at the 10th Abu Dhabi International Petroleum Exhibition and
Conference (ADIPEC), Abu Dhabi, UAE, October 8-11.
Al-Shammasi A. A., 1997. “Bubble Point Pressure and Oil Formation Volume Factor Correlations”. SPE 53185
presented at the 1997 SPE Middle East Oil Show and Conference, Bahrain, March 15–18.
26
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Al-Shammasi A. A., 2001. “A Review of Bubble point Pressure and Oil Formation Volume Factor Correlations”.
SPE Reservoir Evaluation & Engineering, 146-160. This paper (SPE 71302) was revised for publication of
paper SPE 53185.
Amabeoku M., Lin O., Khalifa C., Cole A. A., Dahan J., Jarlow M. J., and Ajufo A., (2005). “Use of Fuzzy-Logic
Permeability Models to Facilitate 3D Geocellular Modeling and Reservoir Simulation: Impact on Business”.
Presented at the International Petroleum Technology Conf., 21-23 November, Doha, Qatar.
Bezdek, J., Keller, J., Krisnapuram, R., Pal, N.R., (2005). Fuzzy Models and Algorithms for Pattern Recognition
and Image Processing. Boston Kluwer Academy Publishers.
Cuddy S. J., (1998). “Litho-Facies and Permeability Prediction from Electrical Logs using Fuzzy Logic”.
SPE49470 Presented at 8th Abu Dhabi International Petroleum Exhibition and Conference, (1998).
Duda R. O., Hart P. E., and Stock D. G., (2001). , “Pattern Classification, 2nd E, John Wiley and Sons, New York.
Dumitrescu, D., Lazzerini, B., Jain, L., (2000). Fuzzy Sets and Their Applications to Clustering and Training,
(2000) , Florida CRC Press.
El-Sebakhy E. A., El-Shaltami T. R., Al-Bukhitan S. Y., Shabaan Y. M., Raharja I. P., and Khaeruzzaman Y.,
(2007). “Support Vector Machines Framework for Predicting the PVT Properties of Crude Oil Systems”, the
SPE 15th Middle East Oil Show held in Bahrain, 11–14 March 2007. SPE105698.
El-Sebakhy E. A., Shabaan Y. M., Raharja I. P., and Khaeruzzaman Y., (2007).“Neuro-Fuzzy Inference Systems in
Identifying Flow-Regimes and Liquid-Holdup in Horizontal Multiphase Flow”. ICMSAO’07. Second Int.
Conf. on Modeling, Simulation and Applied Optimization. March 24–27. The PI, Abu Dhabi, UAE.
Elsharkawy A. M., (1998). “Modeling the Properties of Crude Oil and Gas Systems Using RBF Network”. SPE
49961 presented at the SPE Asia Pacific Oil and Gas Conference, Perth, Australia, October 12-14.
Glazo O., 1980. “Generalized Pressure-Volume Temperature Correlations,” JPT May, 785.
Goda H. M., Shokir E. M., Fattah K. A., and Sayyouh M. H., 2003. ” Prediction of the PVT Data Using Neural
Network Computing Theory”. SPE 85650, the 27th Annual SPE International Conference and Exhibition,
Abuja, Nigeria, August 4-6.
Gharbi R. B. and Elsharkawy A. M., 1997. “Neural-Network Model for Estimating the PVT Properties of Middle
East Crude Oils”. SPE 37695 presented at the 1997 SPE Middle East Oil Conference, Bahrain, 15–18.
27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Gharbi R. B. and Elsharkawy A.M., (1997). “Universal Neural-Network Model for Estimating the PVT Properties
of Crude Oils,” SPE 38099 presented at the 1997 SPE Asia Pacific Oil & Gas Conference, Kuala Lumpur,
Malaysia, April 14-16.
Hambalek N. and Reinaldo G., (2003). “Fuzzy logic applied to lithofacies and permeability forecasting”. SPE
(81078) presented at SPE Latin America and Caribbean Petroleum Engineering Conf., Spain, Trinidad, West
Indies.
Jong-Se L. and Kim J., (2004) ”Reservoir Porosity and Permeability Estimation from Well Logs Using Fuzzy
Logic and Neural Networks”. SPE 88476. Presented at SPE Asia Pacific Oil and Gas Conference and
Exhibition. Perth, Australia.
Kumoluyi A. O. and Daltaban T. S., 1994. “High Order Neural Network in Petroleum Engineering”. SPE 27905
presented at the 1994 SPE Western Regional Meeting, Longbeach, California, USA, March 23-25.
LeCun Y., Botou L., Jackel L., Drucker H., Cortes C., Denker J., Guyon I., Muller U., Sackinger E., Simard P., and
Vapnik V., 1995. “Learning algorithms for classification: A comparison on handwritten digit recognition,”
Neural Netw., pp. 261–276.
Liu C., Nakashima K., Sako H., and Fujisawa H., 2003. “Handwritten digit recognition: Bench-marking of state-of-
the-art techniques,” Pattern Recognition, vol. 36, pp. 2271–2285.
McCain W. D. Jr., Soto R. B., Valko, P. P., and Blasingame T. A., 1998. “Correlation Of Bubble point Pressures
For Reservoir Oils - A Comparative Study”. SPE 51086 presented at the 1998 SPE Eastern Regional
Conference and Exhibition held in Pittsburgh, PA, 9–11.
Mohaghegh S. and Ameri S., 1994. "An Artificial Neural Network As A Valuable Tool for Petroleum Engineers".
SPE 29220, unsolicited paper for Society of Petroleum Engineers.
Mohaghegh S., 1995. “Neural Networks: What it Can do for Petroleum Engineers," JPT, 42.
Mohaghegh S., 2000. "Virtual Intelligence Applications in Petroleum Engineering: Part 1 - Artificial Neural
Networks”. JPT, September.
Osman E. A., Abdel-Wahhab O. A., and Al-Marhoun M. A., 2001. “Prediction of Oil Properties Using Neural
Networks,” Paper SPE 68233.
28
29
1
2
3
4
5
6
7
8
9
10
11
Osman E. A. and Abdel-Aal R., 2002. "Abductive Netwoks: A New Modeling Tool for the Oil and Gas Industry".
SPE 77882, Asia Pacific Oil and Gas Conference and Exhibition Melbourne, Australia, 8–10 October.
Osman E. and Al-Marhoun M., 2005. “Artificial Neural Networks Models for Predicting PVT Properties of Oil
Field Brines". SPE 93765, 14th SPE Middle East Oil & Gas Show and Conference in Bahrain, March.
Standing M. B., 1974. “A Pressure Volume Temperature (PVT) Correlation for Mixtures of California Oils and
Gases,” Drill&Prod. Pract., API, pp 275-87.
Taghavi, A. A., 2005. “Improved Permeability Estimation through use of Fuzzy Logic in a Carbonate Reservoir
from Southwest Iran”. Presented at SPE Middle East Oil and Gas Show and Conf., Mar 12 - 15, Bahrain.
Varotsis N., Gaganis V., Nighswander J., and Guieze P., 1999. “A Novel Non-Iterative Method for the Prediction of
the PVT Behavior of Reservoir Fluids”. SPE 56745 presented at the SPE Annual Technical Conference and
Exhibition, Houston, Texas, October 3–6.