Reservoir properties from well logs using neural networksbjornu/Alpana_Bhatt.pdf · Reservoir...

Alpana Bhatt

Reservoir properties from well logs using neural networks

A dissertation for the partial fulfilment

of requirements for the degree of Doktor Ingenir at the

Department of Petroleum Engineering and

Applied Geophysics Norwegian University of Science and Technology

November 2002

iii

Summary In this dissertation we have developed multiple networks systems (multi

nets or committee machines, CM) for predicting reservoir properties e.g. porosity, permeability, partial fluid saturation and for identifying lithofacies from wireline and measurement while drilling (MWD) logs. The method is much more robust and accurate than a single network and the multiple linear regression method. The basic unit of a committee machine is a multilayer perceptron network (MLP) whose optimum architecture and size of training dataset has been discovered by using synthetic data for each application. The committee machine generated for each property has been successfully applied on real data for predicting the reservoir properties and analysing the lithofacies in the Oseberg field, one of the major fields of the North Sea. The advantage of this technique is that it can be used in real time and thus can facilitate in making crucial decisions on the reservoir while drilling. The trained networks have been successfully used in bulk conversion of wireline and MWD logs to reservoir properties. All the programming has been done using MATLAB programming language and different functions from the neural network toolbox.

For porosity prediction we have made a study initially with a single neural

network and then by the CM approach. We have demonstrated the benefits of committee neural networks where predictions are redundantly combined. Optimal design of the neural network modules and the size of the training set have been determined by numerical experiments with synthetic data. With three inputs i.e. sonic, density and resistivity, the optimal number of hidden neurons for the porosity neural network has been determined to be in the range 6-10, with a sufficient number of training patterns of about 150. The network is sensitive to the fluid properties. The unconstrained optimal linear combination of Hashem (1997), with zero intercept term based on least squares, is the most suitable ensemble approach for the porosity CM and the accuracy is mainly limited by the accuracy of the training patterns and the accuracy of the log data themselves. In application to real data the maximum standard error of the difference between prediction and helium core porosity data is 0.04. The benefit of neural networks compared with the multiple linear regression (MLR) technique has been briefly touched upon by showing that MLR fails to reproduce the minor non-linearity

iv

embedded in the common log-to-porosity transforms, whereas the neural network reproduces the same data with high accuracy.

In permeability prediction by CM we have demonstrated the benefits of

modularity by decomposing the permeability range into a number of sub-ranges to increase resolution. We have used synthetic data for evaluating the optimal architecture of the component neural networks. With the four inputs; i.e. sonic, density, gamma and neutron porosity, we find that optimal number of hidden units of the permeability neural network is confined to the range 8-12 where the variance and bias are at their minima. In general, the errors steadily decrease with the number of training facts. A practical lower limit has been set to 300, or twice the size of the training set required for the porosity network due to the increased complexity of the background relationships with the log readings.

Since we use a logarithmic permeability scale rather than a linear scale,

the success of optimal linear combination (OLC) in the porosity CM is not repeated when it is applied to the permeability CM. In fact noise amplification takes place. Simple ensemble averaging is shown to be the preferred method of combining the outputs. A different training strategy must be applied i.e. the validation approach, requiring the training to stop when the level of minimum variance has been reached. Provided that precautions are taken, the permeability CM is more capable of handling the non-linearity and noise than MLR and a single neural network. The benefit of range splitting, using the modularity imbedded in the CM approach, has been demonstrated by resolving details in the combination of logs that otherwise would be invisible. In application to real data a minimum standard deviation error of the difference between prediction and Klinkenberg corrected air permeability data is around 0.3 in logarithmic units (of mD), mainly due to limitations in the techniques of measurement.

We have developed and tested a modular artificial neural network system for predicting the fluids water, oil and gas, and their partial saturations directly from well logs, without explicit knowledge of the fluid and rock properties normally required by conventional methods. For inputs to the networks we have used the density, sonic, resistivity and neutron porosity logs. From synthetic logs based on a realistic petrophysical model we have determined by numerical experiments the optimal architecture, and network training procedure for partial fluid saturation.

The output of three saturations from a single MLP (4-10-3) reveals the

same accuracy as those of three individual MLPs with one output (4-4-1). The latter has the advantage of simplicity in terms of number of neurons, which implies fewer training patterns and faster training. Moreover, simplicity in the MLP improves modularity when used for building blocks in the multi-net system. For the optimal 4-4-1 MLP the number of training patterns should be in excess of 100 to ensure negligible errors in the case of data with moderate noise. A

v

committee neural network for each fluid type is the preferred solution, with each network consisting of a number of individually trained 4-4-1 MLPs connected in parallel and redundantly combined using optimal linear combination compared with a single MLP realisation. The OLC approach implies an overall error reduction by an order of magnitude. Based on these findings we have made a modular neural network (MNN) system consisting of three CMs; one for each fluid type, where each CM contains nine MLPs connected in parallel, and with outputs that are combined using the OLC approach. Using training patterns from CPI logs we have demonstrated its application to real data from North Sea reservoirs containing the full range of fluid types and partial saturation. The saturation predictions from the fluid CMs are further combined in a MNN with the laboratory measured relative permeability curves for both the oil-water and gas-oil fluid systems to generate relative permeability logs. The accuracy in prediction saturation essentially depends on the accuracy of the training patterns, which are from the computer processed interpretation (CPI) logs, and the accuracy of the individual log measurements. The idea of using neural networks for fluid saturation is thus not to eliminate the careful petrophysical evaluation behind the CPI log, but to transfer into the neural network for future application the effort and expertise already imbedded in the petrophysical database. Comparison of Sw values of the neural network with those of CPI logs, in wells that are unknown to the network, indicates a standard error of less than 0.03.

The problem of identification of lithofacies from well logs is a pattern

recognition problem. The CM architecture is based on combining back propagation artificial neural networks (BPANN) with a recurrent BPANN (R-BPANN) adopted from time series analysis. The recurrent BPANN exploits the property of facies i.e. it consists of several sequential points along the well bore thus effectively removing ambiguous or spurious classification. The multiclass classification problem has been reduced to a two-class classification problem by using the modular neural network system. Ensembles of neural networks are trained on disjoint sets of patterns using a soft overtraining approach for ensuring diversity and improving the generalisation ability of the stack.

We have used synthetic logs from a realistic model with a very small layer

contrast and moderate noise level and we found an excellent classification performance only slightly less than 100% hit rates. By introduction of fine layering in the model we have shown that the performance is only slightly reduced, demonstrating excellent performance of the RBPANN for layer enhancement, also in the case of thin layers. Classification from real data is more challenging since the facies in the present study were initially defined by visual inspection of cores, and thus not fully compatible with the readings of the logging

vi

tools, which detect different physical properties and have coarser spatial sampling. Application to the four facies of the Ness Formation reveals an average hit-rate well above 90% in wells unknown to the network. Compared with similar published classification studies our results reveal slightly to significantly better performance.

The CM approach for porosity, permeability and water saturation is

developed and tested on MWD data also. We trained CM architecture for porosity, permeability and water saturation using MWD data. Since cores are normally not collected in horizontal well the patterns for MWD networks are predictions made by wireline networks. The application of this technique is to predict reservoir properties while drilling.

vii

Acknowledgements I sincerely wish to thank Dr. Hans B. Helle for his very valuable involvement and advice throughout this work. I am also very thankful to Prof. Bjorn Ursin for his confidence in me and for the support, guidance and crucial insights he provided during this research work. I thank Norsk Hydro and Oseberg licence group for providing technical data, financial support and office facilities. I immensely benefited from the technical discussions that I had with Brian Farrely and Dr. Jos Okkerman on various occasions. I thank them for helping me in improving my approach to many issues in this work. I am grateful to Dr. Mary M. Poulton, Dr. Alv Aanestad and Professor Tor Arne Johansen for reviewing this work. I am thankful for their critical comments and very useful suggestions for improving the quality of this thesis. I thank all employees and staff of IPT, NTNU for making my stay in the department very comfortable and satisfying. I especially want to thank my colleagues at IPT, Nam Hoai Pham and Klaus Lykke Dag, for their academic feedback, help and pleasure of sharing some challenges together. My gratitude to Mr. J.S.Thakur and his noble friend without whose benevolent intervention this work could not have been undertaken. It is my pleasure to thank my parents for their unstinted support and understanding. I thank my son Padmanabh whose charm provided the much-needed comfort and invigorated my mind throughout this scientific pursuit. Last but not the least I am grateful to my husband for all his patience in encouraging me to do this study. Trondheim, November 2002

Alpana Bhatt

ix

List of symbols

21a : partial tortuosity for fluid flowing through the sand matrix

23a : partial tortuosity for fluid flowing through the clay matrix

b: bias

B: geometrical factor

C: clay content

d: desired output vector

D: grain size

Ds: effective particle size of sand

Dc: effective particle size of clay

D : effective grain size e: error output vector

e: error between the desired and the actual output

E: expectation operator

f: volume fraction

F: target function

FFI: free fluid index

Hc: hydrogen index of clay

Hg: hydrogen index of gas

Ho: hydrogen index of oil

Hs: hydrogen index of sand

Hw: hydrogen index of water

h(z): output from BPANN at depth z for lithofacies prediction

( )h z : output from RBPANN at depth z for lithofacies prediction

( )kh z : output from CM for kth lithofacies prediction

x

( )kh z : output from RBPANN for kth lithofacies prediction

k: permeability

kr: relative permeability

L(z): output from the combiner

m : cementation factor M: particle mass

n : saturation exponent r(x): real response from a CM

Rc : resistivity of clay

Rt : true resistivity of a formation

Rw : resistivity of formation water

Sg: gas saturation

So: oil saturation

Sw: water saturation

Swi : irreducible water saturation

t: time

T: tortuosity

u: input to adder

w: weight vector ikjw : weights on the connection between j

th input unit and kth output unit corresponding to the ith layer.

W: weight matrix

x : input vector

y : output vector

y : combined output from K experts in a CM z: depth

xi

: weights from OLC approach on each expert in CM

: approximation error between the desired response and the output from CM

: cost function

( ) : activation function : porosity

c : porosity in clay

N : neutron porosity log

p : percolation porosity

s : porosity in sand

: gamma ray log

min : minimum value of gamma ray log

max : maximum value of gamma ray log

: regression parameters

: intermediate output

: bulk density

f : effective fluid density

o : density of oil

w : density of water

g : density of gas

m : matrix density

: standard deviation : difference

t : bulk transit time

ct : transit time of clay

ft : effective fluid transit time

gt : transit time of gas

xii

mt : matrix transit time

ot : transit time of oil

wt : transit time of water

xiii

List of acronyms BPANN: back-propagation artificial neural network

CM: committee machine

CPI : computer processed interpretation

LMBP: Levenberg Marquardt back-propagation

LMS: least mean square

mD : milli Darcy

MLFF: multi layer feed forward

MLP: multi layer perceptron

MLR: multiple linear regression

MNLR: multiple nonlinear regression

MNN: modular neural network

MWD: measurement while drilling

OLC: optimal linear combination

RBPANN: recurrent back-propagation artificial neural network

xv

Contents

Summaryiii Acknowledgements..vii List of symbolsix List of acronyms..xiii 1.Introduction..1 2. Neural networks ..............................................................................7

2.1 Introduction.................................................................................. 7 2.2 Neuron model ............................................................................... 8 2.3 Network architectures ............................................................... 11

2.3.1 Learning ............................................................................... 11 2.3.2 Perceptron architecture ....................................................... 12 2.3.3 Adaline architecture............................................................. 13 2.3.4 Multilayer perceptron .......................................................... 14

2.4 Network tasks ............................................................................ 17 2.4.1 Pattern recognition .............................................................. 17 2.4.2 Function approximation....................................................... 18

2.5 Bias variance dilemma .............................................................. 19 2.6 Advantages and disadvantages of a MLP network .................. 21 2.7 Multiple networks system ......................................................... 21

2.7.1 Ensemble combination ......................................................... 22 2.7.2 Modular neural network ...................................................... 25

2.8 Multiple linear regression ......................................................... 27 2.8.1 Analogy of MLR with neural network................................. 28

3. Porosity prediction .........................................................................31 3.1 Introduction................................................................................ 31 3.2 Synthetic data ............................................................................ 31 3.3 Optimal design of networks....................................................... 34 3.4 Training strategy ....................................................................... 38 3.5 Comparison of alternative techniques ...................................... 39 3.6 Real data .................................................................................... 42 3.7 Conclusions ................................................................................ 55

4. Permeability prediction .................................................................57 4.1 Introduction................................................................................ 57 4.2 Errors in data............................................................................. 60

4.2.1 Measurement conditions...................................................... 60 4.2.2 Resolution and spatial sampling ......................................... 63

xvi Contents

4.2.3 Anisotropy............................................................................. 63 4.3 Synthetic data ............................................................................ 64 4.4 Optimal design of networks ....................................................... 68 4.5 Error analysis of the committee networks ................................ 73 4.6 Real data..................................................................................... 75 4.7 Conclusions................................................................................. 85

5. Fluid saturation prediction........................................................... 87 5.1 Introduction................................................................................ 87 5.2 Water saturation measurement on cores .................................. 88

5.2.1 Retort method ....................................................................... 88 5.2.2 Dean- Stark extraction method............................................ 88

5.3 Log responses to pore fluids....................................................... 89 5.4 Synthetic data ............................................................................ 92 5.5 Optimal design of networks ....................................................... 93 5.6 CM architecture ....................................................................... 100 5.7 Real data................................................................................... 102 5.8 Conclusions............................................................................... 109

6. Lithofacies prediction...................................................................111 6.1 Introduction.............................................................................. 111 6.2 Multiclass classification using MNN and stacked generalisation ........................................................................... 113 6.3 Recurrent networks for enhanced layer detection.................. 115 6.4 Optimal design and training of networks for classification ... 118

6.4.1 Back propagation neural net for classification.................. 122 6.4.2 Recurrent back propagation networks for classification .. 123 6.4.3 Ensemble network and stacked generalisation................. 125

6.5 Classification of logfacies in a multi-layer model with thin layers .............................................................................................. 127 6.6 Real data................................................................................... 130 6.7 Conclusions............................................................................... 136

7. Applications using MWD logs......................................................137 7.1 Introduction.............................................................................. 137 7.2 Sources of error in the data ..................................................... 138 7.3 Porosity prediction ................................................................... 139 7.4 Permeability prediction............................................................ 141 7.5 Water saturation prediction .................................................... 143 7.6 Conclusions............................................................................... 145

8. Conclusions and future work .......................................................147 References ........................................................................................151

1

Chapter 1 Introduction

Many forms of heterogeneity in rock properties are present in clastic petroleum reservoirs. Understanding the form and spatial distribution of these heterogeneities is important in petroleum reservoir evaluation. Porosity, permeability and fluid saturation are the key variables for characterising a reservoir in order to estimate the volume of hydrocarbons and their flow patterns to optimise production of a field. These are different for different rocks. Lithofacies are defined as a lateral mappable subdivision of a designated stratigraphic unit, distinguished from adjacent subdivisions on the basis of lithology, including all mineralogical and petrographical characters and those paleontological characters that influence the appearance, composition or texture of the rock. Thus identification of lithofacies is an important task in knowing the heterogeneity of the reservoir.

Porosity is described as the ratio of the aggregate volume of interstices in a rock to its total volume whereas the permeability is defined as the capacity of a rock or sediment for fluid transmission, and is a measure of the relative ease of fluid flow under pressure gradients. Knowledge of permeability in formation rocks is crucial to oil production rate estimation and to reservoir flow simulations for enhanced oil recovery. Reliable predictions of porosity and permeability are also crucial for evaluating hydrocarbon accumulations in a basin-scale fluid migration analysis and to map potential pressure seals to reduce drilling hazards.

There is no well log which can directly determine porosity and hence it is measured on cores in the laboratory. This is an expensive exercise and hence is not a routine operation in all drilled wells. Several relationships have been offered which can relate porosity to wireline readings such as the sonic transit time and density logs. However, the conversion from density and transit time to equivalent porosity values is not trivial (Helle et al., 2001). The common conversion formulas contain terms and factors that depend on the individual location and lithology e.g. clay content, pore fluid type, grain density and grain transit time for the conversion from density and sonic log, respectively, that in general are unknowns and thus remain to be determined from rock sample analysis.

2 Chapter1. Introduction

Permeability is also recognised as a complex function of several interrelated factors such as lithology, pore fluid composition and porosity. It can be measured directly from pressure test in wells and on cores in the laboratory. Although these are important methods neither of them is good enough to allow them to be widely used owing to technical and financial reasons. The routine procedure in the oil industry has been to estimate it from well logs. The permeability estimates from well logs often rely upon porosity e.g. through the Kozeny-Carman equation or Wyllie and Rose model (Wyllie and Rose, 1950) which contains adjustable factors such as the Kozeny constant that varies within the range 5-100 depending on the reservoir rock and grain geometry (Rose and Bruce, 1949). Nelson (1994) gives a detailed review of these problems.

Helle et al. (2001) demonstrated that instead of a single log a group of logs should be used to compute a petrophysical property provided there is a relationship between the property and the logs. Similar opinions have been emphasised in Wendt et al. (1986). They reported a study of permeability prediction by using multiple linear regression techniques. The study demonstrated that the correlation coefficient between the predicted and the actual permeability increases as other logs and log-derived parameters than porosity are included in the prediction for permeability.

Finding the distribution and composition of subsurface fluids is another main objective in hydrocarbon exploration, field development and production. Since direct sampling of underground fluids and determination of fluid saturation in the laboratory is an expensive and time-consuming procedure, indirect determination from log measurements is the common approach. The common practice in the industry is to determine water saturation from empirical formulas using resistivity, gamma ray logs and porosity estimates. The hydrocarbon saturation is then calculated from water saturation as both water and hydrocarbon form the composite pore fluid. Apart from the convenience and financial benefits, methods based on log measurements imply the technical advantage of providing a continuous record and sampling of a larger rock volume. Fluid evaluation from log data and accurate conversion of the logs to fluid saturation values thus also constitute important tasks in reservoir characterisation.

The standard practice in industry for calculating water saturation is by

using different saturation models. But these models should be tuned to the area of work which requires the estimation of parameters in the laboratory. Thus it would be better to obtain water saturation from logs using neural networks without explicitly depending on the auxiliary parameters.

Lithofacies defines a body of rock on the basis of its distinctive lithological properties, including composition, grain texture, bedding characteristics, sedimentary structures and biological features. The common practice in oil industry is to manually examine the various facies identified on

Chapter1. Introduction 3

cores from well logs with the aid of graphical techniques such as cross plotting. This method is labour intensive and becomes cumbersome when the number of logs to be analysed simultaneously increases. Thus there are several advantages in making this method computerised while retaining the expert reasoning of an experienced geologist.

Neural networks have been applied in a wide variety of fields to solve problems such as classification, feature extraction, diagnosis, function approximation and optimisation (e.g. Lawrence, 1994; Haykin, 1999). Although it seems clear that neural networks should not be used where an effective conventional solution exists, there are many tasks for which neural computing can offer a unique solution, in particular those where the data is noisy, where explicit knowledge of the task is not available or when unknown non-linearity between input and output may exist. Artificial neural networks are most likely to be superior to other methods under the following conditions (Masters, 1993): i) The data on which a conclusion is based is fuzzy, or is subject to possibly large errors. In this case the robust behaviour of neural networks is important. ii) The patterns important to the required decision are subtle or deeply hidden. One of the principal advantages of a neural network is its ability to discover patterns in data, which are so obscure as to be imperceptible to the human brain or standard statistical methods. iii) The data exhibits significant unpredictable non-linearity. Neural nets are marvellously adaptable to any non-linearity. iv) The data is chaotic (in the mathematical sense). Such behaviour is devastating to most other techniques, but neural networks are generally robust with input of this type.

In many respects the above list summarises the features of conventional earth science data, and are the main reasons for the increasing popularity of ANN in geoscience and petroleum engineering (Mohaghegh, 2000; Nikravesh et al., 2001).

Neural networks for quantitative analysis of reservoir properties from well logs have been demonstrated in several practical applications (e.g. Huang et al., 1996; Huang and Williamson, 1997; Zhang et al., 2000; Helle et al., 2001), where the artificial neural network approach is shown to be a simple and accurate alternative for converting well logs to common reservoir properties such as porosity and permeability. Single multilayer perceptrons (MLP) consisting of an input layer, a hidden layer and an output layer, trained by a back-propagation algorithm (e.g. Levenberg-Marquardt, see Hagan et al., 1996) have been the


conventional work horse for most practical applications over the last decade. Back-propagation learning techniques have been shown to be universal approximators (e.g. White, 1992), implying that they will approximate any static function provided sufficiently representative input-output sample pairs of the function are given. However, the concomitant disadvantage of their ability to generalise beyond the set of examples on which they were trained, is that they are likely to make errors.

If we accept that neural networks are unlikely to generalise perfectly to all possible test cases, we have good reason for exploring ways of improving the performance of neural networks. A single MLP, when repeatedly trained on the same patterns, will reach different minima of the objective function each time and hence give a different set of neuron weights. A common approach therefore is to train many networks, and then select the one that yields the best generalisation performance. However, since the solution is not unique for noisy data, as in most geophysical inversion problems, a single network may not be capable of fully representing the problem at hand. Selecting the single best network is likely to result in loss of information since, while one network reproduces the main patterns, the others may provide the details lost by the first. The aim should thus be to exploit, rather than lose, the information contained in a set of imperfect generalisers. This is the underlying motivation for the committee neural network approach, or committee machine, where a number of individually trained networks are combined, in one way or another, to improve accuracy and increase robustness. An important observation (Naftaly et al., 1997) is that a committee can reduce the variance of the prediction while keeping the bias constant, whereas Hashem, (1997) proposed unconstrained optimal linear combination to eliminate the bias.

Helle et al. (2001) demonstrated the prediction of porosity and permeability using single neural network. While using the committee machine approach Bhatt and Helle (2001a) have demonstrated improved porosity and permeability predictions from well logs using ensemble combination of neural networks rather than selecting the single best by trial and error. Helle and Bhatt (2001) successfully applied the ensemble approach to predict partial fluid saturation. Bhatt and Helle (2001b) successfully applied a committee machine using a combination of back propagation neural network and recurrent neural network for the identification of lithofacies.

In this dissertation we have devised a technique using neural networks for predicting porosity, permeability, fluid saturation and identifying lithofacies from log data. The technique utilises the prevailing unknown relationship in data between well logs and the reservoir properties. It utilises the ability of neural network to discover patterns in the data important for the required decision, which may be imperceptible to human brain or standard statistical methods. The method is better than the commonly practised technique in industry because it

Chapter1. Introduction 5

does not require a deep geological knowledge of the area and is much faster to use than the standard statistical methods. It is more robust and accurate than the standard multiple linear regression technique. This technique makes the identification of lithofacies much simpler in comparison to manual identification specially when the number of logs to be analysed increases. Thus the idea of this dissertation is not to eliminate the interpretation from an experienced petrophysicist but to make the task simpler and faster for future work.

Thus in this study we have explored the capabilities of back propagation neural networks on synthetic and real data. The study shows the applications of neural networks in predicting reservoir properties with a short introduction to neural networks and the modular neural networks in Chapter 2. In all the applications synthetic data has been utilised for determining the optimal architecture of the network. In Chapter 3 we compare the prediction of porosity using a single neural network with that of the committee machine approach. The optimal architecture of the network was then applied to real data. We concluded that the optimal linear combination (OLC) approach (Hashem, 1997) is the best for porosity prediction. In Chapter 4 we discuss the different factors affecting the permeability and the reasons for the difference between the predicted and the core permeability. We demonstrate a comparison of the prediction of permeability using a single neural network with that of the committee machine approach. In Chapter 5 a committee machine has been devised to predict partial fluid saturation, which was further utilised for generating relative permeability logs. In Chapter 6 we devised a committee machine using back propagation and a recurrent back propagation neural network for predicting lithofacies within the Ness formation of the Oseberg field. The chapter discusses the method of identifying three lithofacies using synthetic data when the contrast between the lithofacies is only 1.25% of the contrast between sand and shale. BPANN leaves some overlapping in lithofacies, which are further, reduced by using RBPANN by utilising the past and the future predicted lithofacies and the past logs. The misclassification in predictions reduced from 6.74% to 2.89% by the application of RBPANN. The same technique is then applied to real data for identification of lithofacies within the Ness formation. In Chapter 7 we applied the committee machine architecture for predicting porosity, permeability and fluid saturation on measurement while drilling data. The individual networks for measurement while drilling were trained on patterns generated from wireline networks in the absence of core data. The main aim behind this approach is to determine the reservoir properties while drilling. Chapter 8 is the final conclusion of the thesis and contains suggestions for future work.

7

Chapter 2 Neural networks 2.1 Introduction

A neural network can be described as a massively parallel-distributed processor made up of simple processing units called neurons, which has a natural tendency for storing experiential knowledge and making it available for use. It resembles the human brain in the following respects: (i) knowledge is acquired by the network from its environment through a learning process (ii) interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.

The human brain is a very complex, non-linear and parallel computer

system, which is capable of thinking, remembering and problem solving. It has the capability to organise its structural constituents known as neurons so as to perform certain computations e.g. pattern recognition, perception etc. much faster than a computer. There have been many attempts to emulate brain functions with computer models and although there have been some spectacular achievements coming from these efforts, all of the models developed to date pale into oblivion when compared with the complex functioning of the human brain.

The fundamental cellular unit of the brain's nervous system is a neuron. It is a simple processing element that receives and combines signals from other neurons through input paths called dendrites. If the combined input signal is strong enough then the neuron fires, producing an output signal along the axon that connects to the dendrites of many other neurons. Figure 2.1 is a sketch of a biological neuron showing its various components. Each signal coming into a neuron along a dendrite passes through a synapse or a synaptic junction. This junction is an infinitesimal gap in the dendrite that is filled with neuro-transmitter fluid that either accelerates or retards the flow of electrical charges. The fundamental actions of the neuron are chemical in nature, and this fluid produces electrical signals that go to the nucleus or soma of the neuron. The adjustment of the impedance or conductance of the synaptic gap is a critically important process. These adjustments lead to memory and learning. As the synaptic strengths of the neurons are adjusted the brain learns and stores information.

8 Chapter2. Neural networks

Figure 2.1: Schematic of a biological neuron

2.2 Neuron Model

A biological neuron is a fundamental unit of the brain's nervous system. Similarly, an artificial neuron is a fundamental unit to the operation of the neural network. The block diagram of Figure 2.2 shows the model of a neuron. Its basic elements are: i) A set of synapses or connecting links, each of which is characterised by a weight of its own. A signal xj at the input of synapse j connected to neuron k is multiplied by the synaptic weight wkj. ii) An adder for summing the input signals, weighted by the respective synapses of the neuron. iii) An activation function for limiting the amplitude of the output of a neuron. It limits the permissible amplitude range of the output signal to some finite value.

Axon hillock

Terminal buttons

Dendrite

Nucleus

Soma

Axon

Chapter2. Neural networks 9

The neuronal model also includes an externally applied bias, bk which has the effect of increasing or lowering the net input of the activation function, depending on whether it is positive or negative respectively.

Figure 2.2: Model of a neuron Mathematically the function of the neuron k can be expressed by

( )k k ky u b= + (2.1) where

1

m

k kj jj

u w x=

= (2.2)

xj is the input signal from an m dimensional input, wkj is the synaptic weights of neuron k, uk is the linear combiner output due to the input signals, bk is the bias,

( ) is the activation function, and yk is the output signal of the neuron. The relation between the linear combiner output uk and the activation potential vk is

k k kv u b= + (2.3) The activation function ( )v defines the output of a neuron in terms of the induced local field v . The three most common activation functions are: i) Hardlimit function

( ) 1 if 00 if 0{v v = (from equation 4.17)

Thus we see that for combining the permeability networks the constrained OLC approach (Hashem, 1997) is better than the unconstrained approach but still the simple averaging (equation 4.11) is superior to the OLC as also shown by the numerical experiment in Figure 4.8 which shows the constrained OLC (equation 4.18) where the blow-up of noise in the case of OLC is evident. Since the OLC approach does not work properly in the case of log10k, and since the variance has become a serious problem, the idea of over-training to reduce bias on account of increasing variance is no longer applicable. The concern is now to reduce the variance of the individual neural networks in the CM and still keep the bias at a sufficiently low level. Instead of over-training the neural networks we validate the network output against the validation set. When the variance reaches its minimum the training stops. In Figure 4.8a we have compared the validation method with the over-training method for the permeability CM, using both the OLC (equation 4.18) and the simple average (equation 4.11). The results clearly reveal that the simple ensemble averaging, using the validation criteria for training the individual neural networks, is the optimum approach for the permeability networks at hand.

Chapter 4. Permeability prediction 75

Figure 4.8: Comparison of bias and variance (a) for alternative training and combination methods (b)predictions based on OLC-overtraining method

4.6 Real data

In the first study (Helle et al., 2001) we trained a single neural network

having 4 input neurons (density, gamma ray, neutron porosity and sonic), 12 neurons in the hidden layer and one neuron (permeability) in the output layer. The training patterns have been taken from three fields in the North Sea. Most of the training patterns (72%) are from water bearing well (31/4-10) from the Brage field, the rest of the patterns (25%) are from the Oseberg field and from the Halten banken area (3%). The patterns from the Oseberg field are added from the oil and gas bearing well (30/9-B-20) and the water bearing well 30/9-11, therefore the network has a complete range of reservoir fluids. It is very important that the network should never be trained on the unresolved dataset. The shale permeabilities were added from the Haltenbanken area from the study of Krooss et al. (1998). This low permeability shale data was added to tune the network for basin scale applications (Bhatt, 1998). By adding the six low-

O L C -O T O L C -V A L E N S -O T E N S -V A L-0 .0 2

0 .0 0

0 .0 2

0 .0 4

0 .0 6

0 .0 8

0 .1 0(a )

lo

g 10k

- lo

g 10k

Met

hod

M e th o d

5 % N o is e B ia s V a r ia n c e

0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0-1

0

1

2

3

4(b )

E n s -V a lid a tio n

O L C -O v e rtra in

D e p th (m )

log 1

0k

5 % N o is e

76 Chapter 4. Permeability prediction

permeability shale points in the range 0.5-39 nD to the standard core analysis permeability in the range 34 D - 12 D we have covered most sediments within the prospective depths in the Viking Graben. Most of the training facts are conventional Klinkenberg corrected permeability measurements on core plugs. While the porosity network is based on samples both from Tertiary and Jurassic, all training facts for the permeability network are confined to cored sections from upper Jurassic. The training patterns are dominated by wells outside the test field i.e. the Oseberg-field, and the majority of facts are from Brage-field which is in the same area. The trained network has been tested on several wells from the Oseberg field. The results of this network for permeability predictions in the cored reservoir intervals of two wells are shown in Figures 4.9 and 4.10.

Well 30/6-4 is completely unknown to the network. It is an oil bearing

well with a hole deviation of 0-1 degrees. The core data is available in the Rannoch and the Oseberg formation. The permeability prediction by neural network matches very well with the Klinkenberg corrected core permeability in most of the intervals as shown in Figure 4.9b and c. The error distribution is shown in the form of the histogram of the difference between the logarithm of core permeability and that of the neural network predicted permeability. The error distribution fits the Gaussian model so the mean values and the standard deviation shown are for the Gaussian model. In Figure 4.9b the mean error is very close to zero. The reason for the standard deviation of 0.28 is the fine layering in the bottom part of the Oseberg formation (2668-85m). Due to the small scale heterogeneity there is scattering in the core data whereas the neural network prediction of permeability gives about a mean value in this interval. There are two reasons for this mismatch, firstly the spatial resolution of log data is poorer than that of core data and secondly less resolution of the network, which is trained on the whole, range of permeability. As a result the network does not have a high resolution because the transfer function normalises the whole dataset in the range of -1 to +1. The sonic log has the best resolution i.e. about 0.3-0.6m and then the density but for the rest of the logs the resolution is about 0.5m to about 1m. Here also in the sonic and density logs a marked increase in the amplitude of short-length variations coincides with intervals where core data exhibit maximum scattering. So there is not much we can do to improve the log resolution except that we should include the high resolution logs available as inputs. The second problem is remedied to some extent by either reducing the permeability range or using a range splitting permeability CM architecture as shown in Figure 4.6.


Figure 4.9: (a) Comparison of permeability predictions with core data in well 30/6-4, (b) error distribution

Figure 4.10 shows the comparison of Klinkenberg corrected core

permeability with the permeability predicted by neural network in well 30/9-B-20. In the training dataset we took 10 training points from this well but the rest of the data is unknown to the network. The well has an inclination angle of 38-42 degrees. The top formation is the Tarbert (3110-3138m.), then Ness (3138-3158m.), then Etive (3158-3162m), then Rannoch (3162-3167m.) and finally the Oseberg formation (3167-3220m.). The Tarbert formation, which is fairly homogeneous, shows a fairly good match between the two permeabilities. There is however, an overprediction of permeability in the top part of Tarbert formation. The scattering is more in the heterogeneous Ness formation due to thin bed heterogeneities. There is a good match in the Etive, Rannoch and in top

2640 2650 2660 2670 2680-2

-1

0

1

2

3

4 OsebergRannoch

(a)

log 1

0Per

mea

bilit

y (m

D)

Depth (m RKB)

-4 -2 0 2 40

10

20

30

(b) log10kcore-log10kNN

Cou

nts

Mean=-0.01 = 0.28N=97

0 1 2 3 4

0

1

2

3

4

(c) log10Permeabilitycore

log 1

0Per

mea

bilit

y NN

R=0.9, slope=0.8


part of the Oseberg formation (3164-3195m.). There is an overprediction of permeability by neural network in bottom part of Oseberg formations (3195-3220m.) due to fine layerings so the log gives a mean value of the permeability. In Figure 4.10b the mean error is 0.15 logarithmic permeability and a standard deviation of 0.48 logarithmic permeability. However the reason for the high standard deviation is mainly due to the large scattering in the core data and the spatial resolution of the logging tools as discussed. Figure 4.10c shows the correlation between the neural network predicted permeability and the core permeability. The reason for the poor correlation coefficient is mainly due to the discrepancy in core and neural network predicted permeability towards low permeability end. This is because in the training dataset we gave 6 points of low permeability i.e. nano darcy from Krooss et al. (1998). In general in laboratory so low permeabilities cannot be estimated but in real formations it is not unnatural to have low permeability cemented and carbonate streaks. Thus the network is predicting very low permeabilities in some streaks in Ness formation where as the core data doesnt.

Figure 4.11 shows the core photograph of the Etive and the Rannoch

formations. The core photograph shows that the Etive is clean sand whereas the Rannoch is a silty sand. The Figure 4.12 displays the small-scale heterogeneities, which are beyond the resolution of logs.


Figure 4.10: (a) Comparison of permeability predictions with core data inwell 30/9-B-20, (b) error distribution (c) crossplott of core against neuralnetwork permeability.

3100 3120 3140 3160 3180 3200 3220-2

0

2

4

Rannoch

Etive O sebergNessTarbert

G O C

(a)

log 1

0Per

mea

bilit

y (m

D)

Depth (m RKB)

-4 -2 0 2 40

20

40

60

(b)

Mean= -0.155 = 0.475N=361

log10kcore - log10kNN

Cou

nts

-2 0 2 4

-2

0

2

4

(c)

log10Permeabilitycore

log 1

0Per

mea

bilit

y NN

R=0.68, slope=0.68


Figure 4.11: (a) The Etive formation well 30/9-B-3 2781-2782 m (b) Transition of the Etive into the Rannoch formation well 30/9-B-50 H, 3238.5-3239.0 m


Figure 4.12: Comparison of log and core derived permeability values with the core photograph.

Motivated by the improvement in results by using CM instead of single network for predicting porosity we also applied the CM for predicting permeability. The architecture for permeability CM using real data is same as we used with the synthetic data (Figure 4.6).

Provided that the precautions discussed with the architecture of the

network have been taken into account during training and combining networks, the permeability CM is more capable of handling the non-linearity and noise than

3265 3270 3275 3280 3285 3290 32951.5

2.0

2.5

3.0

3.5

4.0

42

43

144S145S146S147S148S149

S150

S151

S152

S153

S154

S155S156S157S158S159

S160

S161S162S163S164S165

S166S167S168

S169S170S171

S172

S173

S174S175

S176

S177S178

S179

S180

S181

S182

S183S184S185

S186

S187

S188S189S190

S191

S192

S193

S194

S195S196

S197

S198

S199

S200

S201

S202

S203

S204

S205S206

S207

S208

S209

S210

S211

S212

S213

S214S215

S216

S217

S218S219

S220S221

S222

S223

S224

S225S226

S227

S228

S229

S230

S231S232

S233S234

S235

S236

S237

S238

S239S240

S241

log 1

0k

Depth (m RKB)


MLR and a single network as shown in Figures 4.5 and 4.13. As discussed in Bhatt and Helle (2001a) range splitting by the CM helps to resolve details in the combination of log that otherwise are invisible. The latter is demonstrated in Figure 4.13a where we show details in the CM predictions, which are not captured by a single neural network. This example is taken from a water bearing vertical well 30/9-1. The encircled portions show the improvement by CM architecture. The underprediction and poor resolution of permeabilities by single neural network which are due to the large dynamic range of the training dataset are, improved by the range splitting in the CM (encircled portions). As shown in Figure 4.13b there is a reduction in bias and variance also along with the improvement in the resolution of the networks by using CM architecture. The same has been illustrated in Figure 4.13c.

The next example shown in Figure 4.14 is taken from well 30/9-B-20.

Compared with previous predictions based on single neural network (as shown in Figure 4.10) more details of the core data are now reproduced. Due to the increased resolution of the network because of range splitting the network is able to predict permeabilities in the bottom part of the Oseberg formation (3195-3220), which also has fine layering, and in the top part of the Tarbert formation. The encircled portion shows the improvement. The overall error between predictions and core measurements has been also significantly reduced. As shown in Figure 4.14a the mean error has been reduced from 0.155 to 0.04 and standard deviation from 0.475 to 0.3. There is also higher correlation between the core permeability and CM permeability (Figure 4.14c shows the improved results). There still remains the discrepancy between core and CM predicted permeabilities in the Ness formation because of the higher heterogeneity in thin layers.

In most of the results discussed yet on real data showed a standard deviation of about 0.3 logarithmic permeability which is quite low compared with standard industry practice (multivariate linear regression technique) keeping in mind all the errors between core and log data. In Huang et al. (1997) also the predicted permeability by neural network had an average error of less than 0.5 logarithmic permeability.


2680 2700 2720 2740 2760 2780

-2

0

2

4

-2

0

2

4

log 1

0Per

mea

bilit

y (m

D)

D ep th (m R K B )

R annoch

O sebergE tiveTarbert N ess(a )

C M

S ing le neura l ne tw ork

Figure 4.13: (a) Comparison of permeability predicted by neuralnetwork and a CM in well 30/9-1, (b) error distributions (c) crossplott of core against CM permeability.

-2 0 2 40

20

40

60

Cou

nts

log10kcore-log10kNN

CMMean=0.002 =0.43N=177

-2 0 20

20

40

60

(b)

Mean=-0.012 =0.52N=177

Single neural network

Cou

nts

log10kcore-log10kNN

-2 0 2 4

-4

-2

0

2

4

(c)

lo

g 10P

erm

eabi

lity N

N

R=0.75, slope=0.81


-2 0 2 4

-2

0

2

4

log 1

0Per

mea

bilit

y CM


R=0.83, slope=0.75


3100 3120 3140 3160 3180 3200 3220-2

-1

0

1

2

3

4 Oseberg

Rannoch

EtiveTarbert Ness

GOC

(a)

log 1

0Per

mea

bilit

y (m

D)

Depth (m RKB)

-4 -2 0 2 40

20

40

60

80

(b)

C

ount

s

log10kNN- log10kcore

Mean=0.04 = 0.3N = 361

Figure 4.14: (a) Comparison of permeability predicted by a CM inwell 30/9-B-20 with core permeability (b) error distribution (c)crossplott of core against CM permeability.

-2 0 2 4

-2

0

2

4

(c)


log 1

0Per

mea

bilit

y CM

R=0.73, slope=0.7


4.7 Conclusions

The neural network approach to permeability prediction is much more advantageous than any conventional method, which includes empirical formulas based on linear regression models or the common semi-empirical formulas such as the Wyllie and Rose model or the Kozeny-Carman equation because knowledge of a mathematical model is not necessary.

The benefits of modularity by decomposing the permeability range into a

number of sub-ranges increases the resolution in comparison with a single network trained on the whole range of permeability. The increase in resolution could also be achieved by reducing the dynamic range of the training dataset. But in this study we desired to predict permeabilities on the basin scale so we kept a large dynamic range of the training dataset.

Synthetic data based on a background model of Kozeny-Carman type

with porosity and clay content as the independent variables helped in evaluating the optimal architecture of the network, training procedure and the size of training dataset. With the four inputs; i.e. sonic, density, gamma, neutron porosity, the optimal number of hidden units of the permeability neural network is confined to the range 8-12 where the variance and bias are at their minima. In general, the errors steadily decrease with the number of training facts. A practical lower limit has been set to 300, or twice the size of the training set required for the porosity network due to the increased complexity of the background relationships with the log readings.

Since we use a logarithmic permeability scale rather than a linear scale,

the success of OLC in the porosity CM is not repeated when it is applied to the permeability CM. In fact noise amplification takes place. Simple ensemble averaging is shown to be the preferred method of combining the outputs. However, with a relatively small number of neural network components in the CM, the variance associated with the individual networks becomes a major problem. The normal success of over-training to reduce bias is replaced by errors due to increasing variance. A different training strategy must be applied using the validation approach, which requires the training to stop when the level of minimum variance has been reached.

Provided that precautions are taken, the permeability CM is more capable

of handling the non-linearity and noise than MLR and a single neural network. In application to real data a minimum standard error of the difference between prediction and Klinkenberg corrected permeability data seems to be around 0.3 in logarithmic units (of mD), mainly due to limitations in the techniques of measurement.


Thus our permeability predictions are sufficiently accurate for most practical purposes, given the limitations due to the spatial resolution of the logging instruments, depth shifting between core and logs and the expanded range covered by the permeability values. Application to real-time data (MWD) is the obvious extension of this technique.

87

Chapter 5 Fluid saturation prediction

5.1 Introduction Apart from porosity, permeability of the reservoir rock and the type of

fluid it is important to know the hydrocarbon saturation of the reservoir in order to estimate the total reserves and to determine if the accumulation is commercial. The saturation of a formation is defined as the fraction of its pore volume occupied by the fluid considered. The direct sampling of the reservoir fluid is not technically and commercially efficient so the preferred method to date is to use well logs for fluid saturation prediction. Moreover the logs provide a continuous record of the formation also.

Although the common saturation models such as those of Archie (1942)

and Poupon (1971) are based on sound scientific and technical reasoning, they are still non-universal and non-linear empirical relations that need to be fitted to real data. These are the main justifications for employing the neural network techniques in predicting fluid saturation. The neural network approach is very pragmatic and non-linear, and may even be trained to display the expertise of a skilled petrophysicist. Helle et al. (2001) demonstrated that a network trained for porosity prediction provides excellent accuracy for all pore fluids implying that, after training for different fluids and partial saturation, knowledge of the fluid properties is embedded in the network.

The purpose here is to establish networks for fluid saturation only using

the log readings, without relying on functions that explicitly depend on porosity and auxiliary parameters derived from laboratory measurements. Since the network has to learn from data provided through a careful petrophysical analysis, the idea is not to eliminate the petrophysical work behind the saturation logs, but to transfer into the neural network for future application the effort and expertise already embedded in the petrophysical database. An obvious application is predicting while drilling, when the data required for conventional petrophysical evaluation are not yet available.

In this study we test the performance of alternative neural network

configurations for saturation using model data and real data. We generate synthetic logs, with various levels of noise, for evaluating the optimal network

88 Chapter 5. Fluid saturation prediction

architecture by employing a model of mixed fluid saturation and the common formulae relating water saturation to well log response. From a large number of individually trained networks (~20) we select the best subset (~5-10) to be included in a committee neural net, or committee machine (CM), following the test and select approach suggested by Sharkey et al. (2000). We compare the performance of the individual expert of the CM with various methods of combining the ensemble to obtain best accuracy. For completeness, we also compare the neural network results with those of the conventional multiple linear regression (MLR) technique to demonstrate that the accuracy of the MLR approach is less than that of the simplest possible neural network architecture needed to represent the problem. Using a generalised committee neural network for fluid properties we apply the new technique to real data from the North Sea.

5.2 Water saturation measurement on cores

Apart from calculating water saturation (Sw) on logs, which is the common industry practice, occasionally Sw measurements are made on cores also. Following are some of the methods by which Sw can be estimated on cores:

5.2.1 Retort method

This is a technique for measuring the fluid saturations in a core sample by heating the sample and measuring the volumes of water and oil driven off. The sample is crushed and weighed before being placed in the retort. It is then heated in stages or directly to 650oC during which the fluids are vaporized, collected, condensed and separated. Plateaus in the rise of the cumulative water volume with temperature are sometimes analysed to indicate when free water, surface clay-bound water and interlayer clay-bound water have been driven off. The volumes of water and oil are measured directly, but corrections are needed to account for alterations in the water and oil because of the dissolved gas. The volume of gas also is needed for accurate results. This is measured on a separate, adjacent sample by injecting mercury under pressure and measuring the volume absorbed. Before injection, the sample is weighed and its bulk volume determined by mercury displacement. The total pore volume is then the sum of the volumes of gas, oil and water. The saturation of each component is the ratio of its volume to the total pore volume.

5.2.2 Dean Stark extraction

This a method for the measurement of fluid saturations in a core sample by distillation extraction. The water in the sample is vaporized by boiling solvent,

Chapter 5. Fluid saturation prediction 89

then condensed and collected in a calibrated trap. This gives the volume of water in the sample. The solvent is also condensed, then flows back over the sample and extracts the oil. Extraction continues for a minimum of two days until the extracted solvent is clean or the sample shows no more fluorescence. The weight of the sample is measured before and after extraction. The clean and dried samples are measured for porosity. The water volume from the extraction gives the water saturation values. The oil saturation can be calculated indirectly by using weight before and after extraction.

5.3 Log responses to pore fluids

The type of pore fluid, gas, oil or brine is clearly reflected in several well logs. Gas and water are significantly different in density and sonic velocity, while the differences are smaller for water and oil. Assuming a mixed pore-fill of water, oil and gas averaged over the scale of measurements, the density f of the composite pore fluid is given by

f w w o o g gS S S = + + (5.2)

where typical values for the densities of the constituents are 1.03, 0.75w o = = and

30.25 g/cmg = , and the partial saturations satisfy the equation

1w o gS S S+ + = (5.3)

The fluid saturation profile is the same as used in the synthetic data for porosity prediction. The bulk density of the reservoir rock is given by

(1 )f m = + (5.4)

where is the porosity and m the density of the rock material, implying that the density log is sensitive to the pore-filling fluid as well as the properties of the rock itself. On the other hand, while f may vary within the wide range 0.2 -1.1 g/cm3 for gas and brine, respectively, the density of siliciclastic rock material varies only within a narrow range 2.64-2.80 g/cm3 (Brigaud et al., 1992). The variations in bulk density within a typical North Sea siliciclastics reservoir may thus mainly be in response to variations in fluid content and composition of the pore fluid. The transit time for the bulk can be approximated by

(1 )f mt t t = + (5.5)


where mt is the transit time of the rock material, which varies in the range 56-75 s/ft (Brigaud et al., 1992). The sonic log is therefore a sensitive indicator of fluid content and fluid type.

Brine and hydrocarbons, in general, are highly different in resistivity with low resistivity for brine (~1 ohmm) and high resistivity for the hydrocarbon-bearing reservoirs (~100 ohmm or more). Resistivity, in fact, is the most important hydrocarbon indicator and, moreover, the resistivity for clean sandstone is directly related to the water saturation through Archies equation (Archie, 1942)

w

t nw

FRRS

= (5.6)

where the resistivity of the formation water 0.14 ohmmwR = , the formation

factor F is given by maF

= , where a = 0.625 and the exponents m = n = 2

using typical values for a North Sea sandstone reservoir. For sandstones containing clay various modifications of Archies equation have been proposed which have the general form (Schlumberger, 1989)

20 11/ t w wR S S = + (5.7)

where 0 is the predominant sand term that is dependent on the amount of

sand, its porosity, and the resistivity of the saturating water. The term 1 is the shale term that depends on the amount and resistivity of the shale. For clean sandstone (5.7) reduces to Archies equation (5.6). One of the favourite models for calculating effective water saturation Sw in shaley formations has been provided by Poupon et al., (1971) who claim that their model is independent of the clay distribution. Based on a modified Archies equation the relationship between the true resistivity and the formation parameters has been established by the equation

( )1 2 2

21C m

new

t C w

CS

R R aR

= +

(5.8)

where RC is the resistivity of the clay and C is the volume fraction of clay in the formation as determined from the clay sensitive logs such as gamma ray, or from a combination of neutron and density. Here e is the effective porosity of the formation, i.e. excluding shale porosity. It is calculated from the density log, the


grain density measurements of matrix density and the fluid density using the standard density-to-porosity transform (equation 5.4). Then the latter is combined in a weighted sum of the density porosity and neutron porosity. Several logs other than resistivity thus form the input to the saturation models, plus supplementary laboratory data to determine the constants and exponents. Because of lack of data the common assumption in petrophysical evaluation is, however, that the formation factor a, the cementation exponents m and the saturation exponent n as well as the resistivity of the saturating water Rw are constants over the field, though significant variations are seen in the laboratory data. Provided a sufficient number of core samples has been analysed the actual values of m and n may be used rather than their mean values.

All the models from which water saturation is calculated are empirical,

either determined during laboratory investigations, or from field experience. There are significant uncertainties in estimation of formation parameters. The relative change in Sw from Archies equation arising from errors in all the measured variables can be given by:

1 ln ln( )w w tw

w w t

S a R RS n m mS n a R R

= + +

(5.9)

Thus all the parameters Rw , Rt , a and contribute to the total error wS .

If m=2 porosity errors are twice as significant as resistivity errors. Errors in n could lead to significant errors in Sw at small water saturations, while errors in m can be important for low porosity media. Keeping all other parameters constant for a formation of 31.6% porosity and 31.6% water saturation with m, n=2 a relative error of 2.5% in the measurement of a, m, n, , Rw and Rt taken one at a time causes a relative error of 1.25%, 5.7%, 3%, 2.5%, 1.25%, and 1.25% in Sw values, respectively. The maximum errors in Sw are due to errors in m and n. A relative error of 2.5% in the measurement of n is very optimistic. In reality it can be up to 20% which can lead to large uncertainties in Sw values.

The parameter m is also affected by a large number of factors including

grain texture, pore configuration and size, constrictions existing in a porous system, tortuosity, type of pore system (inter-granular, inter-crystalline, vuggy, fractured), compaction due to overburden pressure and presence of clay minerals. The main effect of these parameters is to modify the formation resistivity factor F. Consequently, their combination can produce a range of values of F and m for a given porosity. In a case study in a reef type limestone (Tiab and Donaldsson et al., 1996) when the overburden pressure increases from 0 to 35 MPa, the value of m increases from 1.99 to 2.23, causing a relative error of 0.5 % to 11.5% in the value of m.


Furthermore the parameters are generally obtained under ambient conditions instead of reservoir conditions. In the study made by Snden et al. (1990) the results from the electrical resistivity measurement from cores from a North Sea reservoir showed that the saturation exponents obtained at the reservoir conditions are lower than those obtained under ambient conditions and that they also slightly increased when the effective stress was increased. Thus it is crucial to have careful laboratory measurements of m and n parameters. However in the present study we have not eliminated these errors from Sw as the errors are already there in the Sw calculation from CPI logs against which we have calibrated the neural network model.

Being aware of the uncertainties in the empirical relation Woodhouse

(1998) suggested Sw measurement by the extraction of reservoir formation water from core plugs cored with oil-based mud. The study shows that after small systematic corrections the cores gave accurate in situ reservoir Sw measurements valid over a wide range of Sw values and throughout most of the transition zone (a reservoir interval extending from the fluid contact upwards, where water saturation is higher than the irreducible saturation). During coring no significant amounts of reservoir formation water were mobilised and displaced from the cores except those, which were taken from and below the OWC. Rapidly drilled cores, with incomplete penetration and pressure-retained cores provided conclusive proof. He found out that Sw measurements by this method are consistent with the Sw measured on logs or by capillary pressure method.

The neural network model is fully empirical, non-linear and may even be

trained to display the expertise of a skilled petrophysicist. The idea here is thus to establish networks for fluid saturation prediction based on a petrophysical database.

5.4 Synthetic data

Before applying a new method to real data the common practice in development and testing of geophysical methods and algorithms is to use synthetic data in order to maintain full control. We use for simplicity the clean sandstone model (equation 5.6). We generate the three synthetic logs, which have clear physical relationships to the fluid properties for input to the prediction; i.e. density, sonic and resistivity. The generation of these three synthetic logs is the same as we have discussed in section 3.1 with porosity prediction. In addition we include the neutron porosity log, which is an indicator of the abundance of hydrogen nuclei, and a common indicator to distinguish whether hydrocarbons are in the gas or fluid phase.

From the partial fluid saturation model of equation (5.3) and with from equation (5.4), we obtain the neutron porosity log from the relation


( ) (1 )N w w o o g g SS H S H S H H = + + + (5.10)

where Hi is the hydrogen index of the constituent i. For the different pore fill components the hydrogen indices are given by 1.1w wH = , 0 1.1 oH = and

1.2g gH = , and for the rock material (sand) we use SH =0.001 (Schlumberger, 1989). The four synthetic input logs; density, sonic, resistivity and neutron porosity ( , , and )t Nt R are shown in Figure 5.1 without noise (a) and with 10% noise added (b), respectively. Four different data sets, each consisting of 3000 samples, were created with independent random noise of 0, 2.5, 5 and 10% added to each log, including the saturations. Subsets consisting of 150 samples, or 5% of the total record, were then selected at regular depth intervals to use as training patterns for the neural networks.

5.5 Optimal design of networks

In the optimal design of fluid saturation network other than the two basic questions of 1) how much data and 2) how many neurons do we need we are faced with the additional problem of determining the number of outputs; i.e.

i) Is prediction of a single component; e.g. the water saturation Sw

sufficient? ii) Should we rather determine two outputs; e.g. Sw and Sg and compute the

third saturation So from equation (5.2) ? iii) Would it be possible for a neural net to predict all three saturations

simultaneously? iv) What is the optimal number of hidden neurons in cases of i), ii) and iii),

and what is the corresponding number of training patterns required? v) Should we use separate networks for each fluid component rather than

multiple outputs?

The answer to (i) may be that in the case of an oil field water and oil are the main fluid components. We thus assume 0gS = and hence the oil saturation is given by 1o wS S= . A similar argument applies to a gas field when away from a gas-oil transition zone where the components may be present in comparable proportions. But the problem arises in oil- water and gas-oil transition zone for knowing the partial saturations of the three fluids independently.


In case (ii) we have better control since prediction of two components provides the value of the third from equation (5.3). Only two saturations are, however, independent estimates and hence there is no means of control by summing to unity.

In case (iii) the three saturations are independently predicted and,

moreover, equation (5.3) provides an independent control of the prediction quality by the fact that the three output neurons should sum to unity within the estimated error. For the number of neurons in the hidden and the output layers, the obvious relation exists that the number of outputs cannot exceed the number of the neurons in the hidden layer, e.g. a single hidden neuron cannot simultaneously transfer non-trivial data to more than one output neuron. Thus, given the number of outputs, only the minimum number of hidden units can be fixed and the optimal number remains to be determined. The answer to (iv) is likely to be that the more outputs the more hidden units are required to reach comparable performance to that of the single output. More hidden neurons, on the other hand, imply a more complex network and hence more training patterns required to achieve the goal. However, the optimal number of hidden units and the corresponding number of patterns needed remain to be determined by experiments. For this reason we exploit the model data, with added noise levels 0, 2.5, 5 and 10%, to investigate the above questions in more detail. For training the networks we have selected subsets of 25-300 patterns, sampled at regular depth intervals, whereas for the error analysis all tests are made against the 3000 samples in the logs.

With the four input logs and a single output of water saturation Sw we find

that with minimum number of hidden units, which is one in case of a single output, the network clearly fails to reproduce the model as shown in Figure 5.2 and Figure 5.3. Using the same data (150 patterns) to fit a multiple linear regression model we find, on the other hand, that the model fit is much worse even though the number of coefficients to fit is the same (four) for the two models. The assumption of linearity of MLR versus the embedded non-linearity in the neural network explains the difference. Moreover, by adding one more neuron to hidden layer we gain significant improvement in favour of neural network, in the noise-free data as well as in the case of 10% noise as shown in Figure 5.3. The latter is also evident from Figure 5.4a and b showing that the error drops significantly when the number of neurons changes from one to two. With a further increase in the number of neurons the bias still changes but the standard deviation remains almost constant for higher noise level. By adding neurons beyond 2 to the hidden layer we, in general, still gain accuracy up to 4 hidden neurons but thereafter the network becomes more sensitive to noise and the errors increase. With an optimal number of 4 hidden neurons the error of the Sw network is about 0.02, which is below the error level expected in practical situations. Similar results are obtained with two outputs, and with three outputs as


shown in Figure 5.5 and Figure 5.6. Here we find that the optimal number of hidden neurons is 8 and 10, respectively, in case of 2 and 3 outputs.

For the model used in this study the number of outputs thus seems to have

minor effects on the accuracy as shown in Figure 5.7 for water saturation (a) and gas saturation (b). For all fluid saturations the standard deviation for noisy data (10 %) is almost constant at about 0.02 and, moreover, apparently independent of the number of output neurons. Once sufficient numbers of well-represented training samples are available the bias is negligibly small as shown in Figure 5.8 and the standard deviation clearly becomes the leading term in the overall error in noisy data. With a single output the appropriate number of training facts is in excess of 100 samples, however, in general, the error still continues to decrease with increasing number of facts. We have thus chosen 150 facts in the above experiments.

A major assumption in network design is that a sufficient number

of well-represented training samples is available. In many real situations, however, training samples are expensive or difficult to obtain. Since the problem complexity dictates the size of the network, we may end up with fewer useful facts than required for a network of a given size. The challenge in network design has been the small number of available data, and various methods have been proposed for generating virtual samples (Cho et al., 1997; Wong et al., 2000) to compensate for an insufficient number of training samples. For the problem at hand the number of reliable estimates from fluid transition zones, with at least two fluids present and of known partial saturation, is normally the main limitation. Since the overall errors are the same for a simple network with one single output and four hidden neurons, as for the more complex network with three outputs and 10 hidden neurons, the choice of architecture that requires minimum training patterns is preferred. An additional benefit of a simple network specialised for a particular fluid, besides the reduced training time, is the modularity that can be achieved when such a network constitutes a building block of a committee network as will be demonstrated in the following section.


Figure 5.1: Model output of synthetic logs for the saturation network (a)Noise free and (b) added 10 % random noise

Figure 5.2: a)Comparison of Sw predicted by MLR and neural network with one and two neurons in hidden layer noise free case b)differences with model Sw

300

250

200

150

100

50

0

0 25 50 75 100 125 150

N tRt

Rt (ohm m) t (s/ft)

Dep

th (m

)

0.1 0.2 0.3 2.30 2.35 2.40(a)

No Noise

N (g/cm3)

300

250

200

150

100

50

0

0 25 50 75 100 125 150

(b)

Dep

th (m

)

RtN

Rt (ohm m) t (s/ft)

0.1 0.2 0.3 2.30 2.35 2.40

10% noise t

N (g/cm3)

300

250

200

150

100

50

0

0.0 0.2 0.4 0.6 0.8 1.0(a)Sw

No noise

Dep

th (m

)

MLR Hidden = 1 Hidden = 2 Sw

300

250

200

150

100

50

0

-0.2 -0.1 0.0 0.1 0.2

N=2

N=1MLR

(b)

No noise

Sw

Dep

th (m

)


Figure 5.3: Comparison of Sw predicted by MLR and neural network with one and two neurons in the hidden layer for 10% Figure 5.4: (a) Bias, (b) standard deviation for one single output (Sw) with number of neurons in hidden layer for different noise levels.

300

250

200

150

100

50

0

0.0 0.2 0.4 0.6 0.8 1.0

N=2

(a)

10% noise

N=1

MLR

Sw

Dep

th (m

)

Sw

300

250

200

150

100

50

0

-0.2 -0.1 0.0 0.1 0.2(b)

10% noise

N=2

N=1MLR

Sw

Dep

th (m

)

0 2 4 6 8 10

10-6

1x10-4

10-2One output (Sw)

(a)

Bia

s

# Neurons in hidden layer

Noise free 2.5% Noise 5% Noise 10% Noise

0 2 4 6 8 10

10-4

10-3

10-2One output (Sw)


(b) # Neurons in hidden layer


Figure 5.5: (a) Bias, (b) standard deviation from network with two outputs (Sw and Sg) with number of neurons in hidden layer for different noise levels

Figure 5.6: (a)Bias,(b)standard deviation from network with three outputs (Sw , Sg,, So ) with number of neurons in hidden layer for different noise levels

2 4 6 8 10 121x10-6

1x10-5

1x10-4

1x10-3

Bia

s


Two Outputs: Sw , Sg

(a) # Neurons in hidden layer2 4 6 8 10 12

10-4

10-3

10-2


Two Outputs: Sw , Sg

(b)


2 4 6 8 10 121x10-6

1x10-5

1x10-4

1x10-3


Bia

s

Three Outputs: Sw, Sg, So

(a) # Neurons in hidden layer2 4 6 8 10 12

1x10-5

1x10-4

1x10-3

1x10-2


Three Outputs: Sw , Sg , So

(b)



Figure 5.7: Errors in (a) Sw and (b)Sg with number of output neurons and optimal hidden neurons of 4,8,10 for 1,2,3 outputs respectively

Figure 5.8: (a)Bias, (b) standard deviation for one output (Sw) with the number of training facts for network with four hidden neurons.

1 2 3

10-6

1x10-

Reservoir properties from well logs using neural networksbjornu/Alpana_Bhatt.pdf · Reservoir...

Documents

Transcript of Reservoir properties from well logs using neural networksbjornu/Alpana_Bhatt.pdf · Reservoir...