Fault Detection by Adaptive Process Modeling for Nuclear...

HELSINKI UNIVERSITY OF TECHNOLOGYDepartment of Chemical Technology

Fault Detection by Adaptive Process

Modeling for Nuclear Power Plant

Jaakko Talonen

Master’s Thesis submitted in partial fulfillment of the requirements for the degree ofMaster of Science in Technology.

Espoo, September 17, 2007

Supervisor: Professor Olli Simula

Instructors: M.Sc. (Tech.) Jukka Parviainen, D.Sc. (Tech) Miki Sirola

HELSINKI UNIVERSITY OF TECHNOLOGY ABSTRACT OF

Department of Chemical Technology MASTER’S THESIS

Author: Jaakko Talonen

Title of thesis: Fault Detection by Adaptive Process Modeling for

Nuclear Power Plant

Date: September 17, 2007 Number of pages: 62 + 3

Chair: Computer and Information Science Chair Code: T-115

Supervisor: Professor Olli Simula

Instructors: M.Sc. (Tech.) Jukka Parviainen, D.Sc. (Tech) Miki Sirola

Abnormal events exist in nuclear power plants (NPP) like in any industrial process.

These can be, for example, leakage in the pipe network, fouling of heat-exchanger or

calibration error in the flow indicator. These slow developing events should be detected

before something more serious happens.

In this work stored NPP process data was analyzed. A structure of one existing data

mining process model was extended. Data was explored by data management tool

(DMT), which was programmed during this project. DMT helps user in time series

data analysis. Relevant variables were selected and features were extracted for adaptive

model. Principal component analysis (PCA) was used as data mining tool. Delays be-

tween process variables were detected by cross-correlation function. Weighted recursive

least squares (WRLS) method was used for adaptive modeling. A leakage detection

method was based on the model estimation error.

The work was a part of NoTeS project and was carried out in the Laboratory of Com-

puter and Information Science in Helsinki University of Technology. The project part-

ner was Teollisuuden Voima Oy.

Keywords: data mining, PCA, variable selection, feature extraction, WRLS, fault de-tection

2

TEKNILLINEN KORKEAKOULU DIPLOMITYON

Kemian tekniikan osasto TIIVISTELMA

Tekija: Jaakko Talonen

Tyon nimi: Ydinvoimalan vikatilanteiden havainnointi

adaptiivisella mallinnuksella

Paivamaara: 17.09.2007 Sivuja: 62 + 3

Professuuri: Informaatiotekniikka Koodi: T-115

Tyon valvoja: Professori Olli Simula

Tyon ohjaajat: DI Jukka Parviainen, TkT Miki Sirola

Normaalista poikkeavia tilanteita esiintyy ydinvoimaloissa kuten missa tahansa teolli-

suusprosessissa. Naita voivat olla esimerkiksi vuoto putkiverkostossa, lammonvaihtimen

likaantuminen tai virtausmittarin mittauksen vaaristyma. Hitaasti kehittyvat tapahtu-

mat pitaisi havaita ennen kuin jotakin vakavampaa tapahtuu.

Tyossa tutkittiin ydinvoimalasta tallennettua prosessidataa. Eraan tiedonlouhintapro-

sessimallin rakennetta laajennettiin. Data tutkittiin datanhallinnointityokalulla, joka

ohjelmoitiin taman projektin aikana. Tyokalu auttaa asiantuntijaa aikasarjadatan ana-

lysoinnissa. Olennaiset muuttujat valittiin ja piirteet irrotettiin adaptiivista mallia

varten. Paakomponenttianalyysia kaytettiin tiedonlouhintatyokaluna. Viiveet prosessi-

muuttujien valilla tunnistettiin ristikorrelaatiofunktion avulla. Painotettua rekursiivis-

ta pienimman neliosumman menetelmaa kaytettiin adaptiiviseen mallinnukseen. Vuo-

dontunnistusmenetelma perustui mallin estimointivirheeseen.

Tyo suoritettiin NoTeS-projektin osana Teknillisen korkeakoulun Informaatiotekniikan

laboratoriossa. Yhteistyokumppanina oli Teollisuuden Voima Oy.

Avainsanat: tiedonlouhinta, PCA, muuttujavalinta, piirreirrotus, painotettu rekursiivi-nen pienimman neliosumman menetelma, vikatilanteiden havainnointi

3

Acknowledgements

This Master’s thesis has been done in the Laboratory of Computer and Information Science

in Helsinki University of Technology in 2006-2007.

I wish to thank my supervisor professor Olli Simula for guidance and patience. I would

also like to thank my both instructors, Jukka Parviainen and Miki Sirola for sharing their

experience with the author. Thanks also for Golan and Tuomas being excellent project

partners.

The industrial partner in this project was Teollisuuden Voima Oy in Olkiluoto. I wish to

thank Heidi Westerholm and Sami Asikainen for co-operation. Thanks replying for my

fuzzy questions concerning the nuclear process at Olkiluoto.

My gratitude goes to my friends not forgetting people in my past. They have helped me

to understand how important it is to pick up the baton for myself.

I would also like to thank my family. The perfect parents, Markku and Elina. My brothers,

Juho and Mikko, for the supporting comments. This Master’s thesis was a showcase that

youngest child can just do it too.

Finally kisses for Sanna, my biggest love, for keeping the wide smile on me! Thanks for

sending love messages all day long and packing lunch for me. Goat cheese and raspberry

salad is delicious!

Otaniemi, September 17, 2007

Jaakko Talonen

4

Contents

1 Introduction 10

1.1 Research problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2 Objectives and scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Nuclear energy 13

2.1 Nuclear power in Finland . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Operation of a boiling water reactor . . . . . . . . . . . . . . . . . . . . . . 14

2.2.1 Reactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.2 Main recirculation system . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.3 Turbines, reheater and condenser . . . . . . . . . . . . . . . . . . . . 16

2.3 Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Working with data 18

3.1 Model construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Data mining methods 21

4.1 Data understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2.2 Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2.3 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2.4 Moving standard deviation . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Variable selection and feature extraction . . . . . . . . . . . . . . . . . . . . 27

4.3.1 Principal component analysis . . . . . . . . . . . . . . . . . . . . . . 27

4.3.2 Hotelling’s T 2 statistics . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3.3 Dynamically similar variables . . . . . . . . . . . . . . . . . . . . . . 28

5

4.3.4 Identification of negative correlating variables . . . . . . . . . . . . . 30

4.3.5 K-means clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3.6 Reduction of redundant variables . . . . . . . . . . . . . . . . . . . . 30

4.4 Modeling of nonstationary process . . . . . . . . . . . . . . . . . . . . . . . 31

4.4.1 Delay and correlation . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.4.2 Weighted recursive least squares . . . . . . . . . . . . . . . . . . . . 32

5 Experiments and results 36

5.1 Data mining process model structure . . . . . . . . . . . . . . . . . . . . . . 36

5.1.1 Phase 1: Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1.2 Phase 2: Variable selection . . . . . . . . . . . . . . . . . . . . . . . 37

5.1.3 Phase 3: Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.2 Design based events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2.1 Implementation of data management tool . . . . . . . . . . . . . . . 41

5.2.2 Database construction . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.2.3 Data set selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.3.1 Data set exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Variable selection and feature extraction . . . . . . . . . . . . . . . . . . . . 45

5.5 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.5.1 Model 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.5.2 Model 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.6 Development of leakage detection method . . . . . . . . . . . . . . . . . . . 53

5.6.1 Simulated leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.6.2 Leakage index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.6.3 Leakage detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6 Discussion 56

6.1 Other ways to detect leakages . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.2 Problems in data mining and modeling . . . . . . . . . . . . . . . . . . . . . 57

6.3 Future development ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7 Conclusions 59

A Illustrations 60

Bibliography 63

6

Abbreviations

BWR Boiling water reactor

CSS Computerized support system

DM Data mining

DMT Data management tool

EPR European pressurized water reactor

EWMA Exponentially weighted moving average

HPFW high-pressure turbine feed water piston position

IAEA International atomic energy agency

LOCA Loss of coolant accident

MA Moving average

MSDV Moving standard deviation

NPP Nuclear power plant

PCA Principal component analysis

PSA Probabilistic safety analysis

RLS Recursive least squares

SOM Self-organizing map

STUK Radiation and nuclear safety authority

TMI Three mile island

UCL Upper control limit

YVL Regulatory guides on nuclear safety

WRLS Weighted recursive least squares

7

Symbols

Preprocessing

X process data in a matrix form

Conv(h, f)(x) discrete convolution

d(i) difference value

vi instance value

wi normalized value

Wi globally scaled value

s2 variance

sMSDV (i) moving standard deviation value

CH higher confidence limit

CL lower confidence limit

PCA

S covariance matrix

n number of measurements

xi ith measurement of sample vector x

x average of the measurements

λ diagonal matrix with the corresponding eigenvalues on the diagonal

θi eigenvector

θ eigenvector matrix

Θ matrix containing the eigenvectors corresponding the largest eigenvalues

8

T 2 Hotelling’s T 2 statistic

H scores, the original data mapped into the new coordinate system defined

by the principal components

Hi score vector

WRLS

e(k + 1) estimation error

y(k + 1) dependent feature value

ϕ(k + 1) interpretative feature vector

Θ(k + 1) model coefficient vector

λ(k + 1) forgetting factor

Ω constant, which is used to calculate λ(k + 1)

σ2v variance of estimation error in normal condition

λmin minimum value for forgetting factor

λmax maximum value for forgetting factor

γ(k + 1) updating vector

P(k + 1) confidence factor

α constant, which is used to calculate P(1)

I unit matrix

9

Chapter 1

Introduction

Technology has developed extremely fast in past decades. It prospects new advances in the

analysis of all types of data. Information is collected all around us and it is not possible

anymore to analyze everything by conventional methods. Useful knowledge is hiding in

stored data sets [1]. Industrial processes have to be developed, because of increasing

demand in production efficiency and safety. By developing better data mining (DM)

methods these targets can be reached.

Most nuclear power plants (NPP) have been built before the generation of multivariate

data analysis methods. Engineers worldwide have developed better solutions for processes

using the knowledge of process understanding and engineering skills. Now it is possible to

explore interesting information semiautomatically from process data without widespread

understanding of the system. However, the modern statistical methods have been used in

probabilistic safety analysis (PSA), where the operation risks of NPP are quantitatively

analyzed [2].

Fault detection is an important research field, because repair shut downs can be avoided

by process condition-based maintenance. The traditional approaches for the fault detec-

tion are variable low and high limit checking [3]. Measurements or some simple feature

extractions are usually just monitored at control room of the industrial process. Using

DM to explore process information, monitoring can be upgraded to find the new useful

features from the process. Computerized support system (CSS) can be constructed using

the results of DM. Monitored information is reproduced in more informative form, even in

linguistic form [4]. Usually this information is shown in the common monitor of the control

room. For example, the alarm can be a notification of process measurement exceeding of

lower or upper limit. More informative form for leakages would be a numerical value. It

could be determined by model error or deviation in measurements.

Human operator supervises and also controls parts of the process, which are not auto-

mated. Actually operators in practice are projecting process measurements to a single

point of a state space. They are concerned about more current operational status than

10

CHAPTER 1. INTRODUCTION 11

instant values of specific variables [5]. The early detection of faults is critical in avoiding

degradation of the system or human health [3].

Fault detection means determining that a problem has occurred or something is going

wrong in the monitored system, whereas fault isolation pinpoints the exact cause and

location. Fault identification is the determination of the magnitude of the fault [3]. For

example, fouling of heat exchangers is slowly developing fault, which is hard to detect.

One of the main difficulties for detection is the compensating effect of feedback control,

which reduces the effect of small slow developing faults [6].

Fault does not have to be an abnormal state in the process, also sensor failures are impor-

tant to detect. Sensor failures or degradation, calibration error in the sensor, can cause

process disturbances, loss of control, profit loss or catastrophic accidents [7]. In process

control 60% of perceived malfunctions in a plant are consequences of the lack of credibility

of sensor data [8].

Improving the controlling and supervising methods has become more important. The

operator should get notification only by relevant alarms within the context of the process

state. For example, in the scram 1 situation 250 alarms appear directly after the fault,

but only five alarms are of interest to the operator [9]. This example motivates to search

and invent new computed or modified features, which are describing the process status

in quantitative or qualitative way. By developing new methods to detect faults, some

old-fashioned redundant alarms can be removed from the system.

This Master’s thesis is a part of NoTeS project (Nonlinear temporal and spatial forecasting:

modeling and uncertainty analysis). In this project a generic tool set for spatiotemporal

forecasting and forecast uncertainty analysis is developed. It consists of five different test

cases, and one of these is supporting operational decisions at a NPP [10]. This test case

combines neuro-computing, in particular Self-Organizing Map (SOM), for decision support

at a control room of the NPP.

1.1 Research problem

NPP process data has to be analyzed. What happened in these stored data sets? Is it

possible to detect faults by multivariate data analysis methods? Is it possible without

extensive process knowledge? These were the problem statements for this thesis.

TVO provided data sets and possibility to use training simulator for one day. However,

during the project the objectives had to be changed due to problems in data. Data sets

last only few minutes, so these were not very suitable for a research of slow developing

faults. Therefore, one existing DM process model structure was extended.

1The scram is an emergency shutdown.

CHAPTER 1. INTRODUCTION 12

1.2 Objectives and scope

It is important to execute corrective action well before the onset of a dangerous situation

especially at NPP. Catastrophic faults can be detected by observing time series or fixed

alarm limits. This is one reason, why the goal of this work was early detection of slow

developing faults by reproducing new monitored features. To avoid serious consequences,

incipient failures should be detected early enough [6]. The effective fault diagnosis and

detection of slow developing faults play a key role in this work.

Operators have stored design based data sets and abnormal events. Analysis of these

design based events has been extended in this work. DM project is designed and a data

management tool (DMT) is programmed. Statistically significant variables are detected

and data set report (tables and figures) is created. DM methods and time series modeling

are combined to detect abnormal process events. Model and fault detection method are

created by DMT. It helps user to perform semiautomatic analysis for Olkiluoto NPP data.

It has simple interface for analysing off-line data and it can be used for developing process

models. It is implemented by Matlab 2 and intended for expert users.

1.3 Structure

In Chapter 2 the operation of the boiling water reactor (BWR) is explained and some

background of the energy production is introduced. The reference model for DM process

is explained in Chapter 3. DM methods, such as principal component analysis (PCA) and

weighted recursive least squares (WRLS) are explained in Chapter 4.

In Chapter 5 the structure of the data analysis is explained and one stored data set is

analyzed. Variables of the data set are preprocessed. Then variable selection, feature

extraction and modeling phase are introduced. A detection method for a slow developing

leakage is created. Results are visualized and analyzed during the DM experiments.

Analysis and development of the fault detection method is performed off-line. Discus-

sion and further developing ideas are in Chapter 6. Conclusion and final remarks are in

Chapter 7.

2Matlab is a short for ”matrix laboratory”. It is a numerical computing environment and programminglanguage.

Chapter 2

Nuclear energy

Nuclear fission was discovered in 1939. The first chain reaction was achieved in the Man-

hattan Project (1942) at the University of Chicago. About ten years later development

focused mainly on technologies for civilian electricity generation. The emphasis in the USA

was on the pressurized water reactor (PWR) and the boiling water reactor (BWR) de-

signs. Reactor development in the USSR were same as previous ones and also in graphite-

moderated designs.

The role of nuclear power has varied in the last decades. In the selection of energy

production type four areas are concerned: economics, safety, waste management and risk

management [11].

In the end of 1960s the USA guaranteed fixed price for the nuclear energy and it was clearly

competitive with coal and oil fired alternatives. The US Atomic Energy Commission

foresaw nuclear power as a very respectable choice for energy production. The number

of reactors under construction globally increased yearly until 1979, when the Three Mile

Island (TMI) accident happened.

According to the International atomic energy agency (IAEA), the TMI accident was a

significant turning point in the nuclear power production. Seven years later the fourth

reactor of the Chernobyl NPP exploded. Support for nuclear power across the world fell.

Following the events the number of reactors under construction declined every year from

1980 to 1998. [12] If something good has to be found from these accidents, they woke up

governments to put money for developing nuclear production and pay attention to safety

regulations. TMI and progression of digital technology brought computers into the control

room in an attempt to provide a possibility to monitor process properly [13].

Now the direction in energy production seems to be same as before accidents worldwide.

Many countries such as Japan and China remain active in developing nuclear power. More

reasons for nuclear energy are increasing demand of energy, fossil fuel prices and global

warming.

13

CHAPTER 2. NUCLEAR ENERGY 14

2.1 Nuclear power in Finland

In Finland four nuclear power plant units generate about quarter of the annual electricity

needs [14]. Two reactors are in Olkiluoto1 and annual energy production was 14 268 GWh

and average capacity factor was 95.8% in the year 2006 [15]. Energy net imports was 20%

in the year 2005 [15]. Fifth reactor, European Pressurized water Reactor (EPR), is under

construction. In this year (2007) there has been discussion in media about 6th and even

7th nuclear power plant.

Nuclear power plants are classified by the features of the reactor. Power plant at Olkiluoto

is the most common type - boiling water reactor (BWR). Nuclear power plant consists of

a steam supply and generator like any thermal power plant. Mechanical work is produced

by steam in turbine and it drives generator. The main difference between the thermal and

nuclear power plant is the heat source.

Nuclear energy is produced by two reactor units in Olkiluoto. Teollisuuden Voima Oy2

operates reactors, at reactor 1 (OL1) since 1978 and reactor 2 (OL2) since 1980. Today

(2007) both reactors have a net capacity of 860 MW after the units have been upgraded

twice. The operating reliability is about 95%, which is higher than average capacity

factor of any other nuclear power plant in the world. Both reactors generate electricity

at a voltage of 20 kV and after a transformer it goes to the national grid. Electricity is

transformed to 400 kV to minimize the transmission losses [15].

Nuclear power is used to meet the base load electricity demand in Finland. The annual

outage is carried out in the beginning of summer, when the supply of hydroelectric power

is the greatest. In the annual outage the plant is inspected and it lasts 7-8 days. Every

other year a reactor is refueled with new uranium fuel assemblies. About 25% of the fuel

assemblies are changed and outage lasts from two to three weeks [14].

2.2 Operation of a boiling water reactor

The main components of the BWR are reactor and internals such as primary process

systems, turbine, generator, control systems and electrical systems. Energy is generated

by fission reactions in the reactor core. Comparing to thermal plant, the reactor core

is equivalent to the furnace of the boiler, where the combustion of organic fuel release

energy [16].

The design of BWR is shown in Figure 2.1. Steam is raised in the core, separated and

dried, before turbine. Thermal heat output is controlled with a moving group of control

rods up and down. The reactor power can be varied within certain limits by controlling

recirculation flow. Continuous operation is possible from 25% of nominal power. Lower

than 65% the power is controlled by control rods. Normally the power is controlled

1Olkiluoto is an island located on west coast of Finland, in the municipality of Eurajoki2Teollisuuden Voima Oy (TVO) was founded in 1969 by industrial companies


transformer

generator

transfer network

sea watercondenser

high-pressureturbine

low-pressureturbine

reheater

reactor

Figure 2.1: Simplified process chart of the BWR type nuclear power plant.

varying the speed of the main recirculation pumps. When the generated electrical power

is maintained at a preset value, it is normal operating mode.

2.2.1 Reactor

The core, control rods, steam separators and steam driers, main recirculation pumps and

nozzles for steam and feedwater are situated in the reactor vessel. All major pipe nozzles

are located above the top of the core. The core consists of hundreds vertical uranium

fuel assemblies. Both reactors have 500 assemblies, which are arranged in a quadratic

pattern [14]. The fuel assemblies are supported by core grid. The chain reaction heats

water into high-pressure steam [16].

The reactor vessel cover has been simplified so that all external pipe connections have

been eliminated by making the connections inside the reactor [14]. Reactor is designed for

a pressure of 8.5 MPa. Operation pressure is about 7.0 MPa, at which water evaporates

in the temperature of 286 Celsius [15]. The water is normal purified water for cooling and

moderation.

2.2.2 Main recirculation system

Incoming feed water (about 1260 kg/s of water with a temperature of about 185 Cel-

sius) [14] is mixed with the water coming from the steam separators. The main recircula-

tion pumps force water through the reactor core. In Olkiluoto there are four main steam

lines with the diameter of 60 cm and six internal main circulation pumps. This type of

pump reduces the risk of leakage, because the need for major pipe connections is lower in

the reactor vessel [16].


2.2.3 Turbines, reheater and condenser

Live steam flows from reactor to high-pressure turbine, where it yields about 40% of its

useful energy [16]. The moist steam has to be reheated after the high-pressure turbine.

Before four low-pressure turbines there exists a reheater. First the steam is heated with

bled steam of high-pressure turbine and then using main steam lines. In 2006 on annual

outage new reheaters were changed. These new 24 meters long, 220 tons heaters are

located both sides of the turbine [17].

Low pressure steam is reheated, because of two reasons. The reheater separates the

water from the steam, because moist steam causes equipment problems that increase

maintenance costs and downtime. Secondly, reheated steam is less radioactive.

The steam can be by-passed directly into the condenser. This event is used during plant

startup, shutdown and in the event of turbine load loss. Live steam pressure is 67 bar and

in temperature of 283 Celsius. Cylinder design is double axial-flow. Turbine is connected

to the generator so that it is located outside of the radiation shield of the turbine [14].

The sea water cooled consender is divided into two shells, one for each pair of low-pressure

turbines. The consender is a large tube-shell heat exchanger. The condensate is pres-

surized, preheated and delivered to a storage tank. The condensate pump sets consist of

electrically driven units [16].

2.3 Safety

Nuclear wastes and materials are under strict national and international control. The

operation of the plant is tightly governed by laws and Radiation and Nuclear Safety

Authority (STUK). It formulates nuclear safety requirements and regulates the use of

nuclear power plants in Finland. This is described in Regulatory Guides on nuclear safety

(YVL) [18].

One of the principal purposes in safety is to minimize the probability of a radioactive

release. The probability must be very low. Therefore, the safety is based on a multi-level

defense. It means that all functions that are critically related to safety are provided with

redundant systems and equipments. Many process measurements have four redundant in-

struments. Redundancy is also applied in order to increase the preparedness of the safety

systems. Even operator errors or several equipment failures cannot alone cause a catas-

trophic accident [16]. The nuclear plant also has independent emergency organization. It

is tested during annual outages [14].

After TMI disaster experience has shown that technical equipment in itself can be made

very safe. Operator error has proved to be a dominant factor in causing system faults.

Decisions were made under stress and therefore hastily. Operator may neglect to initiate

the required safety functions. In TMI case wrong decisions were made, because operators


were overwhelmed with information. Information was mostly irrelevant, misleading and

incorrect. After the disaster some improvements in the control room were made: habit-

ability, visibility to instruments, ambiguous indications and the placement of alarm lists

were changed [12].

Operator training is an important part of the NPP safety. Olkiluoto NPP has a simulator,

which is almost identical copy of the real plant. It is used for both in training and in

developing the process. Also improvements in quality assurance, engineering, operational

surveillance and emergency planning have been organized.

Chapter 3

Working with data

This chapter describes the general structure of data analysis. Data mining (DM) project

steps should be clear from start to end. In this work DM is based on the CRoss Industry

Standard Process model for Data Mining (CRISP-DM) reference model [19]. It is a hier-

archical process model created in 1996 by a consortium of some of the major companies

in data mining industry such as SPSS [20]. It provides an overview of the life cycle of a

DM project.

3.1 Model construction

CRISP Model shown in Figure 3.1 consists of six phases: business understanding, data

understanding, data preparation, modeling, evaluation and deployment. In business under-

standing phase data miner familiarizes with the problem domain. Interesting and relevant

data is studied respect to the problem. Data understanding starts with finding the data,

what information is available and how it can be accessed. How to combine available data

Feature selection

DATA

Phase 1

Phase 2

Phase 3

Processknowledge

Modeling

Evaluation

Deployment

Implementation Data understanding

Process understanding

Data preparation

GOAL

DM skills

Figure 3.1: CRISP-DM reference model and it’s modification for NPP data.

18

CHAPTER 3. WORKING WITH DATA 19

together. Data is explored and aggregated by simple descriptive statistics. Data quality

and reliability are inspected. The aim of Data preparation is to make better models by

modifying the data or extracting new features. By Modeling and evaluating the problem

is almost solved, but created model is not the end of the project. The solution of the

project should be applied to real process in deployment phase, thus DM process increases

the system knowledge.

Model was customized concerning the goal of this Master’s thesis. Modified model is shown

in Figure 3.1 right. It is divided to three phases: Preprocessing, variable selection and

modeling. The model is implemented as data management tool (DMT). The outer circle

represents the progress of DM. It is iterative and interactice process, and it cannot be

automated totally. Arrows symbolize the most important dependencies between phases.

In this customized model blocks differ from those in the CRISP-DM model. Business

understanding phase was removed, because funding was already given for this project.

Part of the data understanding is divided to process understanding phase, where related

variables are searched. In variable selection phase relevant variables are found and features

extracted for modeling. After model evaluation it is used to create a fault detection

method. In the end of DM project data set results should be used for process development.

In addition fault detection method should be implemented in practice at the plant.

3.2 Data structure

Different types of data are measured at the Olkiluoto plant. Automation system stores

the process data in the relational databases. Data is retrieved with SQL query from

databases. Data sets for DM were delivered by compact discs. All data sets are time

series of numerous signals.

Data is stored to sample vectors

xj =

x1

x2

...

xi

...

xn

, (3.1)

where xi is process measurement and n is data set length. These vectors are transformed

CHAPTER 3. WORKING WITH DATA 20

to a data matrix

X =

x11 x12 . . . x1k

x21 x22 . . . x2k

......

. . ....

xn1 xn2 . . . xnk

, (3.2)

where k is number of signals in the data set. Process measurements are in the columns.

Linguistic information such as variable names, explanations and process units are stored

to other vectors.

3.3 Data sources

Data types at Olkiluoto NPP are Simulator data: Data can be created and stored by simulator. It is almost exact

copy of the real plant. Data is stored in XLS format and the frequency of data is 1

Hz. Data does not contain random noise. Process data: Data is stored by process computer. All process measurements have

four redundant devices, so stored data is error-free in system database. Data set

length depends on the stored data frequency. High frequency data sets: These are abnormal and design based events stored

with frequency of 10 to 100 Hz. Data set length is few minutes. Variables are selected

by operator.

Data frequency is varying between the data sets. Data is stored to own database shown

in Figure 3.2. In this work only high frequency data sets were used. Statistical properties

such as mean, standard deviation, maximum and minimum values are collected from each

case and aggregated to the database. Database information is used in preprocessing phase

for normalization.

DatabaseStored high frequencydata sets

Database construction

Simulator data Process data

Figure 3.2: Database is made up of different data sources.

Chapter 4

Data mining methods

In this chapter methods and algorithms of this work are described. In data understanding

section 4.1 methods and ideas concerning the data and process understanding are de-

scribed. In data preparation section 4.2 data is processed for data mining (DM). Variable

selection and feature extraction by principal component analysis (PCA) are introduced in

section 4.3. In the end of this chapter the structure and theory of adaptive modeling is

described.

4.1 Data understanding

Data understanding phase gives data description and characteristics and lead to interesting

data subsets for further examination. It is divided into several tasks [19]. Collect initial data: Recorded data sets have to be prepared for suitable form for

the data analysis. Describe the data: Amount of variables and observations are examined. Quali-

tative information about recorded data sets is studied. Do data satisfy the require-

ments and what are the goals of data mining process? If so, how to find these hidden

patterns? How to practically use them when they are found [21]. Explore the data: Exploring data means visualization and aggregation of data.

Properties are visualized using simple statistical analyses [19]. Using data quality

reports and graphs some data subsets can be created depending on further data

analysis methods.

Visualization is possible after data set is in the right form. Data is represented in the form

of table [21]. Table columns represent the variables, and rows represent the records. This

matrix representation has become standard. In the industrial process matrix contains huge

amounts of rows and columns. Examination of matrix without spreadsheets, graphs and

21

CHAPTER 4. DATA MINING METHODS 22

plots is not possible. Data is aggregated to spreadsheets by calculating basic properties

of variables such as mean, median, standard deviation, average transitions, minimum and

maximum values. Useful visualizations are plots, scatter plots, boxplots and histograms.

Variables with not enough transitions are not included for further analysis, because selected

methods are not working well with stable values. Variables with missing data or outlier

observations should be deleted or set realistic values for these data points [21].

Common visualization methods are time series plots of variables, histogram plots and

scatter plots [22]. Time series plots are used to visualize variable changes in time. Scatter

plot is a useful tool to represent relationship between two metric variables [23]. Each

observation is visualized in a two-dimensional graph. Boxplot and histogram are used to

represent the distribution of a variable. Boxplot shows minimum, maximum and major

portion of distribution. Also mean and median values of the variable are shown in the

graph. Outliers can be identified from box-and-whisker diagram (boxplot), when data

values are separated into groups. The lines extending from the box represent the distance

to the smallest and the largest observations that are less than one quartile range from

the box [23]. The Figure 4.1 shows the boxplot for coolant flow rate with the length

of the whiskers specified as 1.0 times the interquartile range. Outliers are shown points

beyond the whiskers are displayed using ’+’. A lower boxplot contains all measurements

of coolant flow rate. In this data set process state changed and that is why so many values

are identified as outliers. Data should be segmented in two parts to ensure success in

modeling phase.

6600 6800 7000 7200 7400 7600 7800 8000

two states

segmented (one state)

Values

Coolant flow rate

Figure 4.1: (top) Segmented coolant flow rate boxplot. (down) Original values of coolantflow rate illustrated by boxplot. Boxplot shows minimum, maximum and major portionof distribution.

4.2 Data preparation

There are no clear rules to perform data preparation or preprocessing. Different results

depending of preprocessing phase should be compared. In CRISP model [19] data prepa-

ration tasks are: Cleaning: Process noise is removed by filtering the data. Constructing: In this work some features are derived from measurements such as

difference and moving standard deviation (MSDV).

CHAPTER 4. DATA MINING METHODS 23 Integrating: Some variables are grouped together. For example four redundant

temperature measurements can be integrated to one feature by using mean value of

these measurements. Formatting: In this work multivariate methods require normalized or ratio scale

data, so nominal 1, ordinal 2 and some interval 3 data vectors are deleted from the

data set [24].

4.2.1 Filtering

Filtering is an operation, which eliminates the effect of measurement noise. Convolution

defines the input-output relation of a linear time-invariant filter [25] as

Continuous linear convolution [25] is defined

g(x) =

∫

U(x)h(x − x′)f(x′)dx′, (4.1)

where U(x) is surrounding of point x, h(x − x′) is filter and f(x′) is measurement value

in point x. Discrete convolution is defined in

Conv(h, f)(x) =

N−1∑

k=0

h(x − k)f(k), (4.2)

where h(x − k) is the response of the system to the delayed unit sample x − k. Moving

average (MA) filter is a linear filter and can be described using convolution.

There exist also other filtering methods for data vectors: Median filtering [24] is useful

for deleting deviating values from the data set. In exponentially weighted moving average

(EWMA) method [26] a weighting parameter determines how older data points affect to

the mean value compared to more recent ones.

4.2.2 Difference

One of the derived attributes in this work is variable difference, which is used to measure

the rate of change

d(i) =x(i) − x(i − N)

N, (4.3)

where x(i) is a preprocessed measurement value and N is a frame size. Difference is high

pass filter and it extracts changes in the signal.

Difference example values with N = 3 are shown in Table 4.1, in section 4.2.4. In practice

the frame size parameter is selected depending on data set frequency and noise.

1Nominal data is classification data, e.g. m/f2Ordinal data is ordered but differences between values are not important3Interval scale data has no true zero points, and therefore data values are not proportional, e.g. tem-

perature [C]


4.2.3 Normalization

Most of the multivariate methods do not work properly, if variable normalization or scaling

is not done. In academic papers often used scaling method is normalization to zero mean

and unit variance [27], [28], [29].

Comparison of the variables makes sense after scaling the data, even if the original units

are not the same. Proportional values are more important than original values, because

in multivariate data analysis correlations and changes of dynamics are analyzed. Scaling

does not change the meaning of data [19].

There are several problems for zero mean and unit variance normalization method: Some

variables have no real changes in time, only noise. Normalization to unit variance increases

proportional effect and creates unwanted effect to the model. Secondly, result is not very

reliable, if only one data set is used for scaling, because data set can be just an exception.

Variables in the data set should be normalized both across range and in distribution [21].

Scaling is done by the linear scaling transform [21]. A range scaled value wi is

wi =vi − min(v1 . . . vn)

max(v1 . . . vn) − min(v1 . . . vn), (4.4)

where vi is an instance value.

The global minimum and maximum values should be available. Process measurements

can be scaled into the range zero to one. In this work all available data sets are explored

and minimum and maximum values of each variable are stored to own database of the

data management tool (DMT).

A globally scaled value Wi is

Wi =vi − (vmin,i)

(vmax,i) − (vmin,i), (4.5)

where vmax,i is maximum value and vmin,i is minimum value of the variable in the database.

After range scaling data is converted to zero mean, because further multivariate data

analysis methods. Example for zero mean and unit variance, and range scale normalization

for same data vectors are shown in Figure 4.2.

The data set contains two data vectors. Variation for variable 2 (v2) is not so much

in reality as in Figure 4.2 (a). The variation is much smaller, when scaling is done by

database minimum and maximum values, see Figure 4.2 (b).

Out-of-range values

Variable measurements in a new data set can include higher or smaller values than the

limit values in the database. Globally scaled out-of-range value Wi is changed

Wi < 0 ⇒ Wi = 0, (4.6)


−200 −100 0 100 200 300 400 500 600−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5Zero mean and unit variance

v1v2

−200 −100 0 100 200 300 400 500 600−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25Range scale

v1v2

Figure 4.2: A comparison between (a) the zero mean and unit variance and (b) range scalenormalization.

or

Wi > 1 ⇒ Wi = 1. (4.7)

From scaled data it is simple to recognize values, which are higher or lower than in the

database. The user should be careful about trusting the analysis, if scaled data has a lot

of limit values (0/1). In practice it means that process is not in normal operating state

or database does not contain yet enough process history. If the out-of-range values were

from design based event, the user should create a new database.

If there were not enough data sets to create database, small exceed of minimum and

maximum values can be allowed by creating the gap to the database. The size of the

expected out-of-range gap is directly proportional to the degree of confidence of out-of-

range values. For example with confidence level 98% range becomes 0.01 to 0.99. New

database minimum and maximum values are derived

min(W ) := min(W ) − 0.01 · |max(W ) − min(W )|, (4.8)

and

max(W ) := max(W ) + 0.01 · |max(W ) − min(W )|. (4.9)

Before scaling the changing of nonlinear units should be considered. For example changing

grams to kilograms does not make sense. Pressure unit [barg]4 to [bar] would effect a bit

to the analysis

x[bar] = (1 + x)[barg]. (4.10)

4Gauge pressure (barg). For example Volkswagen Passat tires are pressurized to 1.9 bar they actuallymean bar gauge: the pressure in the tire is really 2.9 bar.


Table 4.1: Example values of derived features.

Time Signal MSDV Difference

1 0.1000 0.0000 0.0000

2 0.1000 0.0000 0.0000

3 0.1100 0.0071 0.0033

4 0.1200 0.0071 0.0067

5 0.1500 0.0212 0.0167

6 0.1800 0.0212 0.0233

7 0.1100 0.0495 -0.0033

4.2.4 Moving standard deviation

The moving standard deviation (MSDV) is common measure of statistical dispersion,

measuring how widely values are spread in time. If the data points are far from the mean,

then the MSDV values are large. If all the data values are equal, then the MSDV is a null

vector. MSDV is derived from sample variance [25] as

s2 =1

n − 1

n∑

i=1

(xi − x)2, (4.11)

where xi is a sample of infinite population and x is the average value. MSDV value is

defined in

sMSDV (i) =

√√√√ 1

N − 1

N−1∑

k=0

(x(i − k) − x(i))2, (4.12)

where N is frame size and x(i) is MA defined in equation (4.2).

MSDV values for preprocessed samples are illustrated by an example in Table 4.1, where

frame size N is two. In practice MSDV may serve as a reliability indicator, but it makes

no predictions of process direction. For example signal values at t = 3 and t = 7 are the

same, but MSDV values differ.

In this work volatility is the sum of MSDV vectors. This feature is used for rough guesses

to recognize state changes in the system. Different states are segmented to sequences,

because it reduces the complexity of representation of the original data. More developed

segmentation methods are introduced in [30]. High peak in the volatility can mean system

state change, and in these situations operators should pay more attention to process

monitoring. These peaks are not recognized automatically and the user has to make

segmentation decision from the MSDV plot.

These results are used also as confidence limits in time series plots. Idea is same as in the

Bollinger Bands [31], which is used in the stock markets. Bands provide a visual channel


of upper and lower bounds. The indicator is composed of three basic curves. A middle

curve is MA. A higher curve CH is two standard deviations (MSDV) above the middle

curve

CH(x) = Conv(f, g)(x) + 2 · sMSDV (x), (4.13)

and a lower curve CL is two standard deviations below the middle curve

CL(x) = Conv(f, g)(x) − 2 · sMSDV (x), (4.14)

where Conv(f, g)(x) is convolution with a linear filter and sMSDV (x) is a value of moving

standard deviation.

4.3 Variable selection and feature extraction

In industrial process there can be hundreds or even thousands of signals stored to database.

How to select relevant variables for further analysis? Manual selection is not possible,

because of the high dimensionality of the system. So variables are selected automatically

when there exists large amount of process signals [29]. Selection rules have been designed

depending on the intention. In this work statistically significant variables are selected for

a data set report. It is supposed that these variables contain relevant information to the

user. Also suitable variables for modeling are selected.

4.3.1 Principal component analysis

Principal component analysis (PCA) is a useful tool to find relevant variables for the

system and model. It is a linear transformation to a new lower dimensional coordinate

system, while retaining as much as possible of the variation. Given a data matrix x

representing n observations of each of m variables. The mean of this matrix is x = E(x).

The covariance matrix of x is defined as

S = E((x − x)(x − x)T ). (4.15)

The projection directions [29] from the covariance matrix S are solved in

Sθ = λθ ⇐⇒ (S − λI)θ = 0, (4.16)

where λ is diagonal matrix with the corresponding eigenvalues on the diagonal and θ is

eigenvector matrix. Unit eigenvectors are axis directions in the new subspace. The first

component is given by linear combination of the m variables, where the sample variance

is greatest for all of the weights. Principal components are orthogonal between each other

and these are maximizing the variance in each component level.


Data is projected to subspace by placing the first N principal components to matrix Θ as

in

Θ = (θ1| . . . |θN ). (4.17)

Less significant (N +1 . . . m) directions are ignored. PCA is often used for visualizing high

dimensional data in 2D or 3D. It is assumed that data is mainly concentrated in only a

few directions [29], [32]. Matrix Θ and eigenvalues are used as a criteria for the variable

selection.

The data points can then be visualized as scores H. Score vector on component i is defined

by

Hi = Θxi, (4.18)

where Θ is containing the first and second directions in the new low-dimensional space.

4.3.2 Hotelling’s T2 statistics

A measure of the variation within the PCA model is given by Hotelling’s T 2 statistic. T 2

values are the sum of the normalized squared scores [33] and it is defined in

T 2 = (H − H)T S−1(H −H), (4.19)

where H is the score matrix, H and S are the common estimators for the mean vector

and covariance matrix obtained from the scores [34]. The scores H are the original data

mapped into the new coordinate system defined by the principal components. It shows

the relationships among observations.

Outliers are detected with the help of Hotelling’s T 2, which defines the normal operating

area corresponding to 95% confidence. The upper control limit (UCL) of the multivariate

Hotelling’s T 2 statistics [5] can be defined as

T 2UCL =

(n − 1)(n + 1)k

n(n − k)Fα(k, n − k), (4.20)

where Fα(k, n − k) is the upper critical point of the F -distribution with k and n − k

degrees of freedom [5]. In practice k is amount of selected variables and n is the number

of measurements.

4.3.3 Dynamically similar variables

Interpretive variables of an adaptive model should be linearly correlated to ensure a robust

model. The dynamically similar variables are detected using results of equation (4.17).

Similar dynamical behavior between variables has approximately same row values in ma-

trix Θ.


−200 0 200 400 600−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25Time series

−1 −0.5 0 0.5−0.8

−0.6

−0.4

−0.2

0

0.2

v1

v2

v3

v4v5

1st Principal Component

2nd

Prin

cipa

l Com

pone

nt

Principal component map

1 2 30

20

40

60

80

100Percent explained / component

λ

v1v2v3v4v5

Figure 4.3: Example of finding the dynamically similar variables: (a) Preprocessed timeseries, (b) a loading plot (principal component map) and (c) scaled eigenvalues as compo-nent explaining percentage for each eigenvector.

Distance between two vectors [35] u and v is defined in

d(u,v) = ‖u,v‖ =√

(u1 − v1)2 + . . . + (uN − vN )2. (4.21)

For similar variables the distance is small. Weighted Euclidian distance for each variable

is defined in

dW (u,v) =

N∑

k=1

λk · ‖u,v‖, (4.22)

where u and v are projected variables and N is the number of principal components in the

new low-dimensional space. Weighting parameter for each component k is the eigenvalue

λk. The idea is to stress variable selection in the most important dimensions.

The equation (4.22) is used to derive a distance matrix in

D =

dW (1, 1) dW (1, 2) · · · dW (1, n)

dW (2, 1) dW (2, 2) · · · dW (2, n)

......

. . ....

dW (n, 1) dW (n, 2) · · · dW (n, n)

, (4.23)

where n is the amount of variables. It is a matrix containing the distances of set of points

in new principal component space. It is a symmetric n × n matrix containing pairwise

Euclidian distances. Distance matrix D is sorted and the variable index of the nearest

distances are captured to the matrix DI .

Simple example is introduced using a data set containing five time series. Preprocessed

time series are shown in Figure 4.3 (a). Distance matrix is derived by using the results in

Figure 4.3 (b,c).

Content of matrix DI is shown in Table 4.2. Most dynamically similar variables are in

order and so the nearest variable is at second row. For example, if linear model is created


Table 4.2: DI Matrix: Indices of nearest variables

variable 1 variable 2 variable 3 variable 4 variable 5

3 3 2 5 4

2 4 4 2 2

4 5 5 3 3

5 1 1 1 1

for variable 4 the best choice would be variable 5. In this data set there are not good

interpretative variables for variable 1. This method to detect variable dynamics is useful,

when dimension of the data matrix is huge. In this example just inspection of time series

would have been enough for further analysis.

4.3.4 Identification of negative correlating variables

As it was demonstrated PCA is an effective algorithm for finding positive linear correlations

for the model. It is possible to find also negative linear correlations, if reflections of the

data vectors are included to PCA. These copies are created in preprocessing phase.

4.3.5 K-means clustering

K-means is unsupervised learning algorithm, which classifies a given data set through a

certain number of k clusters [36]. Initial places of the centroids are defined, one for each

cluster. Different initial centroid location causes different result. Best choice is to place

them as much as possible far away from each other. Each object is assigned to the group

that has the closest centroid. When all objects have been assigned, the positions of the k

centroids are recalculated. These steps are repeated until the centroids no longer move.

The optimal solution is to minimize cost-function J in

J =

k∑

i=1

∑

xj∈Si

‖xj − µi‖2, (4.24)

where there are k clusters Si, i = 1, 2, . . . , k and µi is the centroid of all the points xj ∈ Si.

4.3.6 Reduction of redundant variables

The intention is to reduce dimensionality and at the same time find statically significant

variables. In this work hard clustering algorithm is used. It means that each vector belongs

exclusive to a single cluster. Objects are classified so that each object in the group is similar

to another. The mean values of preprocessed variables in the new subspace axis are row

vector values in the matrix Θ, see equation (4.17). K-means algorithm is used to cluster


this matrix to k clusters. For each variable is defined one row vector. One object is

selected from each cluster for further analysis. Selection rule is based on equation (4.22).

The Largest distance from the origin is derived for each object. The farthest object inside

the cluster is at the last row in the matrix DI .

4.4 Modeling of nonstationary process

A system is static if its statistical properties do not vary with time [3]. Complicated sta-

tionary models are useful for training operators and understanding dynamics of the plant,

but for fault detection it is not accurate enough. Industrial processes are nonstationary.

For example, dependencies between variables in different process states can vary. Also

external conditions such as seasonal variations impose a dynamic variation to the process.

It is almost impossible to separate the effects of specific phenomena. Therefore it is really

hard to create static model for a industrial process. Adaptive modeling gives possibility

to model and recognize abnormal events in dynamic processes.

4.4.1 Delay and correlation

Delays and linear correlation should be examined before the linear PCA. In this work

these are examined after selection of the interpretative features. Before PCA it would be

too time-consuming, because of large number of process variables. The disliked effect of

delay is not so large in PCA as it is in modeling.

Delays were detected by cross-correlation function [25]. This method detects delays be-

tween two variables. Auto-correlation and cross-correlation functions for two signals u and

y are defined in

ruu(τ) = E[u(t)u(t + τ)], (4.25)

and

ruy(τ) = E[u(t)y(t + τ)], (4.26)

where τ is the time lag. The time lag is zero, if there is no linear correlation between

signals u and y. Next a simple example is illustrating delay detection. Let u be coolant

flow rate measurements and y sum of the steam flows measurements.

Auto-correlations for u and y are shown in Figure 4.4 (a,b). Filtering frame size is 2.5

seconds, difference frame size is 0.5 seconds and sample time is 0.5 seconds. Delay between

these two variables can be detected from the lag of maximum value of cross-correlation

function in Figure 4.4 (c). The maximum value of cross-correlation function is quite big

(ruy(7) = 0.5955). The delay (3.5 seconds) is quite reliable.


−20 −10 0 10 200

0.2

0.4

0.6

0.8

1

τ

Auto−correlation: u

−20 −10 0 10 20−0.5

0

0.5

1

τ

Auto−correlation: y

−20 −10 0 10 20−0.2

0

0.2

0.4

0.6

τ

Cross−correlation: from u to y

X: 7Y: 0.5955

Figure 4.4: (a,b) Auto-correlation of u and v and (c) the result of cross-correlation functionbetween these variables.

Results of delay detection depend strongly on preprocessing. In the previous example the

size of filtering frame can be seen from the auto-correlation subplots. Auto-correlation

value is not high, when absolute value of τ is more than five. The interpretative features in

model are delayed, if delays are well-defined. Usually delays between each variable are hard

to eliminate, because these can vary between process states or data sets. Also external

conditions such as weather conditions, season and time of day effects. For example in

Finland the sea water is used as coolant water for the condenser.

The complexity of delay is the most common reason, why it is usually expected that

there are no delays between variables. In practice, the unwanted effect of delay can be

decreased by increasing parameter filtering frame size N in discrete convolution function

at preprocessing phase.

4.4.2 Weighted recursive least squares

Recursive least squares (RLS) method is selected for modeling, because it is necessary to

have model of the system available on-line while the system is in operation. The model is

based on observations up to the current time. Difference between observed and estimated

value is used for fault detection. In this work these residuals are used to derive leakage

index to detect slow developing leakage.

In time variant systems old measurements should be forgotten and new measurements

should be weighted more. Therefore weighted recursive least squares (WRLS) method is

used. WRLS is adaptive for small process state changes, where correlations between model

features are almost the same. An abnormal event can be detected, if process variables

behavior changes dramatically. WRLS with time-varying forgetting factor converges for

the system in time. Intuitive idea is that forgetting factor is decreased as the error increases

– large error is forgotten faster [37]. High residual between estimate and measured value

can mean fault, normal state change or not so good model. In general, a fault is declared,

if the size of the residual exceeds a certain threshold value [6]. Process knowledge should

be applied. In these situations decision support system or an expert user should do the

diagnosis [3].


Recursive algorithm assumes that there is a preliminary model with N samples. One more

sample is measured and it is used to make model better. Most of the recursive models

Θ(N + 1) can be represented as

Θ(N + 1) = Θ(N) + γ(N) (y(N + 1) − y(N + 1|N)) , (4.27)

where γ(N) is updating factor function and (y(N + 1) − y(N + 1|N)) is model estimation

error.

In this work a new model is used for fault detection and decision. Data set is processed

recursively. Estimation error e(k + 1) in current time k + 1 is shown in

e(k + 1) = y(k + 1) − ϕT (k + 1)Θ(k + 1), (4.28)

where y(k+1) is dependent feature value, ϕ(k+1) is interpretative feature vector (eq. 4.30)

and Θ(k + 1) is adaptive coefficient vector (eq. 4.32).

WRLS algorithm format is quite complicated. Only summarized equations are introduced

in this work. More information is available on Internet [38] and in Lennart Ljung’s System

Identification [39].

The most correlated variables are selected for WRLS model. Original measurements are

not used, because it is not possible to set variables to zero mean, when future values

are unknown. Also difference vector values fits better to Gaussian distribution, which is

expected in WRLS method. Therefore difference values of selected preprocessed variables

are used as model features.

Delays are taken into account in following equations. For example, if delay is three seconds

and time step is 0.1 seconds, interpretative feature value ui is selected 30 time steps before

dependent variable y. The on-line computation of the model has be completed during one

sampling interval [39]. This should not be a problem, because sampling interval is not

required to be small for the detection of slow developing faults.

Process measurements are formed to interpretative feature vector ϕ(k+1) every time step

in

ϕ(k + 1) =

y(k)

u1(k + 1)

...

uN (k + 1)

, (4.29)

where y is a dependent feature value and u are interpretative features. The interpreta-

tive feature vector can be created also without the dependent feature value y, see equa-

tion (4.30). Model does not work as well, but there are reasons why y is not included to

vector ϕ(k + 1). Leakages are not detected, if the past information of dependent feature


is used to estimate model.

ϕ(k + 1) =

u1(k + 1)

...

uN (k + 1)

(4.30)

Updating vector γ(k + 1) is defined in

γ(k + 1) =P(k)ϕ(k + 1)

λ(k + 1) + ϕT (k + 1)P(k)ϕ(k + 1), (4.31)

where P(k) is a confidence factor matrix (4.35), λ(k + 1) is a time varying forgetting

factor (4.36) and ϕ(k + 1) is the interpretative feature vector. Updating vector is used to

calculate adaptive coefficient vector Θ(k + 1) for the linear model in

Θ(k + 1) = Θ(k) + γ(k + 1)(y(k + 1) − ϕT (k + 1)Θ(k)

), (4.32)

where γ(k+1) is the updating vector, y(k+1) is the dependent feature value and ϕ(k+1)

is the interpretative feature vector.

A confidence matrix P(k + 1) is defined in Lennart Ljung [39, p.365] as

P(k + 1) =1

λ(k + 1)

(P(k) −

P(k)ϕ(k + 1)ϕT (k + 1)P(k)

λ(k + 1) + ϕT (k + 1)P(k)ϕ(k + 1)

), (4.33)

but in this work updating vector is used. Therefore a confidence matrix P(k + 1) can be

represented more simple form

P(k + 1) =1

λ(k + 1)P(k) −

P(k)ϕ(k + 1)ϕT (k + 1) 1λ(k+1)P(k)

λ(k + 1) + ϕT (k + 1)P(k)ϕ(k + 1), (4.34)

and thereby as

P(k + 1) =P(k)

λ(k + 1)

(I − γ(k + 1)ϕT (k + 1)

), (4.35)

where λ(k + 1) is the time varying forgetting factor, I is an unit matrix, γ(k + 1) is the

updating vector and ϕ(k + 1) is the interpretative feature vector.

The forgetting factor λ is defined in

λ(k + 1) = 1 −1

Ω

(1 − ϕT (k + 1)γ(k)

)e2(k + 1), (4.36)

where Ω is the user defined constant (4.37), ϕ(k + 1) is the interpretative feature vector,

γ(k + 1) is the updating vector and e(k + 1) is the estimation error. In some application

the forgetting factor λ is constant. It can be used for a system that changes gradually and

in a stationary manner. The WRLS algorithm has better adaptivity, if forgetting factor


λ is dynamic.

In this work it changes in time, because system can undergo abrupt and sudden changes.

There are many variations to compute forgetting factor. It is defined in [37], [38] as

Ω =σ2

v

1 − λ0, (4.37)

where σ2v is variance of estimation error in normal condition and λ0 is the user defined

initial value for the forgetting factor. The normal variance σ2v is based on the knowledge

of the system or measured by feeding white noise to the system.

Typical choices of forgetting factor are in the range between 0.98 and 0.995 [39]. Forgetting

factor is forced to be inside the limit values by λmin and λmax to ensure model robustness.

Initial values are confidence matrix P(1) diagonal values are defined by constant α, coefficient vector values Θ(1)i are inversion of the number of model interpretative

features, time varying forgetting factor λ(1) is λmin.

Chapter 5

Experiments and results

TVO provided 37 data sets for this work. Data sets have been stored during the years

2001 and 2005. All data sets were used for database construction of a data management

tool (DMT). Only few data sets contained enough variables to do data mining (DM). In

this work one data set was explored and analyzed. It was also used for developing fault

detection method. All analysis results cannot be described here, but the most important

tables and figures are shown. DM is very iterative process, so the structure of DM project

is introduced first. Focus is more inside the data analysis steps of each phase, which

were described in Chapter 3. Developing the leakage index, the method to detect faults is

introduced in the end of this chapter.

5.1 Data mining process model structure

DMT is divided to three main phases: preprocessing, variable selection and modeling.

5.1.1 Phase 1: Preprocessing

RAWDATA

data typeselection

preview:time seriesstatistical properties

Database

Preprocessing

1 2

3

4 NORMALIZEDDATAMSDVdifferencevolatilityparameters:frame sizesfiltering typescaling type

5

Figure 5.1: Flow chart of preprocessing phase.

Phase 1 of the DM project is shown in diagram 5.1. It is divided to five steps, which are

listed below.

36

CHAPTER 5. EXPERIMENTS AND RESULTS 37

1. The operator had selected variables and stored a design based event.

2. Data set dimension can be reduced by leaving off some data types. Nominal and

ordinal data types cannot be used in the multivariate methods. Vectors, which have

less unique values than the user defined parameter are removed from the analysis.

3. Depending the dimension of data set some or all variables are previewed in time

series plots. Statistical properties are calculated for all variables.

4. Database information is used to normalize data vectors. Initial parameters of the

DMT are used for preprocessing. Still the user should consider parameter values such

as sampling frequency, frame sizes for filtering, moving standard deviation (MSDV)

and difference. Also types of preprocessing can be changed. Is moving average

(MA) better than median filtering? Does data set include same variables as in the

database? If not, scaling cannot be global range scale normalization. In that case

normalization has to be done by using only analyzed data set values.

5. Preprocessing results are volatility and for all variables: normalized and reflection

values, MSDV and difference in the matrix form. If results are not good enough, the

DMT user can change parameters and preprocess the data set again.

Modeling with raw data is not possible, because noise in the data vectors can create un-

stable model. Therefore the user has to understand what are the suitable preprocessing

results. On the other hand too smooth data disturbs the model, because relevant infor-

mation can be vanished in filtering.

Success of preprocessing can be examined by graphical or numerical way. Program cal-

culates a smoothing index, which is the sum of residual between the filtering and original

data values. Filtering parameters could be estimated by database information. Each vari-

able would have its own suitable filtering type and frame size. In this Master’s thesis all

filtering frame sizes were same.

5.1.2 Phase 2: Variable selection

Phase 2 is divided to 11 steps and it is shown in diagram 5.2. This phase has two intentions.

First five steps are meant to analyze stored data set. In these steps figures and tables are

created as a data set report. The rest six steps are preparation for the modeling phase,

where interpretative features for the model are selected.

PCA is used in two different ways in this work. In data set report PCA is used in its

original meaning. Dimensionality of the data set is reduced, while retaining as much as

possible the variation. In variable selection only the eigenvalues and the eigenvectors are

examined.

1. Normalized data is analyzed in following steps.


PCANORMALIZEDDATA

1

PCA

2select PCA parameters

DATA SET REPORT

select PCA parameters

reduce dimensionVariable selection FEATURES

delay elimination

3

4

10

6

7

8

9

Modeling

statistically significant variablesHotelling’s T²component maps

segmentation of the data set5

11

histogramsscatter plot

selection of similar behaved variablesHotelling’s T²component maps

Figure 5.2: Flow chart of data set analysis and interpretative feature selection.

2. The user has to select parameters for PCA. These are number of principal components

and number of clusters. Usually data is projected to two or three first components

for visualization in the new coordinate system.

3. Principal component analysis for the data set.

4. Time series of the statistically significant variables are viewed in data set report.

The dynamic behavior varies between these variables and these have large variation

in time. Amount of variables are the same as the number of clusters. Scores are

used for state analysis. Hotelling’s T 2 helps user to pinpoint the fault starting

time. A loading plot, which is called in this work component map can be used

to detect dynamic differences between the variables. If there are variables with

same dynamical behavior, the amount of cluster should be decreased. Then PCA

is performed and data set report are analyzed again. User should have process

knowledge to perform isolation and magnitude of the fault. It should be remembered

that there is no evidence that statistically significant variables are reason or even

related to the fault.

5. User can manually segment the data set. Then variable statistics can be examined

before and after the fault has occurred. In this work only minimum, maximim, mean

and median values are shown for each statistically significant variable.

6. The user selects target variable for model. This can be one of the variables listed in

most significant variables in the data set report. Objective of the DM project should

be also considered. For example in this work the goal is to develop detection method

for the leakage. Therefore one of the steam flows is target variable. Default value for

number of principal components is three in DMT. User has to select some parameters

concerning the visualization and percentage R% for reducing the dimensionality in


each PCA iteration step.

7. Principal component analysis is done for the selected data matrix.

8. After every iteration step, the user gets several figures illustrating the progress of

variable selection. Eigenvectors are analyzed and similarity between the variables

is recognized. When suitable amount of possible interpretative variables for the

target variable are remaining, features are extracted. Proper amount of variables for

modeling phase is less than 10 variables, because result visualizations fit well into

one figure. Also different combinations of interpretative features for model can be

tested manually.

9. Data set dimensionality is reduced R% percentage and new data matrix is con-

structed. Go to step 7.

10. Linear correlations between all the possible interpretative variables are calculated.

User can examine these from scatter plot visualization. The user should remem-

ber that variables were selected by exploring linearity. Non-linear correlations are

possible. Linearization should be considered, if these features are included to the

model. If correlation is not enough between the difference values of the interpre-

tative features, the preprocessing should be done again with larger filtering frame

size. Another possibility to increase correlations are to detect delays between the

features.

11. Delays are detected by the cross-correlating method. The user has to detect delays

from the visualizations. The effect of delay is eliminated in the modeling phase.

This phase is most important in DMT. Data set report is created and interpretative

variables for modeling are selected. Delays do not affect much the PCA results, because

it is static method. Therefore delay detection is performed after the dimension reduction.

Delays between variables are hard and slow to detect. Luckily delay elimination (11) is

not always required. Target variable is modeled after this iterative interpretative variable

selection phase.

5.1.3 Phase 3: Modeling

The modeling phase is the most challenging in DM project. Data set has been already

examined. Now it is focused to reach the goal, which is leakage detection, in this thesis.

Last phase of DM project is shown in diagram 5.3.

This phase is divided to four steps:

1. The DMT user has a set of possible interpretative features for modeling.


FEATURES1

selection of interpretative featurestarget variableinterpretative features

2 Modeling3

select WRLS parameters

MODELWRLS3

4

visualizationleakage manipulation

return to phase 1 or 2

Figure 5.3: Flow chart of modeling phase.

2. The user selects some of the features and these are settled to interpretative matrix

ϕ. If delays are detected between the features, it is taken into account here. WRLS

parameters have to be selected. DMT has initial values for WRLS parameters, but

the user should optimize these by using his or her own DM skills.

3. Weighted recursive least squares (WRLS) method is done off-line to the data set.

The model is based on observations up to the current time only, so the results would

be the same in real situation.

4. The modeling results are visualized in several figures. The user can test model

also with other data sets. In DMT leakage can be created by manipulating the

data. Model can be evaluated by cost function. Model parameters or interpretative

features should be changed, if model does not fit well, return to step 2. Also bad

preprocessing parameters can be a reason for unsatisfactory results, then the user

should return to DM phase 1.

The model is evaluated by DMT. Cost function value is sum of squared model errors.

However the user should test model with other data sets. In practice same interpretative

features and model parameters are used in other data set. Implementation to the real

system can be considered after evaluating the model with different process states and

situations.

5.2 Design based events

There have not been many faults in any Finnish NPP. In last six years the reactor scram

has happened only three times in Olkiluoto. The last time was 4th of September 2007.

Nuclear reactor 2 was shut down due to a malfunction in the generator cooling system [40].

Minor problems have occurred at Olkiluoto in past years and these can be used in anal-

ysis. Available data sets contained different types of sudden faults such as problems in

Scandinavian main grid, pump interruption, etc. Also some stored design based events

were provided. The goal was to develop method to detect slow developing faults such as

leakages. In this work data set was manipulated to create a slow developing leakage to

steam pipeline.


5.2.1 Implementation of data management tool

All data mining tasks were done by a data management tool (DMT), which was pro-

grammed in Matlab. Some methods were implemented completely. Preprocessing and

WRLS program classes were programmed totally by the author. Matlab library PCA

method princomp is used for variable selection. K-means algorithm is a method in the

SOM Toolbox [41]. The program structure of DMT and the user interface are shown in

appendices A.

5.2.2 Database construction

It is important to use all the available information for DM project. All 37 data sets

from both reactors were used to create database, but not simulator data. Variables are

same in simulator data sets as in real events. It was assumed that there does not exist

any more valuable information. Basic statistical properties were calculated and stored to

the DMT database. Database minimum and maximum values for variables are used for

scaling in preprocessing phase. Mean and median values give suggestive information for

normal operating states. These values could be also used for defining the operating point

or points. If current measurement values are far from these values, the system is not in

the normal operation state.

5.2.3 Data set selection

In this Master’s thesis only one data set was analyzed and described. Pump number 5

stopped 2nd of April 2004 at 20:06:59 in primary circulation system at Olkiluoto 2 reactor.

It is known by included linguistic comment in the data set. Data set starts 120 seconds

before failure and continues 600 seconds after it. It contains 14400 samples and 68 signals.

5.3 Preprocessing

Raw data was previewed. Variables, which had less than 100 different values in the

data set were eliminated from the analysis. Original data set had 68 variables and after

elimination 44 variables were copied for further analysis. All these variables were previewed

and statistical properties such as mean, minimum and maximum values were calculated.

Database minimum and maximum values were used for global range scaling. For example

coolant flow rate is shown in Figure 5.4 (a) and scaled variable is shown in Figure 5.4 (b).

Coolant flow minimum value is 2523 kg/s and maximum value is 8181 kg/s in the database.

Measurements in this data set are inside these limit values. Therefore preprocessed values

are between zero and one.

Data set frequency was 20 Hz. In this case it was decided to use half second sampling time

to ensure enough time for model calculation. Of course everything is done off-line, but if


−200 0 200 400 6006000

6500

7000

7500

8000

8500

Time [s]

[kg/

s]

Coolant flow rate

−200 0 200 400 6000.7

0.75

0.8

0.85

0.9

0.95

1

Time [s]

Preprocessed coolant flow rate

Figure 5.4: Process signal preprocessing: (a) original coolant flow rate, moving average,upper confidence limit (4.13) and lower confidence limit (4.13). (b) Preprocessed coolantflow rate. Grey line represents scaled values and black line represents filtered and scaledvalues.

developed model is implemented in real process, it should be possible to implement it in

practice too. The frame size for filtering was 10 seconds.

MSDV values were calculated for all variables with 20 seconds frame size. These were

used for confidence limits and also to derive process volatility, shown in Figure 5.5. The

system state changes, when one of the main circulation pumps stops working. The data

set is analyzed more exactly and then it is segmented for the modeling phase.

−200 −100 0 100 200 300 400 500 6000

500

1000

1500

2000

2500

Time [s]

Volatility

Figure 5.5: Volatility of the data set: System state changes at T ime = 0.

5.3.1 Data set exploration

PCA eigenvectors were plotted to component map. K-means clustering (4.24) of eigenvec-

tors gives significant variables for the user. One variable from each cluster was selected,

so suitable number of clusters were selected. Clustering was done with four, eight and 12

clusters. Using 12 clusters there were too many redundant variables and with four clusters

all relevant variables were not selected.


36 44 32 12 7 37 15 4−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1Eigenvectors for each variable

−1 −0.5 0 0.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

36

44

32

127

37

15

4


2nd

Prin

cipa

l Com

pone

nt

A loading plot

Figure 5.6: Data set report of statistically significant variables: (a) Eigenvectors of eachvariable. (b) A loading plot revealing relationships among the variables. Axes are 1st and2nd component and labels codes represent variables.

0 200 400

0.4

0.42

0.44

0.46

output of level controller 53736−CH537K001

2[rpm]

0 200 400

0.2

0.4

0.6

0.8

output voltage P544−CH649K866[V]

0 200 400

0.8

0.82

0.84

0.86

output of controller 53532−CH535K001

1[rpm]

0 200 4000.6

0.7

0.8

0.9

piston position of high pressure feed water12−CH413V505[%]

0 200 400

0.86

0.88

0.9

0.92

steam flow 47−CH311K304[kg/s]

0 200 400

0.805

0.81

0.815

0.82

400 kV voltage37−CH621K831[kV]

0 200 400

0.2

0.3

0.4

0.5

0.6

0.7

idle power15−CH421K821[MVAr]

0 200 4000.82

0.84

0.86

0.88

0.9

0.92

0.94


Figure 5.7: Data set report: Range scaled values of the statistically significant variables.Level controller output 537 and controller output 535 are system variables. Automationsystem reacts rapidly after a pump failure and steam flows starts recovering after the fault.


−1 −0.5 0 0.5−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

−52.5

6

64.5123181.5240298.5

357415.5474532.5591

Scores


2nd

Prin

cipa

l Com

pone

nt

−200 0 200 400 6000

10

20

30

40

50

60

70

80

90

100T² statistics

Time [s]

Figure 5.8: Data set report: (a) Scores, the measured data in the low-dimensional space.Numbers in the figure are time labels. (b) Hotelling’s T 2 statistics. Horizontal line isupper control limit (4.20).

Eight statistically significant variables were selected. Eigenvectors and a loading plot are

shown in Figure 5.6 (a). Time series are shown in Figure 5.7. Output voltage (variable 44)

is the most effective variable statistically. Absolute value of first principal component is

large, −0.92. This can be seen also from the loading plot in Figure 5.6 (b) and from the

second subplot in Figure 5.7. Output voltage [V ] shrank almost from its maximum value

to zero in few seconds. This measurement belongs to the same pump, which suddenly

stopped.

Process variable statistics are calculated for all variables, but only statistically significant

variables are shown. Properties before the state change in Table 5.1 and after the state

change 5.2. For example the piston position mean value of the high-pressure turbine feed

water (HPFW) was 82% in the data set.

Process state changes can be examined by Hotelling’s T 2 statistics after the variable

reduction. State change in the process can be detected from the score values, Figure 5.8

(a). State 1 is on the left side and state 2 on the side of the new coordinate system.

Hotelling’s T 2 statistic is shown in Figure 5.8 (b). State change can be detected as T 2

exceed of the upper control limit (UCL) (4.20). Maximum value of T 2 was received at

T ime = 5.5.

Now the reader is reminded that these results were automatically created without using

any process knowledge or information of the stored event. These DM results combined

with the expert process knowledge can be used for further design based event analysis.

Next modeling and development of a leakage index are introduced.


Table 5.1: Data set report: Variable statistics before state change.# Variable Mean Median Min Max

36 level controller output 537 98.70 101.19 85.95 102.98

44 output voltage P5 0.77 0.78 0.65 0.82

32 controller output 535 5219.96 5265.84 4912.11 5345.05

12 HPFW piston position 81.98 87.60 55.83 89.54

7 steam flow 4 314.03 320.78 275.06 327.77

37 400 kV voltage 62.39 63.48 49.94 69.70

15 idle power -0.47 -0.46 -1.09 -0.20

4 steam flow 1 1266.28 1289.61 1114.81 1303.59

Table 5.2: Data set report: Variable statistics after state change.# Variable Mean Median Min Max

36 Level controller output 537 88.03 87.97 86.04 90.54

44 output voltage P5 0.72 0.72 0.68 0.76

32 controller output 5009.22 4997.83 4874.13 5150.82

12 HPFW piston position 58.08 57.83 56.78 60.33

7 steam flow 4 284.06 283.56 274.25 296.53

37 400 kV voltage 56.67 56.66 52.98 60.21

15 idle power -0.47 -0.46 -1.18 -0.22

4 steam flow 1 1166.10 1162.01 1138.11 1205.96

5.3.2 Segmentation

Data set was segmented, because of the clear process state change. First 25% of the time

series were removed from the data set. Selected data for analysis is after the high volatility

in Figure 5.5 and UCL exceed in Figure 5.8 (b).

WRLS modeling is not enough robust for large-scale state changes as a pump failure. New

data set starts about 80 seconds after pump failure and it can be used for reaching the

goal of this work – method to detect leakage. It was assumed that process after the pump

failure is stabile enough for modeling. The data set length was 1061 observations after the

preprocessing and segmentation.

5.4 Variable selection and feature extraction

Two isolation valves separate each steam line from the boiling water reactor (BWR). Four

outlets are located symmetrically and therefore flow rates are similar [15]. Steam lines are

located in the primary circulation system of BWR and leakage in it is defined as loss of

coolant accident (LOCA) [16].

Variable selection and feature extraction were preparation for the modeling. In this Mas-

ter’s thesis leakage detection for one variable was designed. In this data set flow meters

were for coolant flow and steam flows. There were not enough possible interpretative

variables for the previous one. Statistically significant variable steam flow 4 was selected

for target variable. Another reason for this was that models for other steam flows would

have been easy to create.


100200300400500

0.86

0.88


100200300400500

0.8

0.82


1002003004005000.880.89

0.9

HP−inlet pressure9−CH411K161[barg]

1002003004005000.595

0.60.605

0.610.615

HPFW piston position10−CH413V501[%]

1002003004005000.595

0.60.605

0.610.615


100200300400500

0.72

0.73

0.74

APRM 429−CH531K954[%]

100200300400500

0.870.880.89

LPRM 22.225−CH531K886[V]

1002003004005000.73

0.74

0.75

APRM 328−CH531K953[%]

1002003004005000.715

0.720.725

0.730.735

APRM 126−CH531K951[%]

100200300400500

0.790.8

0.81

feed water flow8−CH312K301[kg/s]

100200300400500

−101

x 10−3

d−CH311K3047−[dkg/s]

100200300400500−1

0

1x 10

−3d−CH311K303

6−[dkg/s]

100200300400500

−505

10x 10

−4d−CH411K161

9−[dbarg]

100200300400500−4−2

024

x 10−4

d−CH413V50110−[d%]

100200300400500−4−2

024

x 10−4

d−CH413V50512−[d%]

100200300400500

−505

x 10−4

d−CH531K95429−[d%]

100200300400500

−101

x 10−3

d−CH531K88625−[dV]

100200300400500

−505

x 10−4

d−CH531K95328−[d%]

100200300400500

−505

10x 10

−4d−CH531K951

26−[d%]

100200300400500

−505

10x 10

−4d−CH312K301

8−[dkg/s]

Figure 5.9: (1st and 2nd rows) present selected variables. (3rd and 4th rows) presentpotential interpretative features.

Dynamically similar variables were searched, after the target variable selection. PCA was

executed iteratively to perform the best results. In every round half of the variables were

selected for the next PCA. Last rounds only one data vector was taken off from the data

matrix. This proceeded, until suitable amount of possible interpretative variables (10)

was remaining. There were not negative correlating variables in this subset, so reflection

values of the variables were not needed.

0.84 0.86 0.88 0.90

100

200

300

400


0.86 0.88 0.9 0.920

100

200

300

400

500

sum of steam flows3−CH311K035[kg/s]

0.86 0.88 0.9 0.920

100

200

300

400

HP−inlet pressure9−CH411K161[barg]

0.58 0.6 0.620

100

200

300

400


−1 0 1

x 10−3

0

50

100

150

200

250

d−steam flow 47−d−CH311K304[dkg/s]

−5 0 5

x 10−4

50

100

150

200

250

d−sum of steam flows3−d−CH311K035[dkg/s]

−1 0 1

x 10−3

0

50

100

150

200

250

d−HP−inlet pressure9−d−CH411K161[dbarg]

−5 0 5

x 10−4

50

100

150

200

250

d−HPFW piston position10−d−CH413V501[d%]

Figure 5.10: Histograms: (1st row) shows variables. (2nd row) shows features.

Features were extracted, with one second difference frame size. Variable and feature time

series are shown in Figure 5.9. Feature values vary around zero, even variables have higher

values in the end of data set. Features satisfy the normal distribution assumption better


than measurements. This is illustrated by histograms in Figure 5.10.

7 6 9 10 12 29 25 28 26 8−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8Eigenvectors for each variable

0.26 0.28 0.3 0.32 0.34 0.36 0.38−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

7

6

91012

29

25

2826

8


2nd

Prin

cipa

l Com

pone

nt

A loading plot

Figure 5.11: The dynamically similar variables: (a) Eigenvectors of each variable. (b) Aloading plot. Axes are 1st and 2nd component and labels codes represent variables.

Three loading vectors are visualized for each variable in Figure 5.11 (a). Correspond-

ing variables for the label codes are explained in Figure 5.9 titles. These results were

used to select the best interpretative variables for target variable. Lines between the

target variable and other variables illustrate the similarity. Dynamically similar vari-

ables with target variable are steam flow 3 (CH311K303), high-pressure turbine, HP-

inlet pressure (CH411K161) and high-pressure turbine feed water (HPFW) piston position

(CH413V501).

Scores and T 2 statistics are illustrated in Figure 5.12. Small changes in the system state

can be recognized, which will create undesired error to the modeling results and thereby

to the leakage index. It is assumed that process state changes in time can be recognized

from the score position on the 1st principal component projection.

−0.04 −0.02 0 0.02 0.04 0.06−8

−6

−4

−2

0

2

4

6

8x 10

−3

112.5

156199.5243

286.5

330

373.5

417460.5

504

547.5

591

Scores


2nd

Prin

cipa

l Com

pone

nt

0 200 400 6000

5

10

15

20

25

30T² statistics

Time [s]

Figure 5.12: The dynamically similar variables: (a) Scores, the measured data in thelow-dimensional space. (b) Hotelling’s T 2 statistics.


Two different interpretative matrices were created for modeling. Both had same target

variable: CH311K304 (steam flow 4). All automatically selected variables except redun-

dant features were selected to the first model. Redundant variables had same linguistic

explanation. Redundant features could also be combined to one feature. it was assumed

that results would be almost the same.

Interpretative variables in first model: CH311K303: steam flow 3 CH411K161: high-pressure turbine, HP-inlet pressure CH413V501: high-pressure turbine feed water (HPFW) piston position CH531K954: APRM 4 CH531K886: LPRM 22.2 CH312K301: feed water flow

APRM and LPRM are neutron flux measurements in the reactor. Expert industrial engi-

neers use these variables to derive power in different points of reactor.

Second model had less interpretative features. It was decided to use computed feature sum

of steam flows as one interpretative feature. Other two features were the most nearest

variables in the distance matrix.

Interpretative variables in second model: CH311K035: sum of steam flows CH411K161: high-pressure turbine, HP-inlet pressure CH413V501: high-pressure turbine feed water (HPFW) piston position

Delays between the model features were estimated. Mostly there were no delays between

target and interpretative features. It means that cross-correlate function got it’s highest

values with τ = 0. It should be remembered that delays should be examined from other

data sets too and there was no possibility for that.

The delay identification by cross-correlation function between steam flow 4 and HPFW

piston position in model 2 is shown in Figure 5.13. The delay cannot be estimated clearly,

because the maximum value of cross-correlation function is about 0.2. In this case delay

was estimated from zero to seven time steps (3.5 seconds), but this information is not

very reliable. Filtering frame size was 10 seconds and sample time was 0.5 seconds. These

preprocessing parameters effected negatively to the delay estimation.

Linear correlations between the target and interpretative variables were verified by scatter

plot. The paired comparisons for the variables are shown below and above the time series


−20 0 20−0.5

0

0.5

1

τ

Auto−correlation: u

−20 0 20−0.5

0

0.5

1

τ

Auto−correlation: y

−20 0 20−0.2

−0.1

0

0.1

0.2

0.3

τ

Cross−correlation: from u to y

Figure 5.13: (a) Auto-correlation of steam flow 4, (b) HPFW piston position and (c)Cross-correlation between these variables.

100200300400500

0.86

0.887−CH311K304[kg/s]

100200300400500

0.88

0.89

0.9

3−CH311K035[kg/s]

0.85 0.9 0.950.8

0.85

0.9r=0.95824

100200300400500

0.880.89

0.9

9−CH411K161[barg]

0.85 0.9 0.950.8

0.85

0.9r=0.95331

0.85 0.9 0.950.85

0.9

0.95r=0.99085

1002003004005000.595

0.60.605

0.610.615

10−CH413V501[%]

0.55 0.6 0.650.8

0.85

0.9r=0.95847

0.55 0.6 0.650.85

0.9

0.95r=0.998

0.55 0.6 0.650.85

0.9

0.95r=0.99038

0.8 0.85 0.90.85

0.9

0.95r=0.95824

0.8 0.85 0.90.85

0.9

0.95r=0.95331

0.8 0.85 0.90.55

0.6

0.65r=0.95847

0.85 0.9 0.950.85

0.9

0.95r=0.99085

0.85 0.9 0.950.55

0.6

0.65r=0.998

0.85 0.9 0.950.55

0.6

0.65r=0.99038

Figure 5.14: Model 2: Scatter plot for filtered data. Linear correlation values are shownin title of each subplot.

in Figure 5.14. Linear correlation is shown in the title of each subplot. For example subplot

titles in the top row are linear correlation values between each interpretative variables and

target feature. Time series of computed feature sum of steam flows is second on the

diagonal in Figure 5.14.

Importance of filtering can see by comparing the paired comparisons in Figures 5.14

and 5.15. Linear correlation in unprocessed data is not enough for successful modeling.


100200300400500

0.85

0.97−CH311K304[kg/s]

100200300400500

0.88

0.9

3−CH311K035[kg/s]

0.85 0.9 0.950.8

0.85

0.9r=0.73298

1002003004005000.860.88

0.9

9−CH411K161[barg]

0.85 0.9 0.950.8

0.85

0.9r=0.60869

0.85 0.9 0.950.85

0.9

0.95r=0.81075

100200300400500

0.6

0.62

10−CH413V501[%]

0.55 0.6 0.650.8

0.85

0.9r=0.70732

0.55 0.6 0.650.85

0.9

0.95r=0.94438

0.55 0.6 0.650.85

0.9

0.95r=0.81688

0.8 0.85 0.90.85

0.9

0.95r=0.73298

0.8 0.85 0.90.85

0.9

0.95r=0.60869

0.8 0.85 0.90.55

0.6

0.65r=0.70732

0.85 0.9 0.950.85

0.9

0.95r=0.81075

0.85 0.9 0.950.55

0.6

0.65r=0.94438

0.85 0.9 0.950.55

0.6

0.65r=0.81688

Figure 5.15: Model 2: Scatter plot without filtering. Linear correlation values are shownin title of each subplot.

5.5 Modeling

Modeling with WRLS was more challenging than expected, because structure of the model

is complicated and there exist many user-settable parameters. Without suitable prepro-

cessing and WRLS parameters the modeling results were not good.

In this work simple cost function was used to analyze success of modeling. Not much

attention was paid to find best parameters to fit model to measured values. There is no

sense to find out the best parameter values for particular case. Secondly there were not

enough stored data sets containing same variables to evaluate model or test it in different

process states.

Modeling parameters were selected iteratively. For example, if coefficients Θ of interpre-

tative features ϕ changed too much, parameters, which effects to forgetting factor λ, were

changed.

Used parameters in both models are shown in Table 5.3. Three first parameters in the

table are used in preprocessing phase. Moving standard deviation (MSDV) frame size

effects to the volatility. Limit values for λ are used to manipulate the forgetting factor

and secure the model robustness. Minimum value for λ was mentioned by Ljung in [39].

Constant α and normal variance σ2v are effecting to the forgetting factor λ and thereby to

WRLS coefficient dynamics.


Table 5.3: Preprocessing and model parameters.

Value

Sampling [s] 0.5

Difference frame size [s] 1.0

Filtering frame size [s] 10.0

MSDV frame size [s] 20.0

Minimum value for λ 0.98

Maximum value for λ 0.999

α 1000

Normal variance 0.000005

5.5.1 Model 1

A model with six interpretative features was created. The model fits rather well to the

real measured values, shown in Figure 5.16.

0 100 200 300 400 500 6000.84

0.85

0.86

0.87

0.88

0.89

0.9

Time [s]

steam flow 4CH311K304

real valueestimate

Figure 5.16: Model 1: Preprocessed real steam flow 4 values and the model estimationvalues.

Cost function value for the model was derived. If model fits perfectly, the cost function

value is zero. It is a sum of the squared estimation errors defined in

J =

n∑

i=1

e2, (5.1)

where length of the data set n was 1061. Cost function value J for first model was 0.0126.

Adaptive coefficient vector (4.32) values Θ for model 1 are shown in Figure 5.17. In the

beginning the change of coefficients depends on parameter α. The model was unstable

with too high values of α and if the normal variance σ2v was settled too small. Many

coefficient values go towards zero, because there are too many interpretative features in


the model. Therefore model 2 has less variables than this model.

0 100 200 300 400 500 6000.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Model coefficients Θ

d−steam flow 3d−HP−inlet pressured−HPFW piston positiond−APRM 4d−LPRM 22.2d−feed water flow

Figure 5.17: Model 1: Coefficient values of six interpretative features.

5.5.2 Model 2

The behavior of model with three interpretative features seems to be good too, shown in

Figure 5.18. Cost function (5.1) value J for the model 2 was 0.0084. Adaptive coefficient

0 100 200 300 400 500 6000.84

0.85

0.86

0.87

0.88

0.89

0.9

Time [s]


real valueestimate

Figure 5.18: Model 2: Preprocessed real steam flow 4 values and the model estimationvalues.

vector (4.32) values Θ for model 2 are shown in Figure 5.19. Now coefficient values do not


change as much as in model 1. Coefficient values are varying around the initial values. It

should be that way, because linearly correlating variables were selected in variable selection

phase. Small changes in the process cause coefficient drifting. Sum of the model errors

0 100 200 300 400 500 6000.24

0.26

0.28

0.3

0.32

0.34Model coefficients Θ

d−sum of steam flowsd−HP−inlet pressured−HPFW piston position

Figure 5.19: Model 2: Coefficient values of three interpretative features.

was smaller in model 2 and therefore a leakage detection method is used for this model.

5.6 Development of leakage detection method

Concerning the fault detection, a good model estimates the target variable well in normal

situation and in abnormal event it does not. Data sets should be available for both normal

situation and abnormal event. There were no such data sets, so data was manipulated

and a slow developing leakage was created to steam line.

5.6.1 Simulated leakage

DMT consists of a simple algorithm to create leakage for steam lines. Many assumptions

were made for leakage, such as effects to the other process variables. Only target mea-

surement values steam flow 4 and values of sum of steam flows were manipulated. The

leakage effects directly to this derived variable. It should be admitted that this was not

the best way to do science. It was expected that change of correlation between the features

after leakage is enough to detect the fault. Location of leakage in the line is assumed to

be before a flow meter. If the leakage is assumed to be after the meter, it may be said

that the flow at the meter location would be higher than in normal situation comparing

to other steam flows.

Two types of leakages were simulated: Leakage was created with constant flow and with

the increasing flow. The latter one, slowly developing leakage was used in this work. The

fault occurs, when half of the data is used for modeling. In this case leakage began at

t = 335. Leakage size was less than one percentage in the end of data set of target variable

margin 1. In original units it was about 1.5 kg/s.

1Margin is difference of variable maximum and minimum global values in database


5.6.2 Leakage index

The Leakage index was created to illustarate system state simply. It is calculated from

cumulative estimation errors of the model. Residual values are cumulated and filtered

with different frame sizes. These values should be near zero in normal situation. State

change can be a reason for bad model behavior and false alarms. Filtering parameters

for frame size of cumulative estimation errors were 25 and 50 seconds. These operations

clear leakage signals. Weak point is that the signal for leakage does not come so fast and

forewarn the operator.

Leakage index is the difference of these two filtered cumulated estimation error values. If

model estimate drifts to show wrong values, the leakage index value rises only temporarily.

If the dynamic behavior of the system is normal without state changes, leakage detection

operates normal way, and index value returns back to zero. If the leakage situation is real,

the fault is detected by increasing leakage index value.

5.6.3 Leakage detection

0 100 200 300 400 500 6000.84

0.86

0.88

0.9

Time [s]


real valueestimate

0 200 400 600−0.01

−0.005

0

0.005

0.01Cumulative estimation errors

0 200 400 600−2

0

2

4x 10

−3 Leakage index

500540

Figure 5.20: Model 2 leakage detection results: (top) Preprocessed artificial steam flow 4values and model estimation values. Real value in figure represents the flow meter output.(left) Cumulative estimation error and filtered cumulative estimation errors with framesize of 25 and 50 seconds. (right) Derived leakage index values.

Model 2 was tested with the manipulated data set and the results are illustrated in Fig-

ure 5.20. Cumulative estimation errors are viewed without filtering (grey line), with 25

seconds filtering frame size (black line) and with 50 seconds filtered frame size (wide line).

There is no leakage, when cumulative estimation error values are near zero, see Figure 5.20


(left). State changes harm model a bit. A small malfunction in the leakage index is noticed

at t = 200, see Figure 5.20 (right). Operators should keep eyes open, when the cumulative estimation error values are decreasing, the leakage index increases.

First indication of the fault can be seen at t = 500, when the leakage index started to

increase. It can be said that leakage is possible at t = 540. At t = 570 there is a strong

possibility of leakage. More about results, leakage detection and development of detection

methods are discussed in next chapter.

Chapter 6

Discussion

In this work the artificial leakage in the steam pipeline was detected. Many assumptions

were made for the leakage and the process dynamics after the fault has occurred. It took

about three minutes to detect the leakage.

The size of the leakage was not very high in percentages. Flows in main pipelines of the

nuclear power plant (NPP) are really high. In steam lines these can be up to 400 kg/s and

the main coolant flow can be almost 9000 kg/s. After data mining (DM) and modeling,

the feeling of success captivated the authors mind.

However, further research brought out the real situation. The artificial leakage in model

2 was inspected. Leakage flow [kg/s] is shown in Figure 6.1 (a) and total leakage volume

[kg] in Figure 6.1 (b).

0 200 400 6000

0.5

1

1.5

2Leak size

[kg/

s]

Time [s]0 200 400 600

0

100

200

300

400Leakage volume

[kg]

Time [s]

X: 500Y: 107.1

Figure 6.1: Artificial leakage: (a) Leakage size and (b) leakage volume.

Leakage volume (mass) in the end was more than 300 kilograms, even the average leakage

flow was less than 1 kg/s. Cogent reason is that flow in steam line 4 is about 300 kg/s.

It was said in model 2 analysis that there is a small possibility to detect a leakage at

t = 500. It does not help much, because leakage volume was then already more than 100

kilograms.

56

CHAPTER 6. DISCUSSION 57

6.1 Other ways to detect leakages

The steam lines contain hot high-pressure water. If a leakage occurs, high-pressure steam

would fulfill the area very quickly. This is detected, because there exists moisture meters

in some pipelines at Olkiluoto. If moisture is not detected, the steam condensation begins

immediately. Water flows to a floor drain and level meters there will detect the leakage.

6.2 Problems in data mining and modeling

There exist hundreds of sensors in Olkiluoto NPP. The author had possibility to examine

only variables selected by operator. Amount of variables in the data sets were less than

100. So not all desired variables concerning the leakage detection were reachable.

The weighted recursive least squares (WRLS) method is adaptive linear modeling. Vari-

ables for the model were selected based on correlation. Process knowledge was not used

much. How to verify this assumption that these variables are really correlating globally

with the target variable?

WRLS method takes some time before the initial coefficients start to settle to the system.

Stored data sets had thousands observations, but these did not last very long, because of

the high frequency. This caused problems, because the goal was to develop a method to

detect slow developing faults. A leakage can last hours before it is recognized. Data with

much less frequency should be available.

6.3 Future development ideas

Data management tool (DMT) could be developed to be much user-friendly. It has only

very simple user interface and Matlab code has not much commentary lines. It is not

possible to use all data mining tools in the DMT.

The model should be tested with other data sets. Also cost function could be developed

more.

The leakage detection method is useless, if it is not used on-line, because in off-line analysis

the starting time of fault can be detected using the data after the fault. The Leakage

detection method should be tested with more suitable data or even with real leakage data,

before it is implemented in practice. After testing, leakage detection method could be

developed. For example, the leakage index should represent leakages to the user more

clearly.

DM of all data sets could be improved. Now only few statistical properties of variables are

stored to DMT database. For example delays between variables in each data set could be

stored there. This information could be aggregated as a delay distribution between each

variable.

CHAPTER 6. DISCUSSION 58

Other methods for variable selection, feature extraction and modeling should be considered

such as self-organizing map (SOM) [42] or Kalman filter [43] for modeling.

Leakage detection could be tested also in much smaller pipelines. In NPP these are located

to ”not so risky” places. These are not connected to primary circulation system.

TVO provided also design based isolation valve data sets. These were not included to

this work, because these events did not contain any other information than samples and

variable codes. These can be analyzed by DMT in future, if some background information

is provided.

Chapter 7

Conclusions

In this Master’s thesis, Olkiluoto nuclear power plant data was explored by data manage-

ment tool (DMT). Also an adaptive model and leakage detection method were developed.

The most challenging part was modeling. One data set with the pump failure was ana-

lyzed and reported by DMT. Statistically significant variables were selected. These were

illustrated by figures and tables. These data mining results combined with the expert

process knowledge can be used for further design based event analysis.

Two adaptive models were created for steam flow 4. Principal component analysis was

used for variable selection despite of its original meaning. Distance matrix was defined to

find the dynamically similar variables. Difference values were derived from the selected

variables. These values were used as interpretative features in the adaptive model. Models

were analyzed by simple cost function. Better model was used to develop the leakage index.

It is used for fault detection. It detects leakages in the steam lines in primary circulation

system of boiling water reactor. This method was based on estimation error of the model.

In this work artificial leakage was created by manipulating the data set. The leakage

was detected, but not early enough. Leakage detection method should be tested with

other data sets and signals. Location of the leakage detector should not be in a primary

circulation system of the nuclear power plant, because of the huge flow rates. Many future

development ideas were found and these require further study.

Research is the process of going up alleys to see if they are blind, Marston Bates (1906-74)

59

Appendix A

Illustrations

Matlab program structure is shown in Figure A.1. Structure is needed in DMT developing,

but the user is not required to understand what is behind the interface. A bootstrap file

is DataManagementTool. Symbol explanations are listed below: Matlab classes are illustrated as rectangles. Name of the ”m-file” is written in the

center. In the top of the rectangle there is really short explanation for the class. Abnormal events are stored in TXT- and simulator data in XLS-format. These are

reformed to Matlab form and stored MAT-files. These are illustrated as tanks. DMT generates one or more figures in class, where is an ellipse over the rectangle. Lines are connections between the classes. A diamond retrieves information and

an arrow transfers the data to the other class. A triangle means that data is stored

to file. A square means loading. Labels next to the lines are program variables,

which are moving to another class.

Main classes of DMT: DataManagementTool : Bootstrap file, the user interface. LoadDataSet : Data sets can be loaded from MAT-file. It is less time-consuming

than directly from original files. CreateTargetData: Data preprocessing. SelectFeatures and FeatureSelection together select statistically significant variables

and interpretative features for modeling. ModelSelection: Creates interpretative matrix. WRLSData: Modeling. WRLS is implemented in WRLSlambda.

60

APPENDIX A. ILLUSTRATIONS 61

Real fault data

load fault

<<User interface>>

DataManagementTool

<<Variable selection>>

SelectAlmostAll

<<Computed variables>>

MirrorData


StdData

<<Preprocessing>>

ScaleData

<<Visualization>>

ScatterData

<<Visualization>>

PlotData

<<Variable Selection >>

ClusterData

<<Modeling>>

WRLSData

<<Modeling>>

WRLSlambda

d.target, p.plots

d.target, p.scatter

<<Displays information>>

DataInformation

<<Data transformation>>

CreateRealDataFaults

Load

<<Load data>>

tvo_datastruct_create

faultx.gfile<<Variable information>>

NamesWhere

<<Data transformation>>

Load_RealDataFaults

PI

OL1&2

<<Target variables>>

CreateTargetData

<<Feature selection>>

SelectFeatures

d.target

d.ori.g.data

d.target

d.target

<<Preprocessing>>

SmoothData

d.target.efX

d.target.eaX


StdData

d.target.seX

d.target.eaX

<<Computational variables>>

DifferenceData

d.target.efX

d.target.defX & dfX

d.target.sX

d.target.aX

d.target.emX d.target.eaX

dd.ori


FeatureSelection

d pd.features

d.features p

d.features p

<<Visualization>>

PlotFeatures

d.features,p.PCA.timeplot

d p

d p


ModelSelection

Database:Global properties forvariables (min, max)

d.target.base

37 data sets:MAT-files

<<Database>>

CreateDataBase

d p

d.features

d.features p

d.model<<Visualization>>

PlotModel

d p

NUCLEAR POWER PLANT

<<Visualization>>

ScatterFeatures

<<Tool>>

ChangeParameters

<<Tool>>

LoadDataSet

d.orip

Figures

clustercluster

Figures

Figures

Figures

Figures

Figures

Figure A.1: Structure of the data management tool (DMT)

APPENDIX A. ILLUSTRATIONS 62

The user interface of DMT is very simple. First version of it is shown below. In this ver-

sion user cannot perform all DM methods, which are possible to do by changing Matlab

code. For example only few parameters can be changed by user interface.

**************************

Data management tool v1.0

**************************

0: Information

1: Create Target data

2: Plot data

3: Scatter data

4: Select variables (repeat)

5: Feature extraction and model selection

6: WRLS and visualization

80: Change parameters

90: Start again (load p and d)

99: Exit program

Bibliography

[1] J. Parviainen. Data Mining for Finding Surface Defects in Steel Strips. Master’s

thesis, Helsinki University of Technology, 2000.

[2] Anon. Probabilistic safety analysis in safety management of nuclear power plants.

Guide, May 2003. YVL 2.8.

[3] J.J. Gertler. Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker,

1998.

[4] M. Sirola. Computerized decision support systems in failure and maintenance manage-

ment of safety critical processes. PhD thesis, Technical Research Centre of Finland,

1999.

[5] Xue Z. Wang. Data Mining and Knowledge Discovery for Process Monitoring and

Control. Springer, 1999.

[6] M. Demetriou and M. Polycarpou. Incipient fault diagnosis of dynamical systems

using onlineapproximators. Automatic Control, IEEE Transactions on, 43(11):1612–

1617, 1998.

[7] C.M. Ying and B. Joseph. Sensor Fault Detection Using Noise Analysis. Ind. Eng.

Chem. Res, 39(2):396–407, 2000.

[8] J.C.Y. Yang and DW Clarke. A self-validating thermocouple. Control Systems Tech-

nology, IEEE Transactions on, 5(2):239–253, 1997.

[9] F. Øwre, J. Kvalem, T. Karlsson, and C. Nihlwing. A new integrated BWR supervi-

sion and control system. Human Factors and Power Plants, 2002. Proceedings of the

2002 IEEE 7th Conference on, pages 4–41, 2002.

[10] R. Ritala, E. Alhoniemi, T. Kauranne, K. Konkarikoski, A. Lendasse, and M. Sirola.

Nonlinear temporal and spatial forecasting: modeling and uncertainty analysis. MASI

Technology Programme 2005 - 2009 Yearbook, page 10, 2007. (NoTeS) - MASIT20.

[11] M.S. Kazimi and N.E. Todreas. Nuclear power economic performance: Challenges

and opportunities. Annual Review of Energy and the Environment, 24(1):139–171,

1999.

63

BIBLIOGRAPHY 64

[12] Anon. Three Mile Island accident. http://en.wikipedia.org/wiki/. retrieved at August

2007.

[13] J. Tylee. On-line failure detection in nuclear power plant instrumentation. Automatic

Control, IEEE Transactions on, 28(3):406–415, 1983.

[14] Anon. Teollisuuden voima Oy. Euraprint, www.tvo.fi, 2 2007. Taskutieto.

[15] Anon. Teollisuuden Voima Oy. http://www.tvo.fi. retrieved at August 2007.

[16] B. Pershagen. Light Water Reactor Safety. Permagon, 1 edition, October 1989.

[17] E. Tommola. Mittavia modernisointitoita Olkiluoto 2-yksikon vuosihuollossa. eY-

timekas Teollisuuden Voiman verkkolehti, 2007. In Finnish.

[18] Anon. Radiation and Nuclear Safety Authority (STUK). http://www.stuk.fi/. re-

trieved at September 2007.

[19] P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and

R. Wirth. CRoss Industry Standard Process for Data Mining, 1999. Crisp-DM 1.0.

[20] Anon. Statistical Package for the Social Sciences (SPSS). http://www.spss.com/.

retrieved at August 2007.

[21] D. Pyle. Data Preparation. Morgan Kaufmann, 1999.

[22] J. Venna. Dimensionality reduction for visual exploration of similarity structures.

PhD thesis, Helsinki University of Technology, 2007.

[23] J.F. Hair, R.E. Anderson, R.L. Tatham, and W.C. Black. Multivariate Data Analysis.

Prentice Hall, 5th edition, 1998.

[24] H. Karttunen. Datan kasittely. CSC-Tieteellinen laskenta Oy, 1994. In Finnish.

[25] O. Aumala, H. Ihalainen, H. Jokinen, and J. Kortelainen. Mittaussignaalien kasittely.

Pressus Oy, 1995. In Finnish.

[26] H.D. Jin, Y.H. Lee, G. Lee, and C. Han. Robust Recursive Principal Component

Analysis Modeling for Adaptive Monitoring. Industrial & Engineering Chemistry

Research, 45(2):696–703, 2006.

[27] H. Imelainen. Developing model structure for process automation use. Master’s thesis,

TKK, 1997.

[28] P. Riihimaki. Development of Calibration by Multivariable Statistical Methods. Mas-

ter’s thesis, Helsinki University of Technology, 2004.

[29] S. Laine. Using visualization, variable selection and feature extraction to learn from

industrial data. PhD thesis, TKK, 2003.

BIBLIOGRAPHY 65

[30] E. Bingham, A. Gionis, N. Haiminen, H. Hiisila, H. Mannila, and E. Terzi. Segmenta-

tion and dimensionality reduction. Proceedings of the SIAM International Conference

on Data Mining (SDM), 2006.

[31] Anon. Stockstoshop. http://www.stockstoshop.com/bollinger.htm. retrieved at Au-

gust 2007.

[32] Y.H. Chu, S.J. Qin, and C. Han. Fault Detection and Operation Mode Identification

Based on Pattern Classification with Variable Selection. Industrial & Engineering

Chemistry Research, 43(7):1701–1710, 2004.

[33] C.K. Yoo, S.W. Choi, and I.B. Lee. Dynamic monitoring method for multiscale fault

detection and diagnosis in MSPC. Ind. Eng. Chem. Res, 41(17):4303–4317, 2002.

[34] L.I. Tong, C.H. Wang, and C.L. Huang. Monitoring defects in IC fabrication using

a Hotelling T2 control chart. Semiconductor Manufacturing, IEEE Transactions on,

18(1):140–147, 2005.

[35] H. Anton. Elementary Linear Algebra 5e. John Wiley & Sons Inc, 1987.

[36] S. Theodoridis. Pattern Recognition. Academic Press, 2003.

[37] T. Escobet and L. Trave-Massuyes. Parameter estimation methods for fault detection

and isolation.

[38] Anon. AS-74.3114 Computer Modeling P. http://www.control.hut.fi/Kurssit/AS-

74.3114/index.en.html. retrieved at September 2007.

[39] L. Ljung. System Identification Theory for the User. 1999.

[40] Anon. Reactor shut down in Olkiluoto nuclear power plant. 05.09.2007 12:58 STT.

[41] J. Vesanto et al. SOM Toolbox for Matlab 5. Helsinki University of Technology, 2000.

[42] J. Vesanto, J. Himberg, E. Alhoniemi, and J. Parhankangas. Self-Organizing Map

in Matlab: the SOM Toolbox. Proceedings of the Matlab DSP Conference, 99:16–17,

1999.

[43] G. Welch and G. Bishop. An Introduction to the Kalman Filter. ACM SIGGRAPH

2001 Course Notes, 2001.

Fault Detection by Adaptive Process Modeling for Nuclear...

Documents

Transcript of Fault Detection by Adaptive Process Modeling for Nuclear...