Post on 14-Oct-2020
Master of Science Thesis in Vehicular SystemsDepartment of Electrical Engineering, Linköping University, 2016
Supervision of the air loopin the Columbus Moduleof the International Space Station
Jasper Germeys
Master of Science Thesis in Vehicular Systems
Supervision of the air loop in the Columbus Moduleof the International Space Station
Jasper Germeys
LiTH-ISY-EX--16/5014--SE
Supervisor: Daniel Jungisy, Linköpings University
Examiner: Erik Friskisy, Linköpings University
Department of Vehicular SystemsDepartment of Electrical Engineering
Linköping UniversitySE-581 83 Linköping, Sweden
Copyright © 2016 Jasper Germeys
Abstract
Failure detection and isolation (FDI) is essential for reliable operations of com-plex autonomous systems or other systems where continuous observation or main-tenance thereof is either very costly or for any other reason not easily accessible.
Beneficial for the model based FDI is that there is no need for fault data to detectand isolate a fault in contrary to design by data clustering. However, it is limitedby the accuracy and complexity of the model used. As models grow more com-plex, or have multiple interconnections, problems with the traditional methodsfor FDI emerge.
The main objective of this thesis is to utilise the automated methodology pre-sented in [Svärd, 2012] to create a model based FDI system for the Columbus airloop. A small but crucial part of the life support on board the European spacelaboratory Columbus.
The process of creating a model based FDI, from creation of the model equations,validation thereof to the design of residuals, test quantities and evaluation logicis handled in this work. Although the latter parts only briefly which leaves roomfor future work.
This work indicate that the methodology presented is capable to create quite de-cent model based FDI systems even with poor sensor placement and limited in-formation of the actual design.
Carl Svärd. Methods for Automated Design of Fault Detection and Isolation Sys-tems with Automotive Applications. PhD thesis, Linköping University, VehicularSystems, The Institute of Technology, 2012
iii
Acknowledgements
This thesis has taken some time to finish and has defined quite a good portionof my being and by that does it not only mark the end of a great time but also apromise of future developments. This is a huge milestone for me and thus couldthese acknowledgements never fully contain the depth of my gratitudes.
First and foremost must I acknowledge Erik Frisk and the whole Vehicular Sys-tems department at Linköping university and Airbus DS for allowing me to workupon such an interesting system, even though it has given me plenty of grief attimes. Daniel Jung for always keep trusting in me but also Enrico Noack andMikael Persson for the time during the preparation work. Not to forget my oppo-nent Gustav Romeling.
Special thanks to Bengt Göransson, 千葉大奈, 李森 and 仇隽挺 for helping meto get back on track at a time where motivation was a great shortcoming, for thiswill you not be forgotten.
Alla mina vänner under studietiden, speciellt Björn Holm, Dan Adolfsson, FredrikHenriksen och Mikael Göransson samt familjerna LeMoine och Lockwood. Menäven Stefan Persson som lyckats stå ut med mig under så lång tid.
Carl Forden, Jonas Frossmed och Fritiof Olsson för simpel men pålitlig vänskap.Anna för alla dessa underbara konversationer som skapar glädje i vardagen.Familjen Asp för ett evigt familjeband samt David, Matte och Peter i Nässjö.
Joakim Råberg då även små människor kan göra stordåd.
The supporting friends in the Mahjong community like David Clarke, MatthiasKöhler, Senechal and Gemma Sakamoto among others, you know who you are.
Folket på Gata/Park i Nässjö kommun för er hjälp att komma ut ur arbetslöshetenslömska klor men även för er förståelse när jag valde att lämna er för högre utbild-ning, det tog sin tid, men nu är det klart och utan er hade detta inte skett.
Familjen Borg för deras outgrundliga vänlighet och stöd. Familjerna Hjelm, Kvist,Nylander samt Pieters i Kärrgruvan, Norberg samt familjen Högström för allt nibetytt för mig.
Solbritt Sundvall för sitt gedigna engagemang att försöka reda ut en karl av mig.
En natuurlijk mijn familie, die altijd voor mijn zaak blijft staan.Ik hou van jullie allemaal.
Linköping, December 22, 2016Jasper RK Germeys
v
Contents
1 Introduction 11.1 The Columbus air loop . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Columbus failure management system . . . . . . . . . . . . . . . . 21.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Model and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 41.6 Related research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.7 Expected results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.8 Outline of the report . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Model description 72.1 The ducts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 Mass conservation . . . . . . . . . . . . . . . . . . . . . . . . 92.1.2 Passive flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Fans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.1 Fan curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.2 Fluid work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Fan flow by power consumption . . . . . . . . . . . . . . . . 12
2.3 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.4 Other equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Model parametrisation and validation 153.1 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Model error measurement . . . . . . . . . . . . . . . . . . . . . . . 163.3 Validation of sensor redundancy . . . . . . . . . . . . . . . . . . . . 173.4 Validation of the fan curves . . . . . . . . . . . . . . . . . . . . . . 203.5 Model parametrisation and validation . . . . . . . . . . . . . . . . 223.6 Validation of fan flow by power consumption . . . . . . . . . . . . 25
4 Design of fault detection system 274.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Generating the MSOs . . . . . . . . . . . . . . . . . . . . . . . . . . 284.3 Selecting the residuals . . . . . . . . . . . . . . . . . . . . . . . . . 29
vii
viii Contents
4.3.1 System modes . . . . . . . . . . . . . . . . . . . . . . . . . . 294.4 Generating the residual code . . . . . . . . . . . . . . . . . . . . . . 304.5 Validation of the residuals . . . . . . . . . . . . . . . . . . . . . . . 32
5 Results 355.1 Fault detection and isolation with injected faults . . . . . . . . . . 355.2 Detection and isolation on real data . . . . . . . . . . . . . . . . . . 37
6 Conclusion 456.1 Overall evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Bibliography 49
Abbreviations
AFS air mass flow sensor. 12, 16, 18, 22, 23, 25
Airbus DS Airbus Defence and Space. 1
Airbus SE Airbus Group SE. 1
CFA cabin fan. 8, 15, 19, 25, 32
CHX condensate heat exchanger. 8, 22–24
CHXFA condensate heat exchanger fan. 12
COL-CC COLumbus Control Centre. 2
COL-DMS COLumbus Data Management System. 2
CTCU cabin temperature control unit. 8, 13, 20
CUSUM cumulative sum. 5, 35
CWSA condensate water separator assembly. 13, 16
DAG directed acyclic graph. 30, 31
EADS European Aeronautic Defence and Space company NV. 1
EU electronic unit. 10
FDI failure detection and isolation. 3–5, 45–47
GLR generalized likelihood ratio. 5
HEPA high-efficiency particulate arrestance. 1, 8, 40
HS humidity sensor. 13, 19, 20
ix
x Units
IMV inter module ventilation. v, vi, 8, 10
IRFA inter module ventilation (IMV) return fan. 8, 10, 15, 19, 25
ISFA IMV supply fan. 8, 15, 19, 22, 23, 25
ISS International Space Station. 1
KH total dynamic head. 11, 24
LCOS liquid carry over sensor. 13
MAE% mean absolute error percentage. 16, 18–20, 23–25
MSE mean square error. 16, 18–20, 23–25
MSO minimal structurally overdetermined set. 5, 27–30, 46
NaN not-a-number. 32, 35
SCC strongly connected components. 27
TPS air pressure sensor. 13, 17
Units
I Current [A]N Revolutions per minute [min−1]Q Volume flow [m3/s] most of the time in [m3/h]R Gas constant [J/kgK]T Temperature [oC]W Watt [W ]φ Relative Humidity [-]ρ Density [kg/m3]p Pressure [kP a]q Mass flow [kg/s] exclusively used as [kg/h] in this document
1Introduction
The European space laboratory Columbus is designed and built by EuropeanAeronautic Defence and Space company NV (EADS), now Airbus Group SE (Air-bus SE), affiliate Astrium, now Airbus Defence and Space (Airbus DS) [AirbusDefense and Space, 2013a], and is a part of the International Space Station (ISS).It is equipped with a range of experimental facilities and basic life support forup to three astronauts. Together with the microgravity environment, Columbushas enabled numeral extraordinary experiments previously not possible in manyscientific fields such as physics, material science, biology, medicine, and humanphysiology [Airbus Defense and Space, 2013b,c].
1.1 The Columbus air loop
The Columbus air loop main function is to provide a forced air flow to avoid deadair pockets which is safety critical for the crew. The forced air flow also enablesfire detection and removes heat from air cooled equipment. As the ColumbusModule has no O2 or CO2 control, the inter module ventilation has to providefresh air to support life on board.
The air loop consists of four fans, three with check valves, one without, a high-efficiency particulate arrestance (HEPA) filter and a condensate heat exchanger.A simple overview of the system is shown in Figure 1.1. The system is designedto be operating in 8 stable modes and additionally 43 interim modes intended forair loop reconfiguration.
1
2 1 Introduction
Figure 1.1: A schematic overview of the Columbus air loop, [Source: AirbusDS, 2013]
1.2 Columbus failure management system
The failure management system on Columbus consist of multiple parts; First isthe on board COLumbus Data Management System (COL-DMS) that collects andlogs all measured signals. Being fully automatic it can quickly initiate requiredresponses but due to sparse resources, the diagnosis part of COL-DMS is limitedto only detect time critical failures like fire outbreaks and other environmentalhazards.
The second part is the COLumbus Control Centre (COL-CC) which is the primaryground based unit. It receives data from COL-DMS in near realtime througha downlink and performs data analysis, both manually and automatically. Thefailure management part of COL-CC has a similar purpose as the onboard unitand focuses primarily on detection of critical failures and short term responses.This is supported by an Engineering Support Centre that does offline analyses ofselected data to find and direct long term corrective actions for COL-CC and anAssembly, Integration and Test facility for testing and development purposes.
An onboard monitoring experiment, the ERNO BOX, has been added as a thirdpart to the failure detection arsenal. This system runs in parallel to COL-DMS,monitoring the same data but with no means for interaction. Instead the re-sponses are submitted to COL-CC for comparison with the decisions made byCOL-DMS [Noack et al., 2012].
There has been some recent developments in the COL-CC infrastructure too im-prove the failure management efficiency [Sabath et al., 2012, 2014].
1.3 Background 3
1.3 Background
Since launch in February 2008, Columbus has encountered multiple failures andabnormalities. The failures have either been minor or been recovered before anyserious damage to the system has occurred. It has emerged that the original fail-ure detection system has weak robustness and poor fault sensitivity [Noack et al.,2010].
Upon request from Astrium, European universities have been engaged in devel-opment and improvement of the diagnosis system on Columbus. BTU Cottbusis using a data mining approach and Linköping University by means of modelbased failure detection. [Noack et al., 2011, 2012, Noack and Schmitt, 2012]
Beneficial for the model based failure detection and isolation (FDI) is that thereis no need for fault data to detect and isolate a fault in contrary to data cluster-ing. However, it is limited by the accuracy and complexity of the model used. Asmodels grow more complex, or have multiple interconnections, problems withthe traditional methods for FDI emerge. In [Svärd, 2012] an automated method-ology is presented for design of an FDI system for complex systems operating inmultiple modes.
This thesis focuses on the design of a model based FDI system for the Columbusair loop. A small but critical part of the Columbus life support.
300
350
400
450
500
AirFlow Sensors [m3/min]
AFS1 AFS2
50
100
150
200
Pressure Differences [Pa]
CHX cfa1KH cfa2KH isfaKH filtered CHX
0 500 1000 1500 2000
0.8
1
1.2
Fan rotation [rpm]
cfa1N cfa2N isfaN irfaN
Figure 1.2: Subsampled unproccessed data
4 1 Introduction
1.4 Model and data
A model of the Columbus air loop was proposed during the summer of 2012by Jasper Germeys and Mikael Persson. This model, which is based on physicalrelations and measured data from the time span of 2008 to 2010 provided byAstrium, has been used as basis for the model in this thesis. A second batch ofdata containing a more detailed selection of all the more interesting parts fromthe initial set was later received.
In this thesis only the second data set and an additional third data set, where thefans’ rotation are increasing in steps, are used. This third data set has been usedas training data for the parameterisation of the model. A selection of the signalsin the second and third data sets are displayed in Figure 1.2 and Figure 3.1 respec-tively. In Tables 2.1 and 2.2 the sensors contained in the data sets are listed. Ad-ditionally, Figure 1.1 and Figures 2.1-2.4 are taken from these documentations.
1.5 Problem formulation
The purpose of this thesis is to develop a diagnosis system that will not onlydetect but isolate the failures included in the model. The system suffers fromnoisy sensors and fault propagation, hence will simple means for fault isolationnot suffice in aspect to minimise the risk of false alarms. This includes detectionof the fans’ current work mode and determine actual mass flow as it is this centralstate that has poor observability. This will pose a problem in unknown modes andwhen clogging or leaks occur as those faults are highly linked.
The FDI system should;
• Be designed based on a model of the air loop.
• Be able to detect which mode the fans operate in.
• Detect and isolate single faults such as fan loss, leaks, wear, (partial) clog-ging, and sensor loss.
1.6 Related research
A model-based design of a diagnosis system can be divided into three phases;A modelling phase, a residual generation phase and finally an evaluation phasewhere the rules for fault detection and isolation are set. These phases entwineand will be done iteratively.
In the modelling phase the full model equation set will be extended with thedesired detectable faults and modes. Structural fault detectability and isolabilitywill be analysed by using the Dulmage-Mendelsohn decomposition [Frisk et al.,2012]. The parameters of the equations will then be estimated from sample datawith no faults.
1.7 Expected results 5
The residual generation phase consist of extracting the minimal structurally over-determined set (MSO) to generate the residuals [Krysander et al., 2008]. As it isunknown how many alternative sets the model allows, initially a greedy selectionalgorithm described in [Svärd et al., 2013] will be applied.
The evaluation phase consist of creating the rules for detection and isolation.There are many methods available for detection, where constant thresholds be-ing the simplest, but also more robust methods as cumulative sum (CUSUM),likelihood functions, adaptive thresholds [Sneider and Frank, 1996], or combina-tions thereof [Meinguet et al., 2012] have proven effective. As the system requiresrobustness and will be operating in multiple modes, a relaxed generalized likeli-hood ratio (GLR), as proposed in [Svärd et al., 2014], was to be considered.
1.7 Expected results
The expected results of this work is an FDI system that detects and isolates faultsin the air loop system. The system should be able to work in real time and detectfaults within reasonable time. It is also expected that the system could measurewear on the fans and partial clogging.
An evaluation shall be done if the methodology presented in [Svärd, 2012] is aworking methodology for this system or not. Also an analysis if the solution canbe improved by using Kullback-Leiber divergence for fault isolation as suggestedby [Eriksson et al., 2011].
Additionally an evaluation on how well the FDI system works in comparison tothe uncertainties in the model and if simpler methods could suffice.
1.8 Outline of the report
The plan for the report outline is as following;Chapter 1 introduces the problem and the system, the chosen methodology andrelated research as well as the expected results and this outline of the rest of thereport.
In Chapter 2 the details on the model and the equations sets are listed. Thework of setting the model parameters are presented in Chapter 3 together withan estimation of the expected noise while in no fault node. Chapters 2 and 3 willtogether form the modelling phase of the work.
Chapter 4 displays the full model including modes and faults and continues tothe residual generation and what sets the methodology generates. These are thenevaluated and the result are presented. How well the design works against col-lected data is presented in Chapter 5.
Chapter 6 contains the evaluation of the work, made together with ideas for po-tential future work.
2Model description
The Columbus air loop consist of a small set of flows with some sensors and fans.This chapter will describe these compontents and the models used to representthem. A 3D model of the air loop has been provided by Astrium which is shownin Figure 2.1. Even though the 3D model is trusted to be an accurate representa-tion of the air loop, no detailed data has been gathered from this figure.
Figure 2.1: A 3D model of the Columbus air loop, [Source: Airbus DS, 2013]
7
8 2 Model description
2.1 The ducts
The ducts have been split into the different regions based on the air flow theyare expected to contain for modelling purposes as shown in Figure 2.2. Belowfollows a description of the different sections. The actual dimensions of the ductsare not known but it can be fair to assume they all are static.
Harmony module: The module supplying Columbus with air and receiving thereturned air. This module is not considered a part of this system and is thusnot modelled. We can not assume there are any correlation between the twoflows from this section due to unknown internal structure.
supply: Transfers air from the Harmony module. This region has many un-knowns as the only sensors are those from the inter module ventilation(IMV) supply fan (ISFA). The input flow may have any temperature or hu-midity even if it is assumed to be constant during normal operation. Thereis both a check valve and a manually controllable valve.
outlet: From the junction of the supply and fan{1,2} to the multiple outlets to thecabin. In this section the air gets filtered through a HEPA filter, regulatedin a controllable cabin temperature control unit (CTCU) with a condensateheat exchanger (CHX) and finally distributed to the cabin. Apart from thecabin this is the only section where the properties of the air is expected tochange. The pressure loss is expected to be quite high. There is a sensormeasuring the pressure difference over the heat exchanger. It is unknownif the water purging system is contained between this sensor or not.
fanJ: Small region where both cabin fans share the ducts, it is unknown if themajority of this section is before or after the actual fans but should not berelevant for modelling purposes as it should only contain the air flow thatis produced by the cabin fans.
fan{1,2}: Small region where only the cabin fan{1,2} (CFA) affects the flow, draw-ing air from the inlet region to the supply and outlet junction. Both fanshave check valves and should virtually be identical. These sections are sub-sections of the fanJ section.
inlet: Small region that is transferring air from the cabin to the fanJ and returnsection. There should be an air filter that can be clogged here. There is alsosmoke detectors mounted in this section.
return: The section where air returns to the Harmony module. Just as for thesupply section, only the sensors provided by the IMV return fan (IRFA) areknown, however, this section should not affect the system that much on awhole. There is no check valve in this section but there is a controllablevalve.
cabin: The actual cabin. Some of the sensors, like pressure and cabin tempera-ture, should be mounted in this section. This section is not a duct and can
2.1 The ducts 9
not be considered sealed. It is however, fair to assume that there is somecorrelation between the in and out flow from this section.
supply outlet
fanJ
inletreturn
Har
mon
ym
odu
le
cabin
Figure 2.2: A schematic overview of the ducts and air flows
2.1.1 Mass conservation
Under the assumption that there are no leaks in the system, the local pressurewill variate based on the mass flow in and out of each node as a function of inand out flow.
p = f (Qin, Qout) (2.1)
Additionally, it is assumed that the local pressures in the system are constant inthe time frame giving the following relations between the flows:
Qoutlet =Qsupply + Qf anJ (2.2a)
Qf anJ =Qf an1 + Qf an2 (2.2b)
Qinlet =Qreturn + Qf anJ (2.2c)
The sections are described in Section2.1.
2.1.2 Passive flow
Sections without a fan will have a flow driven by the pressure difference. This canbe modelled by using the Bernoulli equation but as there are too many unknownsin the system, as duct size, number of bends and if the flow is laminar or not, asimplified model is used instead given by
∆pi = Kf ,i(QiKc,i − Kd,i)2 (2.3)
which translate to that most terms in Bernoullis equation are considered constant,neglectable or flow dependant.
10 2 Model description
2.2 Fans
Each fan has a certain number of sensors as listed in Table 2.1. All four fans are ofthe same type with an attached electronic unit (EU). Thus, one model structureshould apply to them all. There is a check valve connected to the fan output toprevent reverse flow for all fans, except the IMV return fan. A schematic overviewof a fan is displayed in Figure 2.3.
Figure 2.3: A schematic overview of a fan. [Source: Airbus DS, 2013]
Table 2.1: Fan signals
Measurementspart signal unit
Delta_P_ kP aFan_Speed_ min−1
CFA{1,2}_/ Input_Current_ AI{S,R}FA_ EU_Temp_ oC
Fan_Temp_ oCInput_Voltage_ VSwitches/Analyses
CFA{1,2}_/ Pwr_Stat_ ON/OFFI{S,R}FA_ Avail_Stat_ (UN)AVAILCFA_ Redun_Stat_ (UN)AVAIL
I{S,R}SOV_Vlv_Open_Stat_ X/OPENVlv_Closed_Stat_ X/CLOSED
2.2 Fans 11
2.2.1 Fan curve
It is common practice to utilise fan curves when designing ventilation systems toensure that the fans are working in their efficient load regions by calculating thedifferent operation points. Figure 2.4 shows the fan curve for the current fans.This curve is normally generated in ideal (dry) conditions and to scale this forother fan speeds (N ) and air densities (ρ) the affinity equations
dp1
dp2=ρ1
ρ2
(N1
N2
)2
,Q1
Q2=
(N1
N2
),
Wf 1
Wf 2=ρ1
ρ2
(N1
N2
)3
(2.4)
are utilised. A map has been made to calculate flow (Q) as a function of totaldynamic head (KH) (dp), i.e., the fan’s workload. However, there are two possi-ble flows, and the fluid effect (Wf ) the fan can generate depends on where onthe curve the fan is operating as illustrated in Figure 2.4. The lower flow out-put, the so called stall region, is considered a fault as it is not only less efficientbut also induces more wear on the fan. There are multiple reasons for a fan toenter stall mode where the most common ones are from the design stage like over-dimensioning of the fans or competing parallel flows. It is assumed that this fancurve is valid for 8500 rpm and with an air density of 1.2041 kg/m3 (dry air at20oC).
Figure 2.4: Characteristics of the fans, [Source: Airbus DS, 2013]
12 2 Model description
2.2.2 Fluid work
All attempts to utilise the fan curve to determine an operational point by mea-sured power consumption has been very unreliable. The system loss is not in-significant. Analysis indicates that the effect in the fan curve is measured (Wm)and not fluid effect (Wf ), this implies the scaling law for W does not apply di-rectly. This observation is strongly supported by data as well.
Wm = Wf + Wloss
To determine Wloss in the ideal conditions is not feasible and there is no basis toassume Wloss would scale as nicely. Instead, a model which is given by
W = Kp∆pQ + KqQ2 + Kd (2.5)
has been estimated from the fan curves map by linear regression where Kp wouldbe the fan’s efficiency in building pressure, Kq flow work efficiency and Kd anyconstant losses. This approach is accurate enough to determine if the fan is work-ing in the desired mode or in the faulty stall mode. This is later validated inSection3.4.
2.2.3 Fan flow by power consumption
An already parameterised model has been received from Astrium which is a di-rect correlation between the fan power consumption and the flow that the fansproduce. This model is done by linear regression of collected and, possibly, pre-processed data.
W = Kf Q + Kd (2.6)
The validation in Section3.6 conclude that this model is accurate within certainintervals which should be enough for residual generation.
2.3 Sensors
A complete listing of the available sensors is shown in Table 2.1 together withTable 2.2. Some of these sensors have been concluded to be of no use to thecurrent model and are consequently not included.
AFS : The air mass flow sensors, positioned in the junction between supply,fan{1,2} and outlet. Will be abbreviated as AFS{1,2} in most parts of thisdocument. The sensors are validated in Section3.3 but are often saturatedand due to local vertices does rarely produce redundancy. An improvedmodel (2.7) has been made in combination with (2.2a) which is validated inSection3.5.
AFSi =∑
Kj,iQj + K0 (2.7)
CHXFA : The pressure sensors over the condensate heat exchanger fan. Arein used in combination with (2.3) to produce an estimation of Qoutlet , val-
2.3 Sensors 13
idated in Section 3.5 and directly in (3.2) which also is validated in Sec-tion3.5.
CTCU : Multiple sensors and switches, only the cabin temperature signals areused in the current model and these are validated in Section3.3.
CWSA : Condensate water separator assembly. None of these signals are usedin the current model.
HS : Air humidity sensor, measuring percentual water saturation in the air,used to calculate current air density (2.11) and is validated in Section3.3.
TPS : Air pressure sensor, measured in mmHg and used in the system and isvalidated in Section3.3. Will be abbreviated as TPS(1-4).
LCOS : Liquid carry over sensor. Used to detect if the CWSA fails which isoutside the scope of this thesis.
Cabin : Smoke detectors and a low airflow state, none of these are currently inuse by this model.
Table 2.2: Non fan signals
Measurementspart signal unitAFS{1,2}_ Cab_Air_Massflow_ kg/hCHXFA_ Delta_P{1,2}_ P a
CTCU{1,2}_Cabin_Temp{1-3}_ oCAvg_Cabin_Temp_ oCTCV_Posn_ %
CWSA{1,2}_Input_Current_ ADelta_P_Air_ kP aDelta_P_Water_ kP a
HS{1,2}_ Air_Humidity_ %rHTPS{1-4}_ Air_Press_ mmHgLCOS{1,2}_ Level_ V
Cabin_SD{1,2}_Obscuration_ VSD{1,2}_Scatter_ V
Switches/Analyses
CTCU{1,2}_
Pwr_Stat_ ON/OFFKick_Posn_Up_Stat_ (IN)ACTIVEKick_Posn_Down_Stat_ (IN)ACTIVEKick_TCV_Hold_Posn_ (IN)ACTIVEDryout_Stat_ DISABLEDCntl_Loop_Stat_ (DIS)ABLED
CWSA{1,2}_ Pwr_Stat_ ON/OFFCabin_ Low_Airflow_Stat_ NOM/LOW
14 2 Model description
2.4 Other equations
There are some other equations used in the model that are of a more generic type,like the pressure difference
∆p = pa − pb (2.8)
and conversion from mass flow (q) to volume flow (Q)
q = ρQ (2.9)
Ohm’s Law for calculating power
W = IV (2.10)
used to calculate the fans’ power consumption where W equals Watt, I the sup-plied current and V the voltage.
To calculate the air density (ρ) as a function of pressure (p), temperature (T ) andrelative humidity (φ) the ideal gas law is used
ρ = f (T , p, φ) (2.11)
ρ =pairRairT
+pwaterRwaterT
, pwater = psatφ, pair = p − pwater
where px are partial pressures and Rx being the specific gas constants.
3Model parametrisation and validation
This chapter contains verification and parametrisation of the models listed inChapter 2. Starting with an overview of the data sets used throughout the chap-ter and a description of how the model error is measured. This is followed byverification of the system redundancy and the fan curve assumptions. Finally theparametric models are both estimated and verified.
3.1 Data sets
For parametrisation of the models, a data set where the fans are being tested indifferent modes has been used. This training data is not normal operation butmade upon request for assisting in modelling the system. For validation somedata sets with fairly normal operation has been selected and a data set which is asubsampled version of all available data not used as training data. Below followsa short description of each data set and an overview of the training data set isshown in Figure 3.1.
Training data: Data set from 20100[1-2]* which is a testing run to gather infor-mation of how the system change in the different modes. Both IRFA andISFA is enabled with open valves. The temperature, cabin pressure andhumidity is either constant or within normal variation.
n20080319: Validation data set with low sample rate on some signals, CFA2 isdoing the cabin flow and all fan modes and valves are kept at a constantrate.
n2008070: Validation data set going from normal operation to internal modewhere the supply fan gets disabled and both cabin fans are run at high
15
16 3 Model parametrisation and validation
300
400
500
Airflow Sensors [kg/h]
AFS1 AFS2
50
100
150
200
Pressure Differences [Pa]
CHX cfa1KH cfa2KH isfaKH filtered CHX
0 500 1000 1500 2000
0.8
1
1.2
Fan rotation [rpm]
cfa1N cfa2N isfaN irfaN
Figure 3.1: Raw sensor data from the training data set
speed. The return fan is disabled the whole set but its valve is opened inthe internal mode. The supply valve is not closed. There is a slight peak inhumidity when the valves close.
20091129: Validation data set in normal operation with a swap of which cabinfan is running, followed by the return fan disabled and its valve closes.There is also noticeable drops in the AFS sensors due to the CWSA work-ing.
subsampled data: All available data is subsampled for quicker processing andused for validation. The last registered value is used when data is missing.
3.2 Model error measurement
To measure the model error, the model equations are evaluated on the validationdata sets. Mean square error (MSE) and mean absolute error percentage (MAE%)have been used to determine if an equation seem to hold true as
MSE :1N
N∑i=1
(xi − xi)2 MAE% :1
100N
N∑i=1
|xi − xi |xi
(3.1)
in addition to a manual behavioural check.
3.3 Validation of sensor redundancy 17
3.3 Validation of sensor redundancy
Validation of the simplest form of redundancy where multiple sensors ought tomonitor the same information. This is done to determine if the sensors truly areredundant and, if not, how much their output differs. As there is no need forestimation in this section, the training data set is also used as validation data.
Air pressure sensors
The air pressure sensors are quantified to steps of 0.4mmHg. The residual endsup being within one, occasionally two, quantification steps after adjusting offset.If this is due to constant local pressure differences or due to bad calibration can-not be determined. The only contradiction to this can be found in the subsampleddata set as shown in Figure 3.2. The biggest offset is between TPS1 and TPS4 of9 steps being 0.48kP a (3.6mmHg).
0 500 1000 1500 2000730
740
750
760
770
780Raw signals in mmHg
TPS1 TPS2 TPS3 TPS4
0 500 1000 1500 2000
-5
0
5
Residuals after removing offset
TPS1-TPS2
0 500 1000 1500 2000-1
0
1
TPS1-TPS3
0 500 1000 1500 2000
-5
0
5
TPS1-TPS4
max: |6.092|
Figure 3.2: Validation of the air pressure sensors on the subsampled data set
18 3 Model parametrisation and validation
Air flow sensors
At the junction with the air mass flow sensors (AFS) there are multiple modelsthat could be used to determine the actual flow in the region. As shown in Fig-ure 3.1 the AFS sensors do not give a good indication of the true flow in theregion. In Table 3.1, the results are listed together with how many percent of thedata is excluded due to saturated sensor. The difference between the sensors isoften bigger when the sensor is capped.
Table 3.1: Air mass flow sensors redundancy error
data MSE MAE% Capped%training 3847.4 11.42 33.5n20080319 1088.7 8.22 0.0n2008070 214.78 1.79 70.7n20091129 182.27 2.48 21.1subsampled 4407.9 11.29 32.3
Cabin fan pressure sensors
It is only when both fans are enabled at the same time that we can utilize thisredundancy. It is not known if this is never recorded or if it is discarded. Amongthe selected validation data set there is only redundancy in the n2008070 data setand the subsampled data set. There seems to be an offset between these and wecan assume part of it originates from the differences in duct design. However itdoes not seem to be constant which could be due to clogging in the system. In then2008070 data set the predicted offset is as large as 10%. The MAE% is 5.34% inthe subsampled data set while 3.79% in the n2008070 data set.
0 500 1000 1500 2000
-0.1
-0.05
0
0.05
0.1n2008070
0 50 100 150 200 250 300 350-0.3
-0.2
-0.1
0
subsampled
17
-Fe
b-2
00
8
09-Jul-2008 03-Aug-2009
max: |0.07685|
max: |0.2943|
Figure 3.3: Residuals of the cabin fan pressure sensors
At the beginning of the 3rd August 2009 section of the subsampled residual, asshown in Figure 3.3, there is a clear disturbance that is believed to originate fromthat the just starting fan is having problems leaving stall mode. During thisinterval both fans are set to run at high speed while the supply fan was abruptly
3.3 Validation of sensor redundancy 19
disabled indicating that there was something wrong with the supply and a desireto vent out the module was present. However, it is believed that this made one ofthe fans not able to ensure a positive flow.
Fan input voltage
The input voltage to the fans are often varying and the signal is depending onthe modes the fan is working in. This variation is fairly small and the variationis normally within one percent. In Table 3.2 the MSE and MAE% of the differentfans are shown when comparing to the supply fan.
Table 3.2: Fan voltage redundancy error
ISFA vs IRFA CFA1 CFA2data MSE MAE% MSE MAE% MSE MAE%training 0.964 0.79 0.191 0.34 0.859 0.74n20080319* 2.189 1.20 - - 0.953 0.79n2008070* - - 0.044 0.17 - -n20091129* 1.141 0.86 0.134 0.30 0.978 0.80subsampled 1.945 1.12 0.094 0.24 0.963 0.79
Humidity sensors
As shown in Figure 3.4 are the humidity sensors displaying slightly different dy-namics indicating that they must have different positions in the system. The MSEand MAE% are displayed in Table 3.3.
0 500 1000 1500 200040
41
42
43
44
45
46
47Air Humidity in %
HS1 HS2
0 500 1000 1500 2000-2
0
2
HS1-HS2
max: |1.905|
Figure 3.4: Humidity sensors from the training data set
20 3 Model parametrisation and validation
Table 3.3: Humidity sensors redundancy error
data MSE MAE%training 1865.67 5.94n20080319 1849.63 5.17n2008070 1178.48 2.16n20091129 1848.76 3.76subsampled 1641.13 4.87
Temperature sensors
When comparing the temperature sensors, it is concluded that they stay within aquantification step between each other. The variation of the average temperaturesignal indicates that it is calculated from more detailed data. CTCU2 is most ofthe time not in use at all. No offset or drifting has been detected.
0 500 1000 1500 200022.5
23
23.5
24Raw signals in
oC
Temp1 Temp2 Temp3 Avg
0 500 1000 1500 2000-0.5
0
0.5Residuals
Temp1 - Temp2
0 500 1000 1500 2000-0.5
0
0.5
Temp1 - Temp3
0 500 1000 1500 2000-0.5
0
0.5
Avg - mean(Temp(1-3))
Figure 3.5: Validation of the CTCU1 temperature sensors
3.4 Validation of the fan curves
As no true flow is known in the system, it is assumed that the fan curve togetherwith the affinity laws generate a flow that could be utilised instead. The equationsdisplayed in Section 2.2.1 are part of the affinity laws and will not be validated.Since the estimations involving flows produced by this method yield better re-sults than the other methods tested indicate that the method is functional.
3.4 Validation of the fan curves 21
Detection of fan modes
To utilise the fan curves the fan mode must be determined. Initially this was donemanually on the training data set to generate estimations but as the generalisa-tion, described in Section2.2.2, produced an overall better result, all estimationswere re-generated using this. A comparison of the manual and automated modedetection is shown in Figure 3.6.
0 500 1000 1500 2000
200
300
400
500
600Airflow [kg/h]
CFAX:High, ISFA:High CFAX:Low, ISFA:High CFAX:High, ISFA:Low AFS1 AFS2 filtered Estimation
0 500 1000 1500 2000 low
high
lowISFA mode
automatic manual
0 500 1000 1500 2000 low
high
lowCFA1 mode
automatic manual
0 500 1000 1500 2000 low
high
lowCFA2 mode
automatic manual
Figure 3.6: Estimation of modes
Decoupling of air density
Early in the construction of the model it was realised that the air density depen-dency often was not properly defined among the equations and input signals. Inaddition it was desired to decouple the density from the fan curves. The conclu-sions however, was that the fan curve and mode detection could not be satisfac-tory decoupled from the air density on a modelling basis. As fan mode is requiredto determine the produced flow, air density has not been decoupled from the fancurve model.
22 3 Model parametrisation and validation
3.5 Model parametrisation and validation
The model is parametrised using linear regression in the case of (2.7) and (3.2)and curve fitting (Levenberg-Marquardt) in the case of (2.3). The parametrisationis done using the whole training data set except in the case of (2.7) where thesamples with capped AFS signals were discarded.
AFS i − Q : AFSi =∑
Kj,iQj + K0 (2.7)
CHXdp − Q : ∆pi = Kf ,i(QiKc,i − Kd,i)2 (2.3)
kh − CHX : kh = Kv∆p + K0 (3.2)
AFS-Q equation
A complete overview of the parametrisation of (2.7) is shown in Table 3.4. Overallyields the AFS2 sensor a very good result while the AFS1 parametrisation appearsto be clearly faulty. An observation of the estimated parameters show that AFS2uses about 60% of ISFA and 30% of the cabin fans. While the AFS1 parametri-sation uses 30% of them all. An re-parametrisation weighting ISFA higher forAFS1 might give better results, but as the AFS1 sensor is often saturated, thismight never be possible.
0 500 1000 1500 2000 25000
200
400
AFS1-Q [m3/h]
AFS1 Estimate AFS1 abs(error)
0 500 1000 1500 2000 25000
200
400
AFS2 Estimate AFS2 abs(error)
0 500 1000 1500 2000 2500 low
highCFA1 mode
0 500 1000 1500 2000 2500 low
highCFA2 mode
max: 160.7
max: 117.3
Figure 3.7: Validation of the AFS-Q parametrisation (n2008070)
In the n2008070 data set, displayed in Figure 3.7, the AFS2 parametrisationturned worse when the ISFA valve is opened. This could indicate that eitherthe ISFA check valve is not operational or that there is a passive flow from the
3.5 Model parametrisation and validation 23
supply module. This behaviour can also be seen in the n20091129 data set whenthe ISFA valve is closed.
Table 3.4: AFS-Q parametrisation error
data AFS1 MSE AFS1 MAE% AFS2 MSE AFS2 MAE%training 12689 27.14 385.4 4.09n20080319 4001 18.27 34.2 1.15n2008070 14738 28.62 806.9 5.38n20091129 15400 29.91 419.1 5.04subsampled 13915 28.43 576.2 5.09
CHX-Q equation
It can clearly be seen that the parametrisation of (2.3) is not perfect when viewingits match to the training data set but seems to be holding quite well overall evenif it appears to be suffering from an offset. It has to be analysed if this is driftingand thereby could be caused by clogging or any similar fault. In Figure 3.8 can theworst case of offset among the selected validation data sets be seen. Interesting ishow the offset change sign when the working cabin fan is changed about sample1700. Additionally the fan is having problems leaving the faulty stall mode whichit finally manages once the return fan gets disabled just before sample 2000.
Table 3.5: CHX-Q parametrisation error
data MSE MAE% MSE filtered MAE% filteredtraining 246.0 11.03 151.7 8.90n20080319 228.5 12.45 58.9 6.08n2008070 207.0 10.80 9.5 2.20n20091129 156.0 9.62 52.5 5.71subsampled 218.1 16.23 69.0 6.47
0 500 1000 1500 200040
60
80
100
120
140CHX-Q [Pa]
CHX Estimate filtered CHX filtered Est
0 500 1000 1500 20000
20
40
60
80
abs(error) abs(filtered error)
max: 47.36
max: 13.72
Figure 3.8: Validation of the CHX-Q parametrisation (n20091129)
24 3 Model parametrisation and validation
KH-CHX equation
The results of the parametrisation of (3.2) is shown in Table 3.6. It can be notedthat the parametrisation seem to overshoot when the return valve is closed. Thishappens in the first part of the n2008070 data set as seen in Figure 3.9 and in thelatter part of n20091129 data set. This could indicate a back flow through thereturn fan or that there is a forced exhaust flow even when the fan is disabled.
Table 3.6: KH-CHX parametrisation error
data MSE MAE% MSE filtered MAE% filteredtraining 0.012 10.41 0.0089 8.53n20080319 0.011 10.36 0.0022 4.00n2008070 0.023 16.27 0.013 14.19n20091129 0.014 11.81 0.0084 10.36subsampled 0.018 13.85 0.010 9.49
0 500 1000 1500 20000.2
0.4
0.6
0.8
1
1.2KH-CHX [kPa]
Estimation cfaXkh filtered Est
0 500 1000 1500 20000
0.2
0.4
0.6 abs(error) abs(filtered error)
max: 0.5292
max: 0.1961
Figure 3.9: Validation of KH-CHX parameterisation (n2008070)
3.6 Validation of fan flow by power consumption 25
3.6 Validation of fan flow by power consumption
This model (2.6) has been verified against both the air flow sensors (AFS againstf (W )) as displayed in Table 3.7 and against the air flow generated from the fancurves (W against f (Q)) (2.4) as shown in Table 3.8 with varying results. Themodel seems to hold fairly well when the fans operate in certain regions, butas this is quite often not the case the usability might be limited. Note that allfans are identical while the fan model (2.6) has different parameters for eachfan. This could be a case of parametrisation for that fan’s default working regionrather than an overall. When using the parameters for CFA2 on the CFA1 flowan MAE% improvement of about 10% is achieved. Doing a similar attempt forIRFA does not yield any improved results, however as the IRFA flow is unknownthis method can not be determined incorrect.
Table 3.7: AFS estimation by power consumption error
data AFS1 MSE AFS1 MAE% AFS2 MSE AFS2 MAE%training 57562 37.9 56518 37.2n20080319 4288.7 16.2 1095.0 7.6n2008070 3360.6 10.8 3377.9 11.6n20091129 2782.3 9.6 569.6 4.9subsampled 6730.6 15.6 3361.3 8.1
Table 3.8: Power consumption estimation by fan flow error
ISFA IRFA CFA1 CFA2data MSE MAE% MSE MAE% MSE MAE% MSE MAE%training 940.2 20.9 3645.9 58.9 1092.2 28.0 1810.0 18.7n20080319 3.3 1.2 2891.0 40.7 - - 37.4 6.5n2008070 14.0 2.6 - - 432.6 19.7 523.1 17.7n20091129 22.2 3.7 5.6 1.3 928.9 33.7 10.5 3.6subsampled 14.0 2.6 1684.0 25.5 1161.7 32.4 61.2 6.9
4Design of fault detection system
This chapter will explain how the faults are implemented, how the residualsare selected and finally tested. The desired detectable faults are added to themodel where it then is partitioned into minimal structurally overdetermined sets(MSOs) by using Dulmage-Mendelsohn decomposition analysing the stronglyconnected components (SCC) [Frisk et al., 2012, Krysander et al., 2008]. Onlywhen this is done is the greedy approach used to select the residuals.
4.1 The model
The complete model has 78 equations generating 55 states from 36 input signals.32 errors are handled and below follows a description on how these are includedin the modelled. The whole system is displayed in Figure 4.1.
Sensor Faults
Sensor faults (Fs) have been added into the model as an additive parameter to themeasured sensor (Ss) as
St = Ss + FsAs the original signal is known during simulation can multiplicative faults alsobe simulated by letting Fs be a function of Ss. This is added to all input signalsexcept the enabled and open/closed valve signals.
Duct clogging
A simulation of the complete model is required to determine how clogging reallyaffects the system. This is however not done and a heavily simplified fault is in-
27
28 4 Design of fault detection system
e49e50e51e52e53e54e55e56e57e59e60e61e62e63e64e65e66e67e68e69e71e75e17e18e19e35e58e70e72e73e74e76e77e78e13e14e15e16e20e27e36e28e37e46e5e6e7e8e9e10e11e12e31e38e39e40e41e1e2e3e4e34e25e26e42e43e44e45e21e22e23e24e30e47e29e33e48e32
cfa
1d
pcfa
1n
cfa
1i
cfa
1v
cfa
2d
pcfa
2n
cfa
2i
cfa
2v
ou
tlQ
isfa
dp
isfa
nis
fai
isfa
virfa
dp
irfa
nirfa
iirfa
vo
utlkh
1o
utlkh
2h
um
ca
bp T
cfa
1w
cfa
2w
irfa
wis
faw
rho
ou
tlq
ou
tld
pcfa
1m
cfa
2m
irfa
mis
fam
cfa
1kh
cfa
2kh
irfa
kh
isfa
kh
jun
cp
cfa
1Q
cfa
2Q
irfa
Qis
faQ
cfa
1q
cfa
2q
irfa
qis
faq
fan
jpcfa
Xq
fan
jqin
lqfa
njd
pfa
njQ
inld
pin
lpin
lQF
an
jclo
gg
He
pa
clo
gg
Re
turn
grid
clo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1kh
Fcfa
1n
Fcfa
2iF
cfa
2kh
Fcfa
2n
Fch
xfa
1F
ch
xfa
2F
hs1
airh
um
Fh
s2
airh
um
Firfa
iFirfa
kh
Firfa
nF
isfa
iFis
fakh
Fis
fan
Fp
ostc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1
Fte
mp
2F
tem
p3
Fte
mp
AF
tps1
airp
ressF
tps2
airp
ressF
tps3
airp
ressF
tps4
airp
ressF
CF
A1
De
lta
PC
FA
1F
an
Sp
ee
dC
FA
1In
pu
tCu
rre
nt
CF
A1
Inp
utV
olta
ge
CF
A2
De
lta
PC
FA
2F
an
Sp
ee
dC
FA
2In
pu
tCu
rre
nt
CF
A2
Inp
utV
olta
ge
ISF
AD
elta
PIS
FA
Fa
nS
pe
ed
ISF
AIn
pu
tCu
rre
nt
ISF
AIn
pu
tVo
lta
ge
IRF
AD
elta
PIR
FA
Fa
nS
pe
ed
IRF
AIn
pu
tCu
rre
nt
IRF
AIn
pu
tVo
lta
ge
CF
A1
EN
CF
A2
EN
IRF
AE
NIS
FA
EN
ISF
AO
PIR
FA
OP
AF
S1
Ca
bA
irM
assflo
wA
FS
2C
ab
AirM
assflo
wC
HX
FA
De
lta
P1
CH
XF
AD
elta
P2
HS
1A
irH
um
idity
HS
2A
irH
um
idity
TP
S1
AirP
ress
TP
S2
AirP
ress
TP
S3
AirP
ress
TP
S4
AirP
ress
Avg
Ca
bin
Te
mp
Ca
bin
Te
mp
1C
ab
inT
em
p2
Ca
bin
Te
mp
3
Full System
Figure 4.1: Dulmage-Mendelsohn decomposition of the full model
stead simulated that could be seen as clogging. The fault is added as a parameticfault to (2.3) as additional duct flow resistance.
∆pi = Kf ,i(Qi(Kc,i + Fc,i) − Kd,i)2
Air leaks
Like with the clogging, a complete simulation is required to truly know how thistype of fault is affecting the system. The simplified method here is that the nega-tive fault (Fq) is added to the flows in (2.2) as
Qout = Qin + Fq
simulating that air has taken route elsewhere than the monitored connections.
4.2 Generating the MSOs
Finding the MSOs for a system this size is a very heavy task and has thereforebeen split into simpler sub tasks. For the chosen method the system is first sortedusing Dulmage-Mendelsohn decomposition as shown in Figure 4.1. An uppersubset, A, is arbitrarily selected that is considered fast enough to be completelysearched while still having overdetermined components. After which the rest, B,of the system is detached. This detached subsystem is searched for all existing
4.3 Selecting the residuals 29
MSOs and for each found MSO the dependant equations and states from A isadded to form a new matrix C that is again searched for all MSOs.
Consider the system shown in Figure 4.2. After selecting an A, which has at leastan overdetermined part (e5 and e6) the rest B is selected. Analysis of B yieldstwo MSOs that forms C1 and C2 as illustrated. Observe that for C2 has e2 beencompletely removed from the final search.
e1 xe2 xe3 xe4 xe5 x xe6 x xe7 x xe8 x xe9 x x x
e10 x x
full systemA
B
e1 xe2 xe3 xe4 xe5 x xe6 x xe7 x xe8 x xe9 x x x
e10 x x
C1e1 xe2 xe3 xe4 xe5 x xe6 x xe7 x xe8 x xe9 x x x
e10 x x
C2
Figure 4.2: Illustration of the MSO generation process
This method should find most MSOs of the system but can not guarantee a com-plete set as this would require that also the exactly determined sets in B have tobe considered for generation of additional C sets.
These MSOs are then sorted according to which structural faults they could detectin preparation for the residual selection. For this particular system the methodhas produced 1.8 (1776385) million MSOs covering 63604 different combinationsof structural fault detectability.
4.3 Selecting the residuals
To select a sufficient subset of MSOs, a greedy search algorithm is applied [Svärdet al., 2013]. The greedy search works such that a residual is added to the solutionset that fulfils as many new fault isolation requirements as possible. To do thiseach of the residual is tested and the best candidate is included in the solution set.This whole process is repeated until no further improvements are to be found.
4.3.1 System modes
As the approach above were selecting residuals purely based on their structuralfault detectability, except when the residuals were tested, it was clear that someresiduals were only valid in certain modes. This is primarily the case for whichcabin fan currently is in operation. To cope with this all the selected residualswere tested and classified.
Residuals that appeared to be invalid in any mode were completely discardedand the search operation was re-run from that position onward. This is also done
30 4 Design of fault detection system
r303
r16841
r30877
r60695
r37757
r13806
r32010
r44259
r34209
r54361
r22979
r58115
r24468
r33725
r44438
r20653
r32271
r30421
r50248
r9548
r38117
Fa
njc
log
g
He
pa
clo
gg
Re
turn
grid
clo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1kh
F
cfa
1n
F
cfa
2iF
cfa
2kh
F
cfa
2n
F
ch
xfa
1F
ch
xfa
2F
hs1
airh
um
F
hs2
airh
um
F
irfa
iF
irfa
kh
F
irfa
nF
isfa
iF
isfa
kh
F
isfa
nF
po
stc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1
F
tem
p2
F
tem
p3
F
tem
pA
F
tps1
airp
ressF
tps2
airp
ressF
tps3
airp
ressF
tps4
airp
ressF
Figure 4.3: Final structural detectability matrix
on-the-fly when a new residual has been selected. In the case where there existsmultiple MSOs with the same set of structural detectability has only the secondone been tested if the first has been marked as invalid.
Once a general set was generated with ideal isolability without regards of thedifferent modes, the specific mode dependant isolation matrices were analysedand the set was expanded to improve the currently worst matrix using the samemethod. This was iterated until no improvements to any of the isolation matricescould be found. The final structural detectability matrix is shown in Figure 4.3.
4.4 Generating the residual code
During the classification process the residuals have been generated and tested.The residual itself is processed into executable code using a function that will gothrough the directed acyclic graph (DAG) and return the required equations tocalculate each state. A limitation of this function is that it cannot handle when anequation set has to be solved to generate the next needed state, requiring manualsupervision of the output before execution.
As many of the residuals calculate the same states using the same equation sets,another function was made to merge residuals by creating new states and equa-tions when there is a conflict in method to calculate an otherwise shared state.This allows the generation of one function containing all residuals.
4.4 Generating the residual code 31
e49e50e51e52e53e54e55e56e57e59e60e61e62e63e64e65e66e67e68e69e71e76e58e70e77e72e78e74e75e73e76e13e14e15e16e28e36e20e37e35e20e20e20e27e20e35e20e35e20e20e20e20e20e20e35e20e35e35e20e35r314e20e7e8e11e12e38e39e40e41e31e37e6e10e5e9e46e8e12e46e7e11e46e46e31e7e11e46e8e12e6e10e31e5e7e9e11e8e12e28e5e7e8e9e11e12e27e5e6e7e9e10e11e6e10e5e9e31e6e10e5e9e7e11e6e10e37e7e11e8e12e46e31e5e6e8e9e10e12e3e4e34e28e1e2e4e43e44e3e45r113e42e34e3e4e2e43e34e1e3e4e44e37e1e3e4e36e1e2e3e44e2e44e1e45e34e2e1e42e3e2e28e34e3e4e44e9e10e11e7e5e6e34e1e2e4e46e22e21e21e26e21r110e26e21e22r170e25e31e31e22e26e21e26e21e22e22r291e1e2e3e21e22e23e45e20r87e9e10e5e6e22e22e23e22e34e34e23e22e22e23e23e22r326e24e30e31e1e2r102r145e30e24e23e30e24r223r233e30e24e30e24e23e29e33e34e22e33e29e24e30e33e29e33e29e33e29e24e30e32e23e32e29e33e32e32e32e29e33r70e24e30r163e32r214r251r268e32e29e33r192r310e32r75e26e22e21e25e26e23e21e30e24e23e33e29e24e30e32e29e33r132e32r285
cfa
1dp
cfa
1n
cfa
1i
cfa
1v
cfa
2dp
cfa
2n
cfa
2i
cfa
2v
outlQ
isfa
dp
isfa
nis
fai
isfa
virfa
dp
irfa
nirfa
iirfa
voutlkh1
outlkh2
hum
cabp T
outlQ
v56
hum
v57
Tv58
cabpv77
Tv78
cabpv94
Tv95
cabpv104
Tv312
cfa
1w
cfa
2w
irfa
wis
faw
outlq
outldp
rho
outldpv59
cfa
1dpv79
rhov80
rhov96
rhov105
outlqv113
rhov114
cfa
1dpv115
rhov116
cfa
2dpv134
rhov135
rhov147
rhov165
rhov171
rhov194
rhov216
cfa
1dpv225
rhov226
cfa
1dpv235
cfa
2dpv253
rhov254
cfa
1dpv270
res313
rhov314
irfa
mis
fam
irfa
kh
isfa
kh
cfa
1Q
cfa
2Q
irfa
Qis
faQ
juncp
outlkh2v60
cfa
2m
cfa
2kh
cfa
1m
cfa
1kh
outlqv81
isfa
mv82
isfa
khv83
outlqv97
irfa
mv106
irfa
khv107
rhov111
outlqv117
juncpv118
irfa
mv119
irfa
khv120
outlqv136
isfa
mv137
isfa
khv138
cfa
2m
v139
cfa
2khv140
juncpv148
cfa
1m
v149
irfa
mv150
cfa
1khv151
irfa
khv152
isfa
mv166
isfa
khv167
outlkh2v172
cfa
1m
v173
irfa
mv174
isfa
mv175
cfa
1khv176
irfa
khv177
isfa
khv178
outlkh1v195
cfa
1m
v196
cfa
2m
v197
irfa
mv198
cfa
1khv199
cfa
2khv200
irfa
khv201
cfa
2m
v217
cfa
2khv218
cfa
1m
v227
cfa
1khv228
juncpv236
cfa
2m
v237
cfa
2khv238
cfa
1m
v239
cfa
1khv240
irfa
mv255
irfa
khv256
cfa
2m
v257
cfa
2khv258
outlkh2v271
irfa
mv272
irfa
khv273
isfa
mv287
isfa
khv288
rhov292
juncpv293
cfa
1m
v315
cfa
2m
v316
isfa
mv317
cfa
1khv318
cfa
2khv319
isfa
khv320
irfa
qis
faq
fanjp
outlqv61
cfa
1q
cfa
2q
isfa
qv84
cfa
2qv98
isfa
qv99
irfa
qv108
irfa
qv110
res112
cfa
1qv121
fanjp
v122
irfa
qv123
isfa
qv141
cfa
2qv142
cfa
2qv153
fanjp
v154
cfa
1qv155
irfa
qv156
isfa
qv168
isfa
qv170
outldpv179
cfa
1qv180
irfa
qv181
isfa
qv182
outldpv202
cfa
1qv203
cfa
2qv204
irfa
qv205
isfa
qv219
cfa
2qv220
isfa
qv229
cfa
1qv230
irfa
qv241
fanjp
v242
cfa
2qv243
cfa
1qv244
cfa
1qv259
irfa
qv260
cfa
2qv261
outlqv274
fanjp
v275
irfa
qv276
isfa
qv289
isfa
qv291
cfa
1khv294
cfa
2khv295
irfa
khv296
irfa
mv297
cfa
1m
v298
cfa
2m
v299
fanjp
v300
cfa
1qv321
cfa
2qv322
isfa
qv323
rhov62
cfa
Xqv66
cfa
Xq
cfa
Xqv85
cfa
1qv102
cfa
Xqv103
res109
cfa
1qv145
cfa
Xqv146
cfa
Xqv157
res169
cfa
2qv183
juncpv184
juncpv206
cfa
Xqv207
cfa
1qv223
cfa
Xqv224
cfa
2qv233
cfa
Xqv234
cfa
Xqv245
cfa
Xqv262
res290
cfa
1qv301
cfa
2qv302
irfa
qv303
cfa
Xqv324
cfa
Xqv326
fanjq
irfa
qv63
cabpv64
res86
cfa
1khv87
cfa
2khv88
cfa
1m
v89
cfa
2m
v90
cfa
1qv100
cfa
1qv143
fanjq
v158
cfa
Xqv185
fanjp
v186
fanjp
v208
fanjq
v209
cfa
1qv221
cfa
2qv231
fanjq
v246
fanjq
v263
cfa
Xqv304
res325
inlq
fanjd
pju
ncpv65
cfa
1qv91
cfa
2qv92
res101
res144
fanjd
pv159
inlq
v160
fanjq
v187
fanjd
pv210
inlq
v211
res222
res232
fanjd
pv247
inlq
v248
fanjd
pv264
inlq
v265
fanjq
v305
inld
pin
lpfa
njp
v67
cfa
Xqv66v93
inlp
v163
inld
pv164
inlq
v188
fanjd
pv189
inlp
v214
inld
pv215
inlp
v251
inld
pv252
inlp
v268
inld
pv269
inlq
v306
fanjd
pv307
inlp
v68
fanjq
v70
inld
pv161
inld
pv192
inlp
v193
inld
pv212
inld
pv249
inld
pv266
inld
pv310
inlp
v311
res69
inlq
v71
fanjd
pv72
res162
inlp
v190
res213
res250
res267
inlp
v308
inld
pv75
inlp
v76
res191
res309
inlp
v73
res74
cfa
2qv124
isfa
qv125
cfa
Xqv126
cfa
1qv277
isfa
qv278
fanjq
v127
cfa
Xqv279
fanjd
pv128
inlq
v129
fanjq
v280
inlp
v132
inld
pv133
inlq
v281
fanjd
pv282
inld
pv130
inld
pv285
inlp
v286
res131
inlp
v283
res284
Fanjc
logg
Hepaclo
gg
Retu
rngridclo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1khF
cfa
1nF
cfa
2iF
cfa
2khF
cfa
2nF
chxfa
1F
chxfa
2F
hs1airhum
Fhs2airhum
Firfa
iFirfa
khF
irfa
nF
isfa
iFis
fakhF
isfa
nF
postc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1F
tem
p2F
tem
p3F
tem
pA
Ftp
s1airpre
ssF
tps2airpre
ssF
tps3airpre
ssF
tps4airpre
ssF
CF
A1D
eltaP
CF
A1F
anS
peed
CF
A1In
putC
urr
ent
CF
A1In
putV
oltage
CF
A2D
eltaP
CF
A2F
anS
peed
CF
A2In
putC
urr
ent
CF
A2In
putV
oltage
ISF
AD
eltaP
ISF
AF
anS
peed
ISF
AIn
putC
urr
ent
ISF
AIn
putV
oltage
IRF
AD
eltaP
IRF
AF
anS
peed
IRF
AIn
putC
urr
ent
IRF
AIn
putV
oltage
CF
A1E
NC
FA
2E
NIR
FA
EN
ISF
AE
NIS
FA
OP
IRF
AO
PA
FS
1C
abA
irM
assflow
AF
S2C
abA
irM
assflow
CH
XF
AD
eltaP
1C
HX
FA
DeltaP
2H
S1A
irH
um
idity
HS
2A
irH
um
idity
TP
S1A
irP
ress
TP
S2A
irP
ress
TP
S3A
irP
ress
TP
S4A
irP
ress
AvgC
abin
Tem
pC
abin
Tem
p1
Cabin
Tem
p2
Cabin
Tem
p3
Merged
Figure 4.4: Dulmage-Mendelsohn decomposition of the merged residuals
Even though the merging function is far from ideal and can still produce multipleidentical steps the produced output is remarkably faster than running the residu-als separately. The Dulmage-Mendelsohn decomposition of the merged residualsis displayed in Figure 4.4 with horizontal lines marking the splits of the DAG, i.e.,each block of equation can only be calculated by using the states from previousblocks.
32 4 Design of fault detection system
4.5 Validation of the residuals
During the classification process a heavily subsampled selection of both trainingdata and the whole data set was used as described in Section3.1. A fault profile,that was repeatedly alternating and increasing in magnitude was injected. Thiswas very effective for the purpose of detecting what modes the residual requiredbut the results could be very hard to interpret.
Due to that a more simple data set have been selected for validation and a singlepercentual fault has been injected. The fault appears at sample 100 as a step of50% of original signal and stays constant for a few samples to then decrease tozero. At sample 300 is the same procedure done in reverse. For easy reference,the fault profile is added to all the figures.
0 100 200 300 400 500
0
0.5
1
Residual 9 (34209) [cfa1]
Fanjclogg Returngridclogg cfa1iF cfa1khF no fault
0 100 200 300 400 500
-0.5
0
0.5
1
Residual 9 (34209) [cfa1]
cfa1nF cfa2iF chxfa1F hs2airhumF no fault
Figure 4.5: Validation of residual 9
0 100 200 300 400 500
0
0.5
1
Residual 19 (50248) [cfa1 & cfa2]
cfa1iF cfa1khF cfa1nF no fault
0 100 200 300 400 500
0
0.5
1
1.5
Residual 19 (50248) [cfa1 & cfa2]
cfa2iF cfa2khF cfa2nF chxfa1F no fault
Figure 4.6: Validation of residual 19
4.5 Validation of the residuals 33
The data sets n20090803 and n20091127 have been used respectively for theresiduals that require either CFA1 or CFA2 enabled. Regarding residuals work-ing with both or neither fans have part of both data sets been used, switching atsample 250.
The results are varying quite much among the residuals, some are noisy and onlydetects parts of their indicated faults, like residual 9 displayed in Figure 4.5,while others are clean and produce easily detectable injected faults.
A few of them are not centred around the zero axis, like residual 7 and 8 displayedin Figure 4.7, while are still reacting to the injected fault in a detectable way.
Note that residual 7 produces not-a-number (NaN) with an injection on the TPS2air pressure sensor and could be considered triggering. In the case of residual 19however, should a NaN not be considered an alarm which seems to be requiringa separate handling depending on which of the fans are currently running. Thiscan be seen in Figure 4.6.
Generally, most of the residuals are prone to the fault that they structurally areable to detect, but they all need unique methods of detection, and in some cases,a unique method per fault is required.
0 100 200 300 400 500
0
20
40
60
Residual 7 (32010) [cfa2]
precfaloss returnloss temp3F tps2airpressF no fault
0 100 200 300 400 500
0
200
400
600
800
Residual 8 (44259) [cfa1]
afs1F afs2F cfa1khF cfa2iF no fault
Figure 4.7: Validation of residual 7 and 8
5Results
In this chapter the results of the design are presented both when detecting oninjected faults and when analysing the real data.
5.1 Fault detection and isolation with injected faults
Analysis of the structural fault detectability and isolability of the final systemyield that 20 of 32 faults are fully isolable. The structural fault isolation matrixis shown in Figure 5.1a.
A manual detection on the residuals with injected single faults cannot with confi-dence detect any air humidity faults. Neither can it detect the cabin fan 1 currentwhen the system is in a cabin fan 2 only mode, which would be expected, butthis brings the question on why this is detected for the opposite case. The restof the faults are all detectable and about half of the faults are possible to isolateby manual analysis of the residuals. The isolation matrix when doing a manualanalysis on injected single faults is presented in Figure 5.1b and show a fair re-duction of isolability which indicate that the residuals selected might not be idealand further iterations of the residual picking process may be needed.
Simple test quantities were produced for automated detection on the real data asmanual detection on the residuals is a time consuming task. The test quantitiesare using either absolute value, cumulative sum (CUSUM) or a windowed meanvalue depending on best detectability on the injected fault data. The thresholdswere then determined automatically using the test validation data so no falsealarms were triggering. For better isolation were the set completed with a few not-a-number test quantities. In Figure 5.1c is the final isolation matrix presented.This indicate that the issue with detecting the air humidity faults were not a
35
36 5 Results
Fa
njc
log
gH
ep
aclo
gg
Re
turn
grid
clo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1kh
Fcfa
1n
Fcfa
2iF
cfa
2kh
Fcfa
2n
Fch
xfa
1F
ch
xfa
2F
hs1
airh
um
Fh
s2
airh
um
Firfa
iFirfa
kh
Firfa
nF
isfa
iFis
fakh
Fis
fan
Fp
ostc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1
Fte
mp
2F
tem
p3
Fte
mp
AF
tps1
airp
ressF
tps2
airp
ressF
tps3
airp
ressF
tps4
airp
ressF
FanjcloggHepaclogg
Returngridcloggafs1Fafs2F
cfa1iFcfa1khF
cfa1nFcfa2iF
cfa2khFcfa2nF
chxfa1Fchxfa2F
hs1airhumFhs2airhumF
irfaiFirfakhF
irfanFisfaiF
isfakhFisfanF
postcfalossprecfalossreturnloss
temp1Ftemp2Ftemp3FtempAF
tps1airpressFtps2airpressFtps3airpressFtps4airpressF
Tota
l theore
tical
Cabin
Fan 1
only
Cabin
Fan 2
only
(a)Stru
ctural
Fa
njc
log
gH
ep
aclo
gg
Re
turn
grid
clo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1kh
Fcfa
1n
Fcfa
2iF
cfa
2kh
Fcfa
2n
Fch
xfa
1F
ch
xfa
2F
hs1
airh
um
Fh
s2
airh
um
Firfa
iFirfa
kh
Firfa
nF
isfa
iFis
fakh
Fis
fan
Fp
ostc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1
Fte
mp
2F
tem
p3
Fte
mp
AF
tps1
airp
ressF
tps2
airp
ressF
tps3
airp
ressF
tps4
airp
ressF
FanjcloggHepaclogg
Returngridcloggafs1Fafs2F
cfa1iFcfa1khF
cfa1nFcfa2iF
cfa2khFcfa2nF
chxfa1Fchxfa2F
hs1airhumFhs2airhumF
irfaiFirfakhF
irfanFisfaiF
isfakhFisfanF
postcfalossprecfalossreturnloss
temp1Ftemp2Ftemp3FtempAF
tps1airpressFtps2airpressFtps3airpressFtps4airpressF
Tota
l Manual
Cabin
Fan 1
only
Cabin
Fan 2
only
(b)
Manu
allyd
etected
Fa
njc
log
gH
ep
aclo
gg
Re
turn
grid
clo
gg
afs
1F
afs
2F
cfa
1iF
cfa
1kh
Fcfa
1n
Fcfa
2iF
cfa
2kh
Fcfa
2n
Fch
xfa
1F
ch
xfa
2F
hs1
airh
um
Fh
s2
airh
um
Firfa
iFirfa
kh
Firfa
nF
isfa
iFis
fakh
Fis
fan
Fp
ostc
falo
ss
pre
cfa
loss
retu
rnlo
ss
tem
p1
Fte
mp
2F
tem
p3
Fte
mp
AF
tps1
airp
ressF
tps2
airp
ressF
tps3
airp
ressF
tps4
airp
ressF
FanjcloggHepaclogg
Returngridcloggafs1Fafs2F
cfa1iFcfa1khF
cfa1nFcfa2iF
cfa2khFcfa2nF
chxfa1Fchxfa2F
hs1airhumFhs2airhumF
irfaiFirfakhF
irfanFisfaiF
isfakhFisfanF
postcfalossprecfalossreturnloss
temp1Ftemp2Ftemp3FtempAF
tps1airpressFtps2airpressFtps3airpressFtps4airpressF
join
t poin
tsT
heore
tical o
nly
Inje
cte
d o
nly
(c)A
utom
aticallyd
etected
Figure
5.1:Isolationm
atrices
5.2 Detection and isolation on real data 37
residual problem. This automated design can not detect the cabin fan 2 currentfault and a few other pairs can not be isolated from each other.
5.2 Detection and isolation on real data
Analysing the test quantities described in Section 5.1 on the real data yield thatmany of the test quantities triggers and many of the faults are detected but it isnot often a single fault could be isolated with certainty at any given time. Almostall faults isolated are indicated in short time intervals which makes it hard todetermine if these are proper or false alarms. There are however a few cases ofprolonged repeated patterns presented below. Noted can be that less than 3% ofthe time no test quantities are triggering giving us a total of only about 11% ofno fault data including the cases where the data is not properly defined.
The figures throughout this chapters displays the alarms as how many test quan-tities are triggering for this fault in single fault isolation certainty. That meansthat if an alarm is lower than 1 then there is at least one conflict in isolation. Thisis calculated by count of alarming candidates of detected candidates for the spe-cific test quantities triggering. No faults are alarming if only one test quantityis triggering nor if there are contradiction among the test quantities triggering.This is a simple form of isolation by column matching. One shortcoming withthis is that in case of multiple faults could one fault prevent another faults alarmto go off which is also the case for false triggering of test quantities. All faults notcovered in the figures are not alarming during the displayed data set, not evenbriefly.
06:30 06:34 06:38 06:52 18:11 18:26
chxfa
1F
0
1
06:30 06:34 06:38 06:52 18:11 18:26
irfa
khF
0
1
06:30 06:34 06:38 06:52 18:11 18:26
irfa
nF
0
1
Figure 5.2: First noted alarm in the outset data set. The vertical lines indicatetime lapses with missing data in the data set.
Return fan faults
Figure 5.3a displays the alarm signal for fault on the pressure and rotation speedon the return fan on real data from 11th Mars 2008. It cannot isolate the twofaults from each other but it is fairly certain there is a fault on either of those
38 5 Results
two sensors. Also to be considered is that this is not a constant indication whichcould imply that these all are false alarms. There are occasionally some otherfaults alarming, but those are generally brief in nature. Figure 5.2 displays thefirst occurrence of this pattern.
Clogging faults
Another pattern that is repeating over quite large portions of the real data pro-cessed is where there is alarms on multiple parametric faults and pressure sen-sors. These faults are confirmed in Figure 5.1c to be hard to isolate from eachother. A sample of this real data have been selected and is displayed in Fig-ure 5.3b.
19:06 19:21 19:29 19:37 19:46 19:54
Fa
njc
log
g
0
1
19:06 19:21 19:29 19:37 19:46 19:54Re
turn
grid
clo
gg
0
1
19:06 19:21 19:29 19:37 19:46 19:54
afs
1F
0
1
19:06 19:21 19:29 19:37 19:46 19:54
ch
xfa
1F
0
1
19:06 19:21 19:29 19:37 19:46 19:54
irfa
kh
F
0
1
19:06 19:21 19:29 19:37 19:46 19:54
irfa
nF
0
1
19:06 19:21 19:29 19:37 19:46 19:54
pre
cfa
loss
0
1
19:06 19:21 19:29 19:37 19:46 19:54
retu
rnlo
ss
0
1
(a) 11th of Mars 2008
11:25 11:50 11:54 11:57 12:00 12:04
Fa
njc
log
g
0
1
11:25 11:50 11:54 11:57 12:00 12:04
He
pa
clo
gg
0
1
11:25 11:50 11:54 11:57 12:00 12:04Re
turn
grid
clo
gg
0
1
11:25 11:50 11:54 11:57 12:00 12:04
afs
1F
0
1
11:25 11:50 11:54 11:57 12:00 12:04
cfa
1kh
F
0
1
11:25 11:50 11:54 11:57 12:00 12:04
irfa
kh
F
0
1
11:25 11:50 11:54 11:57 12:00 12:04
pre
cfa
loss
0
1
11:25 11:50 11:54 11:57 12:00 12:04
retu
rnlo
ss
0
1
(b) 21th Mars 2008
Figure 5.3: Alarms during selected parts of the outset data set
Outset data set
Both of the above mentioned patterns are part of the a data set, which is thelongest set, containing more than half of the available real data. The name is arbi-trary selected due to this set being the first continuous data available for analysis.This whole set has been marked as return fan loss but there is also documentedunexpected clogging of the return grid. Additionally this data have a fairly lowand inconsistent sampling.
However, the above detected patterns and the documented faults are not entirelycoherent as only half of the data is detected to have return fan problems whilstthe clogging seems to continue further than what has been reported. An overviewof the alarms during this whole set is shown in Figure 5.4 with markings on wherethe part of unexpected clogging is documented.
5.2 Detection and isolation on real data 39
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14F
an
jclo
gg
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
He
pa
clo
gg
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14Re
turn
grid
clo
gg
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
afs
1F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
afs
2F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
cfa
1kh
F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
ch
xfa
1F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
irfa
kh
F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
irfa
nF
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
isfa
kh
F
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
isfa
nF
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
po
stc
falo
ss
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
pre
cfa
loss
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
retu
rnlo
ss
0
1
02/13 03/05 03/11 03/12 03/13 03/20 03/22 03/26 03/28 04/14
tem
p3
F
0
1
Figure 5.4: Full overview of the alarms in the outset data set. The verticallines indicate the start and end of the time period where within also unex-pected clogging of the return grid has been reported.
40 5 Results
HEPA filter clogging
Another data set, covering only the 3rd of August 2009, begins with a fairly clearindication of issues in the post cabin region followed by quite certain alarms onthe HEPA filter. This then stops about 2 am with hardly any alarms until almost 7hours later where yet again issues are indicated in the post cabin fan region andfinally uncertain alarms on quite many of the components in the system. Thealarms over this set can be examined in Figure 5.6.
The data set itself is flagged as unexpected clogging of the return grid, whichcould be one of multiple faults available in this data set, but this is not clearlyisolated and if this design is not producing large quantities of false alarms, is notthe only fault.
06:05 06:07 06:09 06:11 06:13 06:15 06:17-300
-200
-100
0
100
200Residual 3 (30877) [cfa2]
no fault 03-Aug-2009
06:05 06:07 06:09 06:11 06:13 06:15 06:170
200
400
600Residual 4 (60695) [cfa2]
no fault 03-Aug-2009
06:05 06:07 06:09 06:11 06:13 06:15 06:17-0.4
-0.2
0
0.2
0.4
0.6Residual 9 (34209) [cfa1]
no fault 03-Aug-2009
06:05 06:07 06:09 06:11 06:13 06:15 06:17-200
0
200
400
600
800Residual 13 (24468) [cfa2]
no fault 03-Aug-2009
Figure 5.5: Selection of residuals during the 3rd of August 2009.
There are quite a lot of the test quantities triggering in the section where no faultsare being isolated. Figure 5.5 displays a selection of the residuals during a portionof this time frame and clearly displays that something is detected and that thisis a matter of evaluation rather than detection. Included in the graphs is also theno fault data used in the validation for easier comparation. This could indicatea passage with multiple faults. One possible explanation for this could be some-thing faulty with the supply air, like for example smoke, that then propagatesand cloggs most of the system. This has however not been confirmed.
5.2 Detection and isolation on real data 41
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
Fa
njc
log
g
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12H
ep
aclo
gg
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12Re
turn
grid
clo
gg
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
afs
1F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
afs
2F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
cfa
1kh
F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
cfa
1n
F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
ch
xfa
1F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
ch
xfa
2F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
irfa
kh
F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
irfa
nF
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
isfa
kh
F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
isfa
nF
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
po
stc
falo
ss
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
pre
cfa
loss
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
retu
rnlo
ss
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
tem
p1F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
tem
p3F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12
tem
pA
F
0
1
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12tps2airpre
ssF
0
1
Figure 5.6: Full overview of the alarms during the 3rd of August 2009.
42 5 Results
Another return fan failure
One data set, covering the 18th of January 2010, is documented as another unex-pected clogging of the return grid. However this is not detected by this design. Itdoes however, briefly, detect but not isolate faults of the parametric type beforeit quite certainly detects faults with the return fan.
08:13 08:16 08:19 08:22 08:25 08:28 08:31 08:34 08:37 08:40-300
-200
-100
0
100Residual 3 (30877) [cfa2]
no fault 18-Jan-2010
08:13 08:16 08:19 08:22 08:25 08:28 08:31 08:34 08:37 08:40200
220
240
260
280Residual 4 (60695) [cfa2]
no fault 18-Jan-2010
08:13 08:16 08:19 08:22 08:25 08:28 08:31 08:34 08:37 08:4010
20
30
40
50Residual 7 (32010) [cfa2]
no fault 18-Jan-2010
08:13 08:16 08:19 08:22 08:25 08:28 08:31 08:34 08:37 08:40-1.5
-1
-0.5
0
0.5Residual 17 (32271) [cfa1 & cfa2]
no fault 18-Jan-2010
Figure 5.7: Selection of residuals during the 18th of January 2010.
When analysing the test quantities during the silent section of the alarms whichare displayed in Figure 5.8 it is observed that many test quantities trigger about8:23. A selection of the residuals highlighting this event is displayed in Figure 5.7.This indicate that there is something happening that the current evaluation can-not isolate which means that the alarming faults that follow could be a result ofthe measurements taken towards this fault.
Unknown data set
One of the data sets has no documentation of what type of fault could be present,if any. The analysis indicate return fan issues as the alarms displayed in Fig-ure 5.9 tell. However, in general few test quantities trigger which could indicatethat all of these could be false alarms.
5.2 Detection and isolation on real data 43
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31Re
turn
grid
clo
gg
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
cfa
1kh
F
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
ch
xfa
1F
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
ch
xfa
2F
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
irfa
kh
F
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
irfa
nF
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
pre
cfa
loss
0
1
07:46 08:14 08:42 09:11 09:39 10:07 10:35 11:03 11:31
retu
rnlo
ss
0
1
Figure 5.8: Alarms from part of 18th January 2010.
00:00 07:11 14:23 21:35 04:46 11:58
Fa
njc
log
g
0
1
00:00 07:11 14:23 21:35 04:46 11:58
He
pa
clo
gg
0
1
00:00 07:11 14:23 21:35 04:46 11:58Re
turn
grid
clo
gg
0
1
00:00 07:11 14:23 21:35 04:46 11:58
afs
1F
0
1
00:00 07:11 14:23 21:35 04:46 11:58
afs
2F
0
1
00:00 07:11 14:23 21:35 04:46 11:58
ch
xfa
1F
0
1
00:00 07:11 14:23 21:35 04:46 11:58
irfa
kh
F
0
1
00:00 07:11 14:23 21:35 04:46 11:58
irfa
nF
0
1
00:00 07:11 14:23 21:35 04:46 11:58
po
stc
falo
ss
0
1
00:00 07:11 14:23 21:35 04:46 11:58
pre
cfa
loss
0
1
00:00 07:11 14:23 21:35 04:46 11:58
retu
rnlo
ss
0
1
00:00 07:11 14:23 21:35 04:46 11:58
tem
pA
F
0
1
Figure 5.9: Data set with unknown fault.
44 5 Results
No alarms
There was additionally a few sets where the design could not isolate any faults.They are all very similar and only triggers alarms very briefly, mostly on theairflow sensors and post cabin fan loss.
Two of the sets are marked as unexpected return grid clogging, one has docu-mented return fan loss while the last one is supposed to have loss of the supplyfan. Neither of these faults are isolated even though all set are indicated to onlyhave one simultaneous fault.
However, many of the test quantities are either constantly on or triggering onand off repeatedly which could indicate that this section is actually suffering ofmultiple faults which is outside of the scope of the current design.
6Conclusion
In this chapter an overall evaluation of the work will be presented followed byreflections and suggestions for eventual future work.
6.1 Overall evaluation
The expected results were to have an FDI system that detects and isolates faultsin the air loop system that could work in real time and detect the faults withinreasonable time.
On injected data the system is able to detect and isolate most of the modelledfaults. However the models of leaks and cloggings are all parametric faults that,in practice, could be either leakage, clogging or wear.
The system can detect some of the indicated faults in the real data and eventhough it cannot isolate the faults perfectly, it can indicate the nature of thefault which, in practice, is sufficient to know which physical components thatrequires service. The proposed solution shows promising results but additionalwork could improve fault detection and isolation performance by, for example,consider multiple simultaneous faults during the evaluation process.
The system is based on a model of the air loop and is with fairly high certaintyable to detect which mode the fans are operating in even though that informationcurrently not is propagated to the fault detection and isolation level.
The methodology presented in [Svärd, 2012] was partially used and is consideredto be working for this system as a whole. Further development following thismethodology is believed to improve the current results when applicable.
45
46 6 Conclusion
The results on real data indicate that the methods used, even though they areyielding results, are not really sufficient for the design of an FDI solution for thisparticular system. This implies that there is a great potential for further improve-ment if more advanced methods for fault detection and isolation are explored.
6.2 Future work
This section will present suggested future work divided by task in a top-downfashion as it is suggested that future improvements of this design should be madein such an order. This as the underlying model itself is to be considered of desiredquality and without need for immediate improvements.
Fault detection and isolation
The algorithms for detection and isolation should be extended so that multiplesimultaneous faults are handled as there are indications that such exist in some,if not most, sets of real data. It is also recommended that more advanced testquantities are to be generated from the current selection of residuals.
One possible method to improve this could be fault detection and isolation byanalysing the Kullback-Leibler divergence as was originally suggested for thiswork, and should be considered as one of the earlier paths for design improve-ment [Eriksson et al., 2011].
It could also be of interest to make the set of test quantities more dynamic as thismodel has different operation modes. The current solution is fairly basic and themethod described in [Eriksson et al., 2012] could be considered for integration.
Residual generation and selection
For a model with a fair amount of equations, but more importantly, plenty ofinterconnections between them, the MSO count is quickly increasing to sizes thatmakes the residual generation and selection a very time consuming task. Whilethis particular model is yet within reasonable size for modern computers, only afew more additions and it might not be.
Due to this are alternative, preferable automatic, methods of determining whichMSOs to consider for further analysis. One way might be by analysing the residu-als distinguishability [Eriksson et al., 2013], however, as the task to find the MSOsis close to the upper limits of feasibility this analysis must be brought closer tothe model equations. My suggestion would be to analyse the signal (and fault)to noise ratio of the equations themselves and create weights that are to be usedin the MSO search. This would need to be a directional analysis similar to howcausality is handled in [Svärd and Nyberg, 2010] and is particular interesting inthe quantified or otherwise non-linear equations. Due to this is it believed thatthis model could be an interesting set for this kind of analysis even if a (almost)complete MSO set already has been extracted.
6.2 Future work 47
This should not only assist in quicken up the search for MSOs but also should beable to assist in selecting how to structure the MSOs when designing residuals.
Modelling
In the current model the fan modes are estimated on the fly within the model bya simple test quantity, this is needed for making the fan curve equations valid.However, this test quantity is not propagated to the fault detection and isolationlayer of the system. Stall mode is considered a fault and should be implementedand the detection of it ought to be improved. Multiple faults must be handledbefore this is propagated as stall mode occurs fairly often in the live data.
Additionally, in some sets of live data the model has a tough time deciding whichmode the fans are in which in turn gives a very inconsistent output flow for thefans. Analysis indicate that this might not actually be a fault in the model asit not only fits the data fairly well but also fits that parallel fans easily can endup racing each other. In the real data available the sample rate is too low for adeterministic conclusion. However, due to this might it be interesting to smoothout the calculated air flows when problems to determine fan mode occur.
Other improvements of the model could be modelling of the dynamic propertiesand the parametric faults. There has also been some experimentation on makinga Simulink model of the system with fairly decent results, but such an implemen-tation could rapidly make the system infeasible in real time. However it is notsuggested to focus on the underlying models unless no further improvements canbe found in the actual detection and isolation layer of the FDI system.
Bibliography
Airbus Defense and Space. Airbus defense and space homepage, 2013a. URLhttps://airbusdefenceandspace.com/. Online; accessed 01/10/2016.Cited on page 1.
Airbus Defense and Space. Columbus celebrates fifth anniversary inspace, 2013b. URL http://www.space-airbusds.com/en/news2/columbus-celebrates-fifth-anniversary-in-space.html. Online;accessed 01/03/2013. Cited on page 1.
Airbus Defense and Space. Utilisation and operation of the ISS,2013c. URL http://www.space-airbusds.com/en/programmes/utilisation-and-operation-of-the-iss.html. Online; accessed01/10/2016. Cited on page 1.
Daniel Eriksson, Mattias Krysander, and Erik Frisk. Quantitative stochastic faultdiagnosability analysis. In 50th IEEE Conference on Decision and Control, Or-lando, Florida, USA, 2011. Cited on pages 5 and 46.
Daniel Eriksson, Erik Frisk, and Mattias Krysander. A sequential test selectionalgorithm for fault isolation. In 10th European Workshop on Advanced Controland Diagnosis, Copenhagen, Denmark, 2012. Cited on page 46.
Daniel Eriksson, Erik Frisk, and Mattias Krysander. A method for quantitativefault diagnosability analysis of stochastic linear descriptor models. Automatica,49(6):1591–1600, June 2013. Cited on page 46.
Erik Frisk, Anibal Bregon, Jan Åslund, Mattias Krysander, Belarmino Pulido, andGautam Biswas. Diagnosability analysis considering causal interpretations fordifferential constraints. IEEE Transactions on Systems, Man, and Cybernetics– Part A: Systems and Humans, 42(5):1216–1229, September 2012. Cited onpages 4 and 27.
M. Krysander, J. Aslund, and M. Nyberg. An efficient algorithm for finding min-imal overconstrained subsystems for model-based diagnosis. Systems, Manand Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 38(1):197–206, 2008. Cited on pages 5 and 27.
49
50 Bibliography
F. Meinguet, A. SANDULESCU, X. Kestelyn, and E. Semail. A method for faultdetection and isolation based on the processing of multiple diagnostic indices:Application to inverter faults in ac drives. Vehicular Technology, IEEE Trans-actions on, PP(99):1, 2012. ISSN 0018-9545. doi: 10.1109/TVT.2012.2234157.Cited on page 5.
E. Noack, W. Belau, R. Wohlgemuth, R. Müller, S. Palumberi, P. Parodi, andF. Burzagli. Efficiency of the columbus failure management system. In AIAA40th International Conference on Environmental Systems, 2010. Cited on page3.
E. Noack, T. Noack, V. Patel, I. Schmitt, M. Richters, J. Stamminger, and S. Sievi.Failure management for cost-effective and efficient spacecraft operation. InAdaptive Hardware and Systems (AHS), 2011 NASA/ESA Conference on,pages 137–144. IEEE, 2011. Cited on page 3.
E. Noack, A. Luedtke, I. Schmitt, T. Noack, E. Schaumlöffel, E. Hauke, J. Stam-minger, and E. Frisk. The columbus module as a technology demonstratorfor innovative failure management. Deutcher Luft- und Raumfahrtkongress,Berlin, Germany, 2012. Cited on pages 2 and 3.
T. Noack and I. Schmitt. A cyclic process model for monitoring mobile cyber-physical systems. 2012. Cited on page 3.
Dieter Sabath, Gerd Söllner, and Dirk Schulze-Varnholt. Development and imple-mentation of a new columbus operations setup. 2012. Cited on page 2.
Dieter Sabath, Thomas Kuch, Gerd Soellner, and Thomas Müller. The future ofcolumbus operations. In SpaceOps 2014 Conference, page 1618, 2014. Citedon page 2.
H. Sneider and P.M. Frank. Observer-based supervision and fault detection inrobots using nonlinear and fuzzy logic residual evaluation. Control SystemsTechnology, IEEE Transactions on, 4(3):274–282, 1996. Cited on page 5.
Source: Airbus DS. Source: Airbus ds, 2013. URL https://airbusdefenceandspace.com/. Cited on pages 2, 7, 10, and 11.
Carl Svärd. Methods for Automated Design of Fault Detection and Isolation Sys-tems with Automotive Applications. PhD thesis, Linköping University, Vehic-ular Systems, The Institute of Technology, 2012. Cited on pages 3, 5, and 45.
Carl Svärd and Mattias Nyberg. Residual generators for fault diagnosis usingcomputation sequences with mixed causality applied to automotive systems.IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 40(6):1310–1328, 2010. Cited on page 46.
Carl Svärd, Mattias Nyberg, and Erik Frisk. Realizability constrained selectionof residual generators for fault diagnosis with an automotive engine applica-tion. IEEE Transactions on Systems, Man and Cybernetics: Systems, 43(6):1354–1369, 2013. Cited on pages 5 and 29.
Bibliography 51
Carl Svärd, Mattias Nyberg, Erik Frisk, and Mattias Krysander. Data-driven andadaptive statistical residual evaluation for fault detection with an automotiveapplication. Mechanical systems and signal processing, 45(1):170–192, 2014.Cited on page 5.
52 Bibliography