Dissertation 2014

125
COMPACT RELIABILITY AND MAINTENANCE MODELING OF COMPLEX REPAIRABLE SYSTEMS A Thesis Presented to The Academic Faculty by Ren´ e Valenzuela In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Aerospace Engineering Georgia Institute of Technology May 2014 Copyright c 2014 by Ren´ e Valenzuela

description

RELIABILITY THESIS

Transcript of Dissertation 2014

COMPACTRELIABILITYANDMAINTENANCEMODELINGOFCOMPLEXREPAIRABLESYSTEMSAThesisPresentedtoTheAcademicFacultybyReneValenzuelaInPartialFulllmentoftheRequirementsfortheDegreeDoctorofPhilosophyintheSchoolofAerospaceEngineeringGeorgiaInstituteofTechnologyMay2014Copyright c 2014byReneValenzuelaCOMPACTRELIABILITYANDMAINTENANCEMODELINGOFCOMPLEXREPAIRABLESYSTEMSApprovedby:ProfessorVitaliVolovoi,CommitteeChairSchoolofAerospaceEngineeringGeorgiaInstituteofTechnologyProfessorEricFeronSchooolofAerospaceEngineeringGeorgiaInstituteofTechnologyProfessorVitaliVolovoi,AdvisorSchoolofAerospaceEngineeringGeorgiaInstituteofTechnologyDr. PierreDersinRAMCenterofExcellenceAlstomTISProfessorJosephSalehSchoolofAerospaceEngineeringGeorgiaInstituteofTechnologyDr. FredVilleneuveAdvancedDesignMethodsSiemensEnergyProfessorDavidGoldsmanSchoolofIndustrialandSystemsEngineeringGeorgiaInstituteofTechnologyDateApproved: March20th2013ToCarola,Soa,andmyparents,youweremybestcheerleadersiiiACKNOWLEDGEMENTSIwishtothankmycommitteememberswhoweremorethangenerouswiththeirexpertiseandprecioustime. Aspecial thankstomyadvisorDr. Vitali Volovoi, forhiscountlesshoursof reecting, reading, encouraging, andmostof all patiencethroughouttheentireprocess. ThankyouDr. EricFeron, Dr. JosephSaleh, Dr. DavidGoldsman, Dr. PierreDersin,andDr. FredVilleneuveforagreeingtoserveonmycommittee.ivTABLEOFCONTENTSDEDICATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivLISTOFFIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiI INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 OntheCharacterizationofComplexSystems . . . . . . . . . . . . 41.1.2 Reliabilitymodeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 MaintenanceModeling . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3 Organizationofthethesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 7II BACKGROUND. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1 CharacterizationoftheFailureandMaintenanceProcess . . . . . . . . . . 92.1.1 FailureProcess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.2 MaintenanceProcess . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.3 Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 ModelingParadigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 PetriNets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1 ExecutionSemantics . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4 StochasticPetriNets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4.1 ExecutionSemantics . . . . . . . . . . . . . . . . . . . . . . . . . . 182.5 ReliabilityModelingwithStochasticPetrinets . . . . . . . . . . . . . . . . 192.6 StochasticPetriNetModelEvaluation . . . . . . . . . . . . . . . . . . . . 202.7 OverviewofSimulationApproach . . . . . . . . . . . . . . . . . . . . . . . 21III COMPONENT-BASED MODELING OF MAINTENANCE PROCESSESWITHECONOMICDEPENDENCE. . . . . . . . . . . . . . . . . . . . . 233.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 System-levelvscomponent-levelmodels . . . . . . . . . . . . . . . . . . . . 233.3 IntroductiontoComponentModeling. . . . . . . . . . . . . . . . . . . . . 25v3.4 ComputationalMethodsforSystemAnalysis. . . . . . . . . . . . . . . . . 283.4.1 Renewalequations . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4.2 Simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31IVPARAMETRICREPRESENTATIONOFEXOGENOUSEFFECTS 334.1 Decoupledmodeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1.1 ComponentswithidenticalshapeWeibullfailuretimes . . . . . . . 344.1.2 Example1: Threeidenticalcomponents . . . . . . . . . . . . . . . 354.2 Proposedapproach: WinningRaceratio . . . . . . . . . . . . . . . . . . . 384.2.1 OddsofwinningforWeibullfunctions . . . . . . . . . . . . . . . 414.2.2 AsymptoticConsiderations. . . . . . . . . . . . . . . . . . . . . . . 424.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.3.1 Testingtheprocedure: Example2. . . . . . . . . . . . . . . . . . . 444.3.2 Example3: Weibulldistributionswithdierentshapes . . . . . . . 444.3.3 Example4: Lognormaldistributions . . . . . . . . . . . . . . . . . 46V VERIFICATION OF THE COMPONENT BASED MODELING METHOD-OLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.1 MidSizeModel: 2-ComponentTypeModel . . . . . . . . . . . . . . . . . 515.2 Simulationresults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54VI EVALUATIONOFMAINTENANCECOSTFORAGASTURBINEENGINE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.2 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.3 SystemDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.4 SolutionApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.5 PerfectRepairModeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646.5.1 ComponentTemplate. . . . . . . . . . . . . . . . . . . . . . . . . . 646.5.2 Vericationofthetemplate . . . . . . . . . . . . . . . . . . . . . . 686.5.3 Simulationofthetemplate: PerfectRepairExperiment . . . . . . . 696.6 MinimalRepairModeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.6.1 ComponentTemplate. . . . . . . . . . . . . . . . . . . . . . . . . . 756.6.2 Vericationofthetemplate . . . . . . . . . . . . . . . . . . . . . . 76vi6.6.3 Simulationofthetemplate: MinimalRepairExperiment . . . . . . 77VII CONCLUSIONSANDFUTUREWORK. . . . . . . . . . . . . . . . . . 81APPENDIXA PARAMETRICSTATISTICALDISTRIBUTIONSINRELIABILITYMODELING . . . . . . . . . . . . . . . . . . . . . . . . . . 83APPENDIXB BATCHFILECONFIGURATION . . . . . . . . . . . 91APPENDIXC GASTURBINEMODELSET-UP . . . . . . . . . . . 95APPENDIXD EXPERIMENTRESULTS . . . . . . . . . . . . . . . . 99REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109viiLISTOFTABLES1 Summaryofparametersofmid-sizemodel . . . . . . . . . . . . . . . . . . . 522 Componenttypemaintenancepolicydata . . . . . . . . . . . . . . . . . . . 593 ForcedOutageDistributionbycomponentandfailuremodetype . . . . . . 604 ScrapWeibullprobabilitymodelbycomponenttype . . . . . . . . . . . . . 615 OpportunisticMaintenanceInducedinspectionbyfailuremodeandcompo-nenttype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636 InducedFailures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637 ResultsofUncoupledExperiment,percomponent. 1E7replications . . . . . 698 Results of UncoupledExperiment (Minimal repair), per component. 1E7replications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Totalnumberoffailuresofperfectrepairandminimalrepairpolicywithoutopportunisticactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7810 Total numberof maintenanceactionsof perfectrepairandminimal repairpolicies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8011 BatchCongurationFileexample . . . . . . . . . . . . . . . . . . . . . . . . 9312 key-valuepairsforbatchcongurationle . . . . . . . . . . . . . . . . . . . 9313 Transitionsthatarenotmodiedbutintrinsictothemodellogic . . . . . . 9514 Transitionsthataremodiedduringbatchsimulation . . . . . . . . . . . . 9615 Transitionsthatarenotmodiedbutintrinsictothemodellogic . . . . . . 9716 Transitionsthataremodiedduringbatchsimulation . . . . . . . . . . . . 98viiiLISTOFFIGURES1 ReliabilityModelingParadigms . . . . . . . . . . . . . . . . . . . . . . . . . 62 ModelingParadigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Two-statePetriNet. Marking1. . . . . . . . . . . . . . . . . . . . . . . . . 174 Two-statePetriNet. Marking2. . . . . . . . . . . . . . . . . . . . . . . . . 175 SingleItemModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Age-basedreplacementmaintenanceschematics . . . . . . . . . . . . . . . . 247 State-space representation of a simple system with identical components: A)Global(system)B)local(component) . . . . . . . . . . . . . . . . . . . . . 258 SPNcoupledmodelwiththreecomponents: A)systemisinoperatingstateB)systemisinastatewhenoneofthecomponentsfailedandtherestareundergoingopportunisticreplacements . . . . . . . . . . . . . . . . . . . . . 329 SPNModelforasinglecomponentwithopportunisticmaintenance. . . . . 3410 Expectedfailures(top),opportunities(middle),andscheduledreplacements(bottom)forthreeidentical componentsfollowingWeibull with=3and =1for time T =4as functions of the number of scheduledreplace-ments. The coupled model is compared to the sum of three component mod-elswithopportunitiesapproximatedbyexponentialdistributions. Thescaleis found either by balancing total opportunities and failures (balanced), orbymatchingthescaleofthedistributionwiththetotalnumberofexpectedfailuresduringthefullreplacementinterval(SM.) . . . . . . . . . . . . . 3711 Absolutevaluesofrelativeerrorsduetoapproximationofopportunitiesus-ingexponential distribution. Logarithmicscaleis usedfor theerrors duetolargevariability. Failures(top),opportunities(center),andscheduledre-placement (bottom) are evaluated for both balanced and scale-matched (SM)approximations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3912 Twocasesforpart1losingarace. A-Part1doesnotentertherace,B-Part1enterstheracebutlosesittosomeotherpart(Part2) . . . . . . . . 4013 Shapesensitivityof 12for oneof thefunctions with1=1andvarying2of theotherfunction. Thescale=5forbothfunctions. Asymptoticapproximation(Eq.19)isshownbythesolidline. . . . . . . . . . . . . . . 4114 ScalesensitivityforraceratiowhentherstcomponentfollowsWeibulldistribution with shape = 2 and scale = 5 for dierent shape values. Thereplacementintervalisunity. . . . . . . . . . . . . . . . . . . . . . . . . . . 4215 Relative errors for Case 2 as a function of the number of replacement intervalspertimeT= 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45ix16 Sensitivityof theerrorsfors=T/5=1forCase2asafunctionof shapeparameterforopportunity. Dashedlinesrepresenttheshapeselectedinaccordancewiththematchedratio. . . . . . . . . . . . . . . . . . . . . . . 4517 Example3: failuresfollowWeibull distributions(all threecomponentsaredierent). Comparison of relative errors (in logarithmic scale) obtained usingcomponent models with opportunities represented using Weibull distributionwithmatchedandexponentialdistributionwiththematchedscale. TotaltimeT= 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4718 Probabilitydensityfunction(PDF)of opportunityfortherstcomponent,andits approximations usingmatchingthe rst twomoments, maximumlikelihood estimate, and the current procedure that matches weighted . FullPDF(left),thelefttail(right)0 t 1.l . . . . . . . . . . . . . . . . . . . 4819 Probability density function (PDF) of opportunity for the second component,andits approximations usingmatchingthe rst twomoments, maximumlikelihood estimate, and the current procedure that matches weighted . FullPDF(left),thelefttail(right)0 t 1.l . . . . . . . . . . . . . . . . . . . 4820 Probabilitydensityfunction(PDF)of opportunityfortherstcomponent,andits approximations usingmatchingthe rst twomoments, maximumlikelihood estimate, and the current procedure that matches weighted . FullPDF(left),thelefttail(right)0 t 1.l . . . . . . . . . . . . . . . . . . . 4921 Example4: failuresfollowlog-normal distributions. Comparisonofrelativeerrors (in logarithmic scale) obtained using component models with opportu-nities represented using Weibull distribution with matched and exponentialdistributionwiththematchedscale. TotaltimeT= 5. . . . . . . . . . . . 5022 Midsizecoupledmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5323 Midsizeuncoupledmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5424 Mid-sizemodelsummaryofResults . . . . . . . . . . . . . . . . . . . . . . 5525 SolutionAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5826 Scheduledinspections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6027 ScrapWeibullprobabilitydensityfunctions . . . . . . . . . . . . . . . . . . 6228 Perfectrepairtemplate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6529 Screenshotscrap-clockgroupSPN@ . . . . . . . . . . . . . . . . . . . . . . 6930 RateParametersofopportunitiesprobabilitymodel . . . . . . . . . . . . . 7231 Sampleofnumberofeventscollectedfromexperiments . . . . . . . . . . . . 7332 Hazardratefunction,failuremodes. . . . . . . . . . . . . . . . . . . . . . . 7433 NumberofInspectionEvents . . . . . . . . . . . . . . . . . . . . . . . . . . 7534 ScreenshotofminimalrepairtemplateinSPN@ . . . . . . . . . . . . . . . . 76x35 ScaleParametersofopportunitiesprobabilitymodel,minimalrepair . . . . 7836 Sampleofnumberofeventscollectedfromexperiments . . . . . . . . . . . . 7937 Exponentialfunction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8638 Weibulldistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8839 Componenttype1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9940 Componenttype2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9941 Componenttype3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10042 Componenttype5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10043 Componenttype7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10044 Componenttype8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10145 Componenttype9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10146 Componenttype10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10147 Componenttype11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10248 Componenttype12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10249 Componenttype13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10250 Componenttype14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10351 Componenttype1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10452 Componenttype2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10453 Componenttype3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10554 Componenttype5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10555 Componenttype7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10556 Componenttype8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10657 Componenttype9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10658 Componenttype10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10659 Componenttype11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10760 Componenttype12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10761 Componenttype13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10762 Componenttype14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108xiCHAPTERIINTRODUCTIONThisthesisintroducesaframeworkformodelingmaintenanceprocessesof complexengi-neeringsystems. Theframeworkexplicitlyaccountsforcompetingriskfailuremodesandopportunisticmaintenance. Agasturbineexampleisusedtoillustratethemethodologypresented. Thedevelopedframeworkisintendedforawiderangeofapplicationsincludingpowerplants,aircraftandtrainsystems.Theresearchllsagapinthecurrentmodelingcapabilitiesby: (1)extendingthesizeof theengineeringsystems that canbeanalyzedfromthepractical perspectiveand(2)explicitlyincludingdependenceeectsduetocompetingrisk.Themethodologyisbasedonthefollowingtwoideas:1. Acomplexsystemcanbedecomposedintoaset of smaller, less complexsystems(components). Themaintenancepolicyofthesystemmustincludethesimultaneousconsiderationofthemaintenanceofallofthesecomponents.2. The timingof the state changes of the rest of the systemthat have aneect onagivencomponent canbeaccuratelycapturedbyasimpleparametricprobabilitydistribution.Basedonthis, itispostulatedthatthemodel ofacomplexsystemcanbecreatedastheunionof component-basedmodelsineachof whichthetimingof theeectof therestofthesystemonthecomponentiscapturedbyasimpleparametricdistribution.The proposed component-based modeling approach is introduced and validated for bothalowandmediumcomplexitymodelbycomparisonwithabenchmarkmodelofthesamesystemandthenitisappliedtoamaintainabilitystudyofahighcomplexitysystem(gasturbineengine).11.1 MotivationAlthough reliability has a broad meaning in our daily life, in technical terms it has been givenaprecisedenition. Reliabilityisacharacteristicofanitemexpressedbytheprobabilitythat the item performs its intended function under specied conditions for a specied periodof time. Whentheitemunder considerationcanbedecomposedinseveral subsystems,thenreliabilityspeciestheprobabilitythatnooperationalinterruptionswilloccurduringastatedtimeinterval, i.e. thesystemcanperformitsintendedfunctionalityevenifsomeofitssubsystemsmightfail. Thegrowingcomplexityofequipmentandsystems,aswellastherapidlyincreasingcostincurredbylossofoperationasaconsequenceoffailures,havebroughttotheforefronttheaspectsof reliability, maintainability, availability, andsafety.The expectation today is that complex equipment and systems are not only free from defectsandsystematicfailuresattimet = 0(whentheyareputintooperation),butalsoperformtherequiredfunctionfailurefreeforastatedtimeintervalandhaveafail-safebehaviorincaseofcriticalorcatastrophicfailures[5].Reliabilityengineeringisthedisciplineofensuringthataproductwill reachacertainpredenedlevel of reliabilitywhenoperatedinaspecicmanner. Ingeneral, threestepsare necessarytoaccomplishthis objective: (1) accountingfor reliabilityconsiderationsduring the design phase (this will become the products inherent reliability), (2) minimizingproductionprocessvariationtoassurethattheprocessdoesnotappreciablydegradetheinherent reliability, and (3), once the product is deployed, realizing appropriate maintenanceactions (according to the products predened maintenance policy) to alleviate performancedegradationandprolongproductlife[87]. Inthecaseof afactorywhichmanufacturesaproduct, maintenanceis anactivityconductedinparallel withproduction. It canhaveagreat impact onboththecapacityof theproductionandthequalityof theproductsproduced. Ithasbeenestimatedthatinatypical factorybetween15%to40%(average28%)of total productioncostisattributedtomaintenanceactivities[58]. Moreover, itisgenerallytruethatforcomplexrepairablesystemswithlonglifecyclestheoperatingandmaintenancecostisasignicantportionofitslifecyclecost.Inthepast, forsomemanufacturingindustries(liketheenergygenerationindustry),2thersttwoavenuesof improvementwerepursuedbymanufacturerswhilethethirdwaspursuedbyoperators(whofollowedguidelinesprovidedbythemanufacturers). Thelastdecades, however, haveseenashiftinthepredominantbusinessmodel underwhichthesemanufacturingindustriesoperate. Insteadofabusinessmodelorientedtowardthesellingof products, the newparadigmis orientedto the selling of services. Inthe powergenerationindustryfor instance, this shift translates intomanufacturers sellingapowerplantthatwill beabletooperateacertainnumbersof hoursayear, producingacertainamountofenergyforaperiodof20or25years. Toachievethis, themanufacturerisnotonly commissioned the task of providing a power plant, but it is also awarded a maintenancecontract for the whole life cycle of the plant. In the aerospace industry, an airline customerpays an engine service provider in proportion to the number of aircraft ying hours, which isaected by engine uptime [34]. The eect of supply chain contracting on product reliabilityhave beenstudiedinthe operations management literature. Inparticular, [67] analyzeproductreliabilityintheautomotiveindustry.Oneoftheeectsthismarketshifthashadistoincreasetheneedforaccuratemodelsthatcancaptureboththefailureprocessof acomplexsystem(likethegasturbineof apowerplant)andtheeectofthedierentmaintenancepolicies(withtheircorrespondingmaintenanceactions)ontheremaininguseful lifeof it. Wecall thismodelsmaintenancemodels andtheiravailabilitywouldallowthemanufacturerstooptimizetheirdesignandmaintenancepolicyatthesametime.Themaintenancemodelingproblemistheproblemofndingamathematicalrepresen-tationofthewaysatechnical systemunderconsiderationcanfail inconjunctionwithitsassociatedmaintenancepolicy. Theusefulnessoftheresultingmodelwillbebasedon: (1)itsabilitytocapturealltherelevantphenomenaunderconsideration(failuresandmainte-nanceactions), (2)itsabilitytoproduceaccurateenoughresultsinatimelymanner, and(3)theeasinesswithwhichtherepresentationcanbegeneratedandveried.Whendevelopingsuchmodels thereareseveral issues that havetobetackled. Oneofthemhastodowiththemodelingoftheinherentdependenciesthatarisebetweenthedierent subsystems of a complex systems. In the following sections the characterization of3acomplexsystemusedinthisworkisintroduced,followedupbyanoverviewofreliabilityandmaintenanceprocessmodeling.1.1.1 OntheCharacterizationofComplexSystemsFor theremainder of this work, acomplexsystemwill bedenedas systemwhichhasastructureof multipleunits(possiblyof several dierenttypes)whichworktogethertoperformaparticularfunction. Examplesofcomplexsystemsarerailwaynetworks, motorvehicles,andgasturbines.Wedeneaunit(component)asanentityinasystemthatisnotfurthersubdivided.Thisdoesnotimplythatanelementcannotbemadeof parts; rather, itmeansthat, inagivenreliability/maintenance study, it is regardedas aself-containedunit andis notanalyzedintermsofthefunctioningofitsconstituents.Inthesimplestcase, aunitcanonlyhavetwostates: operatingorfailed. Ingeneral,onecanalsoincludeanumber(usuallynite)of deteriorationstates. If uponfailurethesystemcanbebroughtbacktoanoperatingstatethesystemiscalledrepairable.The structure of a complex system is based on the relationship between the states of thecomponentsandthestatesofthesystem. Thethreemostbasicstructuresthatarewidelyfoundinengineeringsystems aretheseries, parallel andk-out-of-nsystems. Inaseriessystem, failureofanycomponentresultsinfailureof thesystem. Inaparallel systemallcomponentsof thesystemmustfail forthesystemtofail. Ak-out-of-nsystemfunctionswhenatleastkofitscomponentsfunction.1.1.2 ReliabilitymodelingReliabilitymodels focus oncapturingthe failure process of asystem. Historically, thereliabilityandsafetymodelingofengineeringsystemshasbeenapproachedfromoppositedirections inat least twodistinct dimensions. Ontheonehand, thereis white-boxvs.black-boxdichotomy[6] wherethedistinctionisbasedonwhetherthefailureprocessofanentityis modeledwithor without the explicit recognitionof individual constituents(components) that comprise the entity. Here component refers to an elementary buildingblock of a white-box (system) model, which can correspond to a lower-level entity if models4are constructedhierarchically, or tothe lowest level of the hierarchy, as determinedbypractical considerations(e.g., individual modules, suchasline-replaceableunitsorLRU).Ontheotherhand, thereisnon-repairablevs. repairabledichotomy[76] withtheformerapproach dealing with a single failure event of an entity, and the latter addressing repeatedfailure events (which assumes the possibility of partial or full recovery from failures). Whileanon-repairable entityis characterizedbyits lifetime distribution(e.g., the cumulativedistributionfunctionof itstimetofailure)randomvariable, arepairableentity behaviorisdescribedbyastochasticpointprocess, andsomustbecharacterizeddierently, e.g.,usingtherateofoccurrenceoffailures(ROCOF),ortheexpectednumberoffailuresforagiven time period. Any permutation of those choices translates into appropriate set of tools:for example,selecting the black-box direction can lead to accelerated testing techniques fornon-repairable entities, and to modeling repairable entities by means of stochastic processes.Furthermore, the white-boxapproachentails selecting either the repairable or non-repairableoptionbothatthesystem(output)andcomponent(input)levels.Booleanalgebramethods(e.g., faulttreesandreliabilityblockdiagrams)assumethatbothsystemsandthecomponentscomprisingthosesystemsarenon-repairable. Thissym-metry between the inputs and outputs characterization (and associated simplicity and clar-ity)isperhapsoneofthereasonsforthepopularityofthosetools. Incontrast,theuseofsuperimposedprocesses[76] impliesthatbothinputsandoutputsarerepairableentities.Inthiscontext, thestate-spacemodelsthatarethesubjectof thisworkcanbeclassiedas selectingrepairableoutputs andnon-repairableinputs withinthewhite-box(system)approach. Thisstate-spacerepresentationisattractiveinthecontextof modelingmain-tenanceprocesses, astheimpactof individual changestothesystem(e.g., changeof themaintenance interval,orintroducing a more reliable module)canbe captureddirectly(SeeFig.1)5Figure1: ReliabilityModelingParadigms1.1.3 MaintenanceModelingMaintenanceisanyactionthat hasasanobjectivetorestoreormaintaintheabilityofthesystemtoperformitsdesignfunction. Thehighlevel ofsophisticationofcurrentsys-tems necessitates appropriate levels of sophistication for the maintenance processes as well.Maintenancenot onlyimproves thecost-eciencyof operatingasystembut it canalsosignicantlyreducetheprobabilityofcatastrophicfailureofthesystem. Themaintenancemanagers must plan the maintenance actions, so that a balance is achieved between the ex-pectedbenetsandcorrespondingexpectedpotentialconsequences[89]. Numerousmodelsofmaintenanceprocesshavebeendevelopedoverthepast30years; seethereviewpapersbyChoandParlar[11], Dekker[20, 22], Scarf[69], Nicolai andDekker[64], andDasandSarmah[17].Whendealingwithcomplexsystems,multi-unitsmaintenancepoliciesmustbeconsid-ered(asopposedtosingle-unitmaintenancepolicies). Theimportanceofmulti-unitmain-tenancehasbeenwell recognized, andextensiveliteraturesurveysonthesubject[11, 22]make it clear that, as the number of components in the system increases, the level of detailscapturedbythemodelsdecreasesinordertomakemodelingoverallcomplexitytractable.Atthesametime, thereiscompellingevidencethatinbothnatural andengineering6domains, complex systems are unlikely to be fully coupled, as modular architecture providesclear advantages in developing desirable systems properties. The evolutionary advantage ofso-callednearlydecomposablesystemshasbeendemonstratedforbiological systems[72],while similar processes were identiedinthe historyof steamengine development [28].These concepts are also explicitly employed in the design principles of computer systems [15](including structured design [74]). It is therefore logical to take full advantage of the modularstructureofthesystemsinmodelingsystemfailures.Whileinsomesituationsafullydecoupledmodelingof eachunitispossible, couplingmechanismscansignicantlyimpacttheresults. Nevertheless, single-unitmodelsarestillwidely used in practice due to the overwhelming complexity of alternative methods of mod-eling,leadingtosub-optimalselectionsofmaintenancepolicies.1.2 ProblemStatementAsnotedearlier, themaintenancemodelingproblemconsistsof developingamathemat-ical model thatcapturesboththefailureprocessandthemaintenancepolicyduringthesystemslifecycle. Thefailureprocessisessentiallystochasticandthemaintenancepolicycanhavedeterministicandstochasticcomponents. Forcomplexsystems, thestateofthesystem depends on the state of its components and the state of the components themselvescanchangeduetodierentfailuremodesorthestatechangesof othercomponents. Themaintenancepolicymaystipulatedierentmaintenanceactionsdependingonthestatesofmultiplecomponents. Forexamplefailureof somecomponentsmaytriggermaintenanceactionsonothercomponents.Thegoal ofthisthesisistodeneamodelingframeworkthatallowsthemaintenancemodelinginastraightforward,fast,andeasilyveriablemanner.1.3 OrganizationofthethesisInChapter2wedescribethecurrentapproachestoreliabilityandavailabilitymodeling.InChapter3weintroducetheStochasticPetriNetwithagingtokensmodelingframeworkwhich is the main tool on which our proposed framework is based. In Chapter 4 we introducethecomponentbasedmodelingframeworkandapplyittotwoproblems. InChapter5we7showhowthemethodologyisappliedtothemodelingofthemaintenanceprocessofagasturbineengine. InChapter6wepresentourconclusionsandoutlinedirectionsforpossiblefuturework.8CHAPTERIIBACKGROUNDInthischapterbackgroundinformationonthetopicsandresultsknownintheliteratureandwhichareusedthroughoutthisworkareconciselypresented. Thechapterstartswithacharacterizationof the failure andmaintenance process of complexsystem. Then, adiscussiononthemodelingparadigmsusedforevaluationofreliabilitymodelsisgivenandat the end, the stochastic Petri net framework that will be used as the basis for the modelingis introducedas well. Complementaryinformationpresentingasummaryof probabilitydescriptionsusedtocapturestochasticevents[36,49,56]isgivenintheAppendix.2.1 CharacterizationoftheFailureandMaintenanceProcessBeforeattemptingtodeneamodelingframeworkitisessentialtounderstandthecharac-teristicsoreectsthataecttheproblemathand. Inthissectiondierentcharacteristicsassociatedtothefailureprocessandthemaintenanceprocessarediscussed. Dependenceeects that contributetothecomplexityof thesystembeingmodeledandthemodeledinteractions between the dierent components include the consequences of failure of a com-ponentaswellascoupledmaintenancepolicies.2.1.1 FailureProcessThe failure process of amulti-component systemis the process that describes howthesystemgoesfromanoperatingstatetoafailedstate(orinthecaseofmulti-statesystemstoadegradedstateandthentoafailedstate). Theprocessisoftentheresultof forcesandstressesgeneratedeitherduringtheintendedoperationof thesystemsthemselvesorfromexternal sources. Commonfailuremechanismsandcausesincludeweardegradation,corrosion, fracture, shockloads, fatigue, etc. Thefailureprocesscanbecharacterizedbythestructureof thesystemandbythefailuremodelsof itscomponents. Randomshockmodelinghasbeenextensivelystudiedforcasesinwhichdevicesareexposedtoexternal9shockenvironments, suchassuddenandunexpectedusageloadsandaccidental droppingontohardsurfaces. Intheliterature,therearefourcategoriesofrandomshockmodels;(i)extremeshockmodel: failureoccurswhenthemagnitudeof anyshockexceedsaspeciedthreshold; (ii)cumulativeshockmodel: failureoccurswhenthecumulativedamagefromshocksexceedsacriticalvalue;(iii)runshockmodel: failureoccurswhenthereisarunofkshocksexceedingacritical magnitude; and(iv)-shockmodel: failureoccurswhenthetimelagbetweentwosuccessiveshocksisshorterthanathreshold[50,61].Whenstudyingthefailureprocessofacomplexsystemadistinctioncanbemadebe-tweenafailure-basedreliabilityapproachandadegradation-basedreliabilityapproach. Inafailure-basedreliabilityapproach, therandomvariableof interestisthefailuretimeofthe component. In general, parametric statistical models are used to capture the stochasticbehavior of thelifetime. Incontrast, indegradation-basedmodels, therandomvariableofinterestistheremaininguseful life(RUL). TheRULisassumedtodependonanotherrandomvariablethat canbemonitored. Duringoperationof thesystemtheobservablerandom variables are used to obtain estimates of the RUL (see, for example, Lu and Meeker[52],Singpurwalla[73],andKharoufehandCox[40]).The structure of a system has a signicant eect on the system availability and reliability.In particular, a system can have a series, parallel, redundant structure or some combinationof theabove. Inaseries structure, thefailureof anycomponent leads tofailureof thesystem; in a parallel structure all components must fail for the system to fail. For redundantstructures, theredundancycanbehot, warmor cold, dependingonhowtheredundantelementisbeingused. Inhotredundancy, theredundantelementisoperatinginparalleltothemainelement. Inwarmredundancytheredundantelementisinstand-bywhilethemainelementoperates. Onlywhenthemainelementfailstheredundantelementstartsoperating. Incoldredundancy,theredundantelementisswitchedoandisonlyputintooperationifandwhenthemainelementfails. Incomplexsystemswithseveralhierarchicallevels redundancy can be implemented in any of the hierarchical levels. Finding the specicoptimalcongurationofaspecicsystemisaddressedbythereliabilityallocationproblem(Chern [10], Coit [13], Kuo [44], [45]). The structure functionis used to map the state of the10components to the state of the system. One of the basic characteristic that is required for asystem in this sense,is that it is coherent. A system is coherent if all of its components arerelevant anditsstructurefunctionismonotone. Inthisworkwefocusoncomplexsystemsthatarecoherent.At the lowest level of the hierarchy a unit can have dierent failure modes. Consideringthemodes separatelymight beof importanceas either theconsequences of thefailuresmight bedierent or themaintenanceactions that eachfailuremodetriggers might bedierent. Ingeneral,thefailureofanysinglecomponentcanbeconsideredinacompetingrisks frameworkwhere everyfailure mode is competingagainst the others tomake thecomponentfailinthatmode.Theproposedmodelingframeworkisabletodeal withall thesecharacteristicsof thefailureprocessesassociatedwithacomplexsystems. Theexamplesshownhowever, onlydealwithafailure-basedreliabilityapproach.2.1.2 MaintenanceProcessTheclassicationof maintenanceprocesses describedinthis sectionfollows [53]. Main-tenanceincludes all technical andadministrativeactions intendedtomaintainasysteminorrestoreittostateinwhichitcanperformatleastpartof itsintendedaction[20].Maintenancecanbeclassiedaccordingtoitstypeanditsdegree.Maintenancetypecanbeclassiedintotwomaincategories: correctivemaintenance(CM)andpreventivemaintenance(PM). Correctivemaintenanceareall maintenanceac-tionsperformedafterasystemhasfailedwiththeobjectiveof restoringitsfunctionality.Preventivemaintenancereferstoplannedmaintenanceactionsperformedwhilethesystemisoperational withtheobjectivemaintainingthesystemoveradesiredoperational time-horizonbypreventingordelayingfailures. Condition-basedmaintenanceisatypeofPMwhichtriggers themaintenanceactionwhenthemeasurement of aconditionor stateofthesystemreachesathresholdthatreectssomedegradationand/orlossofperformance(butnotyetfailure). OpportunisticmaintenanceincludesbothCMandPMandhasasanobjectivetomakeuseof theeconomicdependenciesthatarepresentinthesystembeing11studied.According to its degree maintenance can be classied in ve types [85]: a) perfectrepairor perfect maintenanceare actions which restore a system operating condition to as good asnew (generally, replacement of a failed system by a new one is a perfect repair); b) minimalrepairaremaintenanceactionsthatrestorethesystemtothesamefailureratethatithadwhenitfailed; andc) imperfect repair aremaintenanceactionsthat makeasystemnotasgoodasnewbutyounger(better)thanitwasbeforetherepair; d)worserepair areactionswhichunintentionallymakethesystemfailurerate(oractualage)increasebutthesystemdoesnotbreakdown; ande)worst repair areactionswhichunintentionallymakethesystemfailorbreakdown.Maintenancepoliciesareasetof rulesthatdescribethetypesof maintenanceactionsthat are consideredinresponse to what types of events [53]. For instance, inanagereplacementmaintenancepolicyaunitisreplacedatfailure(CM)ortimeT(PM),whereT isconstant[85]. Preventivemaintenancepoliciesareperhapsoneof themoststudiedmaintenancepoliciesintheliterature[17,84].2.1.3 DependenceInacomplexsystem, thereexistdierenttypesof interactionsbetweentheunits. Theseinteractionsarisefromthedesignof thesystemanditsintendedfunction. Threemajortypescanbeidentied,thersttwoarestochasticdependenceandstructuraldependence[75]andthethirdiseconomicdependence[11].Stochastic dependence occurs if the condition of components inuences the lifetime dis-tributionof other components. This kindof dependencedenes arelationshipbetweencomponentsuponfailureof acomponent. Forexample, itmaybethecasethatthefail-ureof onecomponentinducesthefailureof othercomponentsorcausesashocktoothercomponents.Structural dependenceapplies if components structurallyformapart, sothat main-tenance of afailedcomponent implies maintenance of workingcomponents, or at leastdismantlingthem. Thisrestrictsthemaintenancemanagerinhisdecisiononthegrouping12ofmaintenanceactivities.Inamulti-unit repairable system, economic dependence betweencomponents of thesystemissaidtooccurifthecostofperformingmaintenanceonthegroupofcomponentsis dierent than the cost of performing the same type of maintenance individually [11]. [64]renes theconcept byintroducingtheconcepts of positiveeconomicdependence(if thecostof jointmaintenanceislowerthanthecostof individual maintenance)andnegativeeconomicdependence(ifthecostofjointmaintenanceishigherthanthecostofindividualmaintenance).Positiveeconomicdependencycanarise, forinstance, fromeconomiesof scale(set-upcosts). Thiseconomiesofscaleresultwhenthemaintenancecostpercomponentdecreaseswith the number of maintained components. Another form of positive economic dependencyisdowntimeopportunity(componentfailureistreatedasanopportunityforPMof non-failedcomponents)likeinthecaseofmostcontinuousoperatingsystems. Forthistypeofsystem, the cost of system unavailability may be much higher than component maintenancecosts[85].Negative economic dependencycanarise, for instance, frommanpower restrictions.Grouping maintenance actions results in a peak in manpower needs. Manpower restrictionsmayevenbeviolatedandadditional laborneedstobehired. Anotherformof negativeeconomicdependencearisesfromsafetyrequirements, forinstance, useofequipmentmayhamperuseofotherequipmentandcauseunsafeoperation[64]Thereareseveralgeneralsourcesofdependentstatetransitions[11,47]: Common-causefailures, eitherduetocommonenvironment(e.g., coldtemperaturesduring the Challengerlaunchthatimpactedboth O-rings),ora common defect(e.g.,ifcomponentswereobtainedfromthesamebatchmadebythesamemanufacturer).Theycanbeconsideredanexampleof stochasticdependence. Ineither caseit ispossibletoexplicitlymodelthecommon-causefailurebyprovidingappropriatestatetransitionsthatimpactseveralcomponentssimultaneously,whileconsideringthere-mainingcausesoffailuresforthecomponentstobeindependent.13 Shared-load conguration (another example of stochastic dependence), where the fail-ure of acomponent cancause redistributionof the loadfor the components thatremain in operation, and therefore change their failure distributions. From the white-boxmodelingperspective,thisimpliesthatseveralcomponentstatesmustbedistin-guished, each having distinct failure distribution associated with the transition to thefailed state (in this context care needs be taken to account for aging processes in eachstate[80]). Inaddition, theconsequencesof thefailureof themodeledcomponentcanbedierentfromthesystemsperspective: redundantcongurationscanbecon-sideredaspecialcaseofsharedload,ifafailureofanothercomponentdoesnotaltertheloadforthecomponent,andthereforeitsfailuredistribution. Opportunisticmaintenance: ifoneofthecomponentsfailsandneedsareplacementorarepair, othercomponentsof thesystemcanbeservicedatthesametime(anopportunityis createdfor conductingmaintenanceactions onthoseparts). For avarietyofreasons(e.g.,thesystemhasbeentakenolineorpartiallydisassembled,techniciansareavailable, etc.), servicingseveral componentssimultaneouslyismoreecient economicallythanservicingeachcomponent inisolation(hencethenameeconomiccoupling). Induced(cascading)failures, whenthefailureof acomponentcausesotherfailures(forexample,aliberatedbladeinagasturbinecanbreakneighboringvanes). Logistics coupling usually refer to the logistics constraints where repair of a componentisdependentonwhathappentoothercomponentsthatmightbecompetingforthesame resources (spare parts, labor, or facilities). However, there is also the possibilityofapositivecoupling,wherefailureofanothersimilarsystemprovidesanadditionalrepairresource(e.g.,cannibalizationofparts).2.2 ModelingParadigmsBeforedescribingthemodelingparadigmonwhichthisworkisbased, adiscussionofthedierent paradigms usually encountered in the literature is provided. Fig. 2 shows a general14classicationof modelingparadigms. Apossibleclassicationis providedby[88]. Theclassicationispurposelydoneataveryhigh-level whichdoesnotshowsomeapproachescombiningmorethanoneparadigm.Figure2: ModelingParadigmsAmodel built usingthesystemdynamics paradigminvolves theiterativeevaluationof asystemof ordinarydierential equations. Insuchmodels, the state of the systemvariescontinuouslyintime. Incontrast,inevent-drivenmodeling,thestateofthesystemonlychangeswhenanevent(fromasetof possibleevents)occurs. Althoughitisalwayspossibletocreateamodel ofagivenproblembasedonanyoftheparadigms, sometypesofproblemsaremoreeasilycapturedusingaspecicparadigm. Systemdynamicsisusefulwherethenumberofindividualentitiesislargeandalltheconceptsintherealsystemcanbe thought of as continuous quantities interconnected in loops of information feedback. Onthe other hand,since event driven modeling has as a main focus the occurrence of an eventand describes the evolution the system as a sequence of events, it is appropriate where eachevent is important enough to register individually. In the agent-based approach the modelssimulate the simultaneous operations and interactions of multiple agents (who are presumedtobeactingontheirperceivedself-interest)withthegoal ofrecreatingand/orpredictingtheappearanceofcomplexphenomena. Withrespecttothediscrete-eventparadigmtherearetwomaindierences: agent-basedmodelscanhavecontinuousstatesandtheagentscanusemoresophisticateddecisionrulesincludingforinstance, adaptationofthoserules(i.e. learning).The modeling of maintenance processes involves in general a large number of items withstochasticeventsassociatedtoeachofthem. Theiroperationallivesareoftenmuchlargerthanthetimeneededtoinspect, repairorreplacethem. Althougheachitemhasassoci-atedfailurebehaviorandmaintenancerulesandhencecouldbeconsideredanagent,they15areinpassiveinthesensethateventsoccurtothem. Basedonthis, themostappropri-ateparadigmtocapturetheirbehaviorisanevent-drivenmodelingparadigm. Dierentframeworksexistforthemodelingofevent-drivenprocesses. Oneofthem,StochasticPetrinets, possessesenoughexibilitytocapturethedynamicsofthemaintenanceprocessandisintroducednext.2.3 Petri NetsPetri NetsareamodelingframeworkintroducedbyCarl AdamPetri inthesecondhalfof the 20thcentury[66]. It oers a graphical notationfor event-drivenprocesses thatincludechoice,iteration,andconcurrentexecution. Petrinetshaveanexactmathematicaldenitionoftheirexecutionsemantics. Sincetheirinception, Petri netsandinparticulartheir Stochastic Petri Net generalization have been used extensively to model discrete event-basedprocesses. Althoughseveral books[18, 31, 35, 37, 54, 70] extensivelydescribetheformalisminthefollowingsectionsabriefreviewofitisgiven.APetri net consists of places, transitions, andarcs. Arcs canconnect aplacetoatransitionorviceversa, butnevertwoplacesortwotransitions(mathematically, aPetrinet is abipartite directedgraphwithone set of nodes beingthe places andthe othersetof nodesbeingthetransitions). Graphically, placesarerepresentedbyemptycircles,transitionsarerepresentedbylledrectanglesandarcsarerepresentedbyarrows. Fig. 3showsasimplePetrinetwithallofitselementsappropriatelylabeled.Theplacesfromwhichanarcrunstoatransitionarecalledtheinput placesof thetransition; theplacestowhicharcsrunfromatransitionarecalledtheoutputplacesofthetransition. InFig. 3, Place1isaninputplaceofTransition2andanoutputplaceofTransition1.Graphically, places in a Petri net may contain a discrete number of marks called tokens.Anydistributionoftokensovertheplaceswillrepresentacongurationofthenetcalledamarking. InFig.3, thereisonetokeninsidePlace1, themarkingofthePetrinetisthen1tokeninPlace1and0tokensinPlace2.16Figure3: Two-statePetriNet. Marking12.3.1 ExecutionSemanticsGivenamarking,theexecutionsemanticsofthePetrinetareasfollows: Whenthere are sucient tokens ineachof the inputs places of atransition, thistransitionbecomesenabledanditcanre. Whenatransitionresitconsumestherequiredinputtokens,andcreatestokensinitsoutputplaces. Aringisatomic,i.e.,asinglenon-interruptiblestep.InFig. 3, Transition2isenabledbecausethereisonetokeninitsonlyinput place.Immediately after becoming enabled, Transition 2 res consuming the token in Place 1 andcreating a token in Place 2. The marking on the net after the ring of Transition 2 is showninFig.4.Figure4: Two-statePetriNet. Marking2Unlessanexecutionpolicyisdened, theexecutionof Petri netsisnondeterministic:whenmultipletransitionsareenabledatthesametime, anyoneof themmayre. Since17ringisnondeterministic, andmultipletokensmaybepresentanywhereinthenet(eveninthe same place), Petri nets are well suitedfor modelingthe concurrent behavior ofdistributedsystems.2.4 StochasticPetri NetsStochastic Petri nets are one of the many extensions to the Petri Net framework. In Stochas-ticPetrinets,theoriginalframeworkisextendedasfollows, First, ameasure(concept)of timeisintroduced. Thetraditional methodof doingthisisbyallowingaperiodoftimetoelapsebetweenenablingofthetransitionandringofthetransition. Thisperiodoftimewill becalledthetransitionringtime.Withthisextension,theframeworkisusuallynamedTimedPetriNets. Second, a probability distribution is associated to the ring time of all the transitions.Molloy [60] introduced this extension by considering exponentially distributed transi-tion ring times and presenting an isomorphism between the behavior of the resultingframeworkandMarkovprocesses. Theexecutionsemanticsaremodiedasfollows:When a transition becomes enabled, its associated probability distribution is sampledonce. The sampled time is the next ring time of the transition. With this extension,theframeworkisusuallynamedStochasticPetriNets. Itmustbenotedthatwithinthesameframeworktransitionswithadeterministicringtimeandwithimmediatering time can be captured. Conceptually they are characterized by an appropriatelyshiftedDiracdistribution. ThelatterextensionisduetoMarsand[55].2.4.1 ExecutionSemanticsGivenaninitial markingandaninitial time(whichisusuallyequal to0), theexecutionsemanticsofthenetareasfollows: Thesetofallenabledtransitionsiscalculated. Foreachenabledtransition,thetransitionsringtimeiscalculatedaccordingtoitsassociatedringrule(probabilistic,deterministic,orimmediate).18 Theminimumvalueofthesetofringtimesiscalculated. Letscallthistimemin. Theclockis advancedmin(i.e. thenewtimeis t =t + min). Thetransitionassociatedwithminisred. TheringprocessisthesameastheoneforregularPetriNets.Thisprocessrepeatsitselfuntileither: Thesetofenabledtransitionsbecomesempty,or Theclockreachesorsurpassesthe(predened)naltimeWiththesemanticsdenedlikethis, itcanbeproventhatthemodelingpoweroftheframeworkisequivalenttothemodelingpoweroftheSemi-Markovprocess[35].2.5 ReliabilityModelingwithStochasticPetri netsOne advantage of the Stochastic Petri Net paradigmwhenusedtomodel maintenanceprocessesisitsabilitytoelegantlydeal withPMmaintenanceactions. ThemodelingofPM actions, in particular the quantifying of the eect of performing PM actions at dierentintervalsisoneofthemostdicultmaintenanceactivitiestomodelwhenusingstochasticpointprocessasthemodelingframework[12,42,65].Inthis section, asimpleexampleof aPetri Net model is constructed. Consider anitemwitha single failure mode. The time to failure is modeledusing anexponentialdistributionwithfailure rate. Oncethe item hasfailed,itisrepaired. The repairprocessismodeledusinganexponential distributionwithrepairrate. Fig. 5ashowsaMarkovmodel representing the failure-repair process of the item and Fig. 5b shows the same processusingtheStochasticPetriNetframework.In the SPN model the marking of the net capture the state of the item and the transitionscapturethefailureandrepairprocesses. Lookingatbothmodels, avalidquestioniswhyintroduce the SPN framework if, in this case, it looks almost exactly the same as the Markovmodel?Toanswerthatquestion,considerthefollowingvariationofthepreviousproblem:thefailureprocessisbettermodeledbyaWeibull distributioninsteadof anexponentialdistribution. TomodelthisnewproblemwithMarkovweeither:19(a)MarkovModel (b)StochasticPetriNetModelFigure5: SingleItemModel have touse anaverage failure rate that approximates the failure rate of the newdistributionintheperiodofinterestor, wehavetoincreasethestate-spacewithratesbetweenthenewtransitionscapturingthebehavioroftheWeibulldistributionBothalternativesinvolveanapproximationof therealityof theproblem. Alterna-tively, tomodel thenewversionoftheproblemwithSPNswesimplyneedtomodifytheprobability distribution associated to the failure transition from an exponential distributiontoaWeibull distribution. Clearly, forthenewproblem, theSPNframeworkallowsforasimplerrepresentationofthefailurerepairprocessoftheitem.Summarizing,theSPNframeworkhasagreatermodelingpowerthantheMarkovpro-cess framework. This greater modeling power is the main reason motivating the introductionof the SPN framework for the modeling of reliability problems. It must be noted as well thatthis greater modeling power is not free;it comes at the expense of the number of analyticaltoolsthatcanbeusedtoevaluatethemodel.2.6 StochasticPetri NetModel EvaluationSo far the discussion has been centered on the framework specication and reliability modelcreation. Inthissection, wefocusonhowtouseanavailablemodel fortheevaluationofperformanceindicatorsofinterest.Themainpointthatmustberealizedisthat, unlikethecaseofhomogeneousMarkov20Chainsmodels, itisverydiculttoobtainperformanceparametersanalytically. Hence,toevaluatethemodel wemustresorttosimulationof themodel. Furthermore, sincethemodelcontainsprobabilisticelementstheappropriatesimulationframeworkistheMonteCarloSimulationframework. MonteCarlomethodsareaclassofalgorithmsthatobtainnumerical resultsbyperformingrepeatedrandomsampling. [68, 78] presentthegeneralmethod,[90]presentsapplicationsofthemethodinthereliabilityevaluationsetting.2.7 OverviewofSimulationApproachIngeneral, simulationisthecalculationof somemetricof interestbasedonamodel rep-resentingareal systemandnotthereal systemitself. WhenthesimulationusesrandomnumbersitiscalledaMonteCarlosimulation.Therearetwotypesofsimulationmodelanalysis:staticModels the behavior of the model of a system at a given instant of time. It requiresknowledgeofthefull-stateofthesystematthatinstantoftime.dynamicModelsthetimeevolutionofthemodelofasystemInthecontextof MonteCarloSimulationtheresearchquestionisstatedinsuchawaythatitsansweristheexpectedvalueofarandomvariableX. Inthisframework,aMonteCarlosimulationwill calculatenrealizationsof Xandthenestimateitsexpectedvaluewiththesamplemean,Xn=1nn

i=0Xi(1)The number nof realizations that must be calculatedis basedonthe desirederroroftheapproximation. Thecentral limittheorem(CLT)statesthatforasequenceofi.i.drandom variables with nite mean and nite variance 2, the random variablesn(Sn)convergeindistributiontoanormalN(0, 2)asngoestoinnity.UsingtheCLTitcanbeshownthat,P_Xncn Xn +cn_ 1 (2)where cis the -quantile. Eq. 2relates the error of the approximation(widthof theinterval)withthenumberofreplications.21Theprincipal strengthofsimulationisitsexibility. Therearefewrestrictionsonthebehavior that canbesimulated, soasystemcanberepresentedat anarbitrarylevel ofdetail. Attheabstractendof thisspectrumistheuseof simulationtoevaluateMarkovmodels or Stochastic Petri Nets. At the concrete extreme, running a benchmark experimentisinsomesenseusingthesystemasadetailedsimulationmodelofitself.The principal weakness of simulation modeling is its relative expense. Simulation modelsgenerallyareexpensive: todene,becausethisinvolveswritinganddebuggingacomplexcomputerprogram. to parameterize, because a highly detailed model implies a large number of parameters. to evaluate, because running a simulation requires substantial computational re-sources,especiallyifnarrowcondenceintervalsaredesired. Itmustbenoted,how-ever, that there exist techniques that allowthe computationof estimators withareduced variance (with respect to the variance of the typical estimator). This reducesthenumber of replications that needtobeperformedtoobtainadesiredlevel ofcondence. Twosuchtechniquesareimportancesamplingandsplitting[78].22CHAPTERIIICOMPONENT-BASEDMODELINGOFMAINTENANCEPROCESSESWITHECONOMICDEPENDENCEAcomponent-basedmodelingapproachtomaintenanceprocessthatfeatureeconomicde-pendenceisproposed. Theapproach,itsadvantagesandrequiredinputsarecharacterized.3.1 MotivationThe most signicant challenge in building practical system-level dependability modeling toolshas been dealing with largeness and complexity in the systems that are modeled [14]. Designspaceexplosion, i.e. theexponential increaseof thenumberof designcongurationsthatthemodelerintendstoevaluateasthenumberof parameterorparametervaluesusedtovarythemodelcongurationincreases[29].Furthermore, when using SPNs to capture the behavior of systems of practical interest,veryrapidlythegraphicalmodelitselfoftenlookscomplicatedanddiculttounderstand.Usuallyitcomprisesofnumerousparallelorcrossingarcswhichgiveanideaofcomplexityevenif,inreality,thebehaviorisrathersimple[71].Toovercomethedesignspaceexplosion, amodelingbasedoncomponent-level modelsisproposed. Theapproacheasesmodelcreationbyallowingthemodelertoonlyexplicitlycapturedesigncongurationsthatareexplicitlyrelatedtothecomponentbeingmodeled.Theinteractionsbetweendierentcomponentsarecapturedthroughaprobabilitymodelwhoseparametersarefounditeratively.3.2 System-level vscomponent-level modelsToillustratetheseoptions, letusconsiderasystemconsistingof nidentical componentsthat exhibit anincreasinghazardrate(asspeciedbelow) andaresubject toage-basedreplacement[3]: all partsarereplacedatfailureorafteraspeciedinterval s. Thecom-ponentshaveasinglefailuremodeandafailedcomponentcausesatotal renewal of the23Figure6: Age-basedreplacementmaintenanceschematicssystem: all components(notonlythefailedone)arereplacedwithbrandnewones; fur-thermore, weassumethatthereplacementsoccurinstantaneously. Notethatfailurescanshifttheoriginalscheduleofreplacements(seeFigure6). Heretheoutputsofinterestaretheexpectednumbersofscheduledreplacementsandfailures. Foragivencomponent, anopportunisticmaintenanceactionisperformedifthecomponenthasbeenreplacedduetothefailureofanothercomponent. Thenumberofopportunisticmaintenanceactionsforagivencomponentissimplythesumofthefailuresforallothercomponents.Asystemthree-statediagramisdepictedinFigure7A: eitherscheduledmaintenancetakes place andall ncomponents are replaced, or one of the components fails (andisreplaced), whiletherest(i.e.,n 1components)arereplacedasaresultofopportunisticmaintenance. Outoffourtransitionsdepicted, onlyone, UF, israndom: transitionsFUandSUare instantaneous, while UShas axeddelayor holdingtime. For asinglecomponent system(n=1) noopportunisticmaintenanceis possible, andthetransitionUFoccurswhenthecomponentfails. ThetimingoftransitionUFisfullydenedbythecumulativedistributionfunctionF(t)forthefailureof thecomponent: onceanewpartisput inservicet1, thefailuretimeisindependent identicallydistributedinaccordancewithF(t t1),asthereplacementcomponenthasthesamepropertiesastheoriginalone.Theresultingmodel correspondstoasemi-Markovprocesswherethetimedistributionofatransitiontoanewstatedependsonthecurrentstatetogetherwiththetimethesystemspent inthe current state (so-calledholdingor sojourntime), but not onthe previoushistory. The failure intensity, or hazard rate, is evaluated as h(t) = f(t t1)/(1F(t t1)),wheref(t)istheprobabilitydensityfunctioncorrespondingtoF(t). Anincreasinghazard24rateimpliesh(t3) > h(t2), t3> t2> t1.Figure7: State-spacerepresentationof asimplesystemwithidentical components: A)Global(system)B)local(component)As thenumber nof components inthesystemincreases, thetransitiondistributionUFneedstobeadjustedtorepresentfailureof anyof ncomponentsinsteadof one, butthestructureof themodel (Figure7A) remains unchanged. However, this is rather anunusual situation, asreal-lifesystemsarelikelytoconsistofdistincttypesofcomponentswithnonsymmetric interrelationships, whichwill cause the size of the systemmodel togrowdramatically, as the systemstates must be explicitlyrepresented(while the stateof eachcomponentcanbeinferredfromthesystemstate). Wewill refertosuchmodelsas global(Markovchains alsofall under this category), as opposedtolocalmodelsthatdescribeexplicitlythestatesof components, sothatthestateof thesystemcanbeinferred. StochasticPetri Nets (SPN) represent anexampleof local modelingthat canprovidemorecompactmodelswithlargenumberofcomponents[24, 35], andaversionofSPNwillbeutilizedinthenextsectiontocreatesuchsystem-levelmodels. However,evenforlocalmodels,adescriptionofthefullycoupledbehaviorwithhundreds(orthousands)ofcomponentsischallenging,ifnotimpossible.3.3 IntroductiontoComponentModelingTothisend, constructingdecoupled(component)modelswithaccuraterepresentationofthe coupling among components provides an intriguing alternative (cf. [51]). In the contextof state-space modelingthis implies specifyinganyappropriate additional states of thecomponent,whichisrelativelyeasytodoifthecouplingmechanismsarewellunderstood.25However, compactrepresentationof thetransitionsbetweenthecomponentstatesduetocouplingeectscanbesignicantlymorechallenging, andaddressingthischallengeisthemainfocusofthischapter. Itisreasonabletoassumethatthisrepresentationcandependonthetypeof coupling, soabrief overviewof thepossibletypesof couplingisdescribednext.First,wenotethatinthecontextofstate-spacereliabilitymodeling,couplingiscloselyrelatedtodependency. Itcanbearguedthatoneofthepurposesofwhite-boxmodelingisexplicitmodelingofdependencymechanisms,sothattheassumptionofindependencecanbe reasonably applied to individual state transitions, even if, from the black-box perspective,thecorrespondingsystem-leveltransitionsappeartobestatisticallydependent.From the modeling perspective, these sources can divided into two scenarios: in the rst(relativelysimple)scenariotheoccurrenceof anadditional conditionistime-independent(asiteitherhappensordoesnot). Eectively,therelevanteventoccurspriortothemod-eledtimeperiod. Forexample,componentscanbelongtoahigher-risksubpopulation,andasinglevalueofthecorrespondingprobabilityissucient(combinedwiththedenitionsofthelifetimedistributionsforbothhigher-risksubpopulationandfortherestofthecom-ponentpopulation). Incontrast, thetimingof anothereventisimportantinthesecondscenarioso, strictlyspeaking, thefull descriptionof thetimingof thateventneedstobespecied. Itcanbeobservedthatthemajorityof thelistedabovecouplingtypesfollowthissecondscenario, whereeectivelythereisaracebetweentheinternal andexternaleventsinthecomponentmodel. Wewillrefertothosescenariosascompetingrisksasthistermiscommonlyused[4],butinthecontextofmaintenancemodelingthecompetitionisnotlimitedtorisk-relatedtransitions(i.e. sometypeoffailure), butinsteadreferstoanypossibletransitionsforagivenstate(e.g., timingoftherstavailablespotamongseveralrepairqueues).Figure7Bdepictsasinglecomponentviewofthedescribedprocess. Here,parametersof asinglecomponent model areusedfor thefailuretransitionUF, whileanewstatecorrespondingtothe opportunityis introducedalongwiththe correspondingtransitionUOassociated with the failure of the other n1 components. Parametric distributions are26preferredinstate-spacemodelingfromthecompactnessperspective, assumingthattheiraccuracy is assured. Signicant further simplication can be achieved if constant transitionrates are used: rst, each transition is fully characterized by a single parameter, its constantrate; andsecond, if all statetransitionshaveconstantrates, theholdingtimesarenotaectingthechancesof transitiontothenextstate, andtheresultingprocessisMarkov(i.e.,thechancesoftransitioningtoanewstatearefullydeterminedbythecurrentstate).The analysis for Markov processes is signicantly easier than for semi-Markov processes. Atransitionwithconstantratefollowsanexponentialdistribution,whosecumulativeformisgivenbyFe(t) = 1 et.Inthecaseoffailuretransitions,isthereciprocalofthemeantimetofailure. Fixingthechoiceof thetypeof distributions for transitions for Markovmodels alsofacilitateshierarchicalmodelconstructionandaggregationofstatesandtransitions[7,77].Steady-state results oftendependonlyonthe meanparameters of the distributionsassociatedwiththestatetransitions, justifyingtheuseofexponential distributioneveniftheunderlyingdistributions aredierent (see, for example, theextensions of thePalm-Khinchintheorem,especiallyinthecontextoflogistics[9]).Anadditional argument for usingexponential distributionis basedonwhat canbecharacterizedasthecentral limittheoremforrepairablesystems: inacomplexrepairablesystemwithmultiplecomponents, failuresformahomogeneousPoissonprocess[3] atthesystemlevel. Thelatterargument,however,assumesthatthereisnocoordinationamongcomponent failures. In practice, for many systems with clear aging or degradation patterns(e.g., gasturbineengines), majorinspectionsandoverhaulsimposeanoverall structure,andwithineachmaintenancecyclethefailurerateisgenerallyincreasing. Opportunisticmaintenancehasbeenextensivelystudied[46],[16]and[25],andinallofthosestudiestheopportunitieswereassumedtofollowexponentialdistribution[19],[21]. Opportunitiescanbecausedbyextraneouseventsthatcanbeconsideredrandom,inwhichcaseexponentialdistributionisquiteappropriate. However,theopportunitiescanbealsocausedbyfailureof components, whichprovidesanimpetusforinvestigatingtheperformanceof exponen-tial distributions for suchcases, andestablishingthe requirements for adistributionto27adequatelyrepresentthecombinedeectofmultiplecomponents. Specically,iftheexpo-nential distribution is inadequate, we seek to nd an alternative compact representation of agiven combined distribution. To this end,the natural step is approximate the targeted dis-tribution with two-parametric distributions, and ensure that both rst two moments (meanandstandarddeviation, respectively)ofthetargeteddistributionarematched. Similarly,the targeted distribution can be sampled and the maximum-likelihood estimate (MLE) canbeusedtoobtainthemostappropriatedistributionparameters[57].The hypothesis exploredis that analternative selectionof parameters canbe moreadvantageousinthespeciccontextof modelingforcoupledmaintenancescenariosthatinvolve competingrisktype of coupling. The goal is not onlytopredict the expectednumber of maintenanceevents for multiplecomponents, but alsotorepresent theeectof multiple components for component simulationinacompact fashion. The resultingrepresentationcanbeusedasmodelingblocksforlargermodels(forexampletoevaluatethedurationsofoutagesandotherrelevantsystem-leveleects).Weexplorewhether inthecaseof non-identical distributions andwhenthenumberof distributionsisnite(andevenquitesmall)aWeibull distributioncanprovideagoodapproximation for the described combined eect. The justication for the developed methodstemsfromtheasymptoticconsiderationswithrespecttothesmall parameters/ 1,wheresisthereplacementinterval andisscaleof thefailuredistribution. Asymptoticconsiderationshavebeensuccessfullyused[8, 30, 32] tomakeapproximationsforreliablesystems where the rst terms of the Taylor series have beenused. Incontrast tothatpreviouswork,thepresentapproachreliesonthesecondorderapproximation.3.4 Computational MethodsforSystemAnalysisThere are no closed-form solution existing for nite time horizons, but two numerical optionsexist: numerical integrationof thecorrespondingrenewal integral equations, ordiscrete-eventsimulation. Bothoptionsarebrieydiscussednext.283.4.1 RenewalequationsFor a single component with no scheduled maintenance (when the part is replaced only uponfailure)thecorrespondingrenewal processiswell studied. Forourpurposesthefollowingformofthecorrespondingintegralequationisuseful:m(t) = f(t) +_t0m()f(t )d (3)Here the renewal density m(t) =dM(t)dt, M(t) is the expected number of renewals or renewalfunction,andf(t)isprobabilitydensityfunction[3,62](hereweassumethatderivativeofrenewalfunctionexists). Ecientmethodsforthenumericalsolutionofrenewalequationsexistusing, forexample, nitedierences[86, 23]. Theformof Eq. 3providesanaturalinterpretationthatisamenabletogeneralization: renewal attimetcanoccureitherduetotherstfailureatthattimewiththeprobabilitydensityf(t), orduetotherepeatedfailure, wherethepreviousrenewal tookplaceattime withthecorrespondingrenewaldensitym(),andthechancesofthefailuref(t )fortherenewaltime.Let us now introduce scheduled replacements at interval s, so the renewal can be causedeither byafailure or byscheduledreplacement (if nofailures occurredduringintervals). Therefore, wecanseparaterenewal densitym(t)intothetwodistinctparts: m(t)=u(t) + w(t), whereu(t)andw(t)representrenewal densityduetofailuresandscheduledreplacements, respectively. Fortherst cycle0k 1transitionissetonlyforcolork=4. Considernow,thatthePM+1 transitionres. Then, thetokenismovedfromtheClock totheCheckplaceandits color is increasedbyonemakingthetokencolor equal totwo. Sincethe>k 1hasasingleringrulewhichisonlysetforcolork=4itisnotenabledandthetokenstaysintheplaceuntil theReturntransitionresreturningthetokentotheClockplacebutnowwithcolork=1. If thisprocessrepeatsthreemoretimesthenthetokenwill reach the Checkplace with color k = 4. At that point the > k 1 transition is enabledandimmediatelyredmovingthetokentotheInspectiondueplacewhichenablesthePMorOpptransition in the component-state group triggering an inspection of the component.ThetokenontheschedulernetworkthenmovestotheClock placethroughtheReset -7transitionwhichchangesitscolorbacktozeroandresetsthescheduler.67ConsidernowthecasewherethetokenisintheClock placewithcolork=1. Thismodels the situation where one base inspection has already occurred. Next, an opportunityoccurs due tofailure of component whichis under aT hours inspectionperiod. Thisopportunitydoesnottriggeranopportunityofthecomponentbeinganalyzedbutitdoeschange the schedule of all the subsequent inspections. The scheduler networkcapturesthisbyincreasingthecolorofthetokenandresettingtheclockonthetransitionsduetopreventivemaintenanceoropportunities.Theperfectrepairtemplatecollectsresultsin8sensorsassociatedwiththefollowingplaces: IF Counter, FM1 Counter, FM2 Counter, FM3 Counter, Inspection Due, Inspection,RepairCounterandReplaceCounter.Themaintenancepolicyimplementedreplacesorrepairsthecomponenteverytimeaninspectiononthecomponentisperformed. Thecomponentisreplacedifithasreacheditslife or if it needs to be scrapped and otherwise it is repaired. From the token point of view,areplacementisidenticaltoarepair. Thetokenreturnstotheoperatingplacewithazerolife(likeanewcomponent).6.5.2 VericationofthetemplateInitialvericationofthemodelisachievedbyastep-by-stepsimulationrun. Thepurposeis tocheckthat the behavior of the model is as expected. This is important becauseSPN@doesnotdirectlyimplementprioritiesandcareneedstobetakentoensurethatthedesiredtransitionwill rerstwhentwoormoretransitionsarescheduledtoreatthesame time. Instead, relative timing is used throughout the model to assure proper behavior.Forinstance,considerthebehaviorofthescrapgroup(seeFig.29). WhenthetokenarrivestotheReplaceCounterplaceandthetokeninthescrapgroupisintheOkplacetheexpectedbehaviorisfortheResettransitiontoreandthentheAux. 6inthatorder. IftheorderinvertedthenafterAux. 6transitionrestheResettransitionwill be disabled and the Life and Scrap transitions will not be disabled-enabled as theyshould. To ensure this, Aux. 6 is set to re after 2 and Reset is set to re after . Thevericationstepconrmsthatthisandothersimilarsituationsintherestofthemodelare68Figure29: [email protected] Simulationofthetemplate: PerfectRepairExperiment6.5.3.1 UncoupledSimulationIntheuncoupledsimulationeverythingbuttheopportunitiescausedbyfailuresof othercomponentsareenabled. Theresultsfromtheuncoupledsimulationprovidethestartingpointfortheestimationoftheparametersoftheopportunitiesprobabilitymodel.TheresultforoneoftheexperimentsranintheuncoupledcongurationareshowninTable 7. Since the induced failures are not enabled the column corresponding to the inducedfailureshasonlyzeros.Table7: ResultsofUncoupledExperiment,percomponent. 1E7replicationsFM1 FM2 FM3 Repair Replace Insp. Ind. Fail.C1 0.06016 0.0 0.0 5.90278 2.09801 7.94064 0.0C2 0.01495 0.30760 0.0 6.48543 1.55598 7.71886 0.0C3 0.0 0.0 0.0 1.97599 6.02401 8.0 0.0C5 0.0 0.0 0.0 5.97251 2.027489 8.0 0.0C7 0.008658 0.0064593 0.0049837 0.007006 3.99346 3.9804 0.0C8 0.02509 0.024729 0.002738 0.55261 3.44812 3.94818 0.0C9 0.005818 0.0 0.0 1.38843 2.61157 3.99419 0.0C10 0.02964 0.0 0.0 1.0E-7 2.0001 1.9704 0.0C11 0.0 0.0 0.0 0.12804 3.87196 4.0 0.0C12 0.004588 0.006652 0.0 4.90276 3.09802 7.98954 0.0C13 1.3234 0.0 0.0 4.0E-7 1.55433 0.2309 0.0C14 0.0 0.0 0.0 0.9997 1.0003 2.0 0.069The uncoupled simulation shows that the perfect repair policy is eective in maintainingthenumberofunscheduledoutagesofthesystemforallthecomponenttypesexceptcom-ponenttype13whichhasalargelikelihoodoffailingatleastoncethroughoutthesystemlife.Also,there is an imbalance in the number of events. Consider for instance,componentstype 1, 2and3. This component are under the same inspectionschedule however thenumberofinspectioneventsforeachofthemdiersslightly(7.94064, 7.71886and8.0).This is duetothedecouplingof eachof thecomponent simulationfromtherest of thesysteminconjunctionwiththedisablingofthetransitionsrepresentingthiscoupling.6.5.3.2 Uncoupled+InducedFailuresSimulationTheresultsfromtheuncoupledsimulationareusedtoestimateamodel fortheinducedfailureprobabilitydistribution. Theinducedfailuresareassumedtofollowanexponentialdistribution. Fromthesystemarchitecturetheinducedfailurebehaveasaseriessystem.Hence,therateparameteroftheinducedfailureprobabilitydistributioniscalculatedas,9,if= N77,FM3 +N88,FM312,if= N77,FM313,if= N77,FM3 +N88,FM314,if= N77,FM3 +N88,FM3where,7,FM3=N7,FM3TLC9,FM3=N8,FM3TLCandTLCisthesystemslifecycle.Inthisexperiments, thetemplateisonlyinstantiatedwiththedataforC9, C12, C13andC14.706.5.3.3 CoupledIterationsExperimentsThe results fromthe uncoupledincludinginducedfailures are usedtoestimate the pa-rametersoftheopportunisticmaintenanceprobabilitydistributions. Thesameprocedureappliedtothecalculationof theparametersof theinducedfailuredistributionareused.Theequationsforthelevel-0opportunitiesaregivenby,1,L0= (N11)1,FM1 +N22,FM12,L0= N11,FM1 + (N21)2,FM1i,L0= N11,FM1 +N22,FM1fori 3, 5, 7, 8, 9, 10, 11, 12, 13, 14.Theequationsforthelevel-1opportunitiesaregivenby,2,L1= (N21)2,FM2j,L1= N22,FM2forj 1, 3, 5, 7, 8, 9, 10, 11, 12, 13, 14.Theequationsforthelevel-2opportunitiesaregivenby,7,L2= (N71) (7,FM2 +7,FM3) +N8 (8,FM2 +8,FM3) +N12 (12,FM1 +12,FM2)8,L2= N7 (7,FM2 +7,FM3) + (N81) (8,FM2 +8,FM3) +N12 (12,FM1 +12,FM2)12,L2= N7 (7,FM2 +7,FM3) +N8 (8,FM2 +8,FM3) + (N121) (12,FM1 +12,FM2)k,L2= N7 (7,FM2 +7,FM3) +N8 (8,FM2 +8,FM3) +N12 (12,FM1 +12,FM2)fork 1, 2, 3, 5, 9, 10, 11, 13, 14.Oneaspectof thecalculationof opportunitiesisthatwhentheparameterfortheop-portunisticdistributionmodelarecalculatedforcomponenttypesthatthemselvesprovideopportunitiesthenwhenacomponentofthattypefailsitdoesnotprovideanopportunityforitself (andhencewesubtractonefromthetotal numberof itemsbeingconsideredasprovidingopportunities).Three iterations of the coupled experiment are run, each using the results of the previous71iterationtoupdatetheestimateoftheinducedfailuresandtheopportunities. Theresultsoftheexperimentsareshownnext.(a)Level-0 (b)Level-1(c)Level-2Figure30: RateParametersofopportunitiesprobabilitymodelFig. 30 shows the estimation of the scale parameter for all of the opportunity probabilitymodels. First, it is noted that the estimation has reached an approximately stationary valueafterthethirditeration. Thisprovidesthebasisforastoppingcriteria.Second, theestimatedscaleparametersbasedontheresultsfromtheuncoupledsim-ulationover predicts the stationaryvalues by5%to25%. This is consistent withthemaintenancepolicywhichreplacesthecomponentswhenanopportunityforreplacementpresents itself. In the uncoupled simulation failures of other components do not provide op-portunitiesforreplacement(theyarenotenabled)andhencethecomponentsarereplacedlessoftenwhichinturnsresultsinhavingmorefailuresthroughoutthelife-cycle. Sincethenumberoffailuresisusedforestimationofthescaleparameteroftheopportunities,a72largernumberof predictedfailureswill resultinalargerestimationof thescaleparame-ter. Inthesamevein,ifcomparingtheestimatedvalueswhenusingtheinformationfromtherstiterationandwhenusingtheinformationfromtheseconditerationanalternatingtrendisobserved. Thisisonceagainconsistentwiththemaintenancepolicy. Duetotheoverpredictionof thenumberof opportunitieswhenusingthedatafromtheuncoupledsimulation,theresultingnumberoffailuresintherstiterationofthecoupledsimulationisunderpredictedwhichinturnresultsinanunderpredictionof thescaleparameteroftheopportunities.(a)C7,NumberofFailures (b)C7,NumberofMaint. Actions(c)C8,NumberofFailures (d)C8,NumberofMaint. ActionsFigure31: SampleofnumberofeventscollectedfromexperimentsFig.31showsasampleofthenumberofeventscollectedfromtheexperimentforcom-ponenttypes7and8. SimilarguresfortheremainingofthecomponentscanbeseeninSec.D.1. Fig.31aand31cshowtheoveralleectoftheopportunisticmaintenancepolicy73onthenumberof failuresduetothedierentfailuremodes. Theopportunisticmainte-nance policy clearly reduces the overall number of failures. At the same time, it is observedthatforcomponenttype7thenumberoffailuresinmode2andforcomponenttype8thenumberoffailuresinmode3isnotaectedbytheopportunisticmaintenancepolicy. AnexplanationofthistrendcanbegleanedfromFig.32: thehazardratefunctioninthetimeinterval uptotheirinspectionisforbothmodeseitherconstant(forC7-FM2), oralmostconstant(forC8-FM3).(a)C7 (b)C8Figure32: Hazardratefunction,failuremodesFig.31aand31cshowtheoveralleectoftheopportunisticmaintenancepolicyonthenumberandtypeof maintenanceactionsperformedonthecomponenttypes. Itisclearthatthereductionintheoverall numberof failurescomesattheexpenseof anincreasednumberof maintenanceaction. Whetherthisisacceptableornotwill dependonall thecostsassociatedwithbothscheduledandunscheduledmaintenanceactions.FinallyinFig. 33thenumberof inspectionspercomponenttypeastheyevolvefromuncoupled to coupled iterations is shown. The plot is color coded by inspection level. If themodel constructed was a system level model with all the couplings explicitly stated then thenumber of inspections of dierent levels would clearly have a denite value. It is interestingtonotethateventhoughthereisnoexplicitruleenforcingthatthenumberofinspectionsforthedierentcomponenttypescorrespondingtothesameinspectionlevel mustbethesame(intheproposedcomponent-wisemodel)itturnsoutthatapproximatelythesamevalueisfoundfromthecomponent-levelmodels.74Figure33: NumberofInspectionEvents6.6 Minimal RepairModeling6.6.1 ComponentTemplateTheminimal-repaircomponenttemplateisshowninFig. 34. Thekeyideatomodel theminimal-repairmaintenanceactionistomaketheschedulerclock(thetokeninthesched-ulerbranch)independentofthecomponent-statebranchforthecaseinwhichascheduledmaintenancearrivesandarepairactionisdecidedbythescrap-clockgroup.To achieve this the model is modied so that the token from the component-state branchonly leaves the operating state when the component needs to be replaced. Since the numberof maintenanceeventsisstill recordedthenumberof repairscanbecalculatedasfollows(inapost-processingstageafterthesimulation). Letcmaint, cfailandcrplbethenumberof timesthetokenof schedulerbrancharrivesatthemaintenanceplace, thetokenof thecomponentbrancharrivestothefailureplaceandthenumberof timesthetokenof thecomponent-state branch arrives to the replace place respectively. Recall rst that (from themaintenancepolicy)aninspectionalwaysresultsineitherareplacementorarepairand75Figure34: ScreenshotofminimalrepairtemplateinSPN@thatafailurealwaysresultsinareplacement. Then,wehavenrpr +nrpl,OM,PM= cmaint(22)cfail +nrpl,OM,PM= crpl(23)wherenrpris thenumber of repairs andnrpl,OM,PMis thenumber of replacements duetoeitheropportunisticmaintenanceorpreventivemaintenance. Fromthis, thenumberofrepairsisgivenbynrpr= cmaint +cfailcrpl(24)6.6.2 VericationofthetemplateThevericationprocessof thetemplateismadebyseveral step-by-stepsimulationswithmodicationof someparameterstoensurethatthelogicof themodel followsthespeci-cations. Themaindierenceinthelogicbehaviorof themodel whencomparedtotheperfect repair template is the inclusion of an inhibitor arc going from the Ok place in thescrap-cloggrouptotheSched. Repl. transition. Itisthisinhibitorarcthatindirectlyenablesthebehaviorexpectedfortheminimal repairmaintenancepolicy. Theabsenceofthisinhibitorarcreducestheminimal-repairtemplatetotheperfect-repairtemplate. Asafurthervericationstepof theconsistencybetweenthetwotemplatesanexperimentwas76runfortheminimal-repairtemplatemodiedsothatitdoesnotincludetheinhibitorarc.The results were the same as the ones obtainedbyanequivalent experiment usingtheprefect-repairtemplateasabasis.6.6.3 Simulationofthetemplate: MinimalRepairExperiment6.6.3.1 UncoupledSimulationThe result for one of the experiments raninthe uncoupledcongurationare showninTable 8. Since the induced failures are not enabled the column corresponding to the inducedfailureshasonlyzeros.Table8: ResultsofUncoupledExperiment(Minimalrepair),percomponent. 1E7replica-tionsFM1 FM2 FM3 Repair Replace Insp. Ind. Fail.C1 0.20147 0 0 5.8555 2.1537 7.8077 0C2 0.05350 0.40026 0 6.43654 1.61105 7.59383 0C3 0 0 0 1.97553 6.02447 8.0 0C5 0 0 0 5.97250 2.02749 8.0 0C7 0.00869 0.006426 0.005005 0.006963 3.99313 3.97997 0C8 0.03458 0.027310 0.002797 0.5526347 3.44801 3.9359 0C9 0.01242 0 0 1.3880027 2.61204 3.98762 0C10 0.02965 0 0 2E-7 2.000061 1.97040 0C11 0 0 0 0.12802 3.87198 4.0 0C12 0.01846 0.006539 0 4.89967 3.10064 7.97532 0C13 1.30614 0 0 4E-7 1.53449 0.22834 0C14 0 0 0 0.99971 1.000284 2.0 0Table9showsthetotalnumberoffailurespercomponenttypeforeachofthemainte-nance policies when they do not include opportunistic maintenance (uncoupled simulation).The general trend is that the total number of failures throughout the life cycle of the systemislargerwhenaminimal repairpolicyisusedinsteadof aperfectrepairpolicy(withoutincludingopportunisticmaintenance).Thescaleparameterfor theopportunityprobabilitymodel iscalculatedinthesamewayasoutlinedintheperfectrepairsection. TheirevolutionasiterationsarecarriedoutisshowninFig.35.Thesameeectnotedforthesimulationofthesystemunderaperfectrepairpolicyis77Table 9: Total number of failures of perfect repair andminimal repair policywithoutopportunisticactionsPerfectRepair MinimalRepair Rel. %Di.C1 0.0601617 0.2014748 234%C2 0.3225539 0.4537627 40.7%C3 0 0 C5 0 0 C7 0.0201015 0.0201251 0.12%C8 0.0525571 0.0646931 23%C9 0.0058183 0.0124197 113%C10 0.0296444 0.0296553 0.0368%C11 0 0 C12 0.0112405 0.0249995 122.4056C13 1.3234265 1.3061459 -1.3057C14 0 0 (a)Level-0 (b)Level-1(c)Level-2Figure35: ScaleParametersofopportunitiesprobabilitymodel,minimalrepair78presentinthesimulationofthesystemunderaminimal repairpolicy. Namely, usingthedatafromtheuncoupledmodel simulationresultstoestimatethescaleparameterof theopportunitiesresultsinanoverpredictionofitsvalue.Fig.36showsasampleofthenumberofeventscollectedfromtheexperimentforcom-ponenttypes7and8. SimilarguresfortheremainingofthecomponentscanbeseeninSec.D.2. Fig.36aand36cshowtheoveralleectoftheopportunisticmaintenancepolicyon the number of failures due to the dierent failure modes. The opportunistic maintenancepolicyclearlyreducestheoverallnumberoffailuresatthecostofincreasingthenumberofmaintenanceactions.(a)C7,NumberofFailures (b)C7,NumberofMaint. Actions(c)C8,NumberofFailures (d)C8,NumberofMaint. ActionsFigure36: SampleofnumberofeventscollectedfromexperimentsTable10comparesthetotal numberofmaintenanceactionsfortheperfectrepairandthe minimal repair policy. The minimal repair policy consistently increases the total numberofmaintenanceactionsforeachcomponenttype. Focusingontypeofmaintenanceaction79one canobserve thatthe minimalrepair policyincreases ina largerpercentage the numberof(minimal)repairs.Table 10: Total number of maintenance actions of perfect repair and minimal repair policiesRepair ReplacePR MR Rel. %Di. PR MR Rel. %Di.C1 9.4831 12.5635 32.5 2.9484 3.0108 2.1C2 10.8478 13.9249 28.4 1.5775 1.6551 4.9C3 6.3178 9.2713 46.7 6.1228 6.3107 3.1C5 9.5181 12.6202 32.6 2.9216 2.9601 1.3C7 3.6535 5.4469 49.1 4.9412 5.5165 11.6C8 4.1703 6.0928 46.1 4.4286 4.8815 10.2C9 5.9054 8.1526 38.1 3.0002 3.1704 5.7C10 0.0017 0.0026 52.9 4.9543 6.3191 27.5C11 4.1753 6.1259 46.7 4.416 4.8321 9.4C12 9.2951 12.4167 33.6 3.3 3.3606 1.8C13 0.0001 0.0001 0.0 3.9479 4.8457 22.7C14 3.9339 5.2919 34.5 1.3073 1.3498 3.380CHAPTERVIICONCLUSIONSANDFUTUREWORKAn analytical method for the compact characterization of competing-risk coupling of a com-ponent with the rest of the system has been developed and tested on small scale state-spacemodels. Whilethemodelsconsideredinthethesiswerespecictomodelingopportunisticmaintenancecombinedwiththeage-basedreplacementpolicies, thedescribedphenomenaispertinenttoabroadrangeof couplingscenariosthatinvolveseveral competingtransi-tionsfromthesamestate. It iswell known(e.g., Ref. [26]) that theuseof exponentialdistributionsinsuchsituationscansometimesleadtosignicanterrors(iftheunderlyingdistributionisnon-exponential, includingdeterministicdelays). However, itwasnotclearastowhich(compact)propertiesof adistribution(inadditiontothemean)impactthesystem-level results. Whilethestandarddeviationseemstobeanatural candidate, thepresent work argues that the use of so-called winning ratio , provides superior accuracy.Specically, it was shownthat for Weibull distributions matchingthe winningratioleads tomore precise results thanthe use of adistributionthat matches the tworstmoments of the targeted distributions or an approximation obtained using MLE. For Weibulldistributions, thedevelopedprocedureisalsojustiedonasymptoticconsiderationsthatdemonstratethat,astheinspectionintervalgetssignicantlysmallerthanthescaleofthedistribution,thewinningratiotendstoasimpleratiodeterminedbytheshapeparametersof Weibull distributions. Instead of providing a good global match for the whole range of thedistribution, the resulting approximation targets the left tail of the distribution, which is themostrelevantforrealisticmaintenancescenarios. Itwasfurthershownthatacombinationof lognormal distributions can be well approximated by a Weibull distribution with matchedwinningratioaswell. Itishopedthatthedevelopedprocedurewillfacilitatesthecompactrepresentationofmaintenancepoliciesforcomplexsystemsbyenablingtheapplicationof81component-wiserepresentationof maintenanceprocesses that accuratelyrepresent inter-componentcouplings.Onepossibleavenueofexplorationforfutureworkistostudydierentiterationproce-duresusedtocalculatetheparametersoftheapproximatingdistribution. Researchques-tionsonthisinstancewouldincludeif all theiteratingproceduresconvergetothesamevalueorthespeedofconvergence. Anotherpossibleextensionofthisworkistothemod-elingofdependentfailuremodes. Theproposedmethodologyisbasedontheideathattheexogenous eect ontheitemunder considerationcanbemodeledusingcompetingrisksideas. Since competing risk analysis has been used in reliability modeling to capture depen-dentfailuremodesonepossibleresearchquestionwouldbetostudythesuitabilityoftheapproachforthereductionoftheitemmodelcomplexity. Apositiveanswerwouldleadtoalargereldofapplicationofthemethodologyandtothepossibilityofdealingwithmorecomplexproblems.82APPENDIXAPARAMETRICSTATISTICALDISTRIBUTIONSINRELIABILITYMODELINGSinceinthisworkwefocusonreliabilitymodelingwewill onlyprovideadescriptionofprobabilitydistributionsofnonnegativerandomvariables. Theserandomvariablesaretheonesencounteredwhenstudyingwaitingtimes. Whilefornonnegativerandomvariableswithstandarddeviationsthataresmall comparedtotheirmeansthenormal distributioncanoeranappropriateapproximation, fortheremainderofthecasesthisisnotso. Fornonnegative random variables and in particular when dealing with reliability modeling, theexponentialdistributionisundoubtedlythemostpervasivedistributioninuse[56].Oneof thecentral conceptsusedtodescribephenomenawhichareof astochasticinnatureistheconceptofarandomvariable.Denition1(RandomVariable). Considerarandomexperimentwithasamplespace C.AfunctionX,whichassignstoeachelementc ConeandonlyonenumberX(c) = x,iscalledarandomvariable. ThespaceorrangeofXisthesetofrealnumbers D = {x : x =X(c), c C}.If Disacountablesetthentheassociatedrandomvariableiscalledadiscreterandomvariable; otherwise, if Dis aninterval of real numbers it is calledacontinuous randomvariable.Variousalternativefunctionscanbeusedtofullyrepresentarandomvariable. Thesefunctions include the distribution function, the reliability function, probability density func-tion, hazardfunction, cumulativehazardfunction, meanresidual lifefunction, andtotaltime-on-test transform. Theserepresentations applytocontinuous anddiscreterandomvariablesandwhenoneof themisgiventherestcanbeuniquelyobtained. All of themareuseful becauseforagivendistributiononeof themmayhaveaverysimpleformandalsobecausecertainaspectsofthedistributionmaybemorereadilyseenbylookingata83specicfunctions(e.g. themodeofadistributioniseasilyappreciatedinthegraphofitsprobabilitydensityfunctionbuthardertondinthegraphofitscumulativedistributionfunction).InthefollowingdenitionswewillletXbearandomvariable.Denition2 ((Cumulative) DistributionFunction). The functionF : (, )Rdenedby,F(x) = P (X x) = P ({c C: X(c) x}) (25)iscalledthe(cumulative)distributionfunctionofXDistributionfunctionsaresometimescalledcumulativedistributionfunctionstoem-phasize the fact that F(x) accumulates the probabilities less thanor equal tox. Thedistributionfunctionisnon-decreasingandright-continuous. Moreover, itcanbeshownthatlimzF(z) = 0andlimzF(z) = 1.Denition3(ReliabilityFunction). ThefunctionR : (, ) Rdenedby,R(x) = P (X x) (26)iscalledthereliabilityfunctionofX.Reliabilityfunctionsaresometimescalledsurvival functions[56] orsurvivorfunc-tions[49]. Moreover, sometimestheyaredenotedbyF(x)[3, 56] orS(x)[49]. Clearly,R(x) = 1 F(x). WhentherandomvariableTrepresentsthelifetimeofanitem,thenthevalueof thereliabilityfunctionattisinterpretedastheprobabilitythattheitemisstillfunctioningattimet.Denition4(ProbabilityDensity). Iffisanonnegativefunctionforwhich,F(x) =_xf(z) dz, forallrealx (27)thenfiscalledaprobabilitydensityofX(orofF).84Denition5(Hazardrate). If Fisanabsolutelycontinuousdistributionfunctionwithdensityf,thenthefunctionhdenedon(, )byh(x) =___f(x)R(x)ifR(x) > 0 ifR(x) = 0(28)iscalledthehazardrateofX,(orofF).Veryoftenthehazardrateiscalledthehazardfunctionorthefailurerate. Itismaybethe most popular representation when dealing with random variables modeling lifetime duetoitsinterpretationastheamountof riskassociatedwithanitemattimet. Itcanbeshown[49]that,forsmallt,thehazardratefunctionis,h(t)t = P(t T t + t | T t)which in words says that the probability that an item will fail within the next t time unitswhenitisknownthatithasalreadysurviveduptotimetisproportional tothefailurerate.A.0.4 ExponentialDistributionIt is the most important of one parameter familyof life distributions. Theyare quitesimpletodescribeandexceptionallyamenabletostatistical analysis. Several ofthemostcommonlyusedfamiliesoflifedistributionsaretwo-orthreeparameterextensionsoftheexponential distribution. The exponential distribution,with its constant hazard rate formsabaselineforevaluatingotherfamilies.The reliability function, probability density function and hazard rate function are givenby,R(t) = et(29)f(t) = et(30)h(t) = (31)whereisascaleparameteroftencalledthefailurerate. Fig.37showsthedensity,distri-butionandhazardfunctionforsomevaluesofthefailurerate.85(a)Densityfunction (b)Distributionfunction(c)HazardfunctionFigure37: Exponentialfunction86Theexponential distributionistheonlycontinuousdistributionforwhichthehazardrate is a constant. It greatly simplies calculations with it because of the following properties[5], MemorylessProperty Knowingthattheitemisfunctioningatthepresenttime, itsbehaviorinthefuturewill notdependonhowlongithasalreadybeenoperating. Constanthazardrateatsystemlevel If a system without redundancy consists of n el-ements and the failure-free times of these elements are independent and exponentiallydistributedthen,thesystemfailurerateisalsoconstantandequaltothesumofthefailureratesofitselements.A.0.5 WeibullDistributionAgeneralizationoftheexponentialdistributionwithahazardratethatcanbedecreasing,constantorincreasing. ItwasintroducedbyW.Weibullin1951,relatedtoinvestigationsonfatigueinmetals.The reliability function, probability density function and hazard rate function are givenby,R(t) = e(t)(32)f(t) = t1e(t)(33)h(t) = t1(34)where and are the scale andshape parameter respectively. Since noparametriza-tionoftheWeibull distributionhasbeenstandardizedonemustalwayscheckthespecicparametrizationbeingusedbyanygivenauthor. Fig. 38showsthedensity, distributionandhazardfunctionforsomevaluesoftheshapeparameter.In the context of reliability, Weibull distributions are often used due to their exibility ofrepresenting hazard rate functions that can be either increasing or decreasing with time. Theformercorrespondtotheshapeparameter>1(e.g., failuresindeterioratingsystems),whilethelattercorrespondtotheshapeparameterk-1 2100 None 12345696C.2 Minimal RepairC.2.1 ModelLogicTables 16 and 17 illustrate the setting of each of the transitions in the minimal repair modeltemplateandthedescriptionofwhateachsensoriscount