Determining a defensible preventive maintenance plan
Transcript of Determining a defensible preventive maintenance plan
Determining a defensible preventive maintenance planPresented by Jim Kennedy, CPEng, CFAM, CAMA
Interlogis Consulting Page 1
1 August 2017 Interlogis Consulting Page 2
Course Agenda
Day 1
• Introduction
• Session 1 – Maintenance and its management
• Session 2 – Risk and Reliability
• Session 3 – RCM and Condition Monitoring task period
1 August 2017 Interlogis Consulting Page 3
Course Purpose and Outcomes
• Purpose
• To prepare for determining an objective condition monitoring program using
optimising and verifying algorithms in a defensible manner.
• Outcomes
• Define the defensible budget concept
• Understand and list the maintenance management objectives in Asset
Intensive industries
• Define maintenance and identify the different types
• Describe the failure characteristics associated with preventive maintenance
plans
• Explain the role and development of Preventive Maintenance Plans
• Explain the basic process of FME(C)A/RCM/LORA/TA
• Determine condition monitoring task periods
• Verify condition monitoring task period
Session 1Maintenance and its ManagementSetting the scene
1 August 2017 Interlogis Consulting Page 5
The defensible (maintenance) budget
Assures agreed and verifiable objectives of:
• Safety and environmental risks managed
• Required performance achieved at known level of assurance
• All done at a desired balance between the performance, the cost and the
residual risk
Defensible is defined as comprising solutions that are:
• Fact and risk based
• Fully traceable to system/asset output requirements
• Demonstrably good practice (international and national standards)
• Compliant with statutory and regulatory imperatives
• Implemented by competent (certified) staff
• Supported by verified technology (information and decision systems)
• Transparently and verifiably costed
• Deliverable in the agreed time frame
1 August 2017 Interlogis Consulting Page 6
What is Maintenance?
All activities necessary to retain an item in or return it to a serviceable condition.
Blanchard 1974
Nowlan and Heap 1978
IEC International Electrotechnical Vocabulary*
1 August 2017 Interlogis Consulting Page 7
Maintenance objectives - aerospace industry
• Preserve inherent levels of safety and reliability designed into equipment
• Restore safety and reliability to their inherent level when deterioration has
occurred
• Obtain the information to improve all processes associated with the
system lifecycle
• Do the above at minimum cost of ownership
Adapted from
Nowlan and Heap page xvi
December 1978
Activity 1.3
1 August 2017 Interlogis Consulting Page 8
Maintenance Terminology
Maintenance
ObjectivesPreventive
Maintenance
Corrective
Maintenance
Condition Monitor
Hard Time Activity
Functional Test
Repairs (Stds)
Renewals (Cost)
Unplanned
Planned
1 August 2017 Interlogis Consulting Page 9
Technical Maintenance Plans
• Preventive maintenance policies developed using FMECA/RCM to
achieve inherent asset reliability.
• TMPs were introduced in the Royal Australian Air Force in 1970s, Rail
Industry in 1980s. Power Industry in 1990s. Electrical distribution in 2000s
• Policies cover:
• Which assets are to be maintained?
• What maintenance is to be done?
• When the maintenance is to be done?
• How the maintenance is to be done?
• Where the maintenance is to be done?
• Who is authorised to do the maintenance?
1 August 2017 Interlogis Consulting Page 10
Maintenance Requirements Analysis
Maintenance requirements analysis process
FMECA TA
Failure Mode Effects and Criticality Analysis – FMECA
Reliability Centered Maintenance – RCM
Level of Repair Analysis - LORA
Task Analysis - TA
TMP
Service
SchedulePreventive
CorrectiveRepairsAssets
Business
Functions
Failure modes and parts
Failure modes parts,
risk and causes
LORA
RCM
Session 2Risk and ReliabilityDoing it by the numbers
1 August 2017 Interlogis Consulting Page 12
Risk management process
Co
mm
un
icat
e an
d C
on
sult
Mo
nit
or
and
Rev
iew
Establish the Context
Identify Risks
Analyse Risks
Evaluate Risks
Treat Risks
AS HB89 Risk assessment methods
Event Tree Analysis
Fault Tree Analysis
Cause and Consequence
Bow Tie Diagram
Failure Modes and Effects Analysis
(FMEA)
Fault Mode, Effects and Criticality
Analysis (FMECA)
Reliability Block Diagram
Human Reliability Analysis
Consequence/Likelihood Matrix
Cost Benefit Analysis
Multi Criteria Decision
1 August 2017 Interlogis Consulting Page 13
FMECA – a risk process
RISK
FunctionalPerformance
BusinessRequirement
AssetSolutions
SystemFunctions
EquipmentFunctions
Failure Modes
Failure Effects
Failure Probability
H
L
Consequence
Identify Mitigation
Preventive Actions
Corrective Actions
Operator Actions
Maintainer Actions
Redesign Actions
Functional Block Diagram
Reliability Block Diagram
1 August 2017 Interlogis Consulting Page 14
Reliability and the “Bathtub” Curve
Reliability Definition
The probability that a defined item shall operate to a defined standard for a
defined period of time in a defined operating environment
Wear-in Wear-outUseful life
H(f)
Age
Hazard
Function
1 August 2017 Interlogis Consulting Page 15
Hazard function in reliability
The hazard function defines:
• Conditional probability of failure in a particular time interval which can be
dependent on previous intervals
• i.e. the expected future (T+Δt) rate of failure of an item of equipment given
that it has survived to a particular age (T)
• T may be measured in time or events.
1 August 2017 Interlogis Consulting Page 16
Weibull Family of Curves
Probability Density Function Reliability Function Hazard Function
Raw failure data set Likelihood of survivingto time ‘t’
Likelihood of failurehaving survivedto time ‘t’
MTBF
1 August 2017 Interlogis Consulting Page 17
Relating characteristic to task type
Nowlan and Heap
AD AO66579
Source NASA
88%
pd
f
time
% 1968
4
2
5
7
14
68
1 August 2017 Interlogis Consulting Page 18
RCM and technology impact
3
1
7
3
6
% 2001
2
10
17
9
56
6
% 1968
4
2
5
7
14
68
Nowlan and Heap
AD AO66579
Reliability Centered Maintenance Source NASA
89% 77% 71%
% 1982
42
29
1 August 2017 Interlogis Consulting Page 19
pd
f
time
Probability density function (Constant failure rate)
Number of failures of survivors in a discrete time period
f(t)= l e -lt ie if 100 items at start and failure rate of 10% then0 = 100 1 = 90 (100 - 100/10 = 90)2 = 81 (90 - 9/10 = 81)3 = 73 (81 - 81/10 = 73)4 = 66 (73 - 73/10 = 66)
1 August 2017 Interlogis Consulting Page 20
time
pd
f
Random failure pattern - Condition monitor
Each Item achieves maximum life
but at the cost of many Condition
Monitoring tests
100%
0% Time
Resistance to
failure
100%
0% Time
Resistance to failure
Conditional Defect Point
Degrading Asset Condition
1 August 2017 Interlogis Consulting Page 21
Random failure pattern
Number of survivors at t = MTBF
1 August 2017 Interlogis Consulting Page 22
RCM – 7 Questions and 4 Answers
1. Which assets are important to the business?
2. What are its functions?
3. How does it fail to perform that function?
4. What causes it to fail?
5. What happens when it fails?
6. How can that failure be managed?
7. What can be done if the failure cannot be managed?
1 August 2017 Interlogis Consulting Page 23
RCM - The four risk based solutions
Examine condition to detect
potential failures (Condition
Monitor)
Restore or discard before a
maximum age (Hard Time)
Check to find failures that are
not evident (Failure Finding)
Apply default tasks of “run to
failure” or “redesign”
1 August 2017 Interlogis Consulting Page 24
Knowledge outcomes of MRA
The analysis achieves a detailed listing of:
• what functions and equipment comprise the rail system solution
• what equipment related failure modes adversely impact function
• what are the risks associated with those failure modes
• what preventive tasks (controls) will reduce failure risk
• what corrective tasks are available to recover from failures
• what quality elements are necessary to assure task effectiveness
• what hazards are Inherent in the tasks
• what controls are necessary to manage those hazards
Session 3CM Determination and verificationWhat is my inspection period?
1 August 2017 Interlogis Consulting Page 26
Seven Step Analysis Process
1. Breakdown the asset into systems and items of equipment.
2. Prioritise the assets for analysis according to risk exposure from failure.
3. Collect system information and define each failure problem to be
addressed.
4. Establish possible preventive maintenance strategies for dealing with
each failure cause based on its consequence.
5. Evaluate the validity of each particular preventive maintenance policy
(task and frequency).
6. Determine what to do if there are no applicable and effective
maintenance policies.
7. Package the valid preventive maintenance policies into cost effective
schedules.
1 August 2017 Interlogis Consulting Page 27
Preventive maintenance task options
• Condition monitoring is intended to identify the
potential for item failure with sufficient warning to allow
maintenance prior to failure.
• The MIL-STD-2173 algorithm determines the optimum
number of examinations across the CF interval.
• CSIRO Maths Division - validation paper (Mar 2001)
• Additional consideration should be given to critical
operating equipment that is not readily removed from
service for maintenance (requirement for warning time)
Is a condition
monitoring task
applicable and
effective
Is a scheduled
restoration task
applicable and
effective
Is a scheduled
discard task
applicable and
effective
Redesign or
Run to failure
applicable and
Is a failure
finding task
effective
1 August 2017 Interlogis Consulting Page 28
Matching condition to task period
100%
0% Time
Resistance
to failure
Conditional Defect Point
Standards
Decision
Degrading Asset Condition
Functional
Failure Point
Task Period < Warning period and
Task Effectiveness of 0.95
Warning
Period
19 Conditional Failures
1 Functional Failure20 Items
You cannot stop the failure – but!
You can change the consequence
1 August 2017 Interlogis Consulting Page 29
CM Task Optimisation formula – MIL-STD-2173AS
n =
ln
-MTBF T
* Ci
(Cnpm – Cpf)*ln(1-)
ln(1-
1 August 2017 Interlogis Consulting Page 30
Impact of task effectiveness
How many times do I need to do a 50% effective task to achieve a 95% outcome
0.50 0.25
0.75
0.12
5
0.062
50.03125
0.875
0.9375
0.96875
1 August 2017 Interlogis Consulting Page 31
Examples of human error/violation rates - Reason
Task Error Scenario 1
• Unfamiliar activity
• Performed at speed
• No idea of consequence
Task Error Scenario 2
• Routine
• Highly practiced
• Rapid delivery
• Low skill
Task Error Scenario 3
• Very familiar
• Well designed
• Highly practiced routine
• Trained and motivated deliverer
• Time available to correct errors
Violation Type 1• Compliance unimportant
• Easy to violate
• Low likelihood of detection
Violation Type 2• Compliance important (legal)
• Low chance of detection
Violation Type 3• Culturally unacceptable
• Low likelihood of detection
• Low likelihood penalty
12
150
12500
13
130
1120
1 August 2017 Interlogis Consulting Page 32
Sample CM Assessments
$0
$20,000
$40,000
$60,000
$80,000
$100,000
$120,000
$140,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Co
st
of
Fu
ncti
on
Examination Period Mths
Cost of Examination/year
Cost of Conditional failures/year
Cost of Functional Failures/year
Total cost
Cost of Examination/year 1
Cost of Conditional failures/year 1
Cost of Functional Failures/year 1
Total cost 1
Variables 0
Costofexamination $40
MTBFmths 60
Taskeffectiveness% 95%
CostofFailure(Cond) $1,500
CostofFailure(Funct) $50,000
Population 100
Warningperiodmths 12
OptumimNumberofexams 2.20
OptimumPeriodMths 5.46
Days 166
CostofExaminations/Yr $8,797
CostofConditinalFailures/yr $29,959
CostofFunctionalFailures/Yr $1,507
TotalCost/Year $40,262
NumberofFunctionalFailures/Yr 0.03
NumberofConditionalFailures/Yr 19.97Failures/Yr 20.00
Activity 1.1
1 August 2017 Interlogis Consulting Page 33
Planned maintenance opportunities
0% Time
Resistance to failure
Conditional Defect Point
Degrading Asset Condition
20x20 Items
380 Conditional Failures (95%)
12 Conditional Failures (3%)
20 Items
1.6 x Task period = Warning period
Warning Period
8 Items
1 August 2017 Interlogis Consulting Page 34
Caveats and assumptions
The warning period (T) is consistent
• If highly variable, constant auto monitoring and shutdown
Success probability is consistent over T
• Conservative assumption as success improves closer to failure
Warning period (T) is less than MTBF/5
• If T becomes larger then process approaches wear out
Success probability is less than 1
• Examples of extremes (70% to 99%)
Candidate arrivals across T are random
• Fewer candidates as failure approached
0% Time
Resistance to failure
Conditional Defect Point
Degrading Asset Condition
20x20 Items
380 Conditional Failures (95%)
12 Conditional Failures (3%)
20 Items
1.6 x Task period = Warning period
Warning Period
8 Items
1 August 2017 Interlogis Consulting Page 35
Condition Monitoring methods matrix
Source – NASA RCM Guide
1 August 2017 Interlogis Consulting Page 36
Selecting condition monitoring process
Pole mounted substations – what is most cost effective approach.
Hands on examinationUse of thermography
1 August 2017 Interlogis Consulting Page 37
Condition Monitoring – Verifying your task estimate
Figure 1
CF Verification Graph CM Optimisation Graph
(CF interval Vs Exam success) (annual exam cost Vs task interval)
Activity 1.2
Session 4Failure FindingWhat is may failure finding period?
1 August 2017 Interlogis Consulting Page 39
Failure finding task frequency
Primary Item
Protective Item
AdverseEvent
LOC
PossibleOutcomes
Model
Optimum Task Frequency
MaintenanceCost
Cost ofFailures
Total Cost of Task Freq
$
Cost Profile
Is a condition
monitoring task
applicable and
effective
Is a scheduled
restoration task
applicable and
effective
Is a scheduled
discard task
applicable and
effective
Redesign or
Run to failure
applicable and
Is a failure
finding task
effective
1 August 2017 Interlogis Consulting Page 40
Functional check sample outcome – Method 1
MTBFFmths 120 =1in 1,861 peryear
Q 0.0327839
LOCEProb 0.000537441
TestMths 4
MTBFF 120
UAo 0.01639344
Additional 0
Protection
1 August 2017 Interlogis Consulting Page 41
Failure Finding model – Method 2
MultipleFailure
Primary
Protective Failed
Check Period X+1 Check Period X+3
TimelineSystem
Check Task
Failed
Check Task
Check Period X+2
Check Task
FailFixFail
Time between failures
Fix
Task Effectiveness
1 August 2017 Interlogis Consulting Page 42
FET Failure finding model – Method 2
Possible
Outcomes
or
Primary Item
Protective Item LOCE(Energy)
Low Impact - high probability
Mid Impact - mid probability
High Impact - low probability
and
1 August 2017 Interlogis Consulting Page 43
Sample outcome – Method 2
Fault-EventTreePrimaryMTBF(mths) 120
ProtectiveMTBF(mths) 120
NumberofProtectiveElements/Primary 1
CostofFunctionalTest $200CostofPrimarySystemRepair $2,000
CostofProtectiveElementRepair $1,000
ProbabilisticCostofaFunctionalFailure $200,000PopulationofPrimaryElements 50
Thank you for your [email protected]