Evaluating record history of medical devices using association discovery and clustering techniques

14
Evaluating record history of medical devices using association discovery and clustering techniques Antonio Miguel Cruz School of Medicine and Health Sciences, Universidad del Rosario, yCalle 63D # 24-31, 7 de Agosto, Bogotá D.C, Colombia article info Keywords: Data mining Biomedical engineering Clinical engineering Outsourced services Maintenance and engineering Hospital Operations research Clustering analysis abstract In this research, association discovery and clustering techniques were utilized for improving the efficiency of a hospital’s service and of the maintenance tasks in a clinical engineering department. The indicator in this study is service requests. The association discovery techniques revealed problems in users’ training (errors in operating procedures), intrinsic failures in medical devices, and badly scheduled maintenance policies. Clustering techniques uncovered the main causes of failures. With the evidence obtained corrective actions were taken. The service request average dropped dramatically from 6.4 to 0.4 during the analyzed period. Ó 2013 Elsevier Ltd. All rights reserved. 1. Introduction Clinical engineering departments (CEDs) maintain a large amount of information about their health-care environment’s medical devices. Maintenance histories, breakdowns, safety proto- cols, risk to patients/staff associated with device/system misappli- cation, and service requests and failures are all examples of this cataloged data (Capuano & Koritko, 1996). This research focuses on the analysis of failures and main causes by which service re- quests for corrective maintenance are generated by users. Techniques to analyze and subsequently reduce the probability of instances of failure are all part of a more general subject called risk analysis or assessment (Cohen, 1995). The most common examples are: Failure Modes and Effects Analysis (FMEA), Failure Discounting, and Fault Tree Analysis (FTA) (Mosquera, 1995). Once a failure has been analyzed and corrective actions for that specific failure mode have been implemented, the probability of its recur- rence is diminished. For subsequent success/failure data, the value of the failure for which corrective actions have already been imple- mented should be subtracted from the total number of failures. The first option in defining how this failure value is to be characterized is to use engineering judgment (e.g. a panel of specialists would agree that the probability of failure has been reduced by 50% or 90% and therefore that failure should be given a value of 0.5 or 0.9). The main disadvantages of this approach are its arbitrariness and the potential difficulty of reaching an agreement. Statistical selection is a second and more attractive option. They are less arbi- trary and are more repeatable. Some good examples of statistical methodology used in failure discounting are the Lloyd and Lipow’s model (Lloyd, 1986; Lloyd & Lipow, 1962), the Standard and Modified Gompertz’s models (Jiang, Kececioglu, & Vassiliou, 1994). Maintenance tasks have to be requested before the failure is diagnosed. This ‘Service Request’ is the process by which custom- ers (users) respond to unscheduled events generated by a particu- lar system. It is important to note that all requests are received at a single point of contact. When a request comes in the technician carries out that work order and determines if the particular service request was a real failure. This is an important step because ‘‘false failure alarms’’ are time-consuming and cause backlogs in mainte- nance tasks, which can affect the entire health care service. For example, the Emergency Care Research Institute (ECRI) reports that roughly 68% of service requests were provoked by misapplication of medical devices in the late 1980s (ECRI, 1989). More recently, Miguel and co-workers found a similar pattern: ‘‘...8.1% of service requests were false-repair requests (no problem found in medical devices), resulting from user mishandling...’’ (Miguel, Rodriguez, Sanchez, & Vergara, 2002, p.418). These two examples infer that not only is it important to determine the main causes of failures and how to solve them but also to determine the principal causes of service requests, outside of failure, to save time in maintenance tasks. There is a diversity of methods to discriminate and describe the main cause(s) of a particular service request. For example, The Par- eto Principle is one of the most popular techniques. It is also known as the ‘‘80–20 rule,’’ the ‘‘law of the vital few’’ and the ‘‘principle of factor sparsity.’’ Another set of techniques frequently employed are the so-called machine learning techniques (Ian & Eibe, 2000; Raza, Jayantha, Hassan, & Lee, 2010). The main purpose of employing such techniques is to automatically extract complex phenomena 0957-4174/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2013.03.034 Tel.: +57 3474570x215; fax: +57 3474570x286. E-mail address: [email protected] Expert Systems with Applications xxx (2013) xxx–xxx Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Please cite this article in press as: Cruz, A. M. Evaluating record history of medical devices using association discovery and clustering techniques. Expert Systems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

Transcript of Evaluating record history of medical devices using association discovery and clustering techniques

Page 1: Evaluating record history of medical devices using association discovery and clustering techniques

Expert Systems with Applications xxx (2013) xxx–xxx

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Evaluating record history of medical devices using association discoveryand clustering techniques

0957-4174/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.eswa.2013.03.034

⇑ Tel.: +57 3474570x215; fax: +57 3474570x286.E-mail address: [email protected]

Please cite this article in press as: Cruz, A. M. Evaluating record history of medical devices using association discovery and clustering techniquesSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

Antonio Miguel Cruz ⇑School of Medicine and Health Sciences, Universidad del Rosario, yCalle 63D # 24-31, 7 de Agosto, Bogotá D.C, Colombia

a r t i c l e i n f o a b s t r a c t

Keywords:Data miningBiomedical engineeringClinical engineeringOutsourced servicesMaintenance and engineeringHospitalOperations researchClustering analysis

In this research, association discovery and clustering techniques were utilized for improving theefficiency of a hospital’s service and of the maintenance tasks in a clinical engineering department.The indicator in this study is service requests. The association discovery techniques revealed problemsin users’ training (errors in operating procedures), intrinsic failures in medical devices, and badlyscheduled maintenance policies. Clustering techniques uncovered the main causes of failures. With theevidence obtained corrective actions were taken. The service request average dropped dramatically from6.4 to 0.4 during the analyzed period.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

Clinical engineering departments (CEDs) maintain a largeamount of information about their health-care environment’smedical devices. Maintenance histories, breakdowns, safety proto-cols, risk to patients/staff associated with device/system misappli-cation, and service requests and failures are all examples of thiscataloged data (Capuano & Koritko, 1996). This research focuseson the analysis of failures and main causes by which service re-quests for corrective maintenance are generated by users.

Techniques to analyze and subsequently reduce the probabilityof instances of failure are all part of a more general subject calledrisk analysis or assessment (Cohen, 1995). The most commonexamples are: Failure Modes and Effects Analysis (FMEA), FailureDiscounting, and Fault Tree Analysis (FTA) (Mosquera, 1995). Oncea failure has been analyzed and corrective actions for that specificfailure mode have been implemented, the probability of its recur-rence is diminished. For subsequent success/failure data, the valueof the failure for which corrective actions have already been imple-mented should be subtracted from the total number of failures. Thefirst option in defining how this failure value is to be characterizedis to use engineering judgment (e.g. a panel of specialists wouldagree that the probability of failure has been reduced by 50% or90% and therefore that failure should be given a value of 0.5 or0.9). The main disadvantages of this approach are its arbitrarinessand the potential difficulty of reaching an agreement. Statisticalselection is a second and more attractive option. They are less arbi-trary and are more repeatable. Some good examples of statistical

methodology used in failure discounting are the Lloyd andLipow’s model (Lloyd, 1986; Lloyd & Lipow, 1962), the Standardand Modified Gompertz’s models (Jiang, Kececioglu, & Vassiliou,1994).

Maintenance tasks have to be requested before the failure isdiagnosed. This ‘Service Request’ is the process by which custom-ers (users) respond to unscheduled events generated by a particu-lar system. It is important to note that all requests are received at asingle point of contact. When a request comes in the techniciancarries out that work order and determines if the particular servicerequest was a real failure. This is an important step because ‘‘falsefailure alarms’’ are time-consuming and cause backlogs in mainte-nance tasks, which can affect the entire health care service. Forexample, the Emergency Care Research Institute (ECRI) reports thatroughly 68% of service requests were provoked by misapplicationof medical devices in the late 1980s (ECRI, 1989). More recently,Miguel and co-workers found a similar pattern: ‘‘. . .8.1% of servicerequests were false-repair requests (no problem found in medicaldevices), resulting from user mishandling. . .’’ (Miguel, Rodriguez,Sanchez, & Vergara, 2002, p.418). These two examples infer thatnot only is it important to determine the main causes of failuresand how to solve them but also to determine the principal causesof service requests, outside of failure, to save time in maintenancetasks.

There is a diversity of methods to discriminate and describe themain cause(s) of a particular service request. For example, The Par-eto Principle is one of the most popular techniques. It is also knownas the ‘‘80–20 rule,’’ the ‘‘law of the vital few’’ and the ‘‘principle offactor sparsity.’’ Another set of techniques frequently employed arethe so-called machine learning techniques (Ian & Eibe, 2000; Raza,Jayantha, Hassan, & Lee, 2010). The main purpose of employingsuch techniques is to automatically extract complex phenomena

. Expert

Page 2: Evaluating record history of medical devices using association discovery and clustering techniques

2 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

(Raza et al., 2010). In other words, these techniques help to findpatterns in data to obtain new insights and are used heavily in datamining (Ian & Eibe, 2000). The main difference between traditionalmethods (i.e., Pareto) and machine learning methods is that statis-tical methods need researchers to impose structure on models andconstruct these models by estimating the coefficient of the vari-ables to fit observations. Instead, machine learning techniques al-low users to learn the particular structure of the model straightfrom the data (Raza et al., 2010). However, we performed a rapidliterature review of both research and literature review papersusing the following keywords: maintenance AND data AND miningAND preventive/scheduled AND corrective/unscheduled, between2009 and 20131,2 in both the ‘‘Scopus database’’ and the ‘‘Expert Sys-tems with Applications’’ journal, yielding few papers related to theuse of data mining techniques to solve maintenance problems in ahealthcare environment (see Section 1.1. a brief literature reviewfor more details)

Therefore, this paper focuses on the application of data miningtechniques, more specifically in the use of association discovery,link analysis and clustering techniques to improve efficiency inthe performance of service and maintenance tasks in a CED. Themotivation is the lack of existing published work on the analysesof service requests and failures (Dea, Williams, & Hoaglund,1994; Liao, Chu, & Hsiao, 2012; Miguel & Rios, 2012). Explicitly,the three objectives of this work are:

1. To offer a method to the clinical engineering community todecide which hospital services and pieces of medical equipmenthave made the greatest contribution to the total number of ser-vice requests.

2. To use an association discovery method to obtain insights(rules) into the possible causes of service requests.

3. To use the clustering technique to group medical equipment inorder to corroborate the obtained rules and to take correctiveaction.

Link analysis is a descriptive procedure for exploring data toidentify relationships among values. It can be divided into twomain groups, association discovery and sequence discovery. Asso-ciation discovery finds rules about items that appear together inan event such as purchase transactions, while sequence discoveryis very similar, in that it yields associations related over time(Ian & Eibe, 2000). Clustering techniques serve to divide items orobjects into clusters (i.e., conceptual meaningful groups), so thatsimilar items are in the same cluster, whereas dissimilar itemsare in different clusters (Kamsu-Foguem, Rigal, & Mauget, 2013).

Again, although most commonly used for market basket analy-sis in the retail sector, association discovery and clustering tech-niques have useful applications in other industries includingfraud detection in e-commerce and insurance (Brin, Motwani, &Silverstein, 1997), commercial airplane manufacturing (Zaluski,Létourneau, Bird, & Yang, 2011), and the rail industry (two crossCorporation, 1999). The application of this technique is still a rela-tively unexplored area in the maintenance and more specifically inthe maintenance of medical equipment (Garg & Deshmukh, 2006;Miguel & Rios, 2012; Raza et al., 2010; Simoes, Gomes, & Yasin,2011).

1 Search string in SCOPUS: TITLE-ABS-KEY(maintenance AND data AND miningAND preventive AND corrective) AND DOCTYPE(ar OR re) AND PUBYEAR > 2009 AND(LIMIT-TO(PUBYEAR, 2013) OR LIMIT-TO(PUBYEAR, 2012) OR LIMIT-TO(PUBYEAR,2011)) AND (LIMIT-TO(SUBJAREA, ‘‘ENGI’’)).

2 Search string in ‘‘Expert Systems with Applications’’ review: ALL(maintenancea n d p r e v e n t i v e a n d c o r r e c t i v e ) A N D L I M I T - T O ( p u b y r ,‘‘2013,2012,2011,2010,2009,2008’’) Review: [Journal/Book].

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

2. Data mining in maintenance: a brief literature review

As we pointed out we performed a rapid literature reviewsearching for papers that applied data mining techniques in main-tenance fields in both the healthcare environment and other indus-tries. The papers found can be broken into four groups including:(1) papers that tackled the problems of the forecasting reliabilityof certain equipment, the optimization of maintenance frequencyand the control of maintenance management activities, (2) papersthat tackled the problems regarding failure mode analysis, (3) pa-pers that employ diverse data mining techniques to make progno-ses in equipment, and (4) research papers measuring theperformance of medical equipment maintenance outsourcing.Now, we will proceed to describe the papers in groups 1–4 in moredetail.

The first group of papers focuses on the forecasting reliability ofequipment. For example, Chatterjee and Bandopadhyay (2012)used a neural network-based model for forecasting reliability in aload–haul–dump machine operated in a coal mine in Alaska,USA. Also, Hu, Si, and Yang (2010), by means of evidential reason-ing (ER) algorithms, were able to predict or forecast the reliabilitylevels in turbocharger engine systems. These authors examined thefeasibility and validity of the ER algorithm by means of numericalexamples, showing that their proposal outperforms (i.e., in termsof solution speed and prediction acuracy) several existing methods.In this same group of papers Chung, Lau, Ho, and Ip (2009), used agenetic algorithm approach in multi-factory production networksto predict and keep the system’s reliability to a defined acceptablelevel, and to minimize the makespan of the jobs. In addition, thisalgorithm proposed simultaneously scheduled perfect and imper-fect maintenance during the process of distributtion scheduling. Fi-nally, Juang, Lin, and Kao (2008) proposed a genetic algorithm-based optimization model to improve the design, efficiency andreliability in a set of repairable series–parallel systems. Theauthors, using this approach, determined the most economical pol-icy of components’ mean time-between-failure (MTBF) and meantime-to-repair (MTTR).

Regarding the use of data mining techniques to optimize pre-ventive maintenance frequency and costs, Huei-Yeh, Kao, andChang (2011) investigated the effects of scheduled maintenancecosts on optimal scheduled maintenance policies for a leased prod-uct using Weibull life-time distribution. The authors of this re-search derived succesfully an optimal scheduled maintenance,and maintenance degrees so that the expected total maintenancecosts were minimized. Wang and Lin (2011) used Improved Parti-cle Swarm Optimization (IPSO) algorithm to minimize the sched-uled preventive maintenance cost for a series–parallel system ina manufacturing industry. Kumar and Maiti (2012) used the fuzzyanalytical network process (FANP) to find the right maintenancepolicy selection on both corrective and preventive maintenance.This method was applied to a unit of a chemical plant and was ableto find a suitable maintenance policy for 13 pieces of equipment inthe unit. Huang and Chen (2012) used data mining techniques suchas K-Means, Two-Steps, and C5.0, to categorize bridges into severaldifferent clusters and depict the decision-making tree of clusteringand rules of bridge deterioration. This study allowed bridge main-tenance staff to gain a clear idea of the cluster of bridges they wereresponsible for and therefore to apply the right frequency of pre-ventive maintenance. Finally, as another example of data mininguse in maintenance, Zhou and Wang (2012) introduced an innno-vative data mining approach based on a new decision tree induc-tion method, called co-location-based decision tree (CL-DT), toenhance decision-making in pavement maintenance and rehabili-tation strategies. The authors used pavement database informationcovering four counties to verify the proposed approach. The exper-

ical devices using association discovery and clustering techniques. Expert

Page 3: Evaluating record history of medical devices using association discovery and clustering techniques

Table 1Summary of papers analyzed. (a) Forecasting reliability; optimization of maintenance frequency; control of maintenance management. (b) Failure mode analysis. (c) Prognosis.(d) Evaluation of the performance of suppliers. (See Appendix A for more details).

Group of papers Application Papers Industry Data Mining technique used Total

(a)Forecasting reliability Reliability forecasting Chatterjee and Bandopadhyay

(2012)Coal mining Neural networks 4

Hu et al. (2010) Turbochargerengine systems

Evidential reasoning (ER)algorithm

Chung et al. (2009) Manufacturer Genetic algorithmsJuang et al. (2008) Unknown Genetic algorithms

Optimization ofmaintenancefrequency

Optimization of maintenancefrequency

Huei-Yeh et al. (2011) Leasing Unknown 6Wang and Lin (2011) Manufacturer Particle swam optimization (PSO)

algorithmKumar and Maiti (2012) Chemical Fuzzy analytic network process

(FANP)Huang and Chen (2012) Construction ClusteringZhou and Wang (2012) Construction Decision treeFerneda, Do Prado, D’ArrochellaTeixeira, and Campos (2011)

Software Regression analysis

Control of maintenancemanagement

Control of maintenance &maintenance management

Charongrattanasakul &Pongpullponsak 2011

Unknown-numericalsimulation

Genetic algorithms 4

Safari and Sadjadi (2011) Manufacturer Genetic algorithms and simulatedannealing

Lu and Sy (2009) Unknown-numericalsimulation

Fuzzy logic

Maquee et al. (2012) Unknown-numericalsimulation

Clustering & association discovery

(b)Failure mode analysis Failure mode analysis Castellanos et al. (2011) Gas & oil pipelines Expert systems 5

Azadeh et al. (2010) Pumps Fuzzy rule-based inferencesystem

Mortada et al. (2011) Machinery Logical analysis of dataZaluski et al. (2011) Aviation Regression analysisGürbüz et al. (2011) Aviation Regression analysis

Defects identification Gebus et al. (2009) Electronicmanufacturing

Linguistic equations & FuzzyAlgorithms

1

(c)Prognosis Predictive maintenance and

fault detectionFerreiro et al. (2012) Aviation Bayesian network model 3Rabatel et al. (2011) Railway Expert systemSi et al. (2011) Unknown-

numericalsimulation

Dynamic evidential reasoningalgorithm

(d)Evaluation of the

performance ofsuppliers

Maintenance outsourcing(medical devices)

Miguel et al. (2002) Healthcare system Multiple lineal regression model 5Cruz et al. (2010a) Healthcare system

(dialysis units)Normalized multivariateregression model was employed

Miguel et al. (2007) Healthcare system Normalized multivariateregression model was employed

Cruz and Denis (2005) Healthcare system Fuzzy logic systemCruz et al. (2010b) Healthcare system Clustering techniques

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 3

imental results in this research demonstrated that the proposedCL-DT algorithm can make better decisions with higher accuracythan the existing decision tree methods. In another application ofdata mining techniques in the maintenance field Ferneda, DoPrado, D’Arrochella Teixeira, and Campos (2011) used regressionmodels to provide an estimate of the time required to accomplisha maintenance task in the software industry.

Regarding the control of the maintenance management activi-ties problem, Charongrattanasakul and Pongpullponsak (2011)developed an integrated model between Statistical Process Controland Planned Maintenance of the Exponentially Weighted MovingAverage t (EWMA) control chart, by using both a mathematicalmodel and a genetic algorithm approach. This approach increasedthe ability to find defective products, minimizing the hourly cost ofmaintenance. As another example of the use of data mining to con-trol and manage the maintenance tasks, Safari and Sadjadi (2011)used a hybrid algorithm based on genetic algorithm and simulatedannealing, under the assumption of condition-based maintenance

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

to minimize expected timespan to perform the maintenance tasksin a flowshop configuration in the manufacturing industry. Lu andSy (2009) used a fuzzy logic approach for decision-making of main-tenance. The authors used historical production data to train andtune the fuzzy models. The model was showed to be suitable forproduction control decisions to satisfy the quick maintenance re-sponse times. In another study, Maquee, Shojaie, and Mosaddar(2012) used data mining techniques including: clustering (i.e., k-means) and association discovery (i.e., Apriori algorithm) tech-niques to identify both clusters and the rules (conditions) whichhad caused efficiency maintenance problems in an urban transpor-tation bus network (see Table 1a for more details).

In the second group of papers one can find papers that tackledthe problems of failure mode analysis. For example, Castellanos,Albiter, Hernández, and Barrera (2011) developed a Failure Analy-sis Expert System (FAES). The authors reported that the solutionproperly identified the failure mechanisms for onshore pipelinestransporting oil and gas products. Azadeh, Ebrahimipour, and

ical devices using association discovery and clustering techniques. Expert

Page 4: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 1. Dependence of data sample versus the standard error.

3 Data sample can be calculated using : n ¼ s2

V2�Nþs2 [24],wheren is the final datasample calculation. s2 is the variance of the data sample (n). It can be calculated interms of occurrence probability (p) then s2 = (p⁄(1 � p))V2, is the variance of the entiredata sample (all population).

4 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

Bavar (2010) provided an accurate and timely mechanism to diag-nose pump failures by knowledge acquisition through a fuzzy rule-based inference system. The solution provided by Azadeh and hisco-workers showed some advantages, including: reduction of hu-man error, reduction of repair time, reduction of unnecessaryexpenditure for upgrades, and reduction of maintenance costs. Ge-bus, Juuso, and Leiviskä (2009) reported how linguistic equations(LE) were used to analyze that data and successfully detect andtrace a defect in a small area of the printed circuit board in elec-tronic manufacturing. In the field of the rotating machinery indus-try, Mortada, Yacout, and Lakis (2011) tested the applicability andperformance of a supervised learning data mining technique calledlogical analysis of data (LAD) for the automatic detection of faultsin rolling element bearings. The results showed good classificationaccuracy with both time and frequency features. The authors dem-onstrated that this approach implemented in the form of softwarein operations maintenance scenarios was a useful tool for mainte-nance experts since it revealed insights that lead to the diagnosis ininterpretable terms with which facilitates find out the reasons be-hind component’ failure (Mortada et al., 2011). Finally, Zaluskiet al. (2011) and Gürbüz, Özbakir, and Yapici (2011) studied useddata mining techniques in the Canadian air Force and in airlinecompanies in Turkey, respectively to find the main attributes thataffect the warning levels in a fleet of aircraft. In this researchregression analysis, and anomaly detection analysis were used toreduce the data set (see Table 1b for more details).

The third group of papers employs diverse data mining tech-niques to carry out a prognosis on equipment. For example, inthe aircraft industry, Ferreiro, Arnaiz, Sierra, and Irigoie (2012) pre-sented an approach based on the Bayesian network model as a use-ful technique for prognosis to replace corrective and preventivemaintenance practice for a predictive maintenance one to mini-mize the cost of maintenance support and to increase aircraft/fleetoperability. Rabatel, Bringay, and Poncelet (2011) developed a newalgorithm to automatically detect anomalies in order to predict inadvance potential failures on railway maintenance tasks. Finally,Si, Hu, Yang, and Zhang (2011) used a dynamic evidential reason-ing algorithm for fault prediction. Si and his co-coworkers, bymeans of two numerical examples, illustrated that the proposedapproach had great potential applications in fault prediction andprognosis (see Table 1c for more details).

As we noticed earlier, in the fourth group of papers one can findempirical and longitudinal proposals that make evaluations of theperformance of the maintenance outsourcing of medical devices ina hospital environment. In a remarkable literature review on main-tenance outsourcing of medical devices Miguel and Rios (2012)found a cluster of papers using data mining techniques that madeevaluations of the performance of the maintenance outsourcing ofmedical devices in a hospital environment. For example, Miguelet al. (2002) proposed an empirical-longitudinal study based onmultiple a lineal regression model to evaluate the outsourcing per-formance of the maintenance tasks as a function of concrete fea-tures and capabilities. Then, Miguel, Barr, and Pozo Puñales(2007) proposed a new empirical-longitudinal study for measuringthe performance of the maintenance service providers using theturnaround time (in hours) of medical devices as a function of re-sponse and service time, parts and components response time,obsolescence, and medical device priority. In this paper a normal-ized multivariate regression model was employed. Likewise, Cruz,Aguilera-Huertas, and Días-Mora (2010a) extended the same studyby comparing the service in-house maintenance performanceagainst the performance of the maintenance service providers intwo dialysis units. In this research the maintenance service in-house resulted in being more efficient. Cruz and Denis (2005) useda fuzzy logic model to evaluate the performance of maintenanceservice outsourcing through three performance indicators includ-

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

ing: medical device availability, repairs made correctly the firsttime by the service provider, and service cost/acquisition cost ratio.Finally, Cruz, Perilla, & Pabon, 2010b) resumed the study carriedout by Miguel et al. (2002) and proposed an empirical-longitudinalstudy based on data mining–clustering techniques to measure theimpact on concrete features and on three performance indicators,including: turnaround, response, and service time. The authors ofthese studies achieved the classification of the performance ofthe firms as a function of its capabilities (see Table 1d for moredetails).

3. Methods

3.1. Study design, data and study sample

In our research we used a retrospective cross-sectional studydesign. The data sample for this study was the corrective mainte-nance transactions or service request acquired from a hospitalinventory with 416 medical devices located in nine pilot areas. Ifthe entire data set of medical equipment and maintenance transac-tions were analyzed, more insights related to service requests andequipment failures could be found. However, it is common practiceto reduce the data set into a representative data sample usingeither a probabilistic or non-probabilistic method. Probabilisti-cally, Fig. 1 shows the dependence of ‘‘n’’ vs. ‘‘standard error’’ val-ues, where s2 = 0.09 (p = 0.9). With a confidence interval of 99.91%(V2 = 9 ⁄ 10�3) 100 pieces of medical equipment, with their 87.79work orders and respective service requests, are determined tobe of sufficient representation to conduct this study.3

3.2. Data collection procedure and bias control

The maintenance service requests in our sample were charac-terized according to the maintenance service providers (i.e., eitherin-house or outsourced), the basic data of the equipment being ser-viced, and performance (i.e., medical device turnaround, in hours).

ical devices using association discovery and clustering techniques. Expert

Page 5: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 2. The monitoring procedure of the maintenance transactions.

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 5

In doing so, we first conducted a study characterizing the inventoryfrom nine pilot areas, including: Diagnostic Images, Surgical Unit,Intensive Care Unit, Neonatal Care, Emergency, Intermediate CareUnit, Immunology Laboratory, Clinical Laboratory, and Microbiol-ogy, to identify the equipment features, such as obsolescence level,the equipment acquisition cost, the equipment maintenance cost,the acquisition date, the equipment age, and the maintenance ser-vice provider in charge of the equipment, either internal or exter-nal. Finally, as mentioned above, we collected primary data onhospital equipment maintenance incidents and the performanceof maintenance service organizations (i.e., turnaround time) overa six-month period by means of a monitoring procedure.

The monitoring procedure of the maintenance service requestwas conducted as follows: every time an equipment failure oc-curred, users (medical clinicians and nurses) made a service re-quest to the medical engineering department. This generatedeither an external or internal service call, and the maintenanceincident recording the date and time of the call was then printed.

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

When the external/internal service provider arrived at the hospi-tal’s medical service area, the date and time of arrival was re-corded. After completing the maintenance task, both the userand service provider performed an acceptance test of the device;if it passed, the user accepted the maintenance task conductedand the maintenance incident was closed, recording the date andtime. For each maintenance transaction, we recorded whetherthe Operator or User made any error during equipment operations,whether the service request resulted from a real failure, whether amaintenance transaction of failure or service request was sched-uled during the same week, whether the scheduled maintenancestarted in time (if applicable), the user who made the maintenanceservice request, and whether the failure was owing to failure ofsystem supply (i.e., electricity, gas, vapor, etc.). The process endedwhen the all information related to the maintenance incident wasentered into the computerized equipment management system.

In order to avoid bias, all users were trained at the beginning ofthe study, and each new user (such as a new clinician hired by the

ical devices using association discovery and clustering techniques. Expert

Page 6: Evaluating record history of medical devices using association discovery and clustering techniques

6 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

hospital during the time of data collection) was trained as part ofthe induction period for the job. Additionally, each maintenanceservice provider was monitored through a monthly phone surveyin order to identify any changes in their characteristics related tothe independent variables in the study. To avoid any bias relatedto testing, maintenance service providers knew they were partici-pating in a study, but were unaware that it was specifically theirperformance that was being measured. During the data collectionphase of the study, no maintenance service provider showed anychanges in the firm characteristics related to the independent vari-ables measured in the original service provider survey, and no de-vice was withdrawn from the equipment inventory. Finally, thehospitals contracted no new maintenance service providers, nordid they acquire any new devices during the model building anddata collection period. As a result, there was no need to deal withattrition or history bias.

3.3. Operational definitions

In this study we included as measures of every maintenanceincident:

3.3.1. Dependent or outcome variableThe dependent variable in our study is the number of failures or

service requests or the number of corrective maintenancetransactions.

3.3.2. Independent variables3.3.2.1. Operator or User error (coded as ErrOP). The ErrOP code pos-sible values are TRUE (T) and FALSE (F). A TRUE (T) value meansthat service requested was caused by a failure due to operator oruser mishandling, where a FALSE (F) value indicates that this wasnot the case. The purpose of this variable is to simply act as a usererror indicator.

3.3.2.2. Real failure outcome (coded as Real). Real failure can come inpossible values of TRUE (T) and FALSE (F). TRUE (T) values infer thatthe service request was diagnosed as a real failure, where a FALSE(F) value is the opposite. This variable indicates whether a failure isreal, irrespective of the cause.

3.3.2.3. Scheduled maintenance during the week of failure or servicerequest (coded as Sched). The Sched indicator has the possible val-ues YES (Y) and NO (N). A YES (Y) value means that the service re-quested falls in during the week when a maintenance task wasscheduled; a NO (N) value means that there was no such mainte-nance task scheduled for that week. The purpose of this variableis to determine whether the maintenance task frequency shouldbe adjusted. For example, if many service requests are made in ashort period of time, and they are real failures not caused by oper-ator misuse, and there is no maintenance scheduled in the sameperiod of time, a decrease in the time between scheduled mainte-nance is suggested.

3.3.2.4. Scheduled maintenance started in time (coded as Start-Sched). Possible values for StartSched are again, YES (Y) and NO(N), with the addition of ‘Does not Matter’ (DNM). A YES (Y) valuemeans that the maintenance task started in time, in other words,during the week or day that the service request was made; a NO(N) value is the opposite. A Does not Matter (DNM) means thatno maintenance was scheduled in that week. This variable is usedas a measure of scheduled maintenance frequency efficiency andfor control purposes. For example, suppose that there are servicerequests made in a particular period of time (i.e., on Friday), andthey are real failures not caused by operator misuse; additionallythere are maintenance tasks scheduled for these weeks and at

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

the moment of failure all these tasks have not started yet. Now,if this pattern is repeated; it means that maintenance scheduledfrequency is a possible area of concern and management should re-view task execution times with their technicians.

3.3.2.5. Work teams who made the service request (coded asTurn). Turn has the possible values of Morning (M) or Afternoon(A). Morning (M) values mean that the service request was madebetween from 12:00AM to 12:00 PM, where an Afternoon (A) valuemeans that the service requested was made in the period between12:01 PM and 11:59 PM. Specifically, this is recorded to aid man-agers in determining if the scheduled users or operators need fur-ther training sessions, or if there are activities consistent withsabotage.

3.3.2.6. Failure of system supply (coded as FailSys). The FailSys indi-cator has possible values of TRUE (T) or FALSE (F). TRUE (T) meansthat the service request was diagnosed as a real failure due to afailure of system supply, such as water, steam, or electrical; aFALSE (N) value is the opposite. The purpose if this variable is toaid in the determination of the origin of failure. Sometimes themedical device usage is well within its useful lifetime value andwas provided by a reputable original equipment manufacturer,but the equipment fails. Frequently, managers overlook the opti-mal environmental working requirements of medical equipmentand thus the quality of the support system is often at the root ofthe problem. Repeated patterns of system supply failure indicatethat a review of the collective environments where these devicesare failing should take place.

3.4. Analyses

In performing our analysis a straightforward statistical look atthe gross attributes of the data is in order. Insights about the datacan begin to take shape, and thus provide hypotheses to be testedand subsequent corrective action taken based on those results.Therefore, the following procedural description has its first threesteps dedicated to a simple, yet pointed look at those gross mea-sures and database segmentation (Ian & Eibe, 2000). This proce-dure allowed us: (1) to examine summarizing indicators incomparison with the total number of medical devices by equip-ment type; and (2) to examine the total number of medical deviceservice requests by the original equipment manufacturer. Next weperformed an examination of the distribution of service requestsover time. This analysis allowed us to obtain the ‘‘insights’’ thatemerged from data segmentation. Finally, by using Apriory associ-ation algorithms we analyzed the root causes of the increase ofmaintenance requests.

All of this data was maintained by means of SMACOR™, whichis a Computerized Maintenance Management System. In additionto that, all the computational processing was completed usingthe WEKA� version 3.4.7 and Statistics Toolbox from the MATHWORKS INC� (Matlab., 2003) in a 2.3 GHz 512 MB RAM PentiumIV PC. The mean time to build the models runs between 30 and50 s.

3.4.1. Justification of algorithms and analysis selected for this studyIn this research we used both association discovery and cluster-

ing analysis to analyze our data. Association discovery is used tofind rules formed by means of the occurrence of transactions in-volved in the data sample selected. Associations are written inthe form of rules following the format A) B, where A is calledthe antecedent or left-hand (LHS), and B is called the consequentor and a right-hand side (RHS). For example, in the association rule‘‘whenever a user requests a service then the equipment hasfailed’’, the antecedent is ‘‘request a service’’ and the consequence

ical devices using association discovery and clustering techniques. Expert

Page 7: Evaluating record history of medical devices using association discovery and clustering techniques

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 7

is ‘‘the equipment has failed’’. To calculate the proportion of trans-actions that contain a particular item they are simply counted. Foreach rule encountered, an expected confidence, support, and liftare then computed. The support or prevalence refers to the occur-rence or frequency of a particular association. For example, if wesay that 15 transactions out of 1000 consist of ‘‘real failure is TRUEand Operator or User Error is FALSE’’, then the support of this asso-ciation would be 1.5%. A very low level of support (i.e., one occur-rence in a million transactions) may indicate that the association inquestion is not very important. It is also important to look at therelative frequency of the occurrence of the items and their combi-nations. Given the occurrence of item A, how often does item B oc-cur? In other words, what is the conditional predictability of B,given A? Following the previous example, ‘‘When a user or opera-tor requests a service (with a real failure), how often do they com-mit a mistake that caused the failure (variable Operator or UserError coded as ErrOP = TRUE)’’. An additional term for conditionalpredictability is confidence. It measures how much a particularitem is dependent on another, and is calculated as a ratio [fre-quency of A and B]/[frequency of A]. To visualize these conceptswith further examples see Appendix B. To build the rules in theassociation discovery process the Apriori algorithm was selected(Ian & Eibe, 2000). It has been demonstrated to be valuable for sev-eral real world applications (Kamsu-Foguem et al., 2013) and is oneof the most widely used algorithms for its simplicity and robust-ness (Ian & Eibe, 2000). The algorithms find rules through the sort-ing of data while counting associative occurrences. Efficiency inperforming this task rule creation is one of the key differentiationsbetween algorithms. This is important because of the combinedexplosion that results in an enormous number of rules. With thesetotals, the calculations of confidence and support for these occur-rences can then be made. Apriori is based on the statistical theoriesof correlation and variation analysis. Association rules are built fol-lowing ‘‘covering algorithms’’. As the name implies, at each stagethey identify the rule that ‘‘covers’’ some of the occurrences. Cov-ering algorithms operate by adding tests to the rule under con-struction, always striving to create a rule with maximumaccuracy. In other words covering algorithms will describe choicesas an attribute-pair to maximize the probability of the desired clas-sification. To find a rule, the covering algorithm executes a ruleinduction procedure once for every possible combination of attri-butes, with every possible combination of values.

For the clustering portion of the work, three techniques weretaken into consideration: (1) the k-means algorithm; (2) the incre-

Table 2Summary of main indicators of data sample in the hospital under study.

Equipmenttypes

Code Number ofequip

Number of servicerequest

Acquisition costpenetration in%

Imaging F 23 54 37.20Medical

electronicsC 73 117 15.20

Electromechanic A 122 179 10.10Life support D 23 70 8.60Sterilization B 20 128 6.10Laboratory G 17 33 5.60Optic fiber

devicesK 17 9 4.30

Dentistry E 41 58 4.30Optics I 28 27 3.70Electro-optics L 5 2 2.70Vacuum devices J 9 12 1.10Measurement

devicesH 38 30 1.10

Total 416 719 NAAverage 34.67 59.92 8.33Max 122.00 179.00 37.20Min 5.00 2.00 1.10

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

mental; and (3) the statistical clustering method (Ian & Eibe, 2000).The first technique forms clusters in a numeric domain, and is verysimple and reasonably effective. However, when attributes are notnumeric the k-means algorithm does not work well because thedistance between attributes it is not easy to obtain (two cross Cor-poration, 1999). The incremental clustering technique was devel-oped in the late 1980s. It is embodied in a pair of systems whichoften are grouped together as the COBWEB/CLASSIT system fornominal attributes. Statistical clustering is based on a mixturemodel of different probabilities, one for each cluster. It is calledthe expectation–maximization algorithm (EM), which assignsinstances to classes probabilistically, not deterministically.Expectation–maximization treatment of the data allows for datawith numeric and non-numeric attributes. It is also the only clus-tering method that generates an explicit knowledge structure thatdescribes the clustering in a way that can be readily visualized andreasoned about (Ian & Eibe, 2000). It is for this reason that the EMalgorithm was selected to find clustered differences in our data.

4. Results

Table 2 shows a summary of characteristics of the entire datasample. Notice that:

a. Medical devices have an average cost penetration of 8.33%,with a maximum of 37.20% for imaging device types, and aminimum for vacuum and measurement device types witha value of 1.10%. (The acquisition cost penetration representsthe% of acquisition cost of a category, i.e., device type, incomparison with the total acquisition cost for all categories)(Cohen, 1995).

b. The service cost to acquisition cost ratio (SC/AC) exhibits anaverage of 5.15% for the entire inventory, with a maximumvalue of 9.90% for laboratory and a minimum of 2.10% formeasurement devices, respectively.

c. The turnaround time (TAT) indicator, defined according toCohen (1995), has an average of 2.40 h for all devices, witha maximum of 6.4 h for sterilization and a minimum of0.53 h for vacuum devices.

d. The entire inventory, on average, has not reached their use-ful life. Column 8 from Table 2 shows the average of usagetime versus useful life indicator (AVU = ET/UL). It is lowerthan unity in all cases, except for electro-mechanical devicestype with a value of 1.20 units. (Usage time is the period of

Average SC/ACRatio (%)

Average TAT(hours)

AverageAVU = ET/UL

Average prioritylevel

3.20 1.8 0.2 704.30 2.39 0.4 56

2.30 1.42 1.2 435.20 2.1 0.3 1084.60 6.4 0.18 989.90 0.79 0.35 677.20 5.1 0.12 32

3.70 4.1 0.7 456.30 1.07 0.11 245.90 0.88 0.54 217.10 0.53 0.8 852.10 2.17 0.57 32

NA NA NA NA5.15 2.40 0.46 56.759.90 6.40 1.20 108.002.10 0.53 0.11 2100

ical devices using association discovery and clustering techniques. Expert

Page 8: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 4. Service request ratio indicator values by equipment type.

Fig. 3. Distribution of the total number of service requests in comparison with the total number of medical devices by equipment type.

Fig. 5. Average turnaround time (in hours) indicator values by equipment type.

Fig. 6. Average priority level (in units) indicator values by equipment type.

8 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

time, in years, that the asset has been in use from when itwas procured and installed to the current date of analysis.Likewise, UL is defined as the period of time during whichan asset or property is expected to be usable for the purposeit was acquired [12])

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

e. The priority level calculation was based on Capuano–Kor-itkto system proposed in (Capuano & Koritko, 1996). Theentire inventory has an average of 56.75 units, with a maxi-mum value of 108 units for life support and a minimum of21 units for electro-optics medical devices.

ical devices using association discovery and clustering techniques. Expert

Page 9: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 7. Distribution of the total number of medical device service requests by the original equipment manufacturer.

Fig. 8. Distribution of total number of service requests by models (originalequipment manufacturer: ‘‘AA’’).

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 9

f. During the period from 2002 to 2006 a total of 719 servicerequests were recorded regarding 416 unique pieces of med-ical equipment.

Figs 3–6 show the result obtained from the data base segmen-tation procedure. In these figures one can examine general indica-tors in comparison with the total number of medical devices byequipment types. Viewing the simple bar charts it becomes readilyapparent that:

1. Sterilization devices (equipment types ‘‘B’’, see Table 2) has thehighest ratio of service requests to number of actual devices inthat group (6.4) and the second highest number of totalrequests for the study period (128). Life support devices (equip-ment types ‘‘D’’, see Table 2) have the second highest ratio(3.04) and fourth highest number of requests. Note that electro-mechanical devices (equipment types ‘‘A’’, see Table 2) have themost service requests (179), but that its ratio (1.47) is amongthe lowest in the group. Medical Electronics devices (equipmenttypes ‘‘C’’, see Table 2) exhibit the same characteristics, with117 service requests and a ratio of 1.6 (see Figs. 3 and 4).

2. Sterilization device (equipment types ‘‘B’’, see Table 2) has thehighest TAT indicator value (6.4 h) and the second highest pri-ority level number (98 units) on average (Figs. 5 and 6).

Fig. 7 shows the distribution by the manufacturers in questionfor sterilization device types. Note how original equipment manu-facturers ‘‘AA’’ and ‘‘BB’’ have the highest proportion of service re-quests, with 50.78% (65) and 18.75% (24), respectively. Fig. 8 showsthe distribution of total number of service requests by models ofthe ‘‘AA’’ original equipment manufacturer.4

Given the high proportion of service requests coming from ‘‘AA’’original equipment manufacturer models, they were included forfurther analysis. This decision was based on the followingreasoning:

1. The number of pieces of equipment of each model in inventory(see Fig. 8) are 1,1, and 1 for models coded as Model 1, 2, and 4;and 2 and 3 for models coded as Models 3 and 5, respectively.Two devices (one piece of equipment from Model 1 and onefrom Model 2) have 57 service requests in 4 years (36 in 2003!).

4 Authors reserve the right to keep secret the original names of original equipmentmanufacturer s and models of this study. To obtain the real and original namescontact the main author.

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

2. The ‘‘BB’’ original equipment manufacturer was discardedbecause the 24 real failures were distributed relatively evenlybetween17 medical devices and thus produced no discernablepattern.

Continuing the process with the original equipment manufac-turer ‘‘AA’’, the aim is to determine whether a particular periodof time has service requests frequency that follows a defined pat-tern. The authors selected years as the unit measure of time,though one could choose hours, days or months, etc. Fig. 9 showsthe service request patterns from 2002 to 2006.5 It exhibited alow number of service requests initially in 2002, then a peak in2003 with 36 (18 service requests for each model) and a subsequentdrop between the years 2004 and 2006.6

From these results some interesting ‘‘insights’’ emerge. Insummary:

1. Sterilization (B), life support (D), imaging (F), and laboratory (G)equipment types have the highest values of service requestratio, ranging from 6.4 to 1.94 units (see Fig. 4 and Table 2).

5 From 2002 to the first half of 2004 all service requests were compiled to conductthis study.

6 From the second half of 2004 (July) to at the end of 2004 corrective actions weretaken and implemented.

ical devices using association discovery and clustering techniques. Expert

Page 10: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 9. Pattern of service requests by year (Models 1 and 2).

10 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

2. Sterilization (B) has the highest service request indicator (128service requests occurred in only 20 different types of medicaldevices, see Fig. 2). A mere 4.08% of the medical device inven-tory provoked 17.8% of failures for the analyzed period.

3. Models 1 and 2 from original equipment manufacturer ‘‘AA’’produce 50.78% of the total number of service requests for ster-ilization equipment type (B), which incidentally, are steam ster-ilizers and are located in the same cost center (SterilizationCentral Room).

4. In the years 2003 and 2004 the highest number of servicerequests occurred. Both models (Model 1 and 2, see Fig. 9)had 18 requests in 2003 and 10 and 3, respectively in 2004. Thisaccounts for 85.96% (49 requests/57 total requests, see Fig. 6 forthe 2002–2005 period) of the total number of service requestsin the analyzed period for equipment from the original equip-ment manufacturer ‘‘AA’’.

5. It is important to notice that the sterilization equipment has notreached its useful life (see Column 8 row 5 from Table 2). Theusage time versus useful life indicator (AVU = ET/UL) is0.18 units. A study of sterilization equipment from originalequipment manufacturer ‘‘AA’’ showed that it was a stable pop-ulation, meaning that there were no equipment replacementsduring the 5 years under study. Additionally, an examinationof usage time data shows that it was uniform for the entire ana-lyzed period.

Table 5 shows the total number of clusters and a characteriza-tion for each. The resulting data can be interpreted as follows:

Cluster 1 (coded as C-1, see dashed line, row 4 in Table 5) exhib-its 20 service requests, 7 corresponding to Model 1 GE2609 AR-2(coded as M1) and 13 corresponding to Model 2 GA2609 EM-2(coded as M2). This cluster represents 36% of the total service re-quests with a belonging probability (representing the probabilityof taking an element from the cluster) of 0.367. Of those 20 re-quests, 8 (40%) correspond to system supply problems, the remain-ing 12 (60%) corresponds to other causes (variable V2 coded asFailsys). Operator error and misuse, with no real failure readingpresent, occurred 19 (95%) times. The remaining 1 (5%) corre-sponds to a real failure (variable V3 and V4 coded as Real and Er-rOP, respectively)

In the same 20 requests, 17 (85%) were carried out by the morn-ing work team or turn (variable V5 coded as Turn or Work Team)and just 3 (15%) were requested in the afternoon. Cluster 1 alsoshows that 19 (95%) had no maintenance scheduled on the datethat service was requested and just 1 (5%) did (variable V6 coded

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

as Sched). Finally, the schedule maintenance task was not startedin time once (5%) (Variable V6 coded as StartSched).

Again from Table 5, the columns with a perimeter of dashedlines and variables V5, V6, and V7 (Work Team or Turn, Sched,StartSched, respectively) present the following information: In 57service requests over all the clusters (C-0 through C-3), 49(87.5%) were requested by the morning work team. 52 (92%) hadno maintenance scheduled at the date-time when requested, while3 (5.3%) of the maintenance tasks were not carried out on time.

5. Discussion of the impact of our results and the practicalimplications of this study

During this research association discovery and clustering tech-niques were utilized for improving the efficiency of a hospital ser-vice and the maintenance tasks in a clinical engineeringdepartment. The indicator under study was the service request.The clustering techniques and data base segmentation techniquesrevealed a large increase in service requests in 2003. This thereforeleads to the question: What are the main causes of the increase inservice requests from 2002 to 2003? This root question gives riseto three additional enquiries into specific attributes:

1. Do the service requests increase due to malfunction of medicaldevices, a lack of user training, or a combination of both?

2. Is the increase due to a flawed scheduled maintenance policy?Could low or high maintenance frequencies, or non-compliancewith the scheduled maintenance be responsible?

3. Does the service requests increase occur due to a combinationof enquiries 1 and 2?

The association discovery techniques allowed us to reveal that:(1) whenever a real failure (Real = TRUE) occurs the likelihood thatit was not due to an operator error (ErrOp = FALSE) and with noscheduled maintenance (Sched = NO) is 2.33 times (lift) morelikely than the usual probability of 43.00% (expected confidence).This combination occurred in 35.00% (support) of all cases (57)(see Rule 1). It can be inferred then, that one reason for a real fail-ure to occur is low frequency of scheduled maintenance; (2) If areal failure (Real = TRUE) occurs the likelihood that it was notdue to an operator error (ErrOp = FALSE) has a probability of87.00% (confidence), which is 1.87 times (lift) more likely thanthe usual probability of 47.00% (expected confidence). This combi-nation occurred in 51.00% (support) of all cases (57) (see Rule 6 inTable 4). In other words, when a real failure occurs, it is usually dueto equipment malfunction and not operator misuse, and (3) wherea no real failure reading (Real = FALSE) occurs it is also likely thatan operator error (ErrOp = TRUE) occurs with a probability of100.00% (confidence), which is 2.00 times (lift) more likely thanthe usual probability of 50.0% (expected confidence). This combi-nation occurred in 44.00% (support) of all cases (57) (see Rule 4in Table 4). When no real failures occur (i.e., a false service request)a lack of operator training is culpable.

An apparent lack of operator training and real failures of steril-ization devices are the main causes of the increase in service re-quests from 2002 to 2003. The subsequent clustering revealedthat the main causes of the real random failures were the waterand the steam supply system. Although the failures drop dramati-cally for Models 1 and 2 between 2003 and 2004 (from 36 down to13) and again between 2004 and 2005 (from 13 down to 3 total)(see Fig. 8), there is insufficient statistical strength to show thatthe corrective actions resulted in the lower service requests in2005 and 2006. Several scenarios may collectively or indepen-dently be responsible for that period’s decrease. Usage patternand stability of sterilizer equipment population (in terms of

ical devices using association discovery and clustering techniques. Expert

Page 11: Evaluating record history of medical devices using association discovery and clustering techniques

Table 3Selected variables for the association discovery process.

Variable code Description and type Values

ErrOp Operator or User Error � TRUE (Operator Error) (T)(ordinal) � FALSE (No operator error) (F)

FailSSys Failure of system syply �TRUE (Failure of system syply) (T)(steam and or water), (ordinal) � FALSE (Failure other than system syply) (F)

Real Real failure � TRUE (when a failure occur) (T)(ordinal) � FALSE (no failure occur) (F)

Turn Work team � M (morning)(ordinal) � AF (afternnon)

Sched Scheduled maintenance at week of failure � YES (Y)(ordinal) � NO (N)

� YES (Y)StartSched Scheduled maintenance started in time � NO (N)

(ordinal) � DNM (does not matter, none scheduled maintenance when failure occur)

Table 5Cluster statistics.

Cluster name Cluster features Variables

V1 V2 V3 V4 V5 V6 V7Models FailSSys Real ErrOp Work team or turn Sched StartSched

M1 M2 T F F T T F AF M Y N Y DNM N

C-0 Total number of elements: 6 0 6 5 1 5 1 5 1 1 5 1 5 0 5 1% of Total: 11%Belonging probability: 0.1098

C-1 Total number of elements: 20 7 13 8 12 19 1 19 1 3 17 1 19 0 19 1% of Total: 36%Belonging probability: 0.367

C-2 Total number of elements: 23 23 0 22 1 1 22 1 22 1 22 1 22 0 22 1% of Total: 41%Belonging probability: 0.3986

C-3 Total number of elements: 7 2 5 7 0 1 6 1 6 2 5 1 6 1 6 0% of Total: 13%Belonging probability: 0.1246

where:V1 are the Model equipments under study Model 1 = V1 = GE2609.AR-2 and Model 2 = V2 = GA2609 EM-2.V2 is FailSSys variable (see Table 2).V3 is Real variable (see Table 2).V4 is ErrOp variable (see Table 2).V5 is Work Team or Turn variable (see Table 2).V6 is Sched variable (see Table 2).V7 is StartSched variable (see Table 2).

Table 4Best rules found.

Rule Left hand side or antecedent ? Right hand side or consequence Confidence Lift ratio Support

1 ErrOp = False ? Real = True Sched = NO 1.00 2.33 0.352 Real = True Sched = NO ? ErrOp = False 0.83 2.33 0.353 ErrOp = True ? Real = False 0.86 2.00 0.444 Real = False ? ErrOp = True 1.00 2.00 0.445 ErrOp = False ? Real = True 1.00 1.87 0.516 Real = True ? ErrOp = False 0.87 1.87 0.517 ErrOp = False Sched = NO StartSched = DNM ? Real = True 1.00 1.87 0.358 Real = True ? ErrOp = False Sched = NO StartSched = DNM 0.67 1.87 0.359 ErrOp = False StartSched = DNM ? Real = True 1.00 1.87 0.3510 Real = True ? ErrOp = False StartSched = DNM 0.67 1.87 0.3511 ErrOp = False Sched = NO ? Real = True 1.00 1.87 0.3712 Real = True ? ErrOp = False Sched = NO 0.67 1.87 0.37

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 11

replacement/overhaul) are just two candidates. As a hypotheticalexample, suppose that the equipment was new in 2000 and fewservice requests were made until the year 2003. Then in the firsthalf of 2004, all the equipment was replaced/overhauled. Withinthat view it would be normal to see 2 more years without muchproblem.

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

However, a study of sterilization equipment from the originalequipment manufacturer ‘‘AA’’ population showed that it was astable population and there were no equipment replacements dur-ing the 5 years under study. Additionally, an examination of usagetime data over the 5 years showed a uniform pattern over theinventory. Usage time fluctuation and equipment replacement/

ical devices using association discovery and clustering techniques. Expert

Page 12: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 10. Time between Service Requests (TBSR) for the analyzed period. (a) Model 1: GA2609 EM-2 and (b) Model 2: GE2609 AR-2.

12 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

overhaul did not cause a reduction in requests subsequent to thecorrective actions.

5.1. Practical implications from the study

With the evidence obtained from the analyzed data history, thefollowing corrective actions were taken in the second half of r2004:

1. In most cases the real random failures were caused by the waterand steam supply system (75.00%) with no maintenance sched-uled (see Rule 1, 7 and 9 in Table 3 and V2 in Table 5). This indi-cates that the scheduled maintenance frequency should beincreased from 1 to 3 times per year. The combination of Rules1, 7 and 9 in Table 3 and variable V3 in Table 5 help to discrim-inate the number of real failures and to support the decision ofan increased maintenance frequency. Statistical analysis of the30 real failures (26 are not failures) generated from the totalof service requests show that:

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

� Data of the mean time between failures (MTBF) of real fail-ures follows a normal distribution with parameters N(l,r) = N(5.35, 2.32) (in months) with confidence intervals of(4.486 l 6 6.22) and (1.85 6 r 6 3.12) for l and r, respec-tively. Fig. 10(a) shows the histogram and Fig. 10(b) showsthe probability density function.

� As 4.48 months is the lower limit of the confidence intervalfor l, a 4 month interval in the maintenance schedule fre-quency was selected.

2. Most cases of no real failure were reported in the morning(87.50%) (see Rule 4 in Table 4 and V5 in Table 5). However, auniform usage pattern in sterilization equipment was found,so a study to characterize the operators’ skills level in opera-tional procedures for sterilization equipment was conducted.The study consisted of applying a test of 20 questions for bothwork teams (morning and afternoon work teams). Questionsthat focused on the basic principles and operating proceduresof specific sterilization equipment in the inventory (20% forbasic principles and 80% for operational procedures) were

ical devices using association discovery and clustering techniques. Expert

Page 13: Evaluating record history of medical devices using association discovery and clustering techniques

Fig. 11. Availability trend for the analyzed period.

A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx 13

included. Compilation of the test results showed a lack of oper-ator training for the morning work team. Therefore, a compre-hensive training program was scheduled for work teamoperators who work from 8:00AM to 12:00 PM.

3. Although only 5.3% of scheduled maintenance to supply systemtasks was not carried out on time (see V7, StartSched in Table 5,N = 3, the average delay time was 2 weeks), better monitoring ofthe maintenance compliance was also carried out.

After these corrective actions were taken, an improved perfor-mance of Models 1 and 2 from original equipment manufacturer‘‘AA’’ in sterilization equipment type (B) was obtained.

To help determine the effectiveness of these actions, two moreindications were extracted from the data. The first was the time be-tween service requests (TBSR) and the second was the availabilityof the equipment. Fig. 10(a), shows trends in the TBSR for the Ster-ilizer GA2609 EM-2, coded as Model 1. TBSR at the beginning(2002) had a value of about 10.07 months and during 2003 the va-lue decreased to 1.2 months. At the end of the analyzed period theindicator increased to 9.43 months, with an average value of 3.6and 8.65 for 2004 and 2005, respectively. Similar trends for TBSRare evident for the Sterilizer GE2609 AR-2, coded as Model 2 (seeFig. 10b).

Fig. 11 shows the availability trends for both sterilizers, Model1(GA2609 EM-2) and Model (2: GE2609 AR-2). This Figure showsa maximum value at the beginning and at the end of the analyzedperiod. Availability had an increase just after corrective actionswere taken (from 37% in the first half of 2004, to 70% in the secondhalf).

Traditional methods used in quality control (such are Paretochart) for finding patterns differ substantially from data miningtechniques. With Pareto chart techniques, analysts have to selectone variable for study or research and no conclusion about linksand/or relationships between multiple variables can be reached.Clustering gives the analyst the ability to analyze several variablesat the same time and reach conclusions about relationships. Asmentioned before, Clustering allows the researcher to find distin-guishing characteristics in the data that are not readily noticeableunder standard statistical analyses. With this method it is notknown what the clusters will be at the start, or by which attributesthe data will be clustered. The clustering software does this sortingfor the analyst.

This study has demonstrated that ‘‘intelligent techniques’’, suchas the Association Discovery & Clustering Techniques, can be usedto improve the performance of Clinical Engineering Departments.Given the importance of CED performance to any health care envi-ronment’s quality of care and budgetary bottom line, the authorsbelieve that this contribution and the adoption of similar tech-niques to those contained herein are important.

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

6. Conclusions

Association rules found the most general causes of service re-quests were poorly scheduled maintenance policies, lack of usertraining, and intrinsic failures of medical devices. Clustering re-vealed the main causes of real failures were the water and steamsupply systems and a pattern of ‘‘not real’’ service request (user er-rors) in the morning. Careful attention to this finding was paid inboth the structure of the user operational training, and new sched-uled maintenance policies. This examination has illustrated ameans by which to analyze the quality and effectiveness of currenthospital services. It has demonstrated a process for the identifica-tion of areas and methods of improvement and a model againstwhich to analyze those methods’ effectiveness. This study demon-strates how ‘‘intelligent techniques’’, such as the Association Dis-covery & Clustering Techniques, can be used to improve theperformance of Clinical Engineering Departments (CED). Giventhe importance of CED performance to any healthcare environ-ment’s quality of care and budgetary bottom line, the authors be-lieve that this contribution and the adoption of similartechniques to those contained herein are important.

Acknowledgments

I wish to thank to Luis Ariel Diago, and our anonymous reviewersfor their help and cooperation in preparing this paper. Finally, aspecial thanks to Gregory L. Haugan and Oliver Jarvis for theirassistance and collaboration in the translation and review of thispaper. Also I wish to thank to our anonymous reviewers for theirindirect help and collaboration.

Appendix A. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.eswa.2013.03.034.

References

Azadeh, A., Ebrahimipour, V., & Bavar, P. (2010). A fuzzy inference system for pumpfailure diagnosis to improve maintenance process: The case of a petrochemicalindustry. Expert Systems with Applications, 31(1), 627–639.

Brin, S., Motwani, K., & Silverstein, C. (1997). Beyond market basket: Generalizingassociation rules to correlations. ACM-SIGMOD I of Data (SIGMOD’97). Tucson.

Capuano, M., & Koritko, S. (1996). Risk oriented maintenance system. BiomedicalInstrumentation and Technology, 30(1), 25–35.

Castellanos, V., Albiter, A., Hernández, P., & Barrera, G. (2011). Failure analysisexpert system for onshore pipelines. Part-II: End-User interface and algorithm.Expert Systems with Applications, 38(9), 11091–11104.

Charongrattanasakul, P., & Pongpullponsak, A. (2011). Minimizing the cost ofintegrated systems approach to process control and maintenance model byEWMA control chart using genetic algorithm. Expert Systems with Applications,38(5), 5178–5186.

Chatterjee, S., & Bandopadhyay, S. (2012). Reliability estimation using a geneticalgorithm-based artificial neural network: An application to a load-haul-dumpmachine., 39(12), 10943–10995.

Chung, S., Lau, H., Ho, G., & Ip, W. (2009). Optimization of system reliability in multi-factory production networks by maintenance approach. Expert Systems withApplications, 36(6), 10188–10196.

Cohen, T. (1995). Benchmark indicators for medical equipment repair andmaintenance. Biomedical Instrumentation and Technology, 29(4), 308–320.

Corporation, Two. Cross. (1999). Introduction to Data Mining and KnowledgeDiscovery. (T. C. Corporation, Producer) Retrieved 1 20, 2013, from http://www.stat.ucla.edu/~hqxu/stat19/intro-dm.pdf.

Cruz, A. M., Aguilera-Huertas, W. A., & Días-Mora, D. A. (2010a). A comparativestudy of maintenance services using the data-mining technique. Revista de SaludPublica (Bogota), 11(4), 653–661.

Cruz, A. M., & Denis, E. R. (2005). A fuzzy inference system to evaluate contractservice provider performance. Biomedical Instrumentation and Technology, 39(4),320–325.

Cruz, A. M., Perilla, S. P., & Pabon, N. N. (2010b). Clustering techniques: Measuringthe performance of contract service providers. IEEE Engineering in Medicine andBiology Magazine., 29(2), 116–129.

ical devices using association discovery and clustering techniques. Expert

Page 14: Evaluating record history of medical devices using association discovery and clustering techniques

14 A.M. Cruz / Expert Systems with Applications xxx (2013) xxx–xxx

Dea, O., Williams, S., & Hoaglund, L. (1994). Clinical engineering management: Anannotated bibliography 1989–1993. Biomedical Instrumentation and Technology,4(2), 101–111.

ECRI (1989). Types of services: Their advantages and disadvantages. HealthTechnology, 3(4), 9–20.

Ferneda, E., Do Prado, H., D’Arrochella Teixeira, E., & Campos, F. (2011). Using datamining techniques for time estimation in software maintenance. InternationalJournal of Reasoning-based Intelligent Systems, 3(2), 80–87.

Ferreiro, S., Arnaiz, A., Sierra, B., & Irigoie, I. (2012). Application of bayesiannetworks in prognostics for a new integrated vehicle health managementconcept. Expert Systems with Applications, 39(7), 6402–6418.

Garg, A., & Deshmukh, S. (2006). Maintenance management: Literature review anddirections. Journal of Quality in Maintenance Engineering, 12(3), 205238.

Gebus, S., Juuso, E., & Leiviskä, K. (2009). Knowledge-based linguistic equations fordefect detection through functional testing of printed circuit boards. ExpertSystems with applications, 36(1), 292–302.

Gürbüz, F., Özbakir, L., & Yapici, H. (2011). Data mining and preprocessingapplication on component reports of an airline company in Turkey. ExpertSystems with Applications, 38(6), 6618–6626.

Hu, C., Si, X.-S., & Yang, J.-B. (2010). System reliability prediction model based onevidential reasoning algorithm with nonlinear optimization. Expert Systems withApplications, 37(3), 2550–2562.

Huang, R., & Chen, P. (2012). Analysis of influential factors and association rules forbridge deck deterioration with utilization of national bridge inventory. Journalof Marine Science and Technology, 20(3), 336–344.

Huei-Yeh, R., Kao, K., & Chang, W. (2011). Preventive-maintenance policy for leasedproducts under various maintenance costs. Expert Systems with Applications,38(4), 3558–3562.

Ian, H., & Eibe, F. (2000). Data Mining: Practical Machine Learning Tools andTechniques with Java Implementations.. Morgan Kaufmann Publishers.

Jiang, S., Kececioglu, D., & Vassiliou, P. (1994). Modified Gompertz, In: Proceedings ofthe Annual Reliability and Maintainability Symposium.

Juang, Y., Lin, S., & Kao, H. (2008). A knowledge management system for series-parallel availability optimization and design. Expert Systems with Applications,34(1), 181–193.

Kamsu-Foguem, B., Rigal, F., & Mauget, F. (2013). Mining association rules for thequality improvement of the production process. Expert Systems withApplications, 40(4), 1034–1045.

Kumar, G., & Maiti, J. (2012). Modeling risk based maintenance using fuzzy analyticnetwork process. Expert Systems with Applications, 39(11), 9946–9954.

Lloyd, D. (1986). Forecasting reliability growth. IEEE Quality and ReliabilityEngineering International, 19–23.

Lloyd, D., & Lipow, M. (1962). Reliability Growth Models Reliability: Management,Methods, and Mathematics. New Jersey: Prentice-Hall Space Technology Series,Prentice-Hall, Inc. Englewood Cliffs.

Please cite this article in press as: Cruz, A. M. Evaluating record history of medSystems with Applications (2013), http://dx.doi.org/10.1016/j.eswa.2013.03.034

Lu, K., & Sy, C. (2009). A real-time decision-making of maintenance using fuzzyagent, Part 2. Expert Systems with Applications, 36(2), 2691–2698.

Maquee, A., Shojaie, A., & Mosaddar, D. (2012). Clustering and association rules inanalyzing the efficiency of maintenance system of an urban bus network.International Journal of Systems Assurance Engineering and Management, 3(3),175–183.

Matlab. (2003). User’s Guide Version 2. Fuzzy Logic Toolbox for use with MATLAB 5.3.Miguel, C., Barr, C., & Pozo Puñales, E. (2007). Improving corrective maintenance

efficiency in clinical engineering departments. IEEE Engineering in Medicine andBiology, 26(3), 60–65.

Miguel, C., & Rios, A. (2012). Medical device maintenance outsourcing: Haveoperation management research and management theories forgotten themedical engineering community? A mapping review. European Journal ofOperational Research, 22(1).

Miguel, C., Rodriguez, D., Sanchez, V., & Vergara, I. (2002). Measured effects of userand clinical engineering training using a queuing model. BiomedicalInstrumentation and Technology, 29(3), 405–421.

Mortada, M., Yacout, S., & Lakis, A. (2011). Diagnosis of rotor bearings using logicalanalysis of data. Journal of Quality in Maintenance Engineering, 17(4), 371–397.

Mosquera, C. (1995). Availability and reliability of industrial systems. Barquisimeto.Rabatel, J., Bringay, S., & Poncelet, P. (2011). Anomaly detection in monitoring

sensor data for preventive maintenance. Expert Systems with Applications, 38(6),7003–7015.

Raza, J., Jayantha, P., Hassan, A., & Lee, J. (2010). A comparative study ofmaintenance data classsification based on neural network, logistic regressionand support vector machines. Journal of Quality in Maintenance Engineering,16(3), 303–318.

Safari, E., & Sadjadi, S. (2011). A hybrid method for flowshops scheduling withcondition-based maintenance constraint and machines breakdown. ExpertSystems with Applications, 38(3), 2020–2029.

Si, X. S., Hu, C. H., Yang, J. B., & Zhang, Q. (2011). On the dynamic evidentialreasoning algorithm for fault prediction. Expert Systems with Applications, 38(5),5061–5080.

Simoes, J., Gomes, C., & Yasin, M. (2011). A literature review of maintenanceperformance measurement: A conceptual framework and directions for futureresearch. Journal of Quality in Maintenance Engineering, 17(2), 116–137.

Wang, C., & Lin, T. (2011). Improved particle swarm optimization to minimizeperiodic preventive maintenance cost for series-parallel systems. Expert Systemswith Applications, 38(7), 8963–8969.

Zaluski, M., Létourneau, S., Bird, J., & Yang, C. (2011). Developing data mining-basedprognostic models for CF-18 aircraft. Journal of Engineering for Gas Turbines andPower, 133(10).

Zhou, G., & Wang, L. (2012). Co-location decision tree for enhancing decision-making of pavement maintenance and rehabilitation. Transportation ResearchPart C: Emerging Technologies, 21(1), 287–305.

ical devices using association discovery and clustering techniques. Expert