CLASSIFICATION, DETECTION AND PREDICTION...

59
CLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS by Feng Cao Submitted in partial fulfillment of the requirements For the degree of Master of Science Thesis Advisor: Dr. Soumya Ray Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY August, 2012

Transcript of CLASSIFICATION, DETECTION AND PREDICTION...

Page 1: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

CLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND

ANOMALOUS EVENTS IN MEDICAL ROBOTS

by

Feng Cao

Submitted in partial fulfillment of the requirements

For the degree of Master of Science

Thesis Advisor: Dr. Soumya Ray

Department of Electrical Engineering and Computer Science

CASE WESTERN RESERVE UNIVERSITY

August, 2012

Page 2: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis of

Feng Cao

candidate for the Master of Science degree*.

(signed) Soumya Ray

(chair of the committee)

Andy Podgurski

M. Cenk Cavusoglu

Gultekin Ozsoyoglu

(date) May 2, 2012

*We also certify that written approval has been obtained for any proprietary material

contained therein.

Page 3: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

ii

TABLE OF CONTENTS

Page

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Robotic surgery systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Background and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 Current validation practices and limitations . . . . . . . . . . . . . . . . . . . 52.2 Safety of RoS systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Robot fault detection and isolation . . . . . . . . . . . . . . . . . . . . . . . . 82.4 Monitoring and modeling safety-critical software systems . . . . . . . . . . . . 82.5 Reviews of statistical learning models . . . . . . . . . . . . . . . . . . . . . . 8

2.5.1 Dynamic Bayesian networks . . . . . . . . . . . . . . . . . . . . . . . 92.5.2 Linear Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5.3 Regression tree model . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Phased Validation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 The SABiR robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Robot simulation and environment . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Software architecture and data collection . . . . . . . . . . . . . . . . . . . . . 17

3.3.1 State variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 Modeling the Hardware/Software state with DBN . . . . . . . . . . . . . . . . 20

4 Empirical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1 Sampling normal trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.2 Trajectories of A&A events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Modeling normal trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.4 Classification, detection, and prediction of A&A events . . . . . . . . . . . . . 29

Page 4: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

iii

Page

4.4.1 A&A event classification . . . . . . . . . . . . . . . . . . . . . . . . . 294.4.2 A&A event detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.4.3 A&A event prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

APPENDICES

Appendix A: State variables and DBN structures . . . . . . . . . . . . . . . . . . 39Appendix B: Workspace, environment, and parameters . . . . . . . . . . . . . . . 42Appendix C: DBN models for normal trajectories . . . . . . . . . . . . . . . . . . 44

LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Page 5: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

iv

LIST OF TABLES

Table Page

3.1 State variables in DBNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Set of predictable variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1 Average Test r2 for normal state trajectories. . . . . . . . . . . . . . . . . . . . . 28

4.2 Comparison of models on classifying, and detecting different A&A events. . . . . 37

4.3 Comparison of models on predicting “sweep” event. . . . . . . . . . . . . . . . . 37

AppendixTable

A.1 All state variables in DBNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

A.2 Relevant variable sets for DBN structures. See Table A.3 for DBN structures. . . . 40

A.3 DBN structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

C.1 r2 values of different models on predictable hardware state variables. HS: mod-els using hardware/software variables. HS10: models using hardware/softwarevariables with 10-step prediction. HWOnly: models using only hardware variables. 44

C.2 r2 values of different models on predictable software state variables. HS: mod-els using hardware/software variables. HS10: models using hardware/softwarevariables with 10-step prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Page 6: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

v

LIST OF FIGURES

Figure Page

1.1 A “phased” validation framework for robotic surgery systems. . . . . . . . . . . . 3

2.1 A schematic dynamic Bayesian network, where St is the state at time t, X , Y , andZ are state variables. Note that there is conditional probability distribution (CPD)associated with each node (not shown in graph). . . . . . . . . . . . . . . . . . . . 9

3.1 SABiR: The robotic system for image-guided needle-based interventions on smallanimals. Left: CAD model of the system. Right: A prototype system built in ourlab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 A normal trajectory (dashed line) in the simulation environment consisting of two“tissue” blocks (the red and blue cubes) . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4 Schematic of dynamic Bayesian network models we use showing linear Gaussianand regression tree CPDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1 Trajectories of different scenarios (2D view). Top left: normal trajectory. Topright: encoder failure trajectory. Bottom left: sweep trajectory. Bottom right:out-of-workspace trajectory. The latter three correspond to A&A events. . . . . . . 24

4.2 Regression tree trained for actual needle tip position on Z axis. In each node areR2, mean, variance and training data size of the associated linear Gaussian model.The green nodes are leaves. The others are root and internal nodes, which havethe variable name of splitting test specified. Attached to each edge is the outcomeof the test on corresponding branch. . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Regression tree trained for total force on needle on Z axis. The content has thesame meaning as in Figure 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.4 ROC graph for classification of “sweep” A&A event. . . . . . . . . . . . . . . . 29

Page 7: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

vi

AppendixFigure Page

4.5 ROC graph for classification of “OOW” A&A event. . . . . . . . . . . . . . . . 30

4.6 ROC graph for detection of “encoder failure” A&A event. . . . . . . . . . . . . 32

4.7 ROC graph for detection of “sweep” A&A event. . . . . . . . . . . . . . . . . . 32

4.8 ROC graph for detection of “OOW” A&A event. . . . . . . . . . . . . . . . . . 33

4.9 ROC graph for prediction of “sweep” A&A event with 10 steps looking ahead. . . 34

4.10 ROC graph for prediction of “sweep” A&A event with 100 steps looking ahead. . 35

4.11 ROC graph for prediction of “sweep” A&A event with 1000 steps looking ahead. 35

AppendixFigure

B.1 Workspace of the SABiR simulator. . . . . . . . . . . . . . . . . . . . . . . . . . 43

B.2 Normal trajectory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Page 8: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

vii

ACKNOWLEDGMENTS

I would like to thank my advisor Dr. Soumya Ray, for his great patience and careful guid-

ance through the past two years. He is a knowledgeable person with great passion on everything

he works on. It has been a pleasure to meet with him weekly, having discussions on either re-

search or life. He always granted me enough flexibility on scheduling my time, as well as gave

me insightful advices on difficulties I encountered. It is my privilege to work with Soumya

during my graduate time.

I am thankful to the group of people with whom I work on this project, Dr. Andy Podgurski,

Dr. M. Cenk Cavusoglu, Zhuofu Bai, and Mark Renfrew. This work will be impossible without

their teamwork. Thanks are also granted to Ye Fang, who helped revise my thesis during her

busy final time.

I also would like to show my special thanks to my sister, who always supports me for any

decision I made, and unreservedly sharing her thoughts with me, about life, career, family, and

everything. Thank my parents for their supporting my studies and respecting my decisions.

Without them I would not get a chance to enjoy this wonderful two years in Case!

Page 9: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

viii

LIST OF ABBREVIATIONS

A&A Events: Adverse and Anomalous System Events

API: Application Programming Interface

CPD: Conditional Probability Distribution

DBN: Dynamic Bayesian Network

FDA: Food and Drug Administration

FPR: False Positive Rate

GUI: Graphical User Interface

OOW: Out Of Workspace

ROC Graph: Receiver-Operating Characteristic Graph

RoS Systems: Robotic Surgery Systems

SABiR: Small Animal Biopsy Robot

TPR: True Positive Rate

Page 10: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

Classification, Detection and Prediction of Adverse andAnomalous Events in Medical Robots

Abstract

by

FENG CAO

In this project we propose a framework to model the behavior and evaluate the reliability and

safety of robotic surgery systems. A software simulator and associated user interface tools are

used to generate the simulated hardware/software data of a robotic system performing interven-

tions on small animals. The main contributions of this work are the usage of Dynamic Bayesian

Networks (DBN) to model both software and hardware dynamics of the robotic surgery system,

as well as detecting adverse and anomalous (A&A) events in the system. We show empirically

that the model can accurately capture aspects of the software/hardware dynamics. Further-

more, we show that the models are able to accurately classify, detect and predict certain kinds

of A&A events. Finally, comparison between different models demonstrates the usefulness of

modeling both hardware and software state, as opposed to using only hardware state alone.

Page 11: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

1

Chapter 1

Introduction

With the increasing number of minimally invasive surgical (MIS) procedures, attention is

being directed to the development of robotic surgery (RoS) systems [1]. The nature of the

application makes these systems safety-critical. Further these systems are also complex, both

in terms of hardware and software. In this work, we propose a phased validation system to

measure the reliability of RoS systems. As part of this framework, we develop models to

detect and predict adverse and anomalous system events (A&A events) [2].

1.1 Robotic surgery systems

Robotic surgery systems are cyber-physical systems that use robots to aid in surgical pro-

cedures, specifically MIS procedures. Over the years, the number of MIS procedures being

performed has increased. Such procedures require access to places and scales that are not eas-

ily accessed with manual instruments and conventional techniques. These needs may be met

by mechanical instruments that are easier to control and are more precise in terms of perfor-

mance. Moreover, with the help of RoS systems, the surgeon can conduct the operation in

front of a console, instead of directly over the bedside, which makes remote surgery possible.

In addition to precision and miniaturization brought to surgery, RoS systems can also improve

patient health by making smaller incision with less blood loss and less pain. This results in

shorter patient recovery time and duration of hospital stays, and less side effects [3].

RoS systems operate in safety-critical environment. Generally, they have three compo-

nents: the surgeon interface (console), the bedside robotic system, and the imaging processor.

Page 12: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

2

The surgeon interface is the place where surgeons operate. The bedside robotic system is

the component which has direct interaction with patients. Imaging processor provides video

from bedside robotic system to the surgeon interface. Each of the above components may

malfunction, due to either hardware failure or software bugs. Furthermore, the surgeon might

also misoperate the system, because the system often has steep learning curves [4]. Indeed,

accidents have already occurred, as evidenced by a number of adverse event reports filed by

manufacturers with the Food and Drug Administration (FDA). One such report [5] contains the

following description of an accident involving the da VinciTMS Surgical System [6]:

Event Description: It was reported that during a da Vinci’s beating heart dou-

ble vessel coronary artery bypass graft procedure at the hospital, there was an

unexplained movement on the system arm which had the endowrist stabilizer in-

strument attached to it. The unexpected movement caused the feet at the distal end

of the endowrist stabilizer instrument to tip downward resulting in damage to the

myocardium of the patient’s left ventricle. ...

Manufacturer Narrative: The investigation conducted by an isu field service en-

gineer found the system to [have] successfully completed all verification tests and

to function as designed. No system errors related to the cause of this event were

found. Additional investigations conducted by clinical and software engineering

were inconclusive as a root cause could not be determined based on the event de-

scriptions reported by several eye witnesses and the review of the system’s event

logs. ...

The possibility of such adverse and anomalous events motivates a framework for empiri-

cally assessing and monitoring the reliability and safety of RoS systems. Our framework, as

illustrated in Figure 1.1, takes into account how surgeons use a RoS system, how the system

interacts with patients, how operating conditions affect the system, and how observable sys-

tem behaviors are related to the system’s hardware and software dynamics. The framework

Page 13: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

3

Simulation Improvement

Testingbased

Simulation Tests with

Tissue Phantoms

Clinical

Trials

Postmarket

Surveillance

Adverse and Anomalous Event Report

Confirmation using collected data

Fault Diagnosis and Localization

Revalidation

Testing

Further

Figure 1.1 A “phased” validation framework for robotic surgery systems.

should be able to reveal any serious underlying safety problems by analyzing previously de-

tected anomalous but seemingly harmless system behaviors. Furthermore, the framework will

allow us to characterize a reliable region of operation that we call the “usage envelope”. In this

work we introduce a limited prototype of the framework that is built and evaluated based on

simulations of a robot that are concurrently being developed by us for small-animal biopsies.

A major goal of this work is to effectively classify, detect, predict, and respond to adverse

and anomalous system events (A&A events) that threaten patient safety. The following steps

are essential in achieving our goal:

• Designing a robust simulation platform to reproduce system behavior;

• Developing efficient means of collecting pertinent hardware and software execution data,

as well as user feedback, from simulation platform and software controller;

• Building statistical machine learning models to effectively relate the collected data to the

occurrence of A&A events;

• Using models in real system and developing clinically effective response strategies.

Page 14: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

4

In this work we focus on the third step based on the tools developed for the first two steps .

These tools include a simulator of the robotic system, and a supervisory software controller and

GUI for data collection. We learn dynamic Bayesian network (DBN) models of the system’s

behavior from the collected data.1 In the evaluation, we test the ability of these models to

classify, detect and predict A&A events. Specifically, we compare models which learn by only

using hardware data with models which learn by using both hardware and software data. Our

experimental results show that the two models are both good for our tasks, while one performs

slightly better than the other in some tasks. The results are encouraging and indicate that our

approach provides a solid foundation for completing the proposed framework.

1My work focuses on building DBN models of the system’s behavior, Mark Renfrew developed the simulator,and Zhuofu Bai implemented the software controller and GUI.

Page 15: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

5

Chapter 2

Background and Related Work

In this chapter, we describe current RoS systems practice, and related work on the safety of

RoS systems. We also review the statistical learning models that are used to model the behavior

of RoS systems.

2.1 Current validation practices and limitations

RoS systems are subjected to federal regulation by the Food and Drug Administration

(FDA). The FDA has established classifications for different generic types of devices and

grouped them into three regulatory classes based on the level of control necessary to assure

the safety and effectiveness of the device [7]. We believe RoS systems should be considered

Class III devices, which are “usually those that support or sustain human life, are of substantial

importance in preventing impairment of human health, or which present a potential, unrea-

sonable risk of illness or injury”. However, since 2000, the FDA has assigned RoS systems

as Class II devices, which are subjected to less stringent regulations than Class III devices.

This seems problematic since the requirements for Class II devices are inadequate to assure

the safety and effectiveness of RoS systems. On the other hand, the FDA has required RoS

systems to be evaluated in clinical trials [8], which is more typical for Class III devices.

The FDA currently does not have specific regulations or even guidelines for the design and

validation of RoS systems. The regulations that RoS systems are assigned to are the guid-

ance documents provided by the FDA for manufacturers on proper development and validation

practice for medical devices [9] and associated softwares [10]. Also, the Good Manufacturing

Page 16: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

6

Practice (GMP) requirements of the Quality System Regulation (contained in Title 21 Part 820

of the Code of Federal Regulation) have established the general quality requirements for med-

ical device manufacturers [11]. These documents call for applications of standard procedures

for the development and validation of safety critical systems.

Although the FDA provides general guidelines for the design and validation of medical

devices, the guidelines are not sufficient to ensure the reliability and safety of a cyber-physical

medical system as complex as RoS systems. Essentially, there are two issues in the current

validation practices for the safety and reliability of RoS systems. One is the large state space,

and the other is long-term reliability of software systems. The combined hardware/software

state space of RoS systems is so large that we can only explore a small portion of it during

validation. A much larger portion of the state space is visited when the system is deployed

in fields and used for a long period. Moreover, the state space can change over the life of the

system because of the changes in usage patterns. Consequently, it is likely that latent defects

will trigger system failures after deployment as previously unexplored regions of the state space

are encountered.

The very large state space of RoS systems poses big challenges for even the most promis-

ing validation techniques, such as model checking [12], model based testing [13, 14, 15], and

directed random testing [16]. Model checking and model based testing are based on abstract

models of programs behavior. They abstract a significant part of the system’s actual complexity

in order to make analysis tractable. This, however, makes the techniques vulnerable to model

errors and incompleteness of the system. Directed random testing explores a program’s control

structure by combining random testing with symbolic program evaluation and automatic con-

straint solving. While these methods are useful when applied to models or system components

of limited size, it is not clear if they can be made to scale to large systems. Also, because they

do not consider operational usage, none of them provide objective assessment of a system’s

operational (field) reliability and safety.

In our work, we use statistical learning models, i.e. dynamic Bayesian networks (DBNs),

to compactly represent the dynamics of the hardware/software state space. These models are

Page 17: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

7

trained based on a sample of the whole state space. Further, because the system is engineered

to exhibit regular behavior, we expect that observing a sample will allow us to capture the

significant elements of the system dynamics. In this way we believe these models can be

accurate without exhaustively covering the state space.

There is also a large body of research on modeling the long-term reliability of software and

software-controlled systems, which does consider operational system usage [17]. Most of the

work uses reliability growth models, which assumes that the reliability of a software system

tends to grow over its lifetime due to debugging. With reliability growth models, an estimate

of system reliability is updated after system maintenance, based on a model of how debugging

affects reliability. However, since the immediate effects of attempted debugging on reliability

are extremely variable, reliability growth models are not appropriate for estimating the current

reliability of critical systems such as RoS systems, although they may be useful for long-term

planning. On the other hand, it is also possible for our system to update the model based on

newly observed system behaviors and dynamics, though we have not considered this in our

work.

2.2 Safety of RoS systems

Most of the studies on safety of RoS systems primarily focus on the design of intrinsically

safe systems [18, 19, 20, 21, 22, 23, 24]. In other words, the system is designed to fail in a safe

manner and come to a controlled halt so that it can be removed, and the procedure completes

manually. This is typically achieved by using actuators with limited power and speed, current

limiters, redundant sensors, safety monitors at the sensor, servo and supervisory levels, and

watchdogs monitoring activity of the control system. Some other studies lay out approaches

based on identification of potential hazards and mitigating them throughout the development

lifecycle [15, 25, 26, 27]. All of these design practices are basically complementary to our

approach, which uses statistical learning models to analyze system behavior.

Page 18: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

8

2.3 Robot fault detection and isolation

Apart from the work on safety of RoS systems, there is also a lot of work on online detec-

tion and isolation of faults in general robotics and other hybrid systems [28, 29, 30, 31]. Most

commonly used approaches include using probabilistic sequence model to represent the system

state and performing inference in real time to detect or predict when A&A events happen. Our

framework aligns with these approaches in the sense that we also use probabilistic sequence

models, i.e. dynamic Bayesian network, to capture the system behaviors and dynamics. How-

ever, prior work mostly focuses on modeling the hardware state transition and devote attention

to efficient inference algorithms to account for the online setting. On the contrary, our work

not only incorporate dynamics of both hardware state and software state, but also consider both

online detection and offline classification of A&A events.

2.4 Monitoring and modeling safety-critical software systems

Related work also exists in software engineering. Prior work has analyzed safety-critical

systems, such as spacecraft [32], and recommended the use of runtime monitoring to detect

faults. There is also a large body of work that uses probabilistic models for software testing

and fault localization (e.g. [33, 34]). Unlike our work, this work is typically not in the context

of robotic systems or medical robots, however.

2.5 Reviews of statistical learning models

Several statistical learning algorithms are used in our approach. Dynamic Bayesian network

(DBN) are used to represent the dynamic relationship among hardware state variables, software

state variables, and parameters. Conditional probability distributions of each node in the DBN

are represented with either regression trees or linear Gaussian models.

Page 19: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

9

Figure 2.1 A schematic dynamic Bayesian network, where St is the state at time t, X , Y , andZ are state variables. Note that there is conditional probability distribution (CPD) associated

with each node (not shown in graph).

2.5.1 Dynamic Bayesian networks

Dynamic Bayesian networks (DBNs) [35] are first order Markov models1 that represent,

in a factored manner, the probability of the next state given the current one, i.e. Pr(St+1|St),

where St means state at time t, and is described by a vector of state variables. We further

consider structures that have no ”symmetric” edges, i.e., no edges from Xt+1 to Yt+1, where

X and Y are state variables. In this case the value of a state variable Xt+1 in state St+1 only

depends on variables in St from which there is an edge to Xt+1. Figure 2.1 shows a schematic

1It is possible for them to be higher order; we only consider first order models in this work.

Page 20: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

10

DBN structure. The factored state transition distribution according to the DBN is defined by:

Pr(St+1|St) = Pr(Vt+1|Vt) =n∏

i=1

Pr(V it+1|Vt) =

n∏i=1

Pr(V it+1|V

par(i)t ) [2.1]

where Vt = {V it } denotes all of the variables at time t, Vpar(i)

t denotes all of the variables at

time t that have an edge to V it+1.

Each node (or variable) V i in the DBN is associated with a conditional probability distri-

bution (CPD), denoted as Pr(V it+1|V

par(V )t ), which is the probability distribution of V i given

all the nodes in previous time step that have an edge to it. From Equation 2.1, we know that

the probability of current state given previous state equals to the product of the probability of

each variable in current state given relevant variables in previous state. In other words, the

probability of current state given previous state is the product of the CPD of each variable in

current state.

In a problem where we model with a DBN, normally the CPDs of each state variable are

unknown a priori. We estimate CPDs from observed trajectories of hardware/software states.

There are many ways to represent a CPD. If all the relevant variables are discrete, then one

could store the CPD as a big table explicitly. However this is not always possible, since the size

of table grows exponentially as the number of variables increases. Also, when the variables are

continuous, we cannot store all the probabilities in a table. Thus, alternatively, one could use a

parametric or nonparametric model to approximate the CPD. In the following two subsections,

we will briefly go over two models we used to represent the CPDs in our approaches.

2.5.2 Linear Gaussian model

A linear Gaussian model is an approach to modeling the relationship between a dependent

variable Y and a set of independent variables X. It assumes that the conditional distribution of

Y given X is normal, with a mean that depends linearly on X:

Y |X;w ∼ N (wT ·X, σ2X) [2.2]

The learning problem for linear Gaussian model is formalized in this way: given a set of

training data, i.e. (x, y) pairs, find the parameters w so that the likelihood of training data in

Page 21: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

11

the model is maximum:

w∗ = argmaxw

∏i

Pr(yi|xi;w) [2.3]

Note that variance σ2 is implicitly contained in the equation. However, the choice of w does

not depend on what σ2 is, and indeed same results will be obtained even σ2 is unknown.

2.5.3 Regression tree model

Regression tree [36] is one type of decision tree that predicts the value of a target variable

based on several input variables. Regression tree handles the situation when the target variable

is continuous. A regression tree partitions the whole input data into different disjoint groups,

each of which corresponds to a leaf node. Each internal node of a regression tree is associated

with a test over the value of a input variable, and each branch corresponds to an outcome of the

test. Usually, a test compares the value of an input variable with a constant. When a new data

point is input, it starts from the root node, and goes to a leaf according to testing outcomes. A

regression model, say, linear Gaussian model, is assigned to each leaf node.

The learning problem for regression tree can be formalized in the following way: given a

set of training data, i.e. (x, y) pairs, find the tree structures, including tests on each internal

node, and regression model on each leaf node. Standard methods for regression tree training

and pruning are introduced in [36]. The tree is constructed in a top down way, recursively.

At first, a test is picked for the root node, according to some proper splitting criteria. Then

branches are made for each possible outcome of the test. This splits the training data into

subsets, one for each outcome. Repeat the process recursively for each branch, until no more

splitting is needed. In the problem where most variables are continuous, r2 is used as splitting

and stopping criteria. First of all, a linear Gaussian model is built for each node (both internal

and leaf nodes) using the associated data. Then r2 of the model can be calculated to measure

the goodness of the model. r2 is defined as follows:

r2 = 1−∑

i(yi − yi)2∑i(yi − y)2

[2.4]

Page 22: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

12

where yi is the true value of the ith input data point, yi is the estimated value of yi from model,

and y is mean of yi’s. At each node, the goodness of each test is calculated in the following

way:

r2split =∑i

Ni

N∗ r2i [2.5]

where N is number of data points in current node, Ni is the number of data points that go to ith

branch after splitting, r2i is r2 of model for ith branch. The splitting with largest improvement in

r2 will be used. If no such split exists, current node will become leaf, and all the data survived

in this node will be used to train a linear regression model. There are also ways to do pruning in

order to keep the generated tree from overfitting. In this work, we also use a modified version,

to be introduced in next chapter, that takes advantage of the simulator available.

Page 23: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

13

Chapter 3

Phased Validation Framework

The big picture behind this work is the framework shown in Figure 1.1. This is a phased

validation strategy where the RoS system is tested in several ways: first through accurate simu-

lations, then in hardware with tissue “phantoms” that simulate real tissue, then in clinical trials

and finally when it is marketed. At each point data is collected and analyzed to determine

possible A&A events. Reports of A&A events are also collected from users. If confirmed, di-

agnosis and fault localization techniques can be used to identify and correct the malfunctioning

hardware/software. The modified system is then revalidated using the same phased validation

pipeline.

In this work, we focus on the simulation-based testing phase. Specifically, statistical learn-

ing models are built for the behavior of simulator, and then are used to classify, detect, and

predict A&A events. In this chapter ,we describe a simulation of our robot, the supervisory

software we have developed, and the statistical methods we use to analyze the collected data. 1

3.1 The SABiR robot

In this work, we use the Small Animal Biopsy Robot (SABiR) [37]. Figure 3.1 shows

an image of the robot. It is a five-degree-of-freedom parallel robotic manipulator which is

designed to take biopsies or deliver therapeutic drugs at targets in live small animal subjects

and to achieve accuracy better than 250µm. It employs a parallel design to achieve low inertia.

The robot has high position resolution and can realize dexterous alignment of the needle before

1Work on the simulation environment and supervisory software system is accomplished by Mark Renfrew andZhuofu Bai, respectively.

Page 24: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

14

Figure 3.1 SABiR: The robotic system for image-guided needle-based interventions on smallanimals. Left: CAD model of the system. Right: A prototype system built in our lab.

insertion. The design is light weight, and has high motion bandwidth, so that biological motion

(e.g., breathing, heartbeat, etc) at the target can be canceled while the needle is inserted inside

tissue.

The robot consists of a needle mechanism held by two 5-bar linkage mechanisms, referred

to as the front and rear stages. The front stage has two degrees of freedom (up/down, left/right)

and the rear stage has three degrees of freedom (up/down, left/right, rotate forward/rotate back-

ward). The stage are driven by five tendon mechanism motors and the joint angles are measured

by encoders. The robot’s state is characterized by its five joint angles, and there is a one-to-one

correspondence between any position and orientation that the needle tip can reach and a set of

joint angles.

3.2 Robot simulation and environment

In section 1.1, we described several subproblems that need to be solved in order to achieve

our goal. The first step is to design a robust simulation platform to reproduce system behavior.

We use the simulation of the robot rather than the physical system because building our models

Page 25: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

15

Figure 3.2 A normal trajectory (dashed line) in the simulation environment consisting of two“tissue” blocks (the red and blue cubes)

requires a significant amount of data, and it is difficult in terms of time and cost to collect this

from the actual system. However, to ensure that the models we build are still applicable to

the real system we seek to ensure that the simulation is as accurate as possible as we describe

below.

In prior work, models for the kinematics and inverse kinematics for SABiR have been de-

veloped [38]. Here we use them to create a simulation of the robot, implemented in Simulink [39],

in which the robot’s motors are each represented as third-order transfer functions. The simula-

tor is designed to be a modular component of the system, in the sense that it can be seamlessly

swapped with the controller of the actual robot.

Page 26: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

16

The environment of the simulated robot consists of two gel blocks with different charac-

teristics placed in the workspace, one contained within the other (Figure 3.2). A needle force

model, which assumes a stiff non-deformable needle, is used to provide a resistive force caused

by the combined cutting, frictional, and stiffness forces produced when the needle is inside the

gel block. The cutting force is caused by the needle tip piercing the gel block and provides

a resistance to the needle’s motion during insertion into the gel block. The frictional force is

produced by the friction between the needle body and the walls of the channel in the gel block,

and resists the needle during insertion and extraction. The stiffness force is caused by the gel

block’s tendency to resist sideways motion of the needle, i.e., any motion not in the direction

the needle is pointing. In this way, realistic and distinguishable forces can be produced by any

possible motion of the needle. The needle model is described in detail in [40].

We use a simple low level controller to control the simulation. After calibration the robot’s

initial location is called its “home position”. The controller can then be given a new position

and orientation to move the needle to. If the new position can be reached, a “reference tra-

jectory” is computed using linear interpolation. Velocities along this trajectory are set to be

mainly constant, with fast acceleration and deceleration at the start and end (subject to a de-

sired maximum acceleration). The simulator then uses a PD controller to follow this trajectory

to guide the needle to the end position.

The single insertion task for the robot is to insert the needle tip to a specific target position

in the inner block, at a specific target orientation. Upon completing an insertion task, the

needle starts from a predefined home position. The needle is moved towards a ready position,

which is outside but near the surface of the outer block. At the same time, it is rotated to

the specific inserting orientation. After that, motors work together again to insert the needle

into blocks until reaching target point. Then the needle is extracted from the blocks to ready

position. Finally the needle is guided towards home position, back to the state as it starts with.

The four steps described here divide the whole task into four consecutive macro actions, i.e.,

move-needle, insert-needle, extract-needle, and move-home. The dashed line in Figure 3.2

Page 27: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

17

RobotGUI

Current statePrinter

Task schema

for single insertion

check parameters

Error Handler

Move to ready position

Insert the needle

Extract the needle

Move to home position

SABiRRobot

(API)Move to ready

(API)Needle Insertion

(API)Needle

Extraction

(API)Move

Home

Invalid parameters invoke error handler

API execution failure invoke error handler

current state of robotRobot state

needle positionneedle directionneedle depth

Task Delegator Robot Proxy

...

Figure 3.3 Software architecture

shows a normal trajectory of the single insertion task. Note that the trajectories of last two

actions overlap with the first two here. 2

Three sets of parameters are used to specify a single insertion task: ready position, ready

orientation (same as target orientation), insert distance (distance between ready position and

target position). Note that position and orientation are specified with 3 values, respectively,

each corresponds to one dimension in space.

3.3 Software architecture and data collection

A supervisory software system is built on top of the low-level controller. The user interacts

with the robot through this software. This system has three components: a GUI, a task dele-

gator, and a robot proxy. Figure 3.3 shows the information flow between them when the robot

performs a high level insert/extract needle operation.

The graphical user interface (GUI) part is built for the ease of user to pass commands to

the robot, through the task delegator. For this work, we only need to collect the data, both

from software and hardware. So the data flow here is as follows: the commands (i.e., the

parameters specifying the target location and orientation) are passed by calling the application

2See Appendix B for a more detailed description of the environment and parameters.

Page 28: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

18

Type Name Num Description

Parameter

Reference Trajectory 18 Reference position, starting position, end position

Ready Position 6 Position and orientation where insertion begins

Insert Distance 1 Insert distance into tissue from ready position

Software

Action 1 High level action, e.g. “Insert Needle”

Insert Depth 1 Depth of the needle inside tissue

Needle Force 3 Force on the needle

Needle Torque 3 Torque on the needle

Motor Position Error 5 Error between Actual Position and Reference Position

Motor Speed Error 5 Error between Actual Speed and Reference Speed

Hardware

Needle Position 6 Position and orientation of needle tip

Motor Position 5 Positions of 5 motors

Motor Torque 5 Torques of 5 motors

Table 3.1 State variables in DBNs.

programming interfaces (APIs) provided by task delegator. The task delegator first checks

the validity of input parameters for the specified operation; for example, it ensures that target

locations are within the robot’s workspace. It then decomposes a complex task into a set of

basic needle motions that can be accomplished by calls to the API for the robot (in this case,

simulator). The delegator is equipped with different schema to decompose different high level

tasks. It then invokes the robot API to handle these decomposed tasks. If an error occurs, it is

responsible for stopping the current action and returning an error signal. When the robot proxy

gets an API call, it communicates with the real robot (again, in this case, simulator), issuing

low-level operations and collecting low-level sensor data from the robot (simulator).

Page 29: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

19

Variable Name Num

Needle Position 6

Motor Position 5

Motor Torque 5

Insert Depth 1

Needle Force 3

Needle Torque 3

Table 3.2 Set of predictable variables.

3.3.1 State variables

There are three types of variables that are parts of the system state, namely, parameters,

hardware variables, and software variables. Variables such as the reference trajectory to be

followed by the robot are “parameters”. These variables are inputs to the system and do not

change over time. While we use these parameters to help predict future state trajectories, they

are not themselves predicted since they are constant. Other variables such as the x position of

the needle tip are “hardware variables”. The values for these variables are obtained by sensors

on the robot or by direct hardware measurements of various sorts. The third type of variables is

“software variables”. These variables include flags set by subroutines in the software, denoting

which high level motion is being executed. Software variables also include variables such as

“force on the needle” which cannot be directly sensed in the hardware but can be estimated or

derived in software indirectly from other variables. Such estimates can be used to estimate yet

other variables such as “depth of needle in tissue”. The set of variables describing the state is

shown in Table 3.1. For each kind of variable, “Num” refers to the number of variables of that

kind, e.g., there are 3 needle forces, one each direction. 3

Variables can also be categorized in terms of whether we build models for them or not,

i.e. “predictable” and “observable”. Predictable variables are variables whose values are only

3For a full description of all variables and DBN structures, please refer to Appendix A

Page 30: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

20

t

Depth

y

Torque3

Depth

y

Torque3

Needle

x

Needle

x

SS t+1

Figure 3.4 Schematic of dynamic Bayesian network models we use showing linear Gaussianand regression tree CPDs.

available in real time. They include hardware variables, and some derived software variables,

such as Needle Force. Observable variables are variables whose values are available before

execution of a task. Examples of them include reference trajectory and parameters. Table 3.2

lists the predictable variables. The remaining variables are observable.

3.4 Modeling the Hardware/Software state with DBN

We use the data collected to build models representing the time evolution of the joint soft-

ware/hardware state space of the system. We use dynamic Bayesian networks (DBNs) to rep-

resent the probability of the next state given the current one, i.e., Pr(St+1|St), where each state

Page 31: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

21

St is described by a vector of variables. A schematic DBN structure is shown in Figure 3.4.

The parameters of these probability distributions are learned in the way described below.

We represent the conditional probability distributions (CPDs) in the DBN in two ways.

Certain state variables, such as the x position of the needle tip, vary at a near constant rate from

t to t+1, because (in this case) the robot controllers are designed to maintain a constant velocity

as far as possible. For such variables, we use linear Gaussian models. For other variables, we

employ regression tree models for their CPDs.

We learn DBNs to model “normal” state transitions, and then use them to estimate the

likelihoods of the actual state transitions. State transitions with low likelihoods are classified

as “abnormal”, thus considered to relate to A&A events. We do not build models for abnormal

state transitions in this work. This is because A&A events happen rarely and unexpectedly. It is

impossible to collect data for all potential A&A events. Also, state transitions of A&A events

may not share the same transitions since they are caused by different reasons. Given these, it

is improper to either build separate models for each A&A state transitions, or build one model

for all of them. As a result, our models are not able to classify different types of A&A events.

On the other hand, however, models of “normal” state transitions are more robust to the “rare”

states of the system, because enough data can be obtained from the simulator to ensure that the

models trained are accurate.

We generate sequences of normal trajectories from our simulation and estimate the CPDs

for these variables from them. CPD parameters are estimated using maximum likelihood;

for linear Gaussian models, this is equivalent to linear regression and yields simple closed

form solutions. For regression tree models, we use a standard greedy top down recursive

decomposition approach. At each point, each variable and its associated values are evaluated

to estimate the benefit of splitting on that (variable, value) pair. All splits are binary. The

benefit is computed using a weighted-average r2 metric which compares the r2 goodness-of-fit

value before and after splitting (the weight is the fraction of points that go to each partition

after a candidate split). The (variable, value) pair with the maximum gain is selected as the

internal node. We make one modification to this standard tree construction procedure. In a

Page 32: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

22

normal case, the number of data points decreases as we go deeper into the tree because of the

recursive partitioning. However, since we have a simulator, we use the simulator as an oracle to

generate data points as needed. These points are generated using a form of rejection sampling;

a random trajectory is sampled and a point on it is evaluated to see if it satisfies the checks

at the internal nodes. If so, it is kept, or else discarded. This procedure ensures that we have

enough points at each node in the tree to make decisions about the choice of splits. Further,

to prevent overfitting, we prune the produced regression tree for each variable using a fixed

validation set. We use a standard greedy post-pruning approach to do this [41].

Page 33: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

23

Chapter 4

Empirical Evaluation

In this chapter, we evaluate how well our models can represent the robot dynamics of

conducting a single insertion task, and how accurately they can classify, detect, and predict

certain kinds of A&A events. All the experiments are performed with our simulator, which is

an accurate simulation of the robot. We first describe the experimental methodology, and then

present our results.

4.1 Sampling normal trajectories

Ideally, in order to build a model to capture the normal behavior of the system, all the

possible trajectories should be used as training data. However, this is impossible since there

are infinitely many normal trajectories. In practice, we uniformly sample from the set of all

normal trajectories, and then use them as representatives for the whole data set. The way

we do this is by sampling parameters from the feasible space. The feasible parameter space

are defined such that: the ready position and orientation are made so that they are close to the

surface of outer block, and the insert distance is made so that the target position lies somewhere

in the inner block.

However, not all the trajectories specified by sets of parameters are reachable in the robot’s

workspace. One could imagine asking robot to reach a target position from a weird orientation,

say, the other side of the blocks. This is impossible, and upon given these parameters, the

simulator will generate some abnormal trajectories. Fortunately, our simulator does provide a

way, though incomplete, to rule out most of these infeasible parameters.

Page 34: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

24

Figure 4.1 Trajectories of different scenarios (2D view). Top left: normal trajectory. Topright: encoder failure trajectory. Bottom left: sweep trajectory. Bottom right:

out-of-workspace trajectory. The latter three correspond to A&A events.

4.2 Trajectories of A&A events

Adverse events are any untoward medical occurrence in a patient or clinical investigation

subject administered a pharmaceutical product and which does not necessarily have to have

a causal relationship with this treatment. In our case, A&A events can be triggered by either

hardware problems or software bugs. Our goal here is to use our model to detect adverse and

anomalous events. We consider three kinds of A&A events (as shown in Figure 4.1):

Page 35: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

25

• Encoder Failure: This event happens at some point within the trajectory, when the ele-

ment that reports a motor’s position fails, so the system can no longer track that motor’s

position. As a result, the controllers can no longer effectively guide the needle tip to

follow the reference trajectories.

• Sweep: This event happens where prior to needle insertion, the needle tip strays and

grazes the tissue surface. The reason for this happening could be when part of the refer-

ence trajectory of move-needle or move-home action is very close to the surface. There-

fore, as the actual trajectory tries to follow it, but deviates a little bit from it, it might

touch the surface (skin) and injure the subject.

• Out-of-workspace (OOW): This is an event triggered by software bugs. The reference

trajectories in this case are normal. But the actual trajectories fail to follow the reference

trajectories. The reason for this, as we figured out and fixed, is that the expected trajec-

tory is outside the feasible workspace of the robot. In the buggy code, parameters were

not checked for validation before being input to simulator/robot system.

We generate several trajectories for each such event from simulator. Since actual A&A events

are rare, we restrict the proportion of trajectories for each “A&A event” to normal ones in our

dataset to 1.25%.

4.3 Modeling normal trajectories

We build DBN models with normal trajectories, so that they can be used to classify, detect,

and predict A&A events by observing that their trajectories are of low likelihood according to

the DBNs. We consider and evaluate three DBNs: a model using all the variables in Table 3.1

(HS), a model using only the “parameter” and “hardware” variables (HWOnly), and a model

using all variables but making 10-step predictions (i.e. modeling Pr(St+10|St)) (HS10).

Models are trained using the algorithms described in Section 3.4. We sample 400 normal

trajectories from simulator, and then from them, 5, 000 (st, st+1) pairs are sampled as initial

training data for building regression tree to model the CPD for each predictable variable. More

Page 36: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

26

HA

ctP

ZR

oo

t: S

Act

ion

Nam

eR

2:

0.9

99

99

99

87

8(2

00

0)

#1

SB

egin

PZ

R2

: 0

.99

99

99

96

91

Mea

n:

34

2.5

8 V

ar:

1.6

7e-

05

(56

1)

extr

act-

nee

dle

#1

0 S

En

dP

ZR

2:

0.9

99

99

99

72

8 M

ean

: 3

42

.83

Var

: 1

.17

e-0

5(5

67

)

inse

rt-n

eed

le

#1

9 S

Rea

dy

PZ

R2

: 0

.99

99

99

96

06

Mea

n:

33

4.7

4 V

ar:

1.6

2e-

05

(42

6)

mo

ve-h

om

e

#2

4 H

Po

s2R

2:

0.9

99

99

99

59

4 M

ean

: 3

34

.83

Var

: 1

.82

e-0

5(4

46

)

mo

ve-n

eed

le

#2

R2

: 0

.99

99

99

98

18

Mea

n:

34

1.9

8 V

ar:

1.1

2e-

05

(14

1)

<3

45

.29

#3

SB

egin

PZ

R2

: 0

.99

99

99

96

77

Mea

n:

34

3.1

6 V

ar:

8.6

9e-

05

(18

59

)

>3

45

.29

#4

R2

: 0

.99

99

99

98

02

Mea

n:

34

2.1

9 V

ar:

1.0

7e-

05

(11

1)

<3

45

.58

#5

SB

egin

PZ

R2

: 0

.99

99

99

96

72

Mea

n:

34

3.2

2 V

ar:

1.2

0e-

04

(17

48

)

>3

45

.58

#6

R2

: 0

.99

99

99

98

15

Mea

n:

34

2.2

6 V

ar:

1.9

4e-

05

(17

1)

<3

45

.97

#7

SB

egin

PZ

R2

: 0

.99

99

99

96

59

Mea

n:

34

3.3

3 V

ar:

1.9

3e-

04

(15

77

)

>3

45

.97

#8

R2

: 0

.99

99

99

98

63

Mea

n:

34

2.4

4 V

ar:

1.7

2e-

05

(16

4)

<3

46

.50

#9

R2

: 0

.99

99

99

96

40

Mea

n:

34

3.4

3 V

ar:

5.6

1e-

04

(14

13

)

>3

46

.50

#1

1R

2:

0.9

99

99

99

70

8 M

ean

: 3

42

.06

Var

: 2

.07

e-0

5(1

33

)

<3

45

.20

#1

2 S

En

dP

ZR

2:

0.9

99

99

99

69

8 M

ean

: 3

43

.12

Var

: 5

.87

e-0

5(1

86

7)

>3

45

.20

#1

3R

2:

0.9

99

99

99

58

4 M

ean

: 3

42

.26

Var

: 2

.48

e-0

5(1

32

)

<3

45

.29

#1

4 S

En

dP

ZR

2:

0.9

99

99

99

70

0 M

ean

: 3

43

.18

Var

: 7

.19

e-0

5(1

73

5)

>3

45

.29

#1

5R

2:

0.9

99

99

99

55

0 M

ean

: 3

42

.22

Var

: 2

.24

e-0

5(2

40

)

<3

45

.97

#1

6 H

Act

PZ

R2

: 0

.99

99

99

97

10

Mea

n:

34

3.3

4 V

ar:

1.4

5e-

04

(14

95

)

>3

45

.97

#1

7R

2:

0.9

99

99

98

88

2 M

ean

: 3

39

.13

Var

: 1

.09

e-0

5(3

31

)

<3

40

.02

#1

8R

2:

0.9

99

99

99

42

8 M

ean

: 3

44

.54

Var

: 4

.25

e-0

4(1

16

4)

>3

40

.02

#2

0R

2:

0.9

99

99

99

88

7 M

ean

: 3

34

.65

Var

: 1

.30

e-0

5(1

62

)

<3

38

.27

#2

1 S

Beg

inP

ZR

2:

0.9

99

99

99

49

4 M

ean

: 3

34

.86

Var

: 1

.17

e-0

4(1

83

8)

>3

38

.27

#2

2R

2:

0.9

99

99

99

77

1 M

ean

: 3

34

.73

Var

: 1

.74

e-0

5(1

17

)

<3

38

.38

#2

3R

2:

0.9

99

99

99

48

1 M

ean

: 3

34

.87

Var

: 1

.60

e-0

4(1

72

1)

>3

38

.38

#2

5R

2:

0.9

99

99

99

99

3 M

ean

: 3

38

.66

Var

: 4

.74

e-1

1(4

)<5

.12

#2

6 S

En

dP

ZR

2:

0.9

99

99

99

62

7 M

ean

: 3

34

.86

Var

: 9

.37

e-0

5(1

99

6)

>5

.12

#2

7R

2:

0.9

99

99

99

85

0 M

ean

: 3

34

.98

Var

: 1

.22

e-0

5(1

20

)<3

38

.18

#2

8 S

En

dP

ZR

2:

0.9

99

99

99

61

7 M

ean

: 3

34

.85

Var

: 1

.29

e-0

4(1

87

6)

>3

38

.18

#2

9R

2:

0.9

99

99

99

81

2 M

ean

: 3

34

.52

Var

: 2

.70

e-0

5(1

96

)

<3

38

.37

#3

0 S

En

dP

ZR

2:

0.9

99

99

99

59

8 M

ean

: 3

34

.89

Var

: 2

.06

e-0

4(1

68

0)

>3

38

.37

#3

1R

2:

0.9

99

99

99

84

4 M

ean

: 3

34

.73

Var

: 4

.18

e-0

5(3

35

)

<3

38

.68

#3

2R

2:

0.9

99

99

99

54

5 M

ean

: 3

34

.93

Var

: 5

.75

e-0

4(1

34

5)

>3

38

.68

Figure 4.2 Regression tree trained for actual needle tip position on Z axis. In each node areR2, mean, variance and training data size of the associated linear Gaussian model. The green

nodes are leaves. The others are root and internal nodes, which have the variable name ofsplitting test specified. Attached to each edge is the outcome of the test on corresponding

branch.

Page 37: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

27

HFTotZRoot: SActionNameR2: 0.8450941985

(2000)

#1R2: 0.6578921795 Mean: -7.54e-07

Var: 1.22e-11(627)

extract-needle

#2R2: 0.7885379944

Mean: 1.20e-06 Var: 1.16e-11

(611)

insert-needle

#3R2: NaN

Mean: 0.00e+00 Var: 0.00e+00

(386)

move-home

#4R2: NaN

Mean: 0.00e+00 Var: 0.00e+00

(376)

move-needle

Figure 4.3 Regression tree trained for total force on needle on Z axis. The content has thesame meaning as in Figure 4.2

pairs are sampled when they are needed to split an internal node. Some resulting models are

shown in Figure 4.2 and Figure 4.3. Linear Gaussian models are also trained with 5, 000 pairs

of data points randomly sampled from 400 normal trajectories.

By comparing r2 of each model, we observe that for some variables like needle tip posi-

tions, linear Gaussian models can perfectly capture it’s dynamics, although regression trees are

also almost as good. We use linear Gaussian models for those variables. However, for some

other variables like needle forces, they have different dynamics in different actions. For ex-

ample, for move-needle and move-home actions, forces are always 0, while for insert-needle

and extract-needle actions, they are not. Simple linear Gaussian models are not able to capture

this relationship because it is no longer linearly representable. On the other hand, however, a

regression tree can perfectly model this behavior by creating a branch for each of these cases.

Figure 4.3 shows the resulting model we have for needle force on Z axis.

4.3.1 Results

We first evaluate how well our DBNs can model normal trajectories. We randomly sample

20, 000 (st, st+1) pairs from the 400 normal trajectories (different from the train set). Using the

st values, we then produce predicted state st+1 and compute an r2 metric that represents the

Page 38: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

28

Model Hardware Software All

HS 0.784 0.415 0.670

HS10 0.773 0.397 0.648

HWOnly 0.725 N/A N/A

Table 4.1 Average Test r2 for normal state trajectories.

accuracy of these predictions in the following way:

r2v = 1−∑

i(vi(t+ 1)− vpredi (t+ 1))2∑i(vi(t+ 1)− v(t+ 1))2

[4.1]

r2 =1

|V |∑v∈V

r2v [4.2]

where V is the set of predictable variables in DBN, vi(t+ 1) is the (t+1) value of variable v in

ith pair, vpredi (t+ 1) is predicted value from the model, v(t+ 1) is mean of (t+1) value of v in

training set. r2 is the average of all r2v’s.

These results are shown in Table 4.11. From these results, we observe that the DBNs are

quite good at modeling the time-evolution of the hardware variables. It is more difficult to

predict the software variables, though there is a lot of variability. Some variables, such as the

depth, can be very well predicted (r2 = 0.99) because it varies in a nearly constant rate, as

robot controllers are designed to maintain a constant velocity as far as possible. Others such

as the force on the needle in the y direction, are hard (r2 = 0.4) to predict since its dynamic

is not as linear. However, as we show below, this still often allows us to effectively classify,

detect, and predict A&A events. Finally, we observe that when using the HWOnly model,

the accuracy of prediction on the hardware variables is less than when using the HS model.

This suggests that the software variables add value to the DBN, and using them results in more

accurate predictions.

1See Appendix C for r2 values of different models for all predictable variables.

Page 39: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

29

4.4 Classification, detection, and prediction of A&A events

Now we evaluate the effectiveness of our model on classifying, detecting, and predicting

the three types of A&A events we introduced.

4.4.1 A&A event classification

0

0.2

0.4

0.6

0.8

1

0 0.01 0.02 0.03 0.04 0.05

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.4 ROC graph for classification of “sweep” A&A event.

We first consider the question of classifying trajectories according to whether they contain

an A&A event. This may be useful if execution data is logged on a remote system and later

fetched for analysis.

We consider a test set (distinct from the train set) of 400 normal trajectories and 5 trajec-

tories each of the three A&A events described above. For each point st on each trajectory, we

use the DBN models to compute the negative log likelihood (NLL) of st+1 (or st+10 for HS10).

We record the maximum negative log likelihood over the trajectory, and use this score to rank

Page 40: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

30

0

0.2

0.4

0.6

0.8

1

0 0.01 0.02 0.03 0.04 0.05

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.5 ROC graph for classification of “OOW” A&A event.

trajectories. The negative log likelihood2 for a state st+1 given its previous state st is computed

as follows:

NLL(st+1|st) =∑

i∈{Pred}

(vpredi (t+ 1)− vi(t+ 1))2

V ar(i)[4.3]

where Pred is the set of indices of predictable variables in the DBN, vpredi (t+1) is the predicted

value for variable i at time t+ 1, vi(t+ 1) is the real observed value for variable i at time t+ 1

V ar(i) is the variance of the CPD (in our case either regression tree model or linear Gaussian

model) over training data. Having defined the NLL for state, we then define the “score” of a

trajectory as follows:

scoreC =T−1maxt=1

NLL(st+1|st) [4.4]

where T is the length of the trajectory, and a higher value of scoreC indicates a higher possi-

bility of the trajectory being A&A event. Since we already know the ground truth of each tra-

jectories, i.e., whether it contains an A&A event or not, we can then create a receiver-operating

2The NLL also has a∑

12 log(2πσ

2) term. Since we use NLL to rank predictions and this is a constant offset,we omit it.

Page 41: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

31

characteristic (ROC) graph from this ranked list of predictions. ROC graphs show how the

true positive rate (TPR) varies with the false positive rate (FPR) as a threshold is moved over a

confidence measure (i.e. the score in our case).

Results for classifying “encoder failure” A&A event are perfect for all three models, in the

sense that the area under the ROC curves are all 1. Results for the other two A&A events are

shown in Figure 4.4 and 4.5, respectively. We can see that a trajectory with an encoder failure

or software bug (here generating a “OOW” trajectory) is quite easy to classify for all our

models. This is probably because these kinds of events have a very strong signature among the

hardware variables, which can be captured by models incorporating them. On the other hand,

the sweep event, is harder to classify if software variables are not used. This is because one of

the variables in the HS model is the force on the needle tip, which behaves very abnormally

in such cases and causes a large deviation from the expected behavior. Even though HWOnly

is relatively less accurate, however, all of the DBNs are quite accurate in absolute terms and

reach 100% TPR within 3% FPR. Also, note that for “OOW” event, HWOnly model is doing

perfect while the others are not. This is because the models for some software variables are

not good enough such that there could be large deviation from the prediction to actual values.

As we are taking maximum of the NLLs across the whole trajectory, it is possible that we get

large scoreC for normal trajectories, although the chance is very small.

4.4.2 A&A event detection

Next we consider how quickly our models can detect A&A events after the event has oc-

curred. Of course, we would like not just to detect these events but to predict them, and we will

discuss prediction in the next section. However, some A&A events, such as our encoder failure

and software bugs, may be unpredictable in that the trajectory appears completely normal until

the point when the event happens. Therefore it is still valuable to ask, given that an A&A event

has happened, how quickly a model such as we use can detect it. To measure this we again use

a test set of 400 normal trajectories and 5 trajectories for each of the three A&A events. As

before, we use the DBNs to check every (st, st+1) pairs, but in this case we do not aggregate

Page 42: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

32

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.6 ROC graph for detection of “encoder failure” A&A event.

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.7 ROC graph for detection of “sweep” A&A event.

Page 43: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

33

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.8 ROC graph for detection of “OOW” A&A event.

the prediction over trajectories. Rather we associate each point with a smoothed NLL score,

where the smoothing is done over a window of 50 (chosen arbitrarily) previous time steps.

The smoothing helps to reduce error in intermediate short regions where the DBN’s estimate

is poor. The “score” of a state (point) in a trajectory is defined as follows:

scoreD(i) =i−1∑

t=max{i−50,0}

NLL(st+1|st)min{i, 50}

[4.5]

where NLL is as defined in Equation 4.3, and again, a higher score indicates a higher prob-

ability of an A&A event at this time point. In this case, the “ground truth” is set as follows:

every point after an A&A event until the end of the event receives a label “positive”, while

every other is labeled “negative”. We then use the smoothed NLL score to construct a ROC

graph, as shown in Figure 4.6, Figure 4.7, and Figure 4.8.

From these results we observe that while all the models are good at detecting encoder

failures, the HS10 models have a small advantage. Also for the other two A&A events, HS10

Page 44: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

34

are doing slightly better than the other models. This is probably due to the built-in “lookahead”

in these models.

4.4.3 A&A event prediction

0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5 0.6

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.9 ROC graph for prediction of “sweep” A&A event with 10 steps looking ahead.

Finally, we consider the task of predicting A&A events. Imagine that the model is being

used in an online setting where at every step it can make a determination as to whether an

A&A event is likely to occur in the next k steps. To do this, at every point t, we obtain k points

st+1, ..., st+k from our DBNs conditioned on st. These k points are the means of the associated

Gaussian distributions, so this is the most likely trajectory conditioned on st. From these points

we pick the needle tip positions and measure the average NLL score (Equation 4.6) compared

to the reference trajectory provided as input. (We use only the needle tip position because that

is the only information in the reference trajectory.) Thus we are evaluating, given the current

state, how likely it is that the most likely needle tip trajectory k steps later will be significantly

different from the reference. Each st is then associated with this average NLL score. The

Page 45: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

35

0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5 0.6

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.10 ROC graph for prediction of “sweep” A&A event with 100 steps looking ahead.

0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5 0.6

Tru

e po

sitiv

e ra

te

False positive rate

HSHS10

HWOnly

Figure 4.11 ROC graph for prediction of “sweep” A&A event with 1000 steps looking ahead.

Page 46: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

36

NLL score for the predicted needle tip position of state st+1 based on that of previous state,

compared to the reference needle tip position of state st+1 is defined as follows:

NLL(pt+1|st) =(vpredactX(t+ 1)− vrefX(t+ 1))2

V ar(actX) + V ar(refX − actX)

+(vpredactY (t+ 1)− vrefY (t+ 1))2

V ar(actY ) + V ar(refY − actY )

+(vpredactZ(t+ 1)− vrefZ(t+ 1))2

V ar(actZ) + V ar(refZ − actZ)[4.6]

where vpredactX(t + 1) is the predicted value of actual needle tip position in x axis, vrefX(t +

1) is the value of reference needle tip position in x axis, which is provided as parameter a

priori, V ar(actX) is the variance of the CPD (in our case either regression tree model or linear

Gaussian model) of variable actX over training data, V ar(refX − actX) is the variance of

variable refX and actX over training data. The same logic applies for the meanings of other

notations concerning Y and Z in the equation. Having defined NLL(pt+1|st), we now can

define a score for a state i in prediction task as follows:

scoreP (i) =i+k−1∑t=i

NLL(pt+1|st)k

[4.7]

where k is a user specified parameter indicating how far to look ahead. And again, the higher

the score is, the more possible that an A&A event occurs at the corresponding time point. The

ground truth for each st is as follows: if an A&A event does happen within k steps, it is labeled

“positive”, else “negative”. We then, as before, construct an ROC graph from these predictions.

Experimental results are shown in Figure 4.9, Figure 4.10, and Figure 4.11, with k = 10

(0.01s), k = 100 (0.1s), and k = 1000 (1s) for the sweep event, respectively. From the results,

we observe that HS and HWOnly both do very well, while HS10 is lagging behind. The reason

for HWOnly models performing as good as HS models is that only (a subset of) the hardware

variables are evaluated to calculate the NLL scores (because the reference trajectory has only

those variables). As a result, the advantage of predicting the software variables is limited in

this setting. On the other hand, HS10 models are not as good in prediction. This is probably

because predicted values for software variables are used to further predict hardware variables

Page 47: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

37

A&A events Classification Detection

Encoder failure All perfect HS10>HS>HWOnly

Sweep HS>HS10>HWOnly HS10>HS≈HWOnly

OOW All perfect HS10>HS≈HWOnly

Table 4.2 Comparison of models on classifying, and detecting different A&A events.

Lookahead (ms) Prediction

10 HS≈HWOnly>HS10

100 HS≈HWOnly>HS10

1000 HS≈HWOnly>HS10

Table 4.3 Comparison of models on predicting “sweep” event.

when looking ahead, and the HS10 models for those software variables are not all good enough.

Table 4.2 and Table 4.3 summarize the empirical results. To conclude, all of the DBNs

we test perform well in our experiments at different tasks. They are all good at classifying

trajectories when there is strong signature among the hardware variables, while HWOnly is not

as good at classification when only some software variables behave abnormally in the A&A

events. For detection and prediction tasks, HS10 is the best at detecting A&A events after the

fact, and HS and HWOnly are best at predicting A&A events before they happen. However, we

also see that even though it is not the best, HS is a close second, indicating that this model is

a good compromise for all three tasks that we consider. These results also indicate that DBNs

of the type we use can be successful in identifying A&A events and modeling elements of the

software state can be helpful in classification and detection of these events.

Page 48: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

38

Chapter 5

Conclusion and Future Work

In this work, we have proposed a framework to improve the safety and reliability of medical

robotic surgery systems. We have described a prototype that addresses some of the subprob-

lems, and presented encouraging preliminary results that indicate the feasibility of building on

this foundation. Specifically, DBNs are designed and used to model the behavior and dynamics

of the simulator of a RoS system, i.e. SABiR. We show empirically that the DBNs are able to

effectively relate collected data to the occurrence of A&A events.

Despite good performance of DBN models, a lot remains to be done for the whole frame-

work to be realized. A lot of improvements can be conducted to the simulation and controller

by adding path planning and a simulation of image-guidance, improving the software archi-

tecture to enable more complex high level actions to be performed and to collect more data

about the software execution, and finally also adding to the statistical analysis by computing

“usage envelopes”, that represent regions of the state space where the robot can operate reli-

ably, without many A&A events. It is also important to consider alternative A&A events in

different categories. Last but not least, it is important to integrate with the real hardware and

evaluate how well our model can detect A&A events there, as well as use the results to refine

the software/simulation framework.

Page 49: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

39

Appendix A: State variables and DBN structures

Type Name Components Num

Parameter

Reference Trajectory

Reference position X, Y, Z 3

Reference orientation X, Y, Z 3

Begin position X, Y, Z 3

Begin orientation X, Y, Z 3

End position X, Y, Z 3

End orientation X, Y, Z 3

Ready PointReady position X, Y, Z 3

Ready orientation X, Y, Z 3

Insert Distance Insert distance 1

Software

Action Action name: move-needle, insert-

needle, extract-needle, move-home

1

Insert Depth Depth of needle inside tissue 1

Needle Force Force on the needle X, Y, Z 3

Needle Torque Torque on the needle X, Y, Z 3

Motor Position Error Errors between actual motor joint angles

and reference motor joint angles

5

Motor Speed Error Errors between actual motor speeds and

reference motor speeds

5

Hardware

Needle PositionNeedle tip position X, Y, Z 3

Needle tip orientation X, Y, Z 3

Motor Position Motors’ joint angles 5

Motor Torque Motors’ torques 5

Total: 59

Table A.1 All state variables in DBNs.

Page 50: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

40

Set Variable Name Num

RVSet All

Needle Position 6

Motor Position 5

Motor Torque 5

Insert Depth 1

Needle Force 3

Needle Torque 3

Reference Trajectory 18

Ready Point 6

Insert Distance 1

Action 1

Motor Position Error 5

Motor Speed Error 5

Total: 59

RVSet 1

Needle Position 6

Motor Position 5

Motor Torque 5

Insert Depth 1

Needle Force 3

Needle Torque 3

Reference Trajectory 18

Ready Point 6

Insert Distance 1

Action 1

Total: 49

Table A.2 Relevant variable sets for DBN structures. See Table A.3 for DBN structures.

Page 51: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

41

Type Variable Name Num Relevant Variable Set

Predictable

Needle Position 6 RVSet 1

Motor Position 5 RVSet All

Motor Torque 5 RVSet All

Insert Depth 1 RVSet 1

Needle Force 3 RVSet 1

Needle Torque 3 RVSet 1

Total: 23 -

Observable

Reference Trajectory 18 -

Ready Point 6 -

Insert Distance 1 -

Action 1 -

Motor Position Error 5 -

Motor Speed Error 5 -

Total: 36 -

Table A.3 DBN structures.

Page 52: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

42

Appendix B: Workspace, environment, and parameters

This appendix contains information about the workspace of SABiR and the simulator. As

shown in Figure B.1, there is a home position, and two nested gels.

B.1 Parameters

The coordinates of initial needle tip position (home position) is1:

(-10.7505, -206.2838, 330.8692)

The initial needle tip orientation is:

(-0.0327, 0.0020, 0.9995)

The coordinates of corners of outside gel is:

(0 -240 360), (0 -180 360)

(0 -240 340), (0 -180 340)

(-20 -240 360), (-20 -180 360)

(-20 -240 340), (-20 -180 340)

The coordinates of corners of inner gel is:

(-7 -220 350), (-7 -200 350)

(-7 -220 345), (-7 -200 245)

(-12 -220 350), (-12 -200 350)

(-12 -220 345), (-12 -200 245)

1All the coordinates here use millimeter as unit.

Page 53: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

43

Figure B.1 Workspace of the SABiR simulator.

Figure B.2 Normal trajectory.

Page 54: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

44

Appendix C: DBN models for normal trajectories

Variables HS HS10 HWOnly

Needle Tip Position X: 1 1 1

Needle Tip Position Y: 1 1 1

Needle Tip Position Z: 1 1 1

Needle Tip Orientation X: 1 1 1

Needle Tip Orientation Y: 1 1 1

Needle Tip Orientation Z: 1 1 1

Motor Joint Angle 1: 1 1 1

Motor Joint Angle 2: 1 1 1

Motor Joint Angle 3: 1 1 1

Motor Joint Angle 4: 1 1 1

Motor Joint Angle 5: 1 1 1

Motor Torque 1: 0.363 0.305 0.161

Motor Torque 2: 0.359 0.297 0.111

Motor Torque 3: 0.289 0.194 0.054

Motor Torque 4: 0.377 0.322 0.162

Motor Torque 5: 0.374 0.258 0.107

Average 0.784 0.773 0.725

Table C.1 r2 values of different models on predictable hardware state variables. HS: modelsusing hardware/software variables. HS10: models using hardware/software variables with

10-step prediction. HWOnly: models using only hardware variables.

Page 55: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

45

Variables HS HS10

Insert Depth: 1 1

Needle Tip Force X: -0.202 0.227

Needle Tip Force Y: 0.402 0.452

Needle Tip Force Z: 0.792 0.810

Needle Tip Torque X: 0.075 0.173

Needle Tip Torque Y: -0.029 0.344

Needle Tip Torque Z: 0.866 -0.231

Average 0.415 0.397

Table C.2 r2 values of different models on predictable software state variables. HS: modelsusing hardware/software variables. HS10: models using hardware/software variables with

10-step prediction.

Page 56: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

46

LIST OF REFERENCES

[1] B. Franklin, “Robotic surgical systems,” Biomedical Instrumentation & Technology,vol. 40, no. 6, pp. 461–464, 2006.

[2] F. Cao, R. Mark, Z. Bai, M. C. Cavusoglu, A. Podgurski, and S. Ray, “A framework fordetecting adverse and anomalous events in medical robotic systems,” in IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems (IROS), IEEE, 2012. submitted.

[3] E. Estey, “Robotic prostatectomy: The new standard of care or a marketing success?,”Canadian Urological Association Journal, vol. 3, no. 6, p. 488, 2009.

[4] J. Finkelstein, E. Eckersberger, H. Sadri, S. Taneja, H. Lepor, and B. Djavan, “Openversus laparoscopic versus robot-assisted laparoscopic prostatectomy: the European andUS experience,” Reviews in Urology, vol. 12, no. 1, p. 35, 2010.

[5] FDA, “Adverse event report 2955842-2008-01144: Intuitive Surgical Inc., Da Vinci SSurgical System endoscopic instrument control system.” www.accessdata.fda.gov/

scripts/cdrh/cfdocs/cfMAUDE/Detail.CFM?MDRFOI__ID=1077464, April 2012.

[6] Intuitive Surgical Inc., “Da Vinci S Surgical System.” www.intuitivesurgical.com/

products/davincissurgicalsystem/index.aspx, April 2012.

[7] FDA, “Device classification.” http://www.fda.gov/MedicalDevices/

DeviceRegulationandGuidance/Overview/ClassifyYourDevice/default.htm,April 2012.

[8] T. Schaaf, “Robotic surgery: The future is now,” Mediacal DeviceLink, March, 2001.

[9] FDA, “Medical device quality systems manual: A small entity complianceguide.” www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/

PostmarketRequirements/QualitySystemsRegulations/

MedicalDeviceQualitySystemsManual/default.htm, April 2012.

[10] FDA, “General principles of software validation; final guidance for industry and fdastaff.” http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/

GuidanceDocuments/ucm085281.htm, April 2012.

Page 57: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

47

[11] FDA, “The quality system regulation.” www.fda.gov/MedicalDevices/

DeviceRegulationandGuidance/PostmarketRequirements/

QualitySystemsRegulations/MedicalDeviceQualitySystemsManual/

ucm122391.htm, April 2012.

[12] E. Clarke, O. Grumberg, and D. Peled, Model Checking. MIT press, 1999.

[13] I. El-Far and J. Whittaker, “Model-based software testing,” Encyclopedia of SoftwareEngineering, 2001.

[14] J. Jacky, M. Veanes, C. Campbell, and W. Schulte, Model-Based Software Testing andAnalysis with C#. Cambridge University Press, 2008.

[15] R. Jetley, S. Purushothaman Iyer, and P. Jones, “A formal methods approach to medicaldevice review,” Computer, vol. 39, no. 4, pp. 61–67, 2006.

[16] P. Godefroid, N. Klarlund, and K. Sen, “Dart: directed automated random testing,” inACM Sigplan Notices, vol. 40, pp. 213–223, ACM, 2005.

[17] M. Lyu, Handbook of software reliability engineering. McGraw-Hill New York et al.,1996.

[18] R. Taylor and D. Stoianovici, “Medical robotics in computer-integrated surgery,” IEEETransactions on Robotics and Automation, vol. 19, no. 5, pp. 765–781, 2003.

[19] B. L. Davies, Computer-Integrated Surgery: Technology and Clinical Applications, ch. ADiscussion of Safety Issues for Medical Robots, pp. 287–300. MIT Press, 1996.

[20] E. Dombre, P. Poignet, F. Pierrot, G. Duchemin, and L. Urbain, “Intrinsically safe ac-tive robotic systems for medical applications,” in 1st IARP/IEEE-RAS Joint Workshop onTechnical Challenge for Dependable Robots in Human Environment, Seoul, pp. 21–22,2001.

[21] R. Howe and Y. Matsuoka, “Robotics for surgery,” Annual Review of Biomedical Engi-neering, vol. 1, no. 1, pp. 211–240, 1999.

[22] W. Ng and C. Tan, “On safety enhancements for medical robots,” Reliability Engineering& System Safety, vol. 54, no. 1, pp. 35–45, 1996.

[23] G. Duchemin, P. Poignet, E. Dombre, and F. Peirrot, “Medically safe and sound [human-friendly robot dependability],” Robotics & Automation Magazine, IEEE, vol. 11, no. 2,pp. 46–55, 2004.

[24] S. B. Ellenby, “Safety issues concerning medical robotics,” in IEE Colloquium on Safetyand Reliability of Complex Robotic Systems, pp. 3/1–3/4, IET, 1994.

Page 58: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

48

[25] B. Fei, W. Ng, S. Chauhan, and C. Kwoh, “The safety issues of medical robotics,” Relia-bility Engineering & System Safety, vol. 73, no. 2, pp. 183–192, 2001.

[26] Y. Hu, T. Podder, I. Buzurovic, K. Yan, W. Ng, and Y. Yu, “Hazard analysis of EUCLID-IAN: An image-guided robotic brachytherapy system,” in Engineering in Medicine andBiology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE,pp. 1249–1252, IEEE, 2007.

[27] P. Varley, “Techniques for development of safety-related software for surgical robots,”Information Technology in Biomedicine, IEEE Transactions on, vol. 3, no. 4, pp. 261–267, 1999.

[28] B. Halder and N. Sarkar, “Robust fault detection of a robotic manipulator,” The Interna-tional Journal of Robotics Research, vol. 26, no. 3, pp. 273–285, 2007.

[29] R. Mattone and A. De Luca, “Relaxed fault detection and isolation: An application to anonlinear case study,” Automatica, vol. 42, no. 1, pp. 109–116, 2006.

[30] M. McIntyre, W. Dixon, D. Dawson, and I. Walker, “Fault identification for robot manip-ulators,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 1028–1034, 2005.

[31] V. Verma, G. Gordon, R. Simmons, and S. Thrun, “Real-time fault diagnosis [robot faultdiagnosis],” Robotics & Automation Magazine, IEEE, vol. 11, no. 2, pp. 56–66, 2004.

[32] R. Lutz and I. Carmen Mikulski, “Operational anomalies as a cause of safety-criticalrequirements evolution,” Journal of Systems and Software, vol. 65, no. 2, pp. 155–161,2003.

[33] J. Whittaker and M. Thomason, “A Markov chain model for statistical software testing,”IEEE Transactions on Software Engineering, vol. 20, no. 10, pp. 812–824, 1994.

[34] G. Baah, A. Podgurski, and M. Harrold, “Causal inference for statistical fault localiza-tion,” in Proceedings of the 19th international symposium on Software testing and analy-sis, pp. 73–84, ACM, 2010.

[35] T. Dean and K. Kanazawa, “A model for reasoning about persistence and causation,”Computational intelligence, vol. 5, no. 2, pp. 142–150, 1989.

[36] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees.Monterey, CA: Wadsworth and Brooks, 1984.

[37] O. Bebek, M. J. Hwang, B. Fei, and M. C. Cavusoglu, “Design of a small animal biopsyrobot,” in Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th AnnualInternational Conference of the IEEE, pp. 5601–5604, IEEE, 2008.

Page 59: CLASSIFICATION, DETECTION AND PREDICTION …engr.case.edu/ray_soumya/papers/FengCao_thesis.pdfCLASSIFICATION, DETECTION AND PREDICTION OF ADVERSE AND ANOMALOUS EVENTS IN MEDICAL ROBOTS

49

[38] M. J. Hwang, O. Bebek, F. Liang, B. Fei, and M. C. Cavusoglu, “Kinematic calibrationof a parallel robot for small animal biopsies,” in IEEE/RSJ International Conference onIntelligent Robots and Systems (IROS), pp. 4104–4109, IEEE, 2009.

[39] MathWorks, “Simulink.” http://www.mathworks.com/products/simulink/, April2012.

[40] R. C. Jackson and M. C. Cavusoglu, “Modeling of needle-tissue interaction forces duringsurgical suturing,” in Proceedings of the IEEE International Conference on Robotics andAutomation (ICRA), St. Paul, MN, USA, May 14-18 2012. To appear.

[41] J. Quinlan, C4.5: programs for machine learning. Morgan Kaufmann, 1993.