Post on 30-Oct-2015
1DEVELOPMENT AND IMPLEMENTATION OF RISK
STRATIFICATION TOOLS
Practical tools to identify patients with complex needs
O+berri, Working Paper
2 3
2 3
ContentsForeword by Geraint Lewis
1. Aim
2. Background
3. Context
4. International experiences
5. Spanish National Health Service experiences
Bibliography
Appendix I.Gauging the precision of predictive models
Page
7 - 8
9
10 - 15
16 - 25
26 - 30
31 - 37
38 - 40
42 - 44
4 5
4 5
Authors
Roberto Nuo SolinsDirector of the Basque Institute for Healthcare Innovation (O+berri).
Juan Carlos Contel SeguraChronicity Prevention and Care Programme. Catalonian Health Deptartment
Juan F Orueta Mendia ResearcherBasque Institute for Healthcare Innovation (O+berri) and Osakidetza.
Arturo Garca lvarezResearcherO+berri and Kronikgune
Geraint LewisChief Data Officer of the NHS in England
Other contributors
Bernardo Valdivieso MartnezDirector of Planning LA FE University Hospital. Valencia LA FE Health Dept. Valencia Health Agency Eduardo Zafra GalnDirector of Chronic Patient Care Services and Medium to Long Stay Hospitals. General Healthcare Directorate. Valencian Health Agency. Valencian Regional Government.
Francisco Rdenas RiglaProfessor and Research FellowPoliWelfare Research Center. Valencia University
6 7
6 7
Foreword by Geraint LewisFor many years, researchers and commentators across the developed world have been warning about
a cluster of cost pressures that our health care systems will face in the 21st Century. Populations are
ageing; chronic diseases are becoming more prevalent; and new, expensive technologies continue to
be introduced. However, the 2008 financial crisis dramatically compounded this situation.
Now, in its aftermath, radical efficiency gains are required to ensure that universal health care remains sustainable for future generations.
The use of health care resources is highly skewed across a population, with a relatively small number
of people accounting for a disproportionate amount of cost. In the Basque Country, for example, 1
per cent of individuals account for 21 per cent of health care expenditure. Consequently, thousands
of Euros could potentially be spent on improving the health status of each high-risk individual while
still yielding considerable net savings for the health system as a whole from averted costly episodes of
care. Unfortunately, while such a preventive policy is highly attractive to policymakers, it is beset by a
number of challenges.
First is how to identify which patients will truly experience a costly, adverse health outcome during a future time period (e.g., the individuals who will have multiple unplanned hospital
admissions in the next 12 months). Studies have shown that doctors and nurses are generally unable
to make accurate predictions of this type, and also that simple, rules-based criteria tend to perform no
better. Instead, more sophisticated statistical methods may be required. These models, which are called
predictive risk models or risk stratification tools, are used extensively in insurance-based health care
systems; however, until recently, they have not been used widely within public health care systems.
8 9
A second challenge is to identify the subgroup of high-risk individuals who are most likely to respond to a preventive intervention. Tools for selecting these high-risk, high-opportunity
individuals are known as intervenability models or impactibility models and they offer the potential to
improve the efficiency of preventive care by ensuring it is only offered to patients who are most likely to benefit. Again, there has been relatively little interest in their use within public health care systems.
A key consideration here, however, is to ensure that the use of such models does not inadvertently
worsen health care inequalities (e.g., by excluding patients who have difficult social circumstances).
The third, and probably the toughest, challenge is to design interventions that successfully and cost-effectively mitigate the risk of the costly adverse health outcome (e.g.,
an intervention that reduces the risk of readmission to hospital). Here, the literature contains many
examples of preventive interventions that were either ineffective, cost-ineffective, or which would not
be generalizable to a public health care system.
The final challenge is to evaluate the impact of a preventive health care programme in a robust
manner, and to ensure that the lessons learnt are used to refine the preventive programme accordingly.
Unfortunately, many preventive programmes have been evaluated using unsatisfactory methods, such
as pre/post analyses, which do not allow us to determine the true efficacy of the programme.
This, then, is a story of formidable challenges coupled with exciting opportunities. The sociodemographic
trends are clear, as is the potential to save money while improving the care of the most vulnerable members
of society; however, the methodological challenges of doing so should not be underestimated.
The encouraging news is that there are examples from across Spain of local health care and social
care teams that are rising to these challenges. Therefore, as well as providing an introduction to
predictive modelling in the Spanish context, a key purpose of this report is to raise awareness of these
local projects and to help ensure that as the evidence begins to accrue, that proven interventions are
diffused across the country as rapidly as possible
8 9
1. AimThis paper aims to guide the leaders of healthcare organisations interested in implementing
predictive risk stratification models.
We outline the basic concepts of predictive modelling, describe some of the models that have been
developed internationally, and document some current experiences in the Spanish National Health
Service.
This paper focuses on the use of predictive modelling to identify subpopulations of patients with complex, chronic needs and how to intervene mitigate the risk of adverse events through more efficient use of health care and social care resources.
It is important to underline that the models described here are not designed to be used for other
purposes, such as for insurance, recruitment or finance.
10 11
2. BackgroundOver the past few decades, increases in life expectancy, combined with changes in lifesyle factors, have
led to a marked increase in the prevalence of certain chronic diseases and especially multimorbidity.
Demographic shift. The populations of many developed countries are ageing, and Spain is no
exception. Within fifty years, a third of the Spanish population will be aged over 65 and more than
12% will be over eighty.
Epidemiological shift. Leading to an increase in the prevalence of chronic conditions and
multimorbidity.
Economic impact on the healthcare system. Chronicity and multi-morbidity have a considerable
impact on healthcare system costs (see Figure 1). Moreover, the cost of care increases dramatically
for patients with multiple co-morbidities (see Figure 2) such that these patients account for a large
proportion of healthcare budgets.
10 11
Figure 1Relationship between the annual healthcare cost per patient and
the number of chronic diseases
12 13
Figure 2
Percentage of total healthcare spending by number of chronic diseases
12 13
Yet todays healthcare systems were primarily intended to address acute episodes of illness and are not
designed to provide for the needs of patients with complex, chronic conditions. Currently, healthcare
is often inadequately coordinated for such patients, which impacts negatively on the quality of and
experience of care as well as increasing overall costs.
As a result of this disjuncture between patient needs and care provision, there is growing interest
among governments and healthcare organizations in many countries to redesign how health care and
social services are provided to people with chronic diseases.
To address the phenomenon of chronicity, healthcare services will need to develop new clinical,
organizational and managerial solutions.
In many countries, specific programmes have been put in place to monitor and treat particular chronic
diseases,such as diabetes or heart failure. However, programmes for patients with multi-morbidity
are often under-developed. The ability to identify, understand and work with these subpopulations is
important, not just for administrators, but also for clinical staff.
There is growing interest among policymakers in the Spanish National Health Service on how best
to approach the issue of chronicity and multi-morbidity. For example, the Spanish Health Social
Services and Equality Ministry has recently published a paper entitled Strategy for addressing
chronicity in the Spanish National Health Service. This strategy calls for a working group to be
established that will review the planning, monitoring and evaluation of approaches to chronicity. The
Ministry has identified population stratification as one of the priority areas of work.
Likewise, some Spanish autonomous regions have already developed plans and strategies regarding
chronicity, and have published documents that refer explicitly to population stratification (see Box 1).
14 15
BOX 1 Examples of chronic disease strategies in some Spanish autonomous regions
Basque Country: In July 2010, the Basque Governments Department of Health published a Strategy to meet the challenge of chronicity in the Basque Country, which set out a number of policies and projects to redesign the model for delivering healthcare for patients with long term conditions. In order to be both effective and efficient, proactive interventions must be offered to those patients whose care needs match the profile for which they were designed. Therefore the strategy calls for a system of population stratification using risk adjustment tools.
Catalonia: The Catalan Health Department has published a Chronicity Prevention and Response Plan (CPRP) to promote policies and projects aimed at improving care for patients with chronic conditions. The publication of this document coincided with the development of the 2011-2015 Healthcare Plan-a new model of healthcare that includes important strategic and operational proposals in the area of chronic disease management. Taken together, these documents and policies should transform the model of healthcare and social care for people with chronic diseases and complex social needs.
Andalusia: The Quality Plans produced over the past ten years by the Health Department of the Andalusian Government have brought about a cultural and strategic change that has helped to re-orientate the Andalusian public healthcare service towards the needs of citizens. One such initiative is the Andalusian Plan for Integrated Care for Patients with Chronic Diseases (APICP). The APICP was designed to address the uncertainties of clinical decision-making in patients with multiple co-morbidities by applying the multi-dimensional focus of chronic disease management models to the scenarios detailed in the Quality Plans. It also considers the efficiency of different preventive measures and the viability of existing administrative programmes. The foremost priority identified by the APICP was to strengthen the role of primary care in the Andalusian public healthcare service.
Valencia: The Valencia Health Department, in line with the Valencia Healthcare Agency, is developing a strategy of innovation in the care of patients with chronic conditions. In particular, it is implementing a new proactive care model that involves closer monitoring of patients at home. The goal of the strategy is to ensure longer periods of stability, reduce the frequency of decompensation, keep symptoms manageable and improve patients quality. Ultimately the strategy is designed to reduce healthcare costs by reducing the consumption of resources during periods of deterioration.
Following the publication of the strategy, the Valencia Healthcare Agency has piloted and implemented a Chronic Disease Management Programme based on current policies and best available practices and evidence.
The programme includes the following components:* Identification and stratification of the target population.* Designing a care and treatment process that coordinates all care resources involved, based on the philosophy of case managements. * Documentation of guidelines and protocols and the design of specific educational programmes for the care of patients with chronic diseases. * Promotion of the use of new technologies. * Alignment of resources and incentives from the Department towards the strategic goals.* Continuous assessment and improvement of quality across the Programme.
Similarly, the Valcronic programme is a plan for implementing new technologies in the care for chronic patients. Launched as a pilot scheme in the healthcare departments of Elche General Hospital and Sagunto, its implementation began in early 2012, focusing on four chronic diseases (hypertension, diabetes mellitus, heart failure and COPD). After an evaluation of the results, the programme is set to be extended to the rest of the Valencia autonomous region.
14 15
One way to evaluate chronic care models in the Spanish context is to use the IEMAC tool (Nuo et al, 2012). The tool provides a roadmap for healthcare systems to deliver better care to patients
with chronic illnesses, and the accompanying report contains a number of references to population
stratification, such as:
* Systems have been developed and implemented for population stratification which provide useful information for clinical decision-making and administration (intervention 1.3.2); and
* A predictive classification based on a forecast of care needs is available in the patients medical records (intervention 6.1.1).
Finally, it is impossible to ignore the severe economic and financial pressures facing the Spanish
health care system. Set against this background the preservation and sustainability of a universal
and equitable public healthcare system requires, more than ever, strategies for offering cost-effective preventive care to specific population subgroups.
16 17
3. ContextHealthcare organizations are becoming increasingly aware that they need accurate tools for identifying
chronic patients at greater risk of costly, adverse outcomes.
Risk stratification can be used to detect subpopulations with different risk levels and particular needs
profiles and as such represents a paradigm shift for health care services. Population stratification based
on risk prediction is a dynamic and rapidly evolving field.
Figure 3
A population stratification pyramid
16 17
The prediction of risk has many applications in everyday life, for example in the calculation of insurance
policy premia. Furthermore, while public healthcare systems have only become interested in these
tools relatively recently, prospective risk adjustment models have been in use for years in the
insurance-based health systems. Indeed, their original purpose in health care was to provide adjustment
mechanisms for financing healthcare providers equitably.
Garca Goi (2004) distinguishes between several types of risk adjustment system, based on:
* Demographic models: where variables such as gender and age are entered into a regression model
to predict healthcare spending. The main advantage of this type of model is its simplicity; however, their
predictive accuracy is relatively low and they typically explain only less than 5% of cost variability.
* Models based on previous costs: When previous costs are considered in addition to demographic
variables, up to 10% of cost variability can be explained. The main disadvantage of using previous
costs, is that the higher the costs in any given year, the higher the predicted costs for the following
year. As a result, this model may lead to perverse incentives, since profligate organizations receive
greater rewards than those that practise improved efficiency and cost containment.
* Models based on diagnostics: These models combine demographic variables with distinct
categories of clinical diagnoses. Patients are classified into different levels of co-morbidity that are
expected to cost roughly the same amount (i.e., iso-consumption of resources). Examples of diagnostic
classification models are Adjusted Clinical Groups (ACGs) and Diagnostic Cost Groups (DCGs), described
in greater detail on page 27. Such models offer certain advantages (they tend to be more accurate than
demographic and cost models); however, they tend to be more difficult to apply and can sometimes
generate inappropriate incentives (e.g. incentivising clinicians to ascribe more severe diagnoses).
* Models based on prescriptions of pharmaceuticals: Information about prescribed medicines
can be used to predict the health costs experienced by different groups of people. Typically, the
predictive power of pharmaceutical models tends to be similar to the results achieved by diagnosis-
based models.
18 19
* Models based on health surveys: In health care systems that use Health Risk Assesssments, the
variables recorded may be entered into a regression model to predict costs. The predictive power of
such models is generally no greater than that of diagnosis-based models. Moreover, there are problems
presented by the use of surveys, including the high costs of immplementing the survey, selection bias,
and response bias.
* Models that use socioeconomic variables:
The relationship between poor health and poverty, poor education and social isolation is well known;
however, these variables are not widely used in patient stratification prediction models. There is often
no reliable information available at the individual level. Therefore, when they are used, social factors
are usually included in the form of small area indicators derived from the census. While the inclusion
of socioeconomic factors may be warranted to ensure equal care for less-favoured social groups, in
practice, their additional contribution to the explanatory power of predictive models is usually only
marginal (O+berri, unpublished data).
We suggest that policymakers should consider six characteristics of any risk prediction model:
First, their purpose. Historically, predictive models have been used as adjustment mechanisms for
financing and contracting services, and for ensuring equitable distribution of resources among different
regions or providers. However, the another notable use is in the identification of people with
certain risk profiles, also known as case finding, which is the focus of this paper.
The event being predicted. All predictive models describe the risk of a specific, adverse event, such
as urgent or unplanned admission to hospital, institutionalization, re-admission to hospital, death or
high healthcare costs.
The source of data used. All predictive models rely on data, which must describe both the event to be predicted (the dependent or outcome variable) and a range of risk factors from an earlier time
period that might have predicted this event (known as the independent, explanatory or predictor
variables). The availability of information is therefore a key issue for predictive models. All data sources
have their limitations, be they adminitstrative data or specially collected data. The three main types of
data included in predictive models are routine data from administrative databases, survey data, and
data from electronic health records (EHRs). See box 2.
Clearly, the availability of data determines which explanatory variables are entered in the model. The choice of variables must not be based solely on statistical criteria but should consider other factors,
such as transparency, the ease with which results are interpreted, resistance to manipulation of data or
flexibility to adapt to structural changes within healthcare systems.
The period for during which the risk is predicted. For example, many predictive models predict
which patients in a population will be admitted to hospital in the next 12 months
The type of statistical techniques used. Although neural network and other complex predictive models might seem more appropriate, authors agree that for simplicity, ease of interpretation and the
quality of the results, linear or logistic regression is preferable.
BOX 2 Data for predictive modelling in Spain
At present, there is a wealth of information on Spanish National Health Service use, which is held both in administrative databases and in primary care electronic medical records systems (EMR). In some areas, these data are already being exploited to guide population-based health care approaches. However, in many autonomous regions in Spain, the data are held in separate silos (primary care, hospital, social care), which can act as a barrier. In primary care for example, vast quantities of coded data have been collected over the years on patient diseases (medical and nursing diagnoses) and pharmacy. Hospitals also have the basic minimum data set (hospital care) (MDS). This information is stored separately and it is rare for it to be used in a co-ordinated fashion, even though it would be feasible to devise ad-hoc systems to do this or to use groupings designed elsewhere (such as ACG, and DxCGs CRGs). There are also other sources of information independent of healthcare organizations, such as records of recipients of social care services or those related to the socio-economic status of areas of residence, which can be used as complementary data to determine the healthcare requirements of a given population.
18 19
20 21
Identifying individuals with certain risk profiles: Why? for what purpose?
The basic philosophy behind case finding is to predict costly, adverse events in order to improve and
protect an individuals health status and make net savings. This preventive logic involves replacing the
traditional reactive care model with a proactive model, where upstream investments are offered with
the aim of averting downstream costs.
A one-size fits all approach to proactive care is inadequate because the future costs of patients vary
so widely. Consequently, there is a need for care programmes that are tailored to the characteristics
and expected of each subpopulation. The optimal interventions to offer to different strata of risk have
yet to be confirmed conclusively; however, most Care Management Programmes (CMP) draw upon the
disciplines of disease management, geriatric intervention in fragile and vulnerable patients, and case
management.
The purpose of identifying which chronic patients are at greatest risk is not about identifying people
who are currently in a certain state (such as patients are currently experiencing costly treatment or
repeated hospitalizations). Rather, it is to identify in advance which people will meet a certain profile in
the future. By focusing on anticipated future scenarios, predictive models facilitate the development of
proactive strategies aimed at avoiding the health impact and cost of future adverse events. Predictive
models therefore act almost as a radar for identifying high-risk subpopulations. Once identified, these
individuals can be offered specifically-tailored preventive care models. For example, in the United States,
the Kaiser Permanente health care system considers unplanned hospital admissions as a failure of the
healthcare system and offers a range of interventions designed to prevent them from happening.
+ Alternative Case Finding Methods
Common questions that arise in discussions about predictive models include:
* Are there no easier and simpler alternatives?
* Is such predictive work not the job of clinical staff, based on their knowledge and experience?
There are indeed alternative approaches, as summarized below; however, there are disadvantages
associated with each method.
20 21
Criteria-based identification. These simple tools are popular in clinical environments due to their
intuitive nature. However, methods that select high-risk patients solely on criteria (e.g., aged 65+ and 2
or more hospital admissions in the last year) are inefficient at predicting risk of hospitalisation because
of selection bias and a phenomenon called regression toward the mean (Roland et al, 2005). Their
performance is estimated to be half that of predictive models (Cousins 2002).
Identification based on clinical knowledge. Although clinical staff allows are able to identify
current complex patients accurately using their knowledge, skills and expertise, ther ability to predict
which patients will be at a high risk of admission in the future appears to be no better than chance
(Allaudeen et al 2011).
Combination of predictive models and clinical knowledge. Recent studies (Freund, 2011)
suggest that a combination of predictive modelling and clinical expertise can deliver the most accurate
predictions, particularly when the output from the models is filtered by primary care doctors. Although
this approach is very attractive (and is to a great extent the inspiration for this document) it has yet to
be fully evaluated.
+ The concept of impactibility
Impactibility models (Lewis, 2010) seek to refine the output of predictive models by identifying
sub-groups of high-risk patients that will benefit most from preventive programmes and proactive
intervention. These models prioritize patients according to characteristics that are indicators of how
well they will likely respond to an intervention. Patients that are unlikely to benefit are excluded, and
patients who are highly likely to benefit are prioritized. For example an impactibility model might
prioritize patients with an Ambulatory Care Sensitive Condition (ACSC), such as COPD, heart failure
or asthma. By definition, patients with an ACSC should not require hospitalisation if their primary care
is optimised. Another example of an impactibility model prioritises patients with more gaps in their
care (each gap representing an unaddressed needs, based on evidence-based guidelines). Gaps may
relate to overdue screening (e.g., diabetic foot check), missing vaccinations (e.g. pneumococcus), and
inappropriate medication.
Another approach to impactibility modeling that is sometimes advocated is to exclude the very highest
risk patients on the assumption that they are somehow beyond help. However, the evidence is in
fact in favour of intervention in these cases (Krause 2005; Peikes et al, 2009). Even in advanced
stages of illness, there is a whole range of care and support options that have been shown to be cost-
effective in improving patients quality of care while reducing net expenditure.
22 23
All impactibility models are designed to improve the efficiency of preventive resource allocation.
Models based on ACSC and gaps in care should improve equity because the prevalence of both is
higher in more socioeconomically deprived areas. However, certain other types of impactibility model
present serious issues of ethics and equity (Lewis, 2010). For example, excluding patients with criteria
suggestive of a poor response risks barring people with mental illnesses or addictions, or who cannot
speak the local language, etc.
Selecting patients for Care Management Programmes
Care Management Programmes (CMPs) are complex, multi-component healthcare interventions.
When considering the efficacy of such programmes, it is essential to be aware of the context in which
they were evaluated. For example, some of the earliest published studies of CMPs were conducted
in countries lacking universal public health insurance. Given the very different circumstances in these
countries compared to the Spanish national healthcare system, the study findings should be treated
with caution. For example, studies conducted overseas sometimes propose inclusion criteria that
would be unsuitable in the Spanish context, making difficult to extrapolate to our public healthcare
system. Fortunately however, there are also examples of CMPs that have been evaluated in more
similar organizations, such as the NHS in Northern Ireland (http://www.northerntrust.hscni.net/
about/1277.htm).
In a 2012 study, Freund (2012) examined the practical application of predictive models by German
primary care doctors and developed a conceptual framework of care sensitivity. This concept, which
is closely related to impactibility, reflects the probability that a patient will respond successfully to
the CMP being offered. Freunds framework consists of three principal components. First, that the
patients needs can be met by the CMP; second, that the patient is willingn to participate; and third,
that the patient is able to participate correctly. The last two items are determined by doctors based
on prior knowledge of the patient. We would argue, however, that a patient with high care needs
whose doctor thinks is unlikely to follow a particular plan of treatment, should be to offered a different
intervention. Indeed, in the Spanish National Health Service, where the standard of care offered to
patients with chronic conditions is high by international standards, the most pressing need is for CMPs
to be developed for such patients.
22 23
In fact, there is a dilemma when it comes to identifying cases: Is it a question of finding the patients that best suit a programme or finding a programme that best suits the patients? In a universal, public healthcare system, it is probably the latter that we should be aiming for in the medium term; however, this will require a concerted research effort and continuous feedback
from the CMPS and the results they achieve.
Finally, even the concept of a programme, as defined in the care management literature, could lead
to misunderstanding. The purpose of identifying high risk patients is to adapt the care they are offered
based on their anticipated future needs. This alteration does not necessarily require the involvement of
separate care providers, nor the creation of new roles. Whether or not new providers become involved,
a common feature of most successful programs is to provide staff with additional training in advanced
care skills.
Some of the best known programmes for high-risk patients that involve predictive identification, are:
* ACOVE (Assessing Care of Vulnerable Elders)* GRACE (Geriatric Resources for Assessment and Care of Elders)* Virtual Wards* Guided Care (see Box 3)* Care Management Plus
24 25
BOX 3 Guided Care
The Guided Care intervention (Boult, 2008) is a CMP that was developed at Johns Hopkins University in the United States. Based on the principles of Wagners Chronic Care Model (Wagner et al, 2001), Guided Care uses a predictive model to select which individuals should be offered proactive care. The model generates a risk evaluation for each patient based on their
* diagnostic categories (asthma, heart failure...), * number of urgent hospital admissions and * demographic variables (age, gender...).
The tool identifies patients at high risk of emergency hospital admission and sets out a proactive intervention plan for them using a case management model. There are several potential actions that may be indicated, including evaluation, care planning, monitoring, coaching, promoting self-care, education and support for caregivers, coordinating transitions in care, and improving access to community services. Full information on the results of the programme is available from: http://www.guidedcare.org/publications.asp. This programme has been implemented in several regions of the USA by Kaiser Permanente and other organizations (Bodenheimer, 2009).
A Spanish report on predictive factors and associated interventions (Garcia-Perez et al, 2009)
identified some additional programmes, including the Care Transitions Programme (http://www.
innovativecaremodels.com/care_models/12/leaders) by Eric Coleman and Transitional Care
(http://www.innovativecaremodels.com/care_models/21/leaders) by Mary Naylor.
Evaluating Care Management Programmes
Unfortunately, the evidence for the success of CMPs is currently limited (Purdy 2010), with one of the
principal being that CMPs are often evaluated using suboptimal research methods. Evaluations that
lack a valid control group should be avoided. Examples include:
Pre/post studies: Studies that compare an individuals experiences before and after an intervention are
inadequate because they fail to account for the phenomenon of regression to the mean. For example,
a pre/post study might find statistically significant reductions in hospitalisation rates for patients after
they began a CMP. Without a control group, however, it is impossible to determine whether this is due
to the intervention or whether it would have occurred anyway due to regression to the mean.
Area-based studies: Studies that compare regions that introduced a CMP with other regions that
did not introduce a CMP are unreliable because of a phenomenon called the ecological fallacy. This is
an error of logic that may occur when inferences about individuals are drawn solely from an analysis
of aggregated group data. For example, unless the CMPs were allocated randomly, it is likely that the
regions that introduced a CMP differed systematically from the regions that did not introduce the CMP.
These other differences may be responsible either wholly or partly for any differences seen.
Self-reported studies: Sometimes, clinicians working in a CMP may be asked to record instances
where they believe they prevented an adverse outcome (e.g., when they prevented a hospital admission
from occurring). Unfortunately, such self-reported data is subject to a number of cognitive biases and
is therefore unreliable without a control group.
More reliable evaluation methods include randomised controlled trials, propensity-matched cohort
studies, and regression discontinuity analyses.
24 25
4.International experiences4.1 US experiences
Some of the best known predictive tools are Adjusted Clinical Groups (ACG), Diagnostic Cost Groups (DCG) and Clinical Risk Groups (CRG). See Box 4. All three systems were designed in the United States and are robust from a statistical perspective and versatile in their
applications. Their usefulness has been demonstrated in public and private healthcare organizations
over the course of many years. They explain a considerable proportion of the variability in healthcare use
within a population and provide an estimate of each individuals consumption of healthcare resources
the following year. The most recent versions of the systems combine information from diagnoses,
prescriptions, previous costs and the use of certain medical and surgical procedures. In many countries,
these models have now replaced simpler, less accurate models that were based solely on demographic
data or prior resource use.
26 27
BOX 4 Patient classification system
Adjusted Clinical Groups (ACG) This system, which was developed by researchers at Johns Hopkins University, uses information from diagnoses, prescriptions, relevant procedures and healthcare costs. ACGs are 94 mutually exclusive categories, into which each person is classified based on their age, gender and their combinations of diagnoses recorded over a twelve-month period. The diagnoses are also classified in three other ways:
* 264 Expanded Diagnosis Clusters (EDCs) based on the clinical features of the patients health conditions; * Hospital-dominant diagnosis (HOSDOM), based on diseases signalling a high risk of hospitalization in the following year; * A medical frailty marker, which is a dichotomous variable.
Within the ACG system, drugs are classified into 69 categories, called Rx-MGs, based on the diseases that can be identified from them. A patient is considered to have received a high number of different drugs, when the annual total is more than 12.Not all of these groups are included as variables in the ACG predictive models (ACG-PM). For example, 180 diagnostic categories are included (34 categories for ACGs, 101 EDCs, four categories for HOSDOM, plus the frailty marker) and 65 for prescriptions (64 Rx-MGs, high number of drugs). The two generic versions of the system (for people above and below 65) make predictions based on the case-mix calibration performed by the authors.
Diagnostic Cost Groups (Verisk Healths DxCG DCG_Methodology)This system was devised by researchers working at Boston University and contains a suite of different predictive models. The choice of model to use depends on the explanatory variables (age, gender, diagnoses, prescriptions, costs), populations (commercial, Medicaid or Medicare) and the response variables to be predicted (total cost, hospital admissions, pharmaceutical costs).
The classification of diagnoses is based on their clinical homogeneity. The International Classification of Disease (ICD-9-CM) codes are categorized into 1,013 Dx-groups, which in turn may be condensed to 394 Condition Categories (CC), then into 117 Related Condition Categories (RCC), and finally into 31 Aggregated Condition Categories (ACC). Although all of these groups can be used to describe patient population morbidity, the CCs are the ones used in predictive models.
Likewise, prescribed drugs are classified into 203 Rx Groups for use in the predictive models, which in turn can be condensed to 18 Aggregated Rx Groups.Dx groups and prescriptions are processed hierarchically to avoid the overuse of patient-level codes in the predictive models, and to account for variations in providers coding patterns. As a result, only those categories that reflect the most severe manifestation of each health problem are included.
Clinical Risk Groups (CRG)
This is system, which was developed by the 3M Company, uses diagnostics and procedures to classify patients. The classification process begins by condensing the ICD-9-CM diagnosis codes into 537 EDCs (Episode Disease Categories) and the procedures into 640 EPC (Episode Procedure Categories). Particular combinations of EDCs and EPCs, and their chronological sequence, may lead to some new categories being generated and others eliminated. The next step is to select the most relevant diseases. The combination of these diagnoses with other diagnoses determines a level of severity. Finally, each patient is classified into a single CRG, based on their combination of diagnostic groups. The 1,076 CRGs are therefore mutually exclusive categories. They can be regrouped to achieve the desired level of granularity.
26 27
28 29
Although extremely useful, these instruments do have their limitations. For example, their overall
predictive power (R2) is generally below 30%, and they do not include many of the factors that
influence an individuals health status, such as social variables. More importantly, there may also be
doubts regarding the validity and applicability of such instruments in the Spanish context. Although
various studies have demonstrated the ability of diagnosis-based case-mix systems to explain the use
of healthcare resources retrospectively in Spain and other countries, there are few references to their
use as predictive tools in a national healthcare system.
In the Basque Country, a research project (Orueta et al, 2012) has shown that these U.S. case-mix
systems can be used successfully to predict consumption of healthcare resources in a publicly funded
healthcare system with universal health insurance. In particular, classification systems that use diagnoses
and prescriptions together tend to perform particularly well. In other words, these systems can be used
in the Spanish context to identify people who are liable to consume considerable healthcare resources
in the future. Whether these patients risk can be prevented through proactive actions remains to be
seen.
4.2 Predictive Modelling in the United Kingdom
In the United Kingdom, predictive models are used both for resource allocation (PBRA) and for case
finding. Many case finding models have been developed, including RISC, HUM, SPOKE, PEONY, etc;
however two of the most frequently cited models are the Patients at Risk of Re-hospitalisation (PARR)
model and the Combined Predictive Model (CPM). Funded by the Department of Health and
developed by The Kings Fund and partners, the aim of these models was to identify patients at high
risk of unplanned admission to hospital in the next 12 months. < PPAPAR These models were also the
precursors to the PRISM and SPARRA predictive models used in Wales and Scotland, respectively.
The PARR model makes predictions on a patients prior two years of inpatient data, and thus it can
only predict the risk of hospitalisation in patients who have already had a recent admission. Originally,
PARR consisted of two sub-models: PARR1 (which predicted ACSC hospitalisations) and PARR2 (which
predicted any type of hospitalisation). The advantage of the PARR1 model was its ability to predict
avoidable admissions; however, because of its low sensitivity, a decision was made to renew only
the PARR2 algorithm, which resulted in an improved PARR++ model being launched in 2007. PARR
was designed to be easy to use and came with its own software, which could be downloaded from
the Department of Health website.
28 29
In contrast, the CPM used not only inpatient data as its source of explanatory variables, but also data
from outpatients, Accident & Emergency, and the primary care EMR. As a result, the CPM generates a
risk score for every person in a registered population, including people who have not been previously
hospitalised. The CPM has marginally better predictive power than PARR, and organisations that run
the CPM will have compiled the data necessary to conduct an analysis of gaps in care. However,
the CPM is challenging to implement because of the data requirements (especially the extraction of
primary care EMR data) and the fact that no associated software was made available (so organisations
had to write or purchase their own software to run the model).
By 2010, the use of these two models was becoming well established across the English National
Health System. However, in 2011 the Department of Healths decided not to fund the renewal of the
models. As a result, each region of the country has sought its own solution. While certain areas have
turned to private consultants to implement case-mix stratification systems such as ACG or DxCG,
others, such as the county of Devon, have opted to develop their own local predictive models based
on CPM or PARR. This heterogeneity in stratification systems across England contrasts with Wales and
Scotland, whose Departments of Health opted to develop and maintain a centralised predictive model
for the entire country.
In Wales, Health Dialog UK designed the Predictive Risk Stratification Model (PRISM). Like the
CPM, PRISM uses data from primary and secondary care to identify the risk of future hospitalisation of
an individual. It incorporates variables such as the deprivation index (a measure of the socioeconomic
level of the patients census area) and presents the populations risk score via an online tool.
In Scotland, the Scottish Patients at Risk of Readmission and Admission (SPARRA) model was developed
by the Scottish Information Services Division (ISD). It is similar to the PARR model, but as well as
inpatient data, it also uses prescription data and information collected by mental health services as a
source of explanatory variables. A specialist mental health version of SPARRA, called SPARRA-MH has
also been developed, which predicts the risk of admission to a psychiatric hospital. The ISD, which is
continually developing the SPARRA model, has recently launched an online access tool for accessing
30 31
SPARRA results, which is similar to the PRISM online tool.
The predictions generated by PARR, CPM, PRISM and SPARRA have led to different case management
interventions being developed across the UK. For example, the Virtual Wards scheme (Lewis et al,
2011), originally implemented in Croydon offers multidisciplinary case management at home to patients
with the highest predicted risk scores. Virtual Ward schemes have been established across the UK and
internationally.
We cannot end this description of the experiences in the United Kingdom without referring to The
Nuffield Trust. This organisation has been the main guardian and promoter of predictive models in the
British Isles, with research and publication work that includes articles, reference guides and the design
of three new models:
* Person-based Resource Allocation (PBRA): This is a tool that helps predict hospital expenditure per patient over the next year. It has been used by the English NHS to distribute healthcare resources among different areas of the country.
* A model that predicts admission to a nursing home in the next 12 months or the start of another form of intensive social care (Bardsley et al 2011).
* PARR-30: This model predicts the risk of re-hospitalisation of a patient in the 30 days
following discharge from hospital (Billings et al 2012).
30 31
5. Experiences in the Spanish Health SystemThere is already a considerable number of publications, evidence and good practices for the use of predictive models to identify patients with chronic illness who are at highest risk in the Spanish NHS. For example, the work conducted by the Agencia dAvaluaci, Informaci i Qualitat at the Department of Health in Catalonia includes an appendix with a very detailed list of
models and studies relating to risk stratification (AIAQS, 2010). Likewise, the work of the SSIBE in Baix
Empord in Valencia is also noteworthy, including the use of CARS instrument, the study at the Hospital
de La Fe, and the projects developed in the Basque Country as part of its Chronic Care Strategy.
Below, we describe a series of selected studies.
32 33
The stratification study in Baix Llobregat
A stratification study was conducted by the Hospital Viladecans in Baix Llobregat county with seven
referring PCTs in the towns of Castelldefels, Gav, Viladecans and Begues. The aim of the project was
to study the probability and risk factors for unplanned admission and re-admission in the reference
population (Lpez, 2011). This work was inspired by work in the UK on the PARR model (Billings, 2006)
and the SPARRA model (SPARRA, 2006).
A longitudinal retrospective study was conducted using data from the primary care (PC) clinical history
and the basic minimum data set on hospital discharge (MDSHD) from the hospitals of Viladecans and
Bellvitge. The study period was 01/01/06-31/12/08. Patients were included if they were treated in the
PC centres in the following municipalities of Baix Llobregat Litoral: Castelldefels, Gav, Viladecans,
Begues and Sant Climent de Llobregat.
The dependent variables were unplanned admission and re-admission in one of the hospitals over a 12
month and 6 month period, respectively, during 2008. The records analysed included.
* Socio-demographic factors (age, gender, place of residence).
* Morbidity (selected PC diagnoses in large ICD-10 groups), and
* consumption of healthcare resources in the previous two years (visits, dispensed medications,
total length of stay).
The admission and re-admission logistic regression models were adjusted to the morbidity and use-of-
services variables, and were stratified by gender.
The final sample included 174,400 individuals. There were 3,494 (2.0%) admissions, and 440 (0.3%)
individuals were re-admitted within 180 days. Most of the morbidity and service use variables occurred
with higher frequency in people who were admitted and re-admitted compared to the total population.
The factors associated to unplanned admission in 2008 were: being male; aged between 45-64 or
65 compared to younger individuals; and having been diagnosed with insulin dependent diabetes (ID), non-ID diabetes, ischaemic heart disease, emphysema or chronic obstructive pulmonary disease
32 33
(COPD). The factors with the highest predictive power were having two or more unplanned admissions
in 2007 (OR=35.33; CI95%=24.2-51.3) and having nine or more days accumulated length of stay in
2007 (OR=16.97; CI95%=12.07-23.87). The area under the Receiver Operating Characteristic (ROC)
curve of the unplanned admission model (i.e., the c-statistic) was 0.83.
The factors associated to re-admission within 180 days of the index admission were: being male; aged
between 45-64 or 65; a diagnosis of Type I diabetes, heart disease, emphysema or COPD; taking four or more medicines; having a total length of stay of nine or more days; and having been admitted
twice or more in 2007. The factors with the highest predictive power were being aged 65 or older
(OR=19.1; CI95%=12.7-28.9) and having two or more admissions in a year (OR=22.0; 15.5-31.4). The
area under the ROC curve of this predictive model was 0.93.
The admission and re-admission predictive models identified risks factors that are already recorded in
administrative data, and the performance of the models was similar or superior to previously published
models. In general, the factors with the highest predictive power were the patients age and the use
of hospital services in the previous year. The inclusion of other individual risk factors, such as social
support and comorbidity indexes, might improve the accuracy of the models still further. Despite some
limitations of the study, the developed models presented could be used to prioritise individuals with a
high risk of admission or re-admission for enhanced primary or preventive care.
34 35
Stratification study in Hospital de La Fe in Valencia
In order to implement of a case management programme for complex patients at the Hospital de la Fe
in Valencia, a predictive model was required that that would:
1. identify patients at high risk of decompensation in the short- to mid-term, (i.e., the patients
who would consume most unplanned health resources)
2. be easy to build and implement, and which would be very parsimonious (i.e., use only a
very small selection of the available variables).
A variety of techniques, but mostly multiple logistic regression, were used to determine risk of
unplanned hospitalisation of patients with chronic diseases. The dependent variable was having 10 or
more unplanned hospital bed-days in the next 12 months not due to accidents, assaults, or births, etc.
The independent variables were: patient characteristics (gender and age); consumption (length of stay
[unplanned], number of emergency visits, and number outpatient visits); and clinical disease indicators
(based on the ICD-9-CM and CCS, the chronic condition index (CCI), and the Charlson and Elixhauser
indexes).
The final model was highly predictive (ROC = 0.87). It identifies patients who will account for 64% of the
unplanned length of stay in the next twelve months and can explain 36% of unplanned consumption
with 5% of the sample.
A new care programme was developed for patients with complex chronic diseases. An evaluation
of the programme showed that patients who received the intervention remained the most stable,
with higher quality of life, a high level of satisfaction and an 80% reduction in the consumption of
unplanned hospital resources (CI95% 0.19-0.22).
34 35
Application of the CARS model in the Region of Valencia
In order to apply the CARS model, a retrospective study of cohorts was conducted in health departments
6, 10 and 11 in the region of Valencia. The sample consisted of 500 patients aged 65 and over who
were treated in the Valencian health system in 2008 and 2009. The data came from SIA-Abucasis and
MBDS, and was validated with professionals working in the clinics.
The CARS is a very simple tool, which comprises only three items, namely diagnosis, polypharmacy,
and hospitalisation or emergency visits in the last 6 months. As a result, it is very easy to use. The tool
can be completed in one of three ways. First, medical and nursing staff can complete the tool through
interviews with patients, either face-to-face or by telephone; second, clinical staff can complete the tool
after reading the medical history; finally tool can be completed automatically by linking the different
clinical and administrative databases related to primary and hospital care. This third option is very
attractive because it does not represent an additional workload for professionals and can be included
in home prevention and treatment programmes designed for this type of patient at risk.
CARS offers a 12-month forecast. The CARS score and age are related positively and linearly (CARS
r= 0.09; p
36 37
The stratification study in Osakidetza (Orueta et al, 2012)
Population stratification began in the Basque Country in 2009 as a research project. Its aim was
to establish the capacity of statistical models to predict care expenditure and identify patients with
significant care needs based on demographic, socio-economic and clinical information and their
previous use of healthcare resources. The models were all based on information held in administrative
and clinical databases.
The findings of the research project showed that linking the available information sources (primary care
medical history, Hospital MBDS and other computerised specialised care records) helped to overcome
problems of data quality and made the implementation of population stratification more feasible.
Stratification of all the patients assigned to Osakidetza has been performed since 2010, which entails
classifying more than two million citizens every year. The ACG-PM predictive model is used for this
process, which offers a prospective estimation of the healthcare resource consumption of each person
and their health problems. This information is being used to select target populations who could
benefit from case management programmes, disease management interventions, and other preventive
activities. Each patients risk score is included in their primary care medical history, and alerts are
generated for health workers that help them to identify patients who could benefit from specific
programmes or actions.
The adoption of population health approaches such as this will be increasingly important for healthcare
organisations as they adjust to the new epidemiological context where chronic health problems prevail.
Population stratification is not an end in itself but rather an instrument to support a wider strategy of
change. The experience in the Basque Country shows that, although challenging from a technical and
organisational perspective, it is feasible to stratify the population of an entire region and to integrate
the results into clinical practice.
36 37
Finally, a note about the implementation process: a qualitative study conducted in the Basque Country demonstrated the importance of involving professionals and patients from the outset. This study, which involved general practitioners and nursing staff at 23 PC centres, showed how
the contribution of clinicians is very relevant and that critical success factors emerge when implementing
these tools (see box 5).
Box 5. Conclusions on population stratification, Osakidetza focus groups, 2011.
The regional and population focus is the basis and essence of Primary Care, although it is acknowledged
that its development is currently very limited. Population stratification could help change
the deliver of primary healthcare, although more communication and information would be
required about the project set up by the Basque Health Service (including the concepts, aims pursued,
functions of the tool, actions to take with population sub-groups, etc.).
For population stratification to be most useful and practical, other initiatives should be developed
in parallel, such as better integration of health care and social care services, education and training,
the creation of new job descriptions, or the re-organisation of clinicians working patterns and time
spent on case management tasks.
Professionals identified a series of characteristics that the tool should possess in order for it to be
most useful in their clinical practices, namely: independence of use, patient identification by name
and surname, ergonomic design, and usable information provided at both the individual and group
level.
38 39
Bibliography
Agencia Informaci, Avaluaci i Qualitat en Salut (AIAQS). Desarrollo de un modelo predictivo de
ingresos y reingresos hospitalarios no programmeados en Catalunya IN03/2010.
Barcelona: AIAQS, 2010
Allaudeen N, Schnipper JL, Orav EJ, Wachter RM, Vidyarthi AR. Inability of providers to predict unplanned
readmissions. J Gen Intern Med. 2011;26:771-776.
Billings J, Dixon J, Mijanovich T, Wennberg D. Case finding for patients at risk of readmission to hospital:
development of algorithm to identify high risk patients. BMJ. 2006; 333(7563):327.
Bodenheimer TS, Berry-Millett R. Care Management of Patients with Patients with complex chronic
diseases Health Care Needs. In: The Synthesis Project, Issue 19. Robert Wood Johnson Foundation,
2009.
Boult C, Karm L, Groves C. Improving Chronic Care: The Guided Care Model. The Permanente
Journal 2008; 12 (1): 50-54
California Healthcare Foundation. Patients with complex chronic diseases care care Management
Toolkit. California Quality Control, 2012
Cousins MS, Shickle LM, Bander JA. An introduction to predictive modeling for disease management
risk stratification. Disease Management 2002; 5: 157-167
Curry N, Billings J, Darin R et al. Predictive risk Project. Literature Review. Available at: http://www.
networks.nhs.uk/uploads/2005_Jun/Predictive_%20risk_%20proj_%20review_
REVISED_FINAL.doc.
Freund T, Mahler C, Erler A, Gensichen J, Ose D, Szecsenyi J, Peters-Klimm F. Identification of patients
likely to benefit from care management programmes. Am J Manag Care.
2011 ;17(5):345-52.
Freund T, Wensing M, Geissler S, Peters-Klimm F, Mahler C, Boyd CM, Szecsenyi J. Primary care
38 39
physicians experiences with case finding for practice-based care management. Am J Manag Care.
2012 Apr 1;18(4):e155-61.
Garcia Goi M. El ajuste de riesgos en el mercado sanitario (2004) in: http://www.fgcasal.org/aes/
docs/AjustedeRiesgos.pdf
Garca Prez L, Linertov R, Lorenzo Riera A, Vzquez Daz JR, Duque Gonzlez B, Lpez Hijazo A, Barreto
Cruz S, Lorenzo Prozzo N, Guiote Partido I, Sarra Santamera A. Factores predictivos e intervenciones
efectivas para la reduccin del riesgo de reingreso hospitalario en pacientes de edad avanzada. Plan
de Calidad para el Sistema Nacional de Salud del Ministerio de Sanidad y Poltica Social. Servicio de
Evaluacin del Servicio Canario de la Salud; 2009. Informes de Evaluacin de Tecnologas Sanitarias:
SESCS N 2007/20
Inoriza JM, Coderch J, Carreras M, et al. La medida de la morbilidad atendida en una organizacion
sanitaria integrada. Gac Sanit. 2009;23:29-37.
Kings Fund. Choosing a predictive risk model: a guide for commissioners in England. London: Kings
Fund, 2011
Kings Fund. Combined predictive model. Final report. London: National Health Service (NHS); 2006.
Krause DS. Economic effectiveness of disease management programmes: a meta-analysis. Dis Manag.
2005 Apr;8(2):114-34.
Lewis G, Bardsley M, Vaithianathan R, Steventon A, Georghiou T, Billings J, Dixon J Do virtual wards
reduce rates of unplanned hospital admissions, and at what cost? A research protocol using propensity
matched controls. Int J Integr Care. 2011; 11:e079.
Lewis G, Curry N, Bardsley M. Choosing a predictive risk model: a guide for commissioners in England.
London: The Nuffield Trust, 2011
Lewis G. Impactibility models: identifying the subgroup of high-risk patients most amenable to
hospital-avoidance programmes. Milbank Q. 2010; 88(2):240-55.
40 41
Lpez-Aguil S, Contel JC, Farr J, Campuzano JL, Rajmil L. Predictive model for emergency hospital
admission and 6-month readmission. Am J Manag Care. 2011; 17(9):e348-57.
Nuo-Solins R, Fernndez-Cano P, Mira-Solves JJ, Toro-Polanco N, Carlos Contel J, Guilabert Mora
M, Solas O. Desarrollo de IEMAC, un Instrumento para la Evaluacin de Modelos de Atencin ante la
Cronicidad. Gac Sanit. 2012 Jul 23. [Epub ahead of print]
Orueta JF, Mateos Del Pino M, Barrio Beraza I, Nuo Solinis R, Cuadrado Zubizarreta M, Sola Sarabia
C. [Stratification of the population in the Basque Country: results in the first year of implementation.]
Aten Primaria. 2012 Mar 8. [Epub ahead of print]
Patients at Risk of Rehospitalisation (PARR) case finding tool. Available at: http://www. kingsfund.org.
uk/health_topics/patients_at_risk/index.html. Accessed 05/04/06
Peikes D, Chen A, Schore J, Brown R. Effects of care coordination on hospitalization, quality of care,
and health care expenditures among Medicare beneficiaries: 15 randomized trials. JAMA. 2009 Feb
11;301(6):603-18.
Roland M, Dusheiko M, Gravelle H, Parker S. Follow up of people aged 65 and over with a history of
emergency admissions: analysis of routine admission data. BMJ. 2005; 330(7486): 289292.
SPARRA: Scottish patients at risk of readmission and admission. Edinburgh (United Kingdom): National
Services Scotland. National Health Services (NHS); 2006
Wagner E. H, Austin B. T, Davis C, Hindmarsh M, Schaefer J, Bonomi A. Improving Chronic Illness Care:
Translating Evidence Into Action. Health Aff November 2001 20:664-78
40 41
42 43
Appendix 1Gauging the precision of predictive models.
From a quantitative perspective, the most interesting characteristic of a predictive model is its level of
precision or prediction capacity. There are several measurements that can be used to estimate the level
of precision of a model.
When the dependent variable is continuous (e.g., health expenditure), although certain measurements
such as the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE) or the Mean Absolute
Percentage Error (MAPE) can be used, it is more common to use the R-squared (R2). This is a
statistical coefficient that measures the goodness of fit of a model. It ranges in value between 0 and
1, with 1 indicating a perfect fit. The R-squared can be interpreted as the ratio of the variation in the
data that can be explained by the model.
If the dependent variable of the model is dichotomous (e.g., unplanned re-hospitalisation or not), or
when studying the cost-effectiveness of including patients selected by the model in an intervention
programme, the measurements to consider are:
* Positive predictive value (VPP): This is the probability that an individual selected by the model
will be a true positive.
* Sensitivity: This is the proportion of high risk patients detected by the model.
* Specificity: This is proportion of low risk patients that the model does not select.
* Negative predictive value (NPP): This is the probability that an individual not included by the
model will be a true negative.
Predicted
High risk
Low risk
High risk Low risk
Falso positivoTrue positivo
False negative True negative
PPV= TP/(TP+FP)NPV= TN/(TN+FN)Sensitivity= TP/(TP+FN)Especificity= TN/(TN+FP)
Observed
It should be borne in mind that these four metrics can be altered by the user of the model simply
by choosing a different threshold for the definition of high-risk. For example, if a high threshold is
chosen (i.e., the restrictions are strict and only patients with a very high score are selected), then the
specificity and the PPV increase, at the expense of reducing sensitivity and NPV, and vice versa. The
way of representing this balance between sensitivity and specificity is using the ROC curve (Receiver
Operating Characteristics). The percentage of the area under the curve (AUC), or c-statistic, is another
measurement that can be used to compare two models. An AUC value of 0.5 specifies that the
predictive capacity of the model is no better than at random, whereas values above 0.8 are considered
indicative of a very good fit.
Figure 4
Example of a ROC Curve
42 43
44 45
Finally, there are various techniques for ensuring that the estimates of the precision of a predictive
models reflect how well the model will perform when applied to the general population (i.e., that the
model is not over-adjusting to the data on which it was built). The most common method is a simple
split-sample division where the data are divided at random into two sub-samples, one of which
is used for calibrating the model and the other for assessing it. The ratio used to divide the samples
generally ranges from 50-50 to 70-30. An alternative is to use a bootstrapping method, while adjusting
for the phenomenon of optimism (which measures the difference between the error obtained on the
initial data set and the error that would be obtained on an infinite group of data).
44 45