The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS...

34
NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take? The National Biomarkers Development Alliance (NBDA)* WORKSHOP VI THE EVER-PROMISING BUT ELUSIVE SURROGATE ENDPOINT: WHAT WILL IT TAKE? A CONVERSATIONAL, RESULTS-ORIENTED WORKSHOP** December 1-2, 2014 Washington, D.C. * Funding for the NBDA Workshops Provided by the Piper Foundation and Arizona State University 1

Transcript of The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS...

Page 1: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

The National Biomarkers Development Alliance (NBDA)*

WORKSHOP VI

THE EVER-PROMISING BUT ELUSIVE SURROGATE ENDPOINT: WHAT WILL IT TAKE?

A CONVERSATIONAL, RESULTS-ORIENTED WORKSHOP**

December 1-2, 2014 Washington, D.C.

* Funding for the NBDA Workshops Provided by the Piper Foundation and Arizona State University

1

Page 2: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

WORKSHOP BACKGROUND

Over the course of the previous NBDA (five to date) workshops, the myriad of problems and issues surrounding the discovery and development of biomarkers and surrogate endpoints (surrogate biomarkers) in clinical trials have been the subject of significant analysis directed toward finding solutions. The difficulty of identifying a surrogate endpoint is repeatedly cited as one of the most complex problem that clinical medicine will face over the next decade; particularly in an era where molecular subtyping of disease will create ever smaller populations of patients for study. However, finding solutions to these problems and identifying valid surrogate endpoints (laboratory measurements or physical signs that can substitute for a clinically meaningful endpoint, i.e., a direct measurement of how a patient feels, functions, or survives) represents what will prove to be key to realizing the potential of precision medicine through the development of targeted interventions for cancer and an array of chronic disease.

The current workshop considered all aspects of surrogate endpoints – their history, successes, and failures – and reexamined and rethought the concept of the surrogate endpoint in view of advances in the vast array of “omics” possibilities and advanced technologies that may lead to a new era of precision medicine. The goal was to identify the current barriers to developing and validating surrogate endpoints and explore solutions. The NBDA plans to convene a future workshop that will finalize a series of “case studies” focused on the identification of the evidence (and ultimately the evidentiary standards) required to submit specific surrogate endpoints and associated contexts of use plans to the FDA for review and action.

Biomarker: A characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. Clinical endpoint: A characteristic or variable that directly reflects how patients feels, functions or survives, and it is usually related to desired effect, i.e., efficacy. Clinical endpoints are preferred for use in efficacy trials and are usually acceptable as evidence for regulatory purposes. Surrogate endpoint: A biomarker that is intended to substitute for a clinical endpoint. A surrogate endpoints is expected to predict clinical benefit – or harm or lack of benefit - based on epidemiologic, therapeutic, pathophysiologic, or other scientific evidence.

2

Page 3: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

WHY RETHINK SURROGATE ENDPOINTS?1 One of the important lessons that the NBDA has learned through its workshops is that the failure of both biomarkers and surrogate markers to achieve regulatory approval does not occur when they reach the Food and Drug Administration (FDA), but rather begins in biomarker discovery and/or early development. The reasons for these failures are complex, but they begin can be traced to a number of key steps in the discovery of a clinically relevant biomarker including the following:

• Is a proposed biomarker discovery effort tied to a good clinical question based on plausible biology?

• Given a good clinical question, are experiments designed correctly to answer that question?

• Are experiments conducted using fit-for-purpose biospecimens in sufficient numbers?

• Is the data of high quality (with meta data) and collected an managed in a transparent manner.

• Are good technology standards applied and followed throughout biomarker discovery? Are the data generated from an experiment of high quality and are they managed well?

• Are the data analyzed using the appropriate analytics? Even when the right questions are asked, researchers too often do not devote enough resources to answering them in a rigorous and reproducible manner. As a result, too many studies purporting to have identified a biomarker or surrogate endpoint are based on poor data and are not reproducible. This means that many putative surrogate endpoints fail to predict clinical benefit and therefore cannot substitute for a true clinical endpoint. In clinical drug development, surrogate endpoints are used to assess whether a drug has clinically- significant efficacy, since the surrogate theoretically appears more quickly than the actual clinical endpoint. In the same manner, surrogate endpoints may be the only way to develop preventive therapies where outcomes can be decades in the future or only a small percentage of the population will develop the condition under study. Surrogate endpoints are much more difficult to achieve for rapidly progressing fatal diseases. Understanding the Regulatory Challenges of Surrogate Endpoints (Dr. Janet Woodcock) Regulatory View of Surrogate Endpoints. FDA Guidance details that surrogate endpoints may be used to support accelerated approval if the surrogate is deemed “reasonably likely” to predict a

1 This section is based on remarks from Anna Barker, Ph.D., President and Director of the NBDA, Director of the Transformative Healthcare Knowledge Networks, Co-Director of Complex Adaptive Systems and Professor in the School of Life Sciences at Arizona State University; and Janet Woodcock, M.D., Director of the Center for Drug Evaluation and Research at the U.S. Food and Drug Administration.

3

Page 4: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

clinical endpoint of interest. Drugs approved under accelerated approval using surrogate endpoints must undergo subsequent clinical trials to demonstrate clinical efficacy. Moreover, this route to approval is used only in serious and life-threatening illnesses that lack an acceptable therapy. Examples of surrogates used for accelerated approval include tumor shrinkage and progression-free survival for anti-cancer agents, and viral copy number for anti-HIV therapies. There are also “validated” surrogate endpoints that are acceptable for full approval, including:

• blood pressure for drugs designed to prevent stroke or blood pressure-associated renal disease;

• bone mineral density for estrogenic compounds used to prevent osteoporosis; • hemoglobin A1C for glycemic control; and • forced expiratory volume (FEV1) for obstructive lung diseases – among others.

The most widely used surrogate endpoint for full approval is blood drug level as a surrogate for clinical efficacy and toxicity in the evaluation of virtually all generic drugs. According to FDA data, approximately 45% of new molecular entity drug approvals use surrogate endpoints. The recent approvals of drugs to treat hepatitis C, for example, were based on the ability of these drugs to reduce viral blood levels. One issue confronting those who want to develop a surrogate endpoint is that there is no standardized process for validating surrogate endpoints for regulatory approval. In some cases, acceptance is based on long-time clinical use combined with adequate data from trials, while in other instances, such as the reduction of blood virus levels for anti-HIV therapies, acceptance was driven by a health crisis. In considering whether to approve the use of a biomarker as a surrogate endpoint, the FDA insists that there first must be biological plausibility. Such evidence can be epidemiologic, as is the case in linking blood pressure and LDL cholesterol levels to increased risk of stroke and heart attacks respectively, but the surrogate must be consistent with pathophysiology and ideally located on the causal pathway for the disease. In addition, changes in the surrogate must reflect changes in prognosis. Acceptance of a biomarker as a surrogate endpoint also requires statistical evidence that the marker correlates with clinical outcome, with the understanding that correlation does not equal causation. In the past, support for the use of a biomarker as a surrogate endpoint came from successes in clinical trials, where an effect on the surrogate predicted outcomes for other drugs of the same pharmacologic class or in several pharmacologic classes. Hypertension gained acceptance as a surrogate endpoint in this manner, although it took many years to achieve this status. The FDA may also consider other benefit/risk factors, such as: serious or life-threatening illness with no alternative therapy; the existence of a large safety database; instances where a drug will only be used for the short-term; or when it is difficult to study the true clinical endpoint.

4

Page 5: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

Highlights from the History of Surrogate Endpoints. To understand the challenges of developing surrogate endpoints and appreciate why it is important to rethink how they are developed, it is useful to consider the history of surrogate endpoint use in the past five decades. In the 1970s and 1980s, the FDA accepted blood pressure measurements and cholesterol levels as surrogate endpoints based on epidemiologic evidence, and researchers and regulators alike were optimistic that surrogate endpoints would gain increasing use in clinical trials. However, problems with the use of a surrogate endpoint – the suppression of ventricular premature beats (VPBs) – arose in the late 1980s as a result of the sobering results of the Coronary Arrhythmia Suppression Trial (CAST). This trial was designed to test the hypothesis: cardiac anti-arrhythmics, which were generally being used off-label to presumably prevent premature death following a heart attack (based on their ability to reduce VPBs,) were in fact reducing mortality. While the three drugs enrolled in this trial were effective at reducing VPBs, two of the three drugs were associated with excess mortality and the trial was discontinued – a shocking finding!

Figure 1: Surrogate Endpoints – Learning from the Past – Planning for the Future

5

Page 6: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

Skepticism about the use of surrogate endpoints prevailed following this experience, which led to the publication of rigorous statistical criteria for assessing the correlation of a candidate surrogate with clinical outcomes. These criteria, known as the Prentice criteria, have never been met by any surrogate endpoint to date. Nonetheless, pressure to speed development of anti-retroviral therapies to address the AIDS crisis in the 1990s led to what was the then controversial use of RNA copy number as a surrogate endpoint for accelerated approval. While RNA copy number is now used widely both as an early drug development tool, as a surrogate endpoint in clinical trials under the accelerated approval protocol, and for monitoring antiviral therapy, there is still a lack of complete correlation with clinical outcomes and RNA copy number at the level of the individual patient. However, there would be no approved anti-retroviral therapies without the use of this surrogate endpoint. Given this success, drug developers and regulators felt more comfortable with the use of surrogate endpoints in clinical trials, but controversy arose in the 2000s around the use of glycemic control as an efficacy endpoint in the clinical trials for rosiglitazone, a drug that binds to specific receptors in adipose cells that makes them more responsive to insulin. A meta-analysis of trial data suggested that there was an increased incidence of myocardial infarction in treated patients compared to comparable drugs or placebo. As a result, the FDA came under fire for using hemoglobin A1c as a surrogate endpoint rather than requiring cardiovascular outcome studies for approval. The FDA’s reasoning in not requiring this endpoint was that no drug approved for treating Type 2 diabetes had ever been shown to affect cardiovascular outcomes. Rosiglitazone’s cardiovascular safety was addressed in a subsequent randomized clinical trial and the drug appears to have a risk profile comparable to similar drugs. Another controversy over the use of a surrogate endpoint occurred when the FDA approved ezetimibe (Zetia), a non-statin drug for lowering LDL cholesterol (and presumably preventing coronary artery disease). LDL cholesterol is one of the most widely used surrogate endpoints in clinical trials. In fact it was viewed as a proxy for preventing coronary artery disease, yet when a small, randomized trial was run using a different surrogate (intima-media thickness of carotid arteries) to compare the effects of the statin simvastatin plus ezetimibe versus simvastatin alone, there was no difference between the two trial arms. The lack of disparity led to criticism that Zetia was little more than an expensive placebo. This resulted in a general criticism of LDL cholesterol as a surrogate endpoint, though there was little comment on the validity of intima-media thickness as a surrogate endpoint. A larger trial (IMPROVE-IT) involving 18,000 patients with known coronary artery disease, randomized to simvastatin or simvastatin plus ezetimibe, and followed for seven years, produced a statistically significant reduction in major cardiovascular incidents – primarily myocardial infarction – for the two-drug regimen. Once again, LDL cholesterol was hailed as an effective surrogate endpoint and there are now additional drugs under development to lower LDL cholesterol levels. What seems to have been lost in the media coverage of this trial’s results is that 32.7 percent of patients in the combination arm still had a major cardiovascular event over the course of the seven-year trial suggesting that

6

Page 7: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

the lowering of LDL cholesterol is not likely to be the ultimate prevention measure for major cardiovascular events. One of the most recent surrogate endpoints to receive FDA approval for use in clinical trials is pathological complete response (pCR) as a predictor of a favorable benefit-risk ratio for accelerated approval of drugs for the treatment of breast cancer. FDA’s statistical analysis showed that there was a positive association of pCR with event-free survival and overall survival; with individual patients who attain a pCR having a more favorable long-term outcome irrespective of the treatment received. The analysis also showed that the magnitude of improvement in pCR did not predict event-free survival and overall survival. Therefore, FDA guidance holds that pCR is a surrogate endpoint that is reasonably likely to predict a clinical benefit, but confirmatory clinical trials must demonstrate clinical benefit in order to attain regular approval. Recently, FDA granted accelerated approval for a supplemental indication for pertuzumab (Perjeta) based on one neoadjuvant trial with pCR as a surrogate endpoint, and an additional adjuvant trial with disease-free survival as an endpoint. FEV1 is another surrogate endpoint that was used recently to gain approval for ivacaftor (Kalydeco), an oral drug designed to improve function of the cystic fibrosis transmembrane conductance regulator (CFTR) protein in cystic fibrosis patients with specific genetic variants of the gene coding for this protein. This surrogate was used as the clinical endpoint instead of chloride concentration in sweat (a well-established surrogate) for cystic fibrosis drugs. Earlier proof-of-concept and efficacy trials had shown that ivacaftor did reduce sweat chloride concentrations towards normal levels but that there was a lack of correlation between individual sweat chloride results and individual FEV1 results. While the results for the new surrogate endpoint did not correspond fully with those for the accepted surrogate, which is also used today as a diagnostic biomarker for cystic fibrosis, as a group the patients felt better and appeared to be functioning better with lower rates of hospital admission and infection. This finding has since been replicated in other settings, so the lack of complete correlation between the two surrogate endpoints suggests that there are many factors that impact,FEV1 improvement beyond the ability of a drug to reduce sweat chloride levels. It also highlights the fact that there are significant error bars associated with the use of surrogate endpoints. As the history of surrogate endpoints in clinical trials shows, there are some fundamental problems with the current conceptual framework for surrogate endpoints. These include:

• There is no “gold standard” clinical outcome measure. Indeed, the concept of the “ultimate” clinical outcome (i.e., one measurable parameter that will captures everything) – is flawed. In fact, clinical outcome measures often do not reflect what truly matters to patients.

• It is unlikely that a surrogate endpoint for efficacy can identify safety problems unless survival is the endpoint being measured.

7

Page 8: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

• Survival data show that the desirability of longer survival depends on the quality of life that comes with longer survival, which in many individuals is an estimation.

• The generalizability of any single outcome measure, such as mortality, can be limited by trial parameters, including who was enrolled in the trial.

• There is confusion between the desirability of prolonged observation for safety and long-term outcomes and the use of surrogate endpoints, As a result, surrogate biomarkers are often blamed for problems in clinical trials when the real problems comes from not asking the right clinical question and designing experiments to test that question with validity and reproducibility.

• The use of the Prentice statistical criteria is unreasonable and a more practical approach for judging the utility of surrogate endpoints is needed.

Future of Surrogate Endpoints. A practical approach to the development of new surrogate endpoints involves accepting that a surrogate may increase uncertainty - and that there are situations where the level of uncertainty is reasonable, given the alternatives. The field must also accept that standardization and evidence generation is necessary, regardless of the desires of Congress and the pressures that advocates bring to bear to increase the speed with which clinical trials are conducted and drugs are approved. Clinical trial experience with more surrogates may help to understand how they perform in real clinical settings. However, this is difficult and expensive if ultimately the surrogate results in failure of the trial. Addressing nearly all of these issues must be the focus for both the affected communities and the FDA. More clinical input is needed during the discovery process, yet the academic community has little experience in seeking and responding to such advice. Mechanistic plausibility, generated by academia, is essential, but the evidence to support the use of even a mechanistically plausible surrogate endpoint is highly dependent on the clinical context and the unmet medical need. Finally, there will always be tradeoffs – so it will be important in months and years ahead to focus on what is lost if a surrogate is not used. In summary, surrogate endpoints are a special category of biomarkers that are increasingly used in clinical studies. The development of additional surrogates would enhance drug development, particularly for slowly progressive diseases, and prevention indications. The use of surrogates is fraught with difficulty, a situation that is likely to continue, but increasing scientific knowledge promises to provide more evidence for biological and clinical relevance. It will be important for the affected communities and the regulators to begin to discuss the uncertainty involved in drug development, specifically as related to surrogate endpoints, and perhaps consider “benefits and risks” vs. the established “safe and effective”. MAJOR BARRIERS THAT IMPACT SURROGATE ENDPOINT DISCOVERY AND DEVELOPMENT An important activity for this workshop was the identification of the major barriers that stand in the way of developing robust surrogate endpoints (capable of predicting a specific clinical

8

Page 9: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

outcome). This was accomplished through small group discussions and a report-out from each small group to the workshop at large. This exercise generated the following list of key barriers:

• Lack of standards, including regulatory standards for data, evidence, analytical methods, and fit-for-purpose biospecimens.

• No clear vision for what is actually being measured as the surrogate endpoint and what clinical endpoint it is replacing.

• Clinical benefit is a moving target that means different things to physicians and patients and from one patient to another. This uncertainty makes it difficult to define the context of use and the level of standardization across analytical platforms and technologies.

• There is a need to have a clear pathway for identifying the benefit-risk ratio and uncertainty for investigational studies of surrogate endpoints and the potential return on investment for such studies. In other words, is the potential payoff from developing and using a surrogate endpoint worth the risk of taking that surrogate endpoint into a clinical trial?

• The field is overly influenced by the U.S. approach to regulatory approval in terms of managing uncertainty.

• There is not enough sharing of information, data, and experiences in developing surrogate endpoints, particularly across diseases and among the different stakeholders in the process. There is a need for incentives and mechanisms to encourage such collaboration, sharing, and the aggregation of data generated across studies and diseases.

• The fundamental ambiguity as to whether a surrogate is specific to a given treatment – or if it can be employed for multiple treatments for the same disease – requires that the surrogate be defined for a specific context of use. ,

• The lack of a clear reimbursement process makes it challenging to secure funding for surrogate endpoint development, leading to scarce resources across the field for identifying and qualifying surrogate endpoints.

• There is a need to broaden the definition of a clinical endpoint to be able to include quality of life measures. Developing such methods will require patient input and thought about how to include these non-laboratory outcomes.

• There are numerous methodological and operation challenges in surrogate endpoint development.

• There is little understanding or acceptance of the uncertainty associated with a given surrogate endpoint.

• A lack of understanding of the biology of the disease can lead to the development of therapies that treat the surrogate instead of the true clinical endpoint - or that focus on correlates of disease vs the biology of the disease.

• Surrogate endpoints need to evolve in the face of the dynamically changing understanding of disease.

9

Page 10: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

• There is a need for more structured approaches to the identification of off-target responses that are not generally captured with today’s surrogate endpoints.

• Regulatory guidance is unclear and perspectives vary within and across regulatory agencies.

• Industry views surrogate endpoints with a certain amount of cynicism given the regulatory uncertainty.

• There are no incentives or mechanisms for developing algorithms that would aid in the development of surrogate endpoints – nor for the analysis of data generated from surrogate endpoints in clinical studies.

FIGURE 2: MAJOR BARRIERS TO THE DISCOVERY AND DEVELOPMENT OF SURROGATE

BIOMARKERS REFLECTIONS ON THE FUTURE OF SURROGATE ENDPOINTS2 (DR. GOODSAID) Regardless of the intended use of a biomarker, as a diagnostic indicator or a surrogate endpoint, contexts of use can be defined as incremental metrics for qualifying that biomarker. The

2 This section is based on the presentation of Federico Goodsaid, Ph.D., Vice President for Strategic Regulatory Intelligence at Vertex Pharmaceuticals.

10

Page 11: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

statement of use for a specific biomarker should include the identity of the biomarker, the aspect of the biomarker that is measured, the form in which it is used for biological interpretation, the characteristics of the subjects that will be studied, and its purpose in drug development. Thinking ahead, attention also should be paid to developing a comprehensive description of how the biomarker will be used once it is qualified; and consideration given to understanding the risk of using that biomarker versus not using it. It is important to remember that there are different levels of risk and benefit, as well as evidentiary standards that depend on the chosen incremental contexts of use. For example, consider the incremental development of the biomarker Kidney Injury Molecule-1 (KIM-1) as a new indicator of kidney toxicity associated with drug therapy. In the non-clinical setting: (1) Initial context of use to detect acute drug-induced nephrotoxicity in rats would be included with traditional clinical chemistry markers and histopathology in toxicology studies. The potential benefit for this context of use would be the detection of nephrotoxicity in the rat with accessible biomarkers that have sensitivity and specificity similar to that of histopathology. The risk would be minimal given that decision-making will depend on the weight of evidence of the histopathology (and currently accepted biomarkers) - and that would not be made exclusively on the basis of data from KIM-1 measurements. The associated evidentiary standard would be bibliographic or retrospective analysis of rat nephrotoxicity studies across multiple drugs and sponsors; (2) Second sequential non-clinical context of use - the confirmatory evidence needed for the initial histopathology results in rats, with the benefit being more accurate decision-making and the risk being that an incorrect result from testing with this biomarker would lead to inaccurate additional safety data in rats. The evidentiary standards would be prospective studies comparing outcomes in rats when decisions are made with the additional data from KIM-1 analysis; (3) Third non-clinical context of use would be to replace histopathology or other measurements by KIM-1, with the benefit of reducing the time, number of animals, and cost for non-clinical safety assessment; the risk being that an incorrect result from testing with the newly qualified biomarkers would lead to inaccurate safety assessment in rats; and the evidentiary standard of prospective studies comparing outcomes in rats when decisions are made with proposed biomarkers instead of with histopathology. The same incremental context of use approach can also be taken for use of KIM-1 in the clinical setting. The initial context of use would be to develop preliminary evidence that anticipating the results from traditional BUN and creatinine testing for nephrotoxicity, the next incremental context of use would be to replace BUN and creatinine testing with KIM-1 measurements – it is at this point that KIM-1 would be considered a surrogate endpoint – and the final context of use

11

Page 12: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

would be to develop a companion diagnostic for monitoring human safety post-approval of drugs with the potential for nephrotoxicity in humans. Each incremental context of use would have its own benefit-risk statement and evidentiary standard. For example, risks would change from an incorrect result from testing with KIM-1 could lead to premature decisions on dose adjustments that would be mitigated by data eventually available for BUN and creatinine, to an incorrect result from testing with the newly qualified biomarkers that would lead to inaccurate safety assessment in humans, and finally that an incorrect result from testing with the newly qualified biomarkers would lead to an inaccurate selection of patients to receive therapy. Evidentiary standards would progress from requiring prospective studies comparing sensitivity for KIM-1 with that for BUN and creatinine for one or two known nephrotoxic drugs, to three to four nephrotoxic drugs, and finally to Phase 3 studies that would include testing a randomized sample from all patients receiving the drug as well as for those on standard of care in order to determine whether the test is sensitive and specific enough to prevent sensitive patients from receiving the drug. The conventional reasoning for why surrogate endpoints are a great idea is that they will serve as development shortcuts that will save time and ultimately reduce the cost of clinical trials, something sponsors and advocates alike desire. A better reason, and one that should drive future development of surrogate endpoints, is that they accurately reflect the molecular mechanism of disease and better therapies; and they provide incremental value related to the level of evidence available to support their context of use. In that respect, it will be critical to ask if a surrogate really explains the changes that are observed as a result of a given therapy. When used in this manner, surrogates have been and will continue to be critical components for accelerated approval programs at regulatory agencies. To determine if a surrogate is acceptable, it is necessary to answer three questions:

• How far is the surrogate’s context of use from the molecular mechanism of a disease and of a disease treatment?

• In what stage of a disease is the surrogate’s context of use defined? • In what fraction of the total patient population is the surrogate’s context of use defined?

For example, when considering the use of cholesterol levels as a surrogate endpoint for assessing a drug’s efficacy at preventing atherosclerosis, the answer to the first question would be that cholesterol levels are associated with a subset of the molecular mechanisms involved in atherosclerosis. The answer to the second question is that cholesterol as a surrogate endpoint is used primarily in the early stages of disease, and the answer to the third question is that cholesterol testing is used in the general population. There are gaps, however, in the definition of the context of use for cholesterol levels as a surrogate endpoint. While cholesterol levels are associated with a subset of the molecular mechanism of atherosclerosis, there are multiple other molecular mechanisms involved in this disease and the context of use may not be defined for later stages of the disease.

12

Page 13: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

As far as a path forward for the development and acceptance of surrogate endpoints is concerned, it will be critical to identify incremental sets of context of use and to propose corresponding sets of evidentiary standards that consider the benefits and risks associated with each context of use. This proposal should be put forward in a white paper that would engage all stakeholders in a discussion in a future workshop that NBDA could sponsor and that would result in soliciting guidance from regulatory agencies based on the white paper and workshop. THE SURROGATE ENDPOINT – PAST, PRESENT, AND FUTURE3 (DR. ROBERT TEMPLE) A surrogate endpoint, as described in the preamble to the Accelerated Approval Rule issued by FDA in 1992, is “a laboratory measurement or physical sign that is used in therapeutic trials as a substitute for a clinically meaningful endpoint that is a direct measure of how a patient feels, functions, survives and is expected to predict the effect of the therapy.” That simple definition encompasses a wide range of “markers,” but none of them have any value unless they do in fact predict clinical benefit. Despite their widespread and regular use in clinical medicine, surrogate endpoints have been controversial for decades because of doubts about whether they are really on the causal pathway. For example, in the 1960s, the so-called New York School of physicians raised doubt about the validity of blood pressure lowering as a surrogate marker for heart attack and stroke. Studies in the late 1960s put that idea to rest, but the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT) conducted in the late 1990s showed that not all blood pressure lowering medications are the same. Some, for example, delay renal dysfunction in diabetics while others treat congestive heart failure. Similarly, in the 1990s there was substantial debate about LDL cholesterol as a surrogate endpoint for coronary artery disease that were largely put to rest by definitive studies with statins. Today, the data clearly support statin use for preventing coronary artery disease, but debate continues about the effectiveness of ezetimibe and other drugs. One interesting finding from these studies is that the mechanism of action for lipid-lowering drugs is not as clear-cut as was once thought. While these drugs do lower LDL cholesterol levels, they do not result in smaller arterial deposits and less constriction of the coronary arteries. Given their rapid beneficial impact, it is more likely that these drugs prevent further deposits from damaging the arterial wall and reducing the ability of platelets to stick to those deposits. There are two major problems with surrogate endpoints. First, the surrogate may not be related to the outcome. A high white blood cell count or fever is indicative of an infection, but drugs that would lower white blood cell count or reduce fever would have no impact on the infection itself.

3 This section is based on the presentation by Robert Temple, M.D., Deputy Center Director for Clinical Science at the FDA.

13

Page 14: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

In this case, those biomarkers are consequences of infection and are not part of the mechanistic chain. A bigger problem, though, is that drugs often have unintended effects that are unrelated to the surrogate. For example, VPBs predict increased mortality after myocardial infarction, and it was presumed that drugs that help the heart beat stronger would prevent heart failure. In fact, intermediate endpoints in clinical trials of anti-arrhythmics showed that they had ample beneficial effects on exercisability and other functional measures. However, outcome studies found that these drugs, which did reduce VPBs, doubled mortality, a finding that is still somewhat of a mystery. The end result is that no drug designed to prevent heart failure will ever be approved without a complete outcome study, which is perhaps why there are no anti-arrhythmics approved for clinical use or even in development. Hematocrit is another example of a surrogate endpoint that does not quite tell the entire story with regard to a drug’s adverse effects. In this case, erythropoeitin was approved on the basis of its ability to maintain hematocrit at an acceptable level in patients on dialysis, but efforts to use a higher dose of the drug to attain a “normal” hematocrit have been uniformly negative, with the elevated dose not only failing to improve cardiovascular outcomes, as hoped, but actually increasing the incidence of heart failure, myocardial infarction, and death - likely an off-target effect. The FDA can in fact approve a drug based on a surrogate endpoint. However, the FDA weighs benefits against risk, and unless there is substantial evidence to support the assertion that an effect on the surrogate corresponds to a clinical benefit, approval would represent a high risk. The accelerated approval rule allows for accelerated approval based on a surrogate endpoint that is considered reasonably likely to predict clinical benefit based on epidemiologic, pharmacologic, or other evidence. The law does not state, but does imply, that full approval would have to be based on the near certainty that a drug does produce the desired clinical benefit with an acceptable benefit-risk profile. Full approval based on a surrogate endpoint is unusual, but current practice approves all anti-hypertensive drugs, anti-diabetic drugs, and cholesterol lowering drugs based on surrogate endpoints. Surrogates such as tumor response rate, time to progression, and progression-free survival have been used to approve anticancer agents, and recently the FDA issued guidance that declared that it will accept pCR in the neoadjuvant setting under accelerated approval, with demonstration of progression-free survival and overall survival required for full approval. Increases in bone mineral density has been used as an initial basis for approval of osteoporosis drugs, with fracture studies to follow for full approval; testosterone suppression has been used for approval of certain prostate cancer drugs; and decreased viral load has been used to approve HIV treatments under the accelerated route, with long-term clinical benefit required for full approval. There are a number of factors that area persuasive in terms of employing surrogate endpoints. Epidemiologic evidence can provide a picture of how direct or obvious the relationship is between a surrogate endpoint and drug efficacy, as is the case with blood pressure and

14

Page 15: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

cholesterol lowering drugs, as well as with drugs that normalize electrolyte levels. Data from prior outcome studies showing a strong connection between a surrogate and a clinical effect can be persuasive, as has been the case with blood pressure, hemoglobin A1c, and LDL cholesterol. Some surrogate endpoints are more clinically measurable than others – renal function, by definition, is a surrogate. Anatomic decrease in renal function is a real endpoint and the FDA treats it that way. On the other hand, there are structured clinical endpoints such as FEV1, exercise tests, and sleep lab results that are sometimes called surrogates but are really attempts to structure a test of function even though they seek to measure function. History shows that even one clear failure of a drug’s effects to correlate with the surrogate endpoint casts doubt on the validity of the surrogate. It is important to remember that there are biomarkers that are not surrogates, but serve as important prognostic and predictive biomarkers. For example, the surface and genetic markers that have transformed oncology, virology, and the treatment of cystic fibrosis are predictive enrichment markers, not surrogate endpoints. Similarly, single nucleotide polymorphisms and the Mamaprint assay are important prognostic biomarkers but are not surrogate endpoints. In addition, a marker that has not “graduated” to surrogacy can still inform dose selection, either for clinical trials or in practice, as is the case of a coagulation measure for an anticoagulant. Are there good reasons to be optimistic that surrogate endpoints have a future in clinical trials across diseases? Overall the answer is – yes! In some cases, the “real” endpoint is rare or greatly delayed, as is the case for cardiovascular outcomes, adjuvant chemotherapy, or almost any prevention or risk reduction agent. In addition, support for surrogate endpoints may we warranted when: there is a persuasive relationship between a surrogate endpoint and an outcome based on results with other drugs, particularly if those drugs belong to multiple chemical classes; when there is high biologic plausibility; there is a clear understanding of disease pathogenesis based on epidemiologic data; solid data from relevant animal models exists. Another strong argument for surrogates exists when an outcome study for a drug targeted to an urgent unmet need would have to be large and require prolonged observation, as was the case with HIV therapies; or when major safety concerns are not an issue (e.g., when a long available drug is repurposed for another indication. There is also occasions when surrogate endpoints are employed for the wrong reasons. For example, using a surrogate to reduce sample size based on the expectation that the effect on the surrogate is far larger than the likely effect on short-term outcome. In these cases, the surrogate endpoint’s efficacy is always less certain and based on an expectation of a benefit, making it difficult to weigh benefit against risk. An example of this situation involved the use of increased blood pressure upon standing as the basis of approval for a drug to treat orthostatic hypotension, a troubling disease with no effective therapy. In this case, the drug ameliorated the decrease in blood pressure, which is in fact the problem, but it did not produce a clinical improvement for reasons that are still unclear.

15

Page 16: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

For rare diseases, there is support for using a surrogate endpoint to speed trials. However, a better solution, similar to the model used by the National Cancer Institute for trials of pediatric anticancer agents, is to conduct trials across multiple centers and measure a valid clinical outcome. In this situation, the use of a surrogate endpoint to secure accelerated approval combined with a longer and larger outcome trial would be the ideal. One potential new approach to the use of surrogate biomarkers being explored by the FDA involves a mechanism of action test in small numbers of patients being studied initially in valid outcome studies - particularly those that are studying a more common variant of a disease. The initial study in this case is intended to show clinical benefit but it may also play a role in validating the surrogate endpoint. The validated endpoint could then be used in smaller trials of other variants of disease with few patients, such as with rare cystic fibrosis genotypes. Another new approach involves diseases in which there is a genetic mechanism with a single clear effect that is very understandable biologically and where restoration of function would be a credible endpoint, at least for accelerated approval. WHERE HAVE ALL THE BIOMARKERS GONE?4 (PANEL DISCUSSION) Some biomarkers have been developed successfully to the point of approval as a surrogate endpoint, while many others have failed. In addition, there are many that are a work in progress and that could represent new thinking a new direction for the field. One success story – the approval of pCR as a valid surrogate endpoint – was one output of the Phase 2 I-SPY 2 trial, which was designed to screen a series of novel agents in combination with standard chemotherapy in the breast cancer neoadjuvant setting. In this standing trial, patients are randomized to receive a novel agent given in combination with standard chemotherapy or standard chemotherapy alone, with pCR as the endpoint. This trial demonstrated the ability of powerful, modern statistical methods to enable a novel trial design that accelerated the process of identifying drugs that are effective for specific breast cancer subtypes while also reducing the cost time, and numbers of patients needed to get drugs through Phase 2 trials and gain accelerated approvals. Milestones/advances for this first-of-its-kind clinical trial include:

• Demonstrated that the pCR endpoint works better by breast cancer subtype. • FDA issued accelerated approval guidance for the use of pCR as a surrogate endpoint. • Developed an infrastructure of information technology systems that can support adaptive

learning and new methods for distributing credit for participating in clinical trials.

4 This section is based on short presentations by Laura Van’t Veer, Ph.D., Professor in the Helen Diller Family Comprehensive Cancer Center and the Angele and Shu Kai Chan Endowed Chair in Cancer Research at the University of California, San Francisco; Howard Scher, M.D., the D. Wayne Calloway Chair in Urologic Oncology and Chief of the Genitourinary Oncology Service at Memorial Sloan Kettering Cancer Center;Joseph Boneventre, M.D., Ph.D., Chief of the Renal Division and the Biomedical Engineering Division at Harvard Medical School and the Samuel A. Levine Professor of Medicine at the Brigham and Women’s Hospital; and Anna Barker.

16

Page 17: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

• Demonstrated the integration of adaptive randomization and real-time data collection to drive ongoing randomization.

• Proved the value of the standing trial concept that comprises multiple arms, a single backbone, and a Master Investigational New Drug application, and that has led to two drugs graduating with biomarker signatures to Phase 3, five additional drugs on study, and several more in the pipeline to participate in the I-SPY 2 trial.

• Initiated the establishment of the international I-SPY 3 consortium to use an adaptive trial design for a multi-arm Phase 3 registration trial.

Figure 3: Panel Discussion – Biomarker Case Studies, Where Have All the Biomarker Gone? An ongoing attempt to qualify a biomarker involves the qualification of circulating tumor cell (CTC) enumeration as a surrogate for survival in prostate cancer. The currently accepted (but controversial) biomarker used today for diagnosing prostate cancer is prostate specific antigen (PSA), which is not a surrogate for survival. To address this shortcoming, researchers have turned to CTCs and are evaluating the validity of reducing CTC levels as a surrogate in a series of Phase 1 through Phase 3 trials using a drug that is known to successfully treat prostate cancer. These trials focused on a proposed context of use for CTCs that employed the FDA-approved CellSearch assay as a surrogate endpoint for survival in castration-resistant prostate cancer.

17

Page 18: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

Increased androgen biosynthesis and androgen receptor overexpression are oncogenic drivers of castration-resistant prostate cancer growth that can be targeted efficiently. Enumeration results using CellSearch were consistent in post-chemotherapy treated castration-resistant patients enrolled on protocols of similar design in the United States and the European Union. Based on Phase 2 results. Two Phase 3 registration trials of similar design were conducted, both of which included a CTC biomarker question. The biomarker panel that was best associated with survival was a combination of CTC enumeration and an assay that quantifies plasma membrane damage by measuring lactate dehydrogenase (LDH) release, both at the 12-week time point. Data from this trial also showed that the treatment being tested had a significant effect on the proposed surrogate endpoint – after 12 weeks, the prednisone-alone group had a higher rate of high-risk patients as classified by the combined CTC/LDH biomarker, while the prednisone plus abiraterone acetate group has a higher rate of low-risk patients. The CTC/LDH marker also correlated with a significant survival benefit and was able to discriminate between low-risk and high-risk patients. Finally, the effect of treatment on survival was captured by the CTC/LDH marker. Based on these findings, FDA has proposed a stepwise process and the developers are waiting for final guidance to begin a qualification trial. A third potential surrogate endpoint that is well down the qualification pathway is KIM-1 for kidney injury. Proximal tubular injury is a key component of most acute and chronic kidney diseases, and a biomarker of proximal tubule injury can indicate injury or toxicity if it increases or efficacy if the marker stops increasing or even decreases. Rat toxicity studies have shown that KIM-1 was more sensitive and more specific than BUN, serum creatinine, and N-acetyl-beta-D-glucosaminidase (NAG), three biomarkers used in diagnosing kidney disease, and in 2008 the FDA and the European Medicines Agency announced that the two agencies would allow drug companies to submit the results of KIM-1 assays and six other new tests to evaluate kidney toxicity. Data from subsequent studies in human clinical trials of an antisense agent designed to lower LDL-cholesterol showed that KIM-1 measurements allowed for the early detection of proximal tubule injury, differentiation between glomerular and tubular damage, and assessment of reversibility and regeneration when such damage occurs. Additional clinical trials of other treatments that either produce kidney injury or are designed to heal the kidney also show a strong correlation between KIM-1 levels and kidney injury or regeneration. Currently, several issues are holding back the use of KIM-1 as a surrogate marker for kidney injury, including what is known as the Gold Standard Problem as it relates to “perfect” sensitivity and specificity, and the lack of “normal” values in a “normal population.” Still yet to be determined are the thresholds for KIM-1 levels that mark functionally important injury. The field also suffers from undue conservatism in not believing that preclinical studies inform the use and interpretation of biomarkers in humans and from the reluctance of diagnostic companies to aggressively evaluate biomarkers. While cardiovascular disease has proven to be a fruitful area for the successful development of surrogate endpoints – LDL cholesterol, blood pressure, and imaging being just three examples –

18

Page 19: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

another story has played out that serves as a cautionary tale. This story involves the use of ventricular arrhythmia (VA) as a surrogate endpoint for survival after a cardiac event in the CAST trial. VA had been described as a cause of death after anesthesia by 1940 and in the late 1970s results on ventricular premature contractions and sudden cardiac death were published. By 1980, a number of antiarrhythmic drugs had been discovered, such as lidocaine, guanidine, and digoxin. Advances in cardiovascular medicine were beginning to show real results in terms of all-cause mortality, and the community undertook a search for more sensitive biomarkers to predict cardiovascular outcomes. A number of studies were published supporting the use of VA as a surrogate endpoint for high risk of sudden cardiac death following a heart attack, and from the late 1970s to the early 1990s, VA was considered a surrogate endpoint for the risk of sudden cardiovascular death. These finding drove the development of a new generation of antiarrhythmic drugs, and in 1980, CAST began recruiting patients to address the question of whether antiarrhythmic drugs were effective at reducing the risk of sudden cardiac death following a heart attack. CAST evaluated the effect of three antiarrhythmic drugs on survival in patients with greater than 10 premature ventricular beats per hour following a heart attack using a surrogate endpoint of reduction in ventricular ectopic contraction. The clinical endpoint was death or cardiac arrest with resuscitation resulting from VA. The surprising outcome from this trial was that two of the treatment arms were stopped early because of marked increase in premature sudden death compared to placebo, and while the third arm was continued in CAST II, that trial was stopped early as well when it became apparent that the treatment arm in a larger number of patients was also associated with an increase in premature sudden death. These results showed clearly that other mechanistic factors besides VA contributed to a risk for sudden cardiovascular death and that while VA is a good biomarker, it is not a good surrogate endpoint. In other words, VA does indicate an increased risk of death after a myocardial infarction, but it fails to correlate sufficiently with clinical outcome. One aside to this work was that at the same time as the CAST trial was just starting to recruit patients, another antiarrhythmic drug was tested in a double blind trial and it, too, increased mortality. The drug was abandoned, but the results of the trial were not published until 1993. The authors concluded that their study could have provided an early warning, but the idea that antiarrhythmics increased mortality was not widely accepted at the time and so they attributed the increased mortality to the effects of chance. So while this is a cautionary tale about the science being good enough, it is also an illustration of the need for the clinical and research communities to share information even if prior publications do not agree with their findings. An open discussion among the workshop participants raised two important questions:

• If it takes a decade for a surrogate to be accepted, how will we develop tools and platforms that keep up with technological change? The answer to that question will depend on continuing to generate evidence and on maintaining consistency in processes.

19

Page 20: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

• How should we approach the development of safety surrogates given that negative predictive value could be extremely helpful in drug development efforts? The answer to will require developing baseline values for potential safety surrogates in “normal” populations, and it will also require careful consideration about how to predict or account for idiosyncratic effects.

CONVERGING ON THE BARRIERS – RETHINKING THE POSSIBLE FOR SURROGATE ENDPOINTS At this point in the workshop, the small groups reconvened and held discussions to prioritize the barriers that they had identified and enumerated earlier. The groups then reported on the most important barrier and a general idea of what a solution to that barrier might be. This exercise generated the following small list of prioritized barriers and solutions, which were captured by the graphical recorder as shown in Figure 2.

• The lack of standards is a barrier for understanding disease biology, for demonstrating the clinical utility of treatments or diagnostics, and for combining data across different studies and trials. Subsumed within this barrier is a second barrier related to the fact that there is no authority with the oversight to actually set the standards.

o A possible solution would be to first understand who the possible authorities might be and to use global crowdsourcing activities to help to bring those entities together and perhaps delegate one expert to set the standards that would be adopted by the community at large. It was recommended that the NBDA could organize this activity.

• Fragmentation is a barrier that arises from the fact that each stakeholder has its own set of incentives and that without the ability to bring stakeholders together; the drivers for developing surrogate endpoints are fragmented. Fragmentation is also a result of the rapidly changing chronicling of the natural history of disease, along with the lack of an organized or structured way to capture the information.

o A possible solution is to develop models and algorithms that would not just passively collect data but bring together collaborators that represent different types of expertise and provide them with a virtual data set, or cloud, data that was actively collected over time and across patient groups. This cloud would generate evidence that experts could convene around and ultimately drive collaboration.

• The failure to clearly define the clinical endpoint and the resulting one-size-fits-all approach that is too often taken in surrogate endpoint development efforts is a major barrier to progress.

o A possible solution is to bring together stakeholders in a collaborative model that can determine unmet needs for specific contexts of use and formulate the right questions to address those needs.

• The inability to share information as a major barrier. As has been demonstrated by the adoption of pCR as a surrogate endpoint based on an FDA analysis of data from

20

Page 21: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

thousands of patients that was collected in numerous clinical trials in the neoadjuvant setting.

o A possible solution would require securing funding to develop the standards for data collection and analysis and then using that data to define a return on investment. Successful examples of data sharing could be used to build the business case for this kind of data aggregation and sharing - based on the assumption that there is actionable data that would be useful for many surrogate endpoint development efforts.

• Another barrier is the uncertainty of the clinical endpoint. o A possible solution could resemble the model of retrospective/prospective

analysis that has been used successfully for the identification of biomarkers. Major barrier here is the inability to share data and access proprietary data sets. The overarching assumption is that good data exists and it should be used instead of spending 10 years conducting a prospective clinical trial from scratch.

• Reimbursement is a barrier that exists because of a lack of payers’ confidence that surrogate endpoints will improve outcomes.

o A possible solution is to look at qualification of a particular surrogate for a given context of use based on industry/expert alignment and to pursue what is known as a special protocol assessment for a Phase 3 program. This would involve getting a drug developer to approach both payers, such as the Centers for Medicare and Medicaid Services (CMS), and the FDA to reach agreement on a Phase 3 study design that would lead to both approval and reimbursement. Securing such an agreement would most certainly give greater momentum to surrogate endpoint development efforts.

SYSTEMS APPROACHES TO RETHINKING SURROGATE ENDPOINTS – ANGIOGENESIS AS AN

EXAMPLE5 (DR. WILLIAM LI) Instead of searching for disease-specific or molecule-specific biomarkers that might be surrogate endpoint candidates, a different way of thinking about surrogate endpoints is to take a broader, systems-based approach that looks at features common across many disease states. Angiogenesis, for example, lies at the heart of a wide range of disorders, including cancer, age-related macular degeneration (AMD), coronary artery disease, chronic wounds, diabetes, obesity, and Alzheimer’s disease, and there are a number of angiogenesis-based treatments available for several of these diseases. One perspective of a systems approach would be to try to understand what characterizes angiogenesis in a healthy system, what the perturbation away from a healthy state is in disease and over the course of disease, and how treatments can influence the perturbed system and enable it to return to a normal, healthy state. Taking that perspective, it may be

5 This section is based on the presentation by William Li, M.D., President, Medical Director, and Co-Founder of the Angiogenesis Foundation.

21

Page 22: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

possible to combine the extensive knowledge base on the biology of angiogenesis with the growing amount of clinical data available to find associations, correlations, biomarkers, and surrogate endpoints.

Figure 4: Angiogenesis – a Systems-Based Approach to Developing Surrogate Biomarkers Angiogenesis is an example of a homeostatic system, one that undergoes constant dynamic adjustment around a set point or normal range. The negative and positive feedbacks that influence this system and that make up the homeostatic mechanism might provide clues that will enable a systems approach to identify surrogate endpoints. In angiogenesis, there is a normal set point for the amount and timing for the vasculature present in a given tissue. For example, the set point in the uterus changes over the course of the menstrual cycle and there are mechanisms that spur angiogenesis and other mechanisms that prune the vasculature. Following surgery or injury, angiogenesis increases but only until the set point for the injured tissue is reached. The mechanisms that determine these set points are not well understood, and they clearly provide clues to both health maintenance and disease. Today, some 70 diseases have been identified as angiogenesis-dependent. Each of these have other causes and pathologies, but changes in the regulation of the angiogenic system are a common denominator in all of them. The role that angiogenesis plays is best understood in three disorders - ocular disease, cancer, and chronic, non-healing wounds. In each case, a combination

22

Page 23: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

of knowledge plus clinical experience is pointing the way to possible surrogate endpoints. For example, there are over 5000 citations reporting increased microvessel density in cancer, but the data has not been aggregated due to lack of standards regarding data collection and analysis. There are also numerous studies showing that people who do not have cancer have lower circulating levels of angiogenic growth factors, such as vascular endothelial growth factor (VEGF), but again because of a lack of standards and collaboration, efforts to derive something useable from these reports have not proven fruitful. Conversely, there are instances where progress towards finding surrogate endpoints is occurring. In AMD. For example, a noninvasive imaging technique called optical coherence tomography can present an image of the displacement of retinal tissue caused by the leaky microvessels that grow in the macula in response to angiogenesis that exceeds the normal set point. More importantly, optical coherence tomography shows the macula normalizing after anti-angiogenic therapy, and this technique is now used as a surrogate to follow the progress of therapy. Antiangiogenic agents have now become important therapies for many types of cancer – anti-VEGF therapy plus chemotherapy is now a gold standard regimen for treating colorectal cancer.. Some of the correlates that occur in patients who respond positively to antiangiogenic therapy include reduction in tumor microvessel density, decrease in tumor edema, and changes in tumor metabolism and gene expression. None of these correlates have risen to the evidentiary level needed to be considered as surrogate endpoints, but they represent possibilities for further exploration. One discouraging aspect of antiangiogenic therapy for cancer is that with continued therapy, tumor gene expression changes and starts producing not only higher levels of the original angiogenic factors but others as well. Recent work has found that circulating levels of the angiogenic factor VEGF-C appear to rise in patients treated with anti-VEGF therapies just prior to disease progression. Additional work is needed to determine if this is a predictive biomarker for tumor escape from anti-angiogenic therapy or if it could be a surrogate endpoint for therapies designed to prevent escape. In wound healing, the problem is not an excess of angiogenesis but rather not enough. Serum VEGF levels, pulse volume recording, and a measure known as ankle-brachial index have been shown to track healing, but insufficient work has been done to consider them as actual surrogates. A new technique called wound fluorescence microangiography uses a non-toxic dye to visualize tissue perfusion in real time. Imaging a wound using this method shows angiogenesis increasing and then eventually normalizing when a wound heals, and researchers are optimistic that this method could serve as a surrogate endpoint for clinical trials and clinical use. Given the ubiquitous nature of angiogenesis, the obvious question is whether markers of angiogenesis have enough specificity to be useful as surrogate endpoints. Answering that question will require generating a human vascular map that describes normal levels of angiogenesis in the various tissues in the body. Certainly, using measures of angiogenesis as

23

Page 24: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

surrogate endpoints is a more complicated process because of its ubiquity, but there are efforts underway to compile as much tissue and clinical data as possible in an accessible database that could serve as a registry for microvascular markers across different therapeutic areas. These markers will then need to be validated with clinically meaningful endpoints and outcomes using standardized technologies and methods. Additional work will also be needed to understand heterogeneity across individuals and how that heterogeneity might change, both in health and disease, and with different treatments. REVISITING AND PERHAPS REINVENTING THE SURROGATE ENDPOINT – ARE WE LOOKING

FOR BIOMARKERS IN ALL THE WRONG PLACES (DR. DONALD BERRY) As was mentioned earlier, the Prentice criteria have three essential components: a treatment must predict the clinical endpoint, the surrogate must predict the clinical endpoint for the given treatment, and the surrogate must capture the entire effect of treatment on the clinical endpoint. There is also the aforementioned note that no surrogate endpoint has ever satisfied all three of these criteria, though some have met two out of three, which can be be sufficient. Most potential surrogates fail the third criterion, but it may be possible to model the excess effect noted in the treatment arm. Recently, results of the ALTTO trial, which was an adjuvant trial, were surprisingly negative, even though results from the I-SPY 2 trial in the neoadjuvant setting were positive using pCR as the surrogate endpoint. Based on this presentation, some experts commented that this represented a failure of the I-SPY 2 process, but in fact, the results were exactly what would be expected given the size of the ALTTO trial and the results of the neo-ALTTO/I-SPY 2 trial. Additional data actually showed that the treatment effect was indeed positive. These results show the importance of modeling the relationship between endpoints, not just for cancer trials but for trials in all therapeutic areas, and model the disease as a stochastic model with parameters that vary and that can include treatment to understand treatment effect and the course of disease. It is important to abandon the notion that there is a point in time, or one measurement, which stands completely separate from all other measurements. The LUNG-MAP trial is an example of the failure to use modeling. This trial has a number of innovative aspects, including counting data from Phase 2 patients in with the Phase 3 results, but the trial sponsors failed to include longitudinal modeling. They set a criteria that the trial would continue to Phase 3 if there was a 50 percent improvement in median progression-free survival of around 1.5 months in the Phase 2 results, but with a Phase 3 endpoint of overall survival of 50 percent. Expecting a result of that magnitude, however, was unreasonable given the size of the trial and the trial design, something that would have been apparent had the sponsors modeled the trial and the disease. For the I-SPY 2 trial, the endpoint is pCR, but the trial also includes modeling of the relationship between pCR and magnetic resonance imaging (MRI). All of the conclusions regarding efficacy

24

Page 25: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

come from pCR, but the results are informed by MRI data, which does have some predictive power and does provide some information that is useful. The I-SPY 3 trial builds on the success of I-SPY 2 and the resulting FDA guidance and will attempt to address both pCR and event-free survival in a single trial design. The goal is to get accelerated approval if the experimental treatment demonstrates superiority on pCR and then to continue the trial to get full approval if there is superiority on event-free survival at follow-up of a minimum of three years. The trial will look at the relationship between pCR and event-free survival and will model any differences between treatments. An important deficiency in the Prentice criteria is that they are abstract in that they fail to consider the patient, the specifics of a given disease, the prevalence of disease, the disease biology, and the response of other diseases to given treatment. For example, the performance of a surrogate for hairy-cell leukemia would be very different from that for breast cancer simply because of the prevalence of these two diseases, but the results from a trial and the use of a surrogate in one disease should inform how the data are viewed when used with another disease. This is one of the strengths of a basket trial, which examines the therapeutic effect of a targeted drug developed simultaneously across diseases – organ-specific cancers, for example – that each express the same molecular target. The sample sizes in a basket trial are small, and they borrow but do not pool patients in the different baskets. In essence, the basket trial formalizes what has been called the Gleevec phenomenon, where the drug was first approved based on its efficacy in treating chronic myeloid leukemia and subsequently approved for other cancers expressing the same molecular target. There is a special class of surrogate endpoints that offer the possibility of low hanging fruit. For these surrogates, the clinical measure takes place at multiple times as part of the routine care of patients in certain diseases. For example, in diabetes, measures of hemoglobin A1c at one year are considered a suitable surrogate, but the measure that predicts eventual response to therapy is hemoglobin A1c response at almost any point in therapy – there is no other measure that is more predictive. Similarly, the best predictor of overall survival in a cancer trial at one year is survival at six months, and there are many other similar measures. An example of how this kind of surrogate can be used is found in the Eli Lilly AWARD5 trial for drugs for Type 2 diabetes. This trial features a seamless Phase 2/3 trial design with adaptive randomization, and the primary endpoint was clinical utility index at 12 months. Longitudinal modeling was critical to developing the dose-response model and refining the trial design and speeding approval. PREDICTIONS ARE HARD – WHAT CHANGES, ACTIONS, AND DISCOVERIES COULD BE

TRANSFORMATIVE IN DISCOVERING AND DEVELOPING ROBUST SURROGATE ENDPOINTS?6

6 This section is based on short presentations by Peter Kuhn, Ph.D., Professor of Biological Sciences and a member of the Southern California Physics Oncology Center at the University of Southern California; Elizabeth Mansfield, Ph.D., Deputy Director of the Office for Personalized Medicine at the FDA; and Kenneth Buetow, Ph.D., Director

25

Page 26: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

Surrogate endpoints are needed in oncology to in order to have a quantitative description of the clinical scenario as treatment progresses on a patient-specific basis), not just a snapshot at the point of death in the future. The clinical endpoint of “you have cancer” is not a good clinical endpoint for cell-free DNA diagnostics, which will identify a large number of patients with cancer-associated mutations. As a result, the surrogate endpoint has to be applied to the individual patient, not just at the population level. The goal should be that when patients feel better, they can be told why they feel better using the information gained from a surrogate endpoint. There are two particularly vexing questions that the field of surrogate biomarker faces. The first is, “what if the future is determined by a future event?”, and this is related to the fact that two identical patients may follow different paths along the evolutionary development of their disease because of non-treatment related events. The second question is, “what should a surrogate endpoint describe?”, and it gets to the question of whether the important clinical endpoint is quality of life, length of life, or an expected positive response to future treatment. A surrogate endpoint for overall survival might achieve a good result in low entropy diseases such as prostate cancer, but only incremental improvements in high entropy diseases such as lung cancer where predictability is difficulty. Another important issue facing surrogate endpoint development efforts is what are the questions that surrogate endpoints and science can actually answer for the patient? These would include: Is it cancer? How bad is it? How far has it spread? Is my treatment working? Am I cured? Patients want quantified answers to these questions, not just a conversation. Surrogate endpoints also have to account for the spatial and temporal evolution of disease within an individual patient, and this will depend on the specific disease. In thinking about the science of surrogate endpoints and what the future might bring, the single cell will be the fundamental unit of analysis because it is the fundamental unit of disease. It may be necessary to look at a large number of cells, but the revolution occurring in single-cell science holds great promise for surrogate endpoint development for it may identify when disease is changing and when therapy will have to change. Again, for cancer at least, surrogate endpoints will have to be measured in a longitudinal manner because only longitudinal measurements will reflect the changing nature of disease. Getting to a place where longitudinal measurements are possible will mean moving away from the current approach of looking for one single surrogate endpoint that captures all clinical information. The idea that there is a single biological entity that describes a multifactorial state, and in particular can describe some of the endpoints that are most important to patients, is

of Bioinformatics and Data Management for the NBDA and Director of Computation and Information at the Complex Adaptive Systems, Arizona State University.

26

Page 27: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

actually likely zero. Therefore, it should not be surprising that so few surrogate endpoints have been identified and accepted for clinical use. As an analogy, looking at one spot of paint in a pointillist’s painting provides almost no useful information for telling what the entire piece of art is revealing. In order to understand the “picture,” it is necessary to look at each point in its true context. Surrogate endpoints attempt to capture complex traits, and complex traits require capturing clinical information that arises from multiple sources, not from one single gene or one single protein. This is a data intensive activity which requires the collection and processing of large amounts of data to make predictions. While this is challenging, industries that are much larger than biomedicine do this on a day-to-day basis and develop real time predictors that are validated constantly. The trick is to translate the activity of data analysis and prediction into the biomedical space and develop predictive algorithms. A logical place to start rethinking the notion of surrogate endpoints is to correlate the phenotypes that need to be modeled and begin measuring the properties of the biological networks that produce those phenotypes. These are computable entities that can then be compared to measurements taken during various states as they change - and used to develop predictors that can be tested. The challenge will be to answer the question of what the surrogate endpoint is trying to measure – is the patient feeling better? Is the patient alive? – in a way that captures the true complexity of disease. For example, looking at disease only from the perspective of the genome does not capture the complexity of the disease. If that were true, all patients with triple negative breast cancer would have exactly the same disease, and that is not true. Complex diseases must be broadly analyzed using all of the available technologies. Such a comprehensive approach will enable measuring events in a specific patient and predicting both current and future states. Furthermore, it is imperative that the classification of disease and the search for informative biomarkers and surrogate endpoints start with the real phenotype of the diseased cell, not just at one, two, or even a dozen genetic markers on the surface of cells and in the signaling networks that are activated. Obviously, genome and transcriptome are important, but they are not the phenotype of disease (might be considered incomplete surrogates). As a result, much of today’s research is looking for surrogates of surrogates, which may explain why the search for truly predictive surrogate endpoints has more often failed than succeeded. It is important to consider what the surrogate purports to measure and to determine it that measure provides enough information about the disease (and the state of the cells) to serve as a reliable surrogate. Today, most surrogate candidates are too blunt in terms of what they are trying to measure to provide anything but broad categorizations. The advance of technologies, particularly circulating tumor cells and new approaches to proteomics (provides state-specific

27

Page 28: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

information with enough depth to provide a real picture of disease as it evolves and changes) holds great promise for advancing the field of surrogate marker discovery and development. The one place where there is a clear case for surrogate biomarkers is leukemia, where the measurement of minimal residual disease appears to be a good surrogate for absence of disease. The problem is that the tests used to measure minimal residual disease are completely unstable and inconsistent across laboratories. The technology must be improved and more uniformly applied, and the field must agree about the type and number of measurements that need to be made. During the open discussion that followed these short presentations, the workshop participants raised a number of questions regarding the goal to rethink surrogate endpoint discovery and development:

• What is the chain of custody for a sample and how does that influence experimental design?

• What can we do to bring a portfolio of markers into the clinical setting and how can they be tested?

• Which stakeholders are best positioned to enable the future of surrogate endpoints? • Should we be thinking about discovering multiple surrogates and developing a

composite model of disease? • How can we advance the science of surrogate endpoints while meeting regulatory rigor,

particularly with regard to determining if surrogates are predictive? Is it possible to build exploration into trials and to generate information for future discovery research?

• How do we standardize clinical information and how specimens are collected and validated?

THE FUTURE OF SURROGATE ENDPOINTS – WHAT CHANGES WOULD MOST BENEFIT EACH

SECTOR AND COMMUNITY?7 (ROUNDTABLE DISCUSSION) Starting with brief comments from representatives of different sectors of the biomedical research and development community and then proceeding to engage in a roundtable discussion, the workshop considered additional aspects of thinking about the future of surrogate endpoints. To understand the private sector’s view, it is necessary to understand how the pharmaceutical industry goes about its drug development efforts. In Phase 1 and Phase 2, the goal is to gather as much information as possible to decide as quickly as possible on whether a drug candidate is likely to have desired properties that will benefit patients beyond standard of care. To make these decisions, key factors such as pharmacodynamics, patient response, and potential to serve

7 This section is based on short presentations by Carl Barrett, Ph.D., Vice President of Translational Science in Oncology at AstraZeneca’ Jimmy Lin, M.D., Ph.D., Founder and President of the Rare Genomics Institute; and Lynn Matrisian, Ph.D., MBA, Vice President of Scientific and Medical Affairs at the Pancreatic Cancer Action Network.

28

Page 29: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

as surrogate endpoints are critical considerations. At this point, possible surrogate endpoints are called response biomarkers, which could include such measures as circulating tumor cells, measures of FDG-PET imaging, and changes in a laboratory measurement (e.g., circulating levels of prostate specific antigen). These response markers are built into clinical trials as secondary objectives, and the data is included in the clinical reports and often built into go/no-go decision process. At this point, it is likely that a number of different markers will be considered, but small patient numbers means that none are statistically significant. Conversely, Phase 3 are designed to provide maximum information, both on the drug in question and possible biomarkers and surrogate endpoints. However, given funding constraints, questions about possible surrogate endpoints are often cu from trials. However, there are examples, such as early work on Gleevec for treatment of myelogenous leukemia (CML), found that the level of BCR transcripts in blood were strongly associated with long-term survival. Further study in clinical trials showed that the BCL transcript levels correlated very well with five-year survival, and in an arbitrary decision, the company developing Gleevec set a three log reduction in blood BCL transcript level as the surrogate endpoint for clinical trials. This strategy enabled the company to successfully develop Gleevec, although t was necessary for the company to set standards for BCL transcript measurement and quantification. This same surrogate endpoint was also used to develop a second-generation drug. While there are many aspects of the CML example that are unique, there is a certain parallelism between this story and that of circulating tumor DNA in circulating tumor cells. Indeed, circulating tumor DNA in circulating tumor cells is being built into many trials today for drugs designed to treat lung cancer. Studies have shown that there is a clear drop in the level of circulating tumor DNA in blood that precedes by several months disease reoccurrence (which is visible using standard diagnostic imaging methods). In the future, it may be possible to sequence the DNA that becomes detectable during recurrence and find evidence of new molecular changes that are occurring earlier in the tumor. Such an early marker could enable the identication of drugs that could be used to more effectively treat and perhaps prevent recurrent disease. From a commercial standpoint, surrogate endpoints are important in other contexts – i.e. patient identification. When they impact standard of care, they can be persuasive to insurers, but this requires that they be analytically validated and have (or will) received regulatory approval for multiple uses. From the rare disease perspective, surrogate endpoints are a necessity rather than a luxury given the short supply of patients, the lack of extensive clinical data, and the fact that clinical outcomes may not be measurable for years after treatment. With rare diseases, often the exact genetic mechanism that triggers the rare disease is known, which means that the proteins affected are also known. Obviously, this should make it simple to identify a surrogate endpoint. However, a single biomarker may not capture the complexity of all clinical outcomes. Interestingly, even

29

Page 30: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

when a potential surrogate is known and the biology understood, there may be prohibitive technical challenges. A case in point is the use of blood phenylalanine level as a surrogate marker for phenylketonuria, where it proved to be technically difficult to accurately measure blood phenylalanine level. Eventually, those technical issues were solved and studies have shown that phenylalanine level in blood is a good surrogate endpoint for this genetic disorder. Surrogate endpoints must ultimately provide information that benefit patients. From the perspective of a pancreatic cancer patient, the surrogate endpoint would be valuable if it can provide a measure of outlook for survival following surgery. For those 15 percent of pancreatic cancer patients who are candidates for surgery, a change in CA19-9 levels of more than 20 percent is an independent predictor of survival. However, oncologists do not use this information to assess therapy, but rather rely on an imaging “gold standard” that they assess over the course of three to four months. Unfortunately, the gold standard is inadequate because it is difficult to analyze the progression of pancreatic tumors based on imaging due to their unusual structural characteristics. Until recently, delaying three to four months post-surgery prior to initiating chemotherapy did not matter, as none of the available chemotherapy regimens significantly impacted survival. However, there are now two effective first-line therapies and a second-line therapy and other agents that are currently in clinical trials. It is now possible to determine is a patient is likely to benefit from therapy – and to enroll them in a clinical trial that could increase their survival. Therefore, it is now possible (and reasonable) to conduct the clinical trials necessary to show that CA19-9 is indeed a surrogate marker for disease prognosis. While CA19-9 may not be the perfect surrogate endpoint, the risk of not using the marker or using it and getting the wrong answer would be dwarfed by the potential benefit of starting therapy sooner. The question for regulators will be how much uncertainty is reasonable in a disease where time is of the essence. There is also the question of whether CA19-9 can be a surrogate endpoint for clinical trials, a question that can be answered as part of the clinical trials process. WORK GROUP REPORTS The workshop participants split into two action groups that developed plans for moving surrogate endpoint development forward under one of two scenarios. The first action grtoup conducted their deliberations starting from the assumption that the current regulatory definition and guidance documents for the qualification and validation of surrogate endpoints would not change going forward. Give that assumption, using the NBDA’s big Six Strategic Elements8 to guide biomarker discovery and development, and in view of the state of biomarker science today, this group was tasked with defining the type and level of evidence needed for regulatory approval within the context of a clinical trial. The second action group focused on identifying

8 The Big Six Strategic Elements for biomarker discovery and development are: the right clinical question, context of use, robust experimental design, and appropriate numbers of high-quality biospecimens, robust technology standards, high quality data and appropriate analytics.

30

Page 31: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

and exploring new ideas and approaches to the discovery and development of surrogate endpoints, starting from the assumption that a new approach is needed to redefine the surrogate endpoint.

Figure 5: Work Groups Reports – Improve or Redefine Surrogate Endpoint Development Action Group 1 This group proposed looking for early surrogate endpoints measured in a course of treatment, which would be associated with clinical benefit and correlate with an established surrogate endpoint that would be used in the clinical trial. Possible candidates would be to image tumor volume change using MRI in the neoadjuvant setting; PET imaging in advanced disease or series of measurements such as CTC counts. The proposed approach would be to initially demonstrate that the early surrogate endpoint correlates with the existing surrogate and to then show that the early surrogate is actually better than the accepted “gold standard” surrogate. This approach would enable the design of clinical trials that could provide an incremental demonstration of the evidentiary standards needed for approval of the earlier surrogate endpoint. CTCs are an attractive candidate for this approach because there is a validated, approved FDA assay available and because CTCs have already been shown to be prognostic indicators for a number of different tumors since they are associated with an adverse outcome when measured at

31

Page 32: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

baseline. The Action Group decided to focus on a common disease – prostate cancer –where considerable data already exists and there are no effective surrogates (PSA is not approvable as a surrogate endpoint). Imaging was also discussed, but disease I but since measurable disease is relatively infrequent, and the since the most common sign of spread, metastasis to bone, cannot be accurately assessed, it is not a good candidate. Trials that be have been conducted and that have included CTCs as a measurement have shown that that it correlates with both positive and negative trial results – drugs that showed an efficacy produced a much larger drop in CTC levels than did drugs that failed trials. Beyond the clinical trials needed to determine if CTCs can serve as gold standard early surrogates for clinical response with cytotoxic drugs, significant research will be required to assess the potential usefulness of CTCs would be useful surrogates for testing biologics, angiogenesis inhibitors, and bone-targeting agents. If CTCs can be shown to be effective for prostate cancer, the approach could be extended to other cancers. For all of these potential applications, the strength of association between the surrogate post-treatment and the clinical outcome must be determined. Action Group 1 explored two other applications where early surrogate endpoints are needed: assessing Alzheimer’s disease progression and measuring kidney toxicity. Alzheimer’s disease is characterized by a progression from mild cognitive impairment to full-blown dementia and death. Imaging is the current gold standard surrogate marker and several techniques are currently used to measure volumetric and hemodynamic changes in the brain. The development of mild cognitive impairment, which can be assessed using a number of validated instruments such as ADOS-cog, is a precursor to Alzheimer’s, but the challenge is to distinguish between those individuals whose mild impairment will progress to Alzheimer’s and the larger number who do not. Changes in several biochemical markers – tau, phospho-tau, and α-beta – have been observed in cerebral spinal fluid and more recently in blood, and a panel of these three markers could be a potential early surrogate endpoint. The Action Group noted that while several drugs intended to modify the course of the disease have failed to demonstrate efficacy in late-stage disease, there are suggestions that they may work better in individuals with mild cognitive impairment. An easily measured panel of biochemical markers that proved to be a reliable early surrogate marker would greatly enable trials to test this hypothesis. The context of use for a proposed trial would be to use a panel of these three markers to identify patients with mild cognitive impairment that will go on to develop Alzheimer’s disease. This trial would provide data to establish the benefit-risk ratio associated with this panel. The surrogate could identify patients that could potentially benefit from an effective disease modifying therapy – a clear benefit; while the risk would be those patients who were identified who ultimately would be exposed to the potential toxicity of the therapy and not receive benefit. Prospective studies would be needed to generate the evidence required for this context of use, though there are data from longitudinal studies that could be used to help inform the prospective

32

Page 33: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

trial. A second context of use that the Action Group identified would be to use the panel of markers as a companion diagnostic that could identify a particular patient subgroup that would be most likely to respond to drug therapy and would most likely benefit from a disease-modifying agent. The driver for developing an early surrogate for kidney toxicity is that the major risk factor for chronic disease is acute kidney injury and that 30 percent of acute kidney injury results from drug toxicity. The imperative, then, is to avoid kidney toxicity, but a confounding issue is that the level of kidney toxicity needed to produce chronic kidney disease is unknown. Unfortunately, the two measures now used to assess kidney toxicity – BUN and creatinine – are not reliable except as a means of confirming that major kidney damage has already occurred. A context of use for the biomarker – KIM-1 – would be to provide an early indication of kidney toxicity associated with approved drugs that are known to cause renal toxicity. The benefits of such a context of use are clear and the primary risk is the occurrence of false positives. Interestingly, false positives would likely be an indicator that a patient had renal cell cancer, or existing chronic kidney disease (which would be informative). The required clinical study would be to determine if KIM-1 levels rise before BUN and creatinine levels do (recognizing that BUN and creatinine levels can go up for reasons not associated with kidney toxicity). A second context of use for KIM-1 would be to stop a drug in clinical use. There would be a great benefit to know early if a drug was causing renal toxicity. A major premise today is that the kidney has enormous reserves and that a kidney can continue to function at half capacity before there is an effect on BUN and creatinine. This is not an acceptable situation, so the proposed marker would not have to be perfect, just better than what is available today. One possible trial design would be to collect data and give the information to half of the physicians taking care of the patients and not the other half. Another approach, one suggested by the FDA, would be to recognize toxicity and then randomize and continue half the patients on the drug and stop therapy in the other half, but this approach is viewed by the industry as unethical. Other contexts of use include: drug efficacy trials for drugs designed to either prevent or treat kidney toxicity; monitoring environmental toxicity, which is a large problem outside of the United States; and as a companion diagnostic that can be used to monitor renal toxicity. Action Group 2 Looking at the issue of surrogate endpoints from the perspective of creating a completely new approach. The group recommended that the first step would be to establish a set of principles based on assessing the effects on patients over the entire course of a disease. This would require examining all of the information about all patients involved in a trial. It would be necessary to create models of disease that recreate the natural history of the disease and its dynamic nature. Such models would be statistical and would account for the stochastic nature of disease and associated uncertainty. Another important principle would be to turn to other fields for help in

33

Page 34: The National Biomarkers Development Alliance (NBDA)*nbdabiomarkers.org/sites/default/files/NBDA WS VI FINAL... · 2015-11-03 · The most widely used surrogate endpoint for full approval

NBDA Workshop VI – The Ever-Promising But Elusive Surrogate Endpoint: What Will It Take?

creating such models since there are many fields that routinely build and use such statistical, multi-parameter models. The next step would be to develop demonstration projects using these in silico models to integrate data from multiple markers, recognizing that trying to characterize a complex disease with single markers is not likely to be successful. These in silico models would put individual markers into a larger context with particular weights that come from other data streams in a composite index adaptive system. The group discussed using biochemical relapse of prostate cancer as one demonstration project – assessing patients as they advance through their disease. There are a number of treatment options available, but not a good understanding of how to use them either as synchronized cocktails or as time-evolved cocktails. However, there are many analytes that can now be measured via CTCs, bone marrow aspirates and via imaging that could provide the data needed to inform the in silico models and provide treatment options. The group noted that there should be multiple demonstrations in other types of cancer to generate data that could be used to develop a durable, broadly applicable platform. A similar approach of starting with the natural history and creating multi-parameter models that make use of as much data as possible, including patient reported outcomes where available, should be taken for other complex diseases, such as Alzheimer’s, type II diabetes, and chronic obstructive pulmonary disease. The group noted that in the quest to profile the progress of disease over time, there is the need to understand the normal – which will require a major effort beyond the abilities of any one group or organization to conduct. A coordinating body such as the NBDA would be needed to function as a “United Nations of biomarkers” and pull data together from multiple sources. A visionary commitment of funding would also be required over time as a great deal of data is needed and ultimately such a major efforts will have failures.

34