|
|||||
|
|
||||||
Originally published as JCO Early Release 10.1200/JCO.2006.09.2775 on December 11 2006 Copyright © 2007 American Society of Clinical Oncology and College of American Pathologists. All rights reserved.
American Society of Clinical Oncology/College of American Pathologists Guideline Recommendations for Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer
From the American Society of Clinical Oncology, Alexandria, VA; and the College of American Pathologists, Northfield, IL Address reprint requests to American Society of Clinical Oncology, Cancer Policy and Clinical Affairs, 1900 Duke Street, Suite 200, Alexandria, VA 22314; e-mail: guidelines{at}asco.org
Purpose: To develop a guideline to improve the accuracy of human epidermal growth factor receptor 2 (HER2) testing in invasive breast cancer and its utility as a predictive marker. Methods: The American Society of Clinical Oncology and the College of American Pathologists convened an expert panel, which conducted a systematic review of the literature and developed recommendations for optimal HER2 testing performance. The guideline was reviewed by selected experts and approved by the board of directors for both organizations. Results: Approximately 20% of current HER2 testing may be inaccurate. When carefully validated testing is performed, available data do not clearly demonstrate the superiority of either immunohistochemistry (IHC) or in situ hybridization (ISH) as a predictor of benefit from anti-HER2 therapy. Recommendations: The panel recommends that HER2 status should be determined for all invasive breast cancer. A testing algorithm that relies on accurate, reproducible assay performance, including newly available types of brightfield ISH, is proposed. Elements to reliably reduce assay variation (for example, specimen handling, assay exclusion, and reporting criteria) are specified. An algorithm defining positive, equivocal, and negative values for both HER2 protein expression and gene amplification is recommended: a positive HER2 result is IHC staining of 3+ (uniform, intense membrane staining of > 30% of invasive tumor cells), a fluorescent in situ hybridization (FISH) result of more than six HER2 gene copies per nucleus or a FISH ratio (HER2 gene signals to chromosome 17 signals) of more than 2.2; a negative result is an IHC staining of 0 or 1+, a FISH result of less than 4.0 HER2 gene copies per nucleus, or FISH ratio of less than 1.8. Equivocal results require additional action for final determination. It is recommended that to perform HER2 testing, laboratories show 95% concordance with another validated test for positive and negative assay values. The panel strongly recommends validation of laboratory assay or modifications, use of standardized operating procedures, and compliance with new testing criteria to be monitored with the use of stringent laboratory accreditation standards, proficiency testing, and competency assessment. The panel recommends that HER2 testing be done in a CAP-accredited laboratory or in a laboratory that meets the accreditation and proficiency testing requirements set out by this document.
The human epidermal growth factor receptor 2 gene ERBB2 (commonly referred to as HER2) is amplified in approximately 18% to 20% of breast cancers.1 ERBB2 is the official name provided by the HUGO Gene Nomenclature Committee for the v-erb-b2 erythroblastic leukemia viral oncogene homolog 2 gene that encodes a member of the epidermal growth factor receptor family of receptor tyrosine kinases. Several aliases have been used in the literature (for example, NEU, NGL, HER2, TKR1, HER-2, c-erb B2, HER-2/neu) and the panel opted to adopt the commonly used term HER2 throughout this article. 2 Amplification is the primary mechanism of HER2 overexpression and abnormally high levels of a 185-kd glycoprotein with tyrosine kinase activity are found in these tumors.3 HER2 overexpression is associated with clinical outcomes in patients with breast cancer.4-6 There are several possible uses of HER2 status. HER2 positivity is associated with worse prognosis (higher rate of recurrence and mortality) in patients with newly diagnosed breast cancer who do not receive any adjuvant systemic therapy. Thus, HER2 status might be incorporated into a clinical decision, along with other prognostic factors, regarding whether to give any adjuvant systemic therapy. HER2 status is also predictive for several systemic therapies.6 In this regard, HER2 positivity appears to be associated with relative, but not absolute, resistance to endocrine therapies in general.7 Although controversial, preclinical and clinical studies have suggested that this effect may be specific to selective estrogen receptor modulator therapy, such as tamoxifen, and perhaps not to estrogen depletion therapies, such as with aromatase inhibitors.8 HER2 status also appears to be predictive for either resistance or sensitivity to different types of chemotherapeutic agents. HER2 may be associated with relative, but not absolute, lower benefit from nonanthracycline, nontaxane-containing chemotherapy regimens.9 In contrast, retrospectively obtained results from prospectively conducted randomized clinical trials appear more definitive in suggesting that HER2 positivity is associated with response to anthracycline therapy; although, this effect may be secondary to coamplification of HER2 with topoisomerase II, which is the direct target of these agents.10-13 Preliminary data also suggest that HER2 may predict for response and benefit from paclitaxel in either the metastatic or adjuvant settings.14,15 Perhaps most importantly, several studies have now shown that agents that target HER2 are remarkably effective in both the metastatic and adjuvant settings. Trastuzumab (Herceptin; Genentech, South San Francisco, CA), a humanized monoclonal antibody, improves response rates, time to progression, and even survival when used alone or added to chemotherapy in metastatic breast cancer.16 Trastuzumab is also active as a single agent17,18 and was approved in 1998 by the US Food and Drug Administration for the treatment of metastatic disease. Importantly, five international, prospective randomized clinical trials have demonstrated that adjuvant trastuzumab reduces the risk of recurrence and mortality by one half and one third, respectively, in patients with early-stage breast cancer.19-23 Furthermore, recently reported results suggest that a small molecule dual HER1/HER2 tyrosine kinase inhibitor of HER2 tyrosine kinase activity, lapatinib (Tykerb, GlaxoSmithKline, Philadelphia, PA), improves clinical outcome in patients with advanced disease when added to capecitabine.24 Taken together, these results imply that HER2 is a useful marker for therapeutic decision making for patients with breast cancer, and they emphasize the importance of evaluating the assay accurately. Gene amplification was initially detected by Southern hybridization in frozen tumor specimens, and was subsequently found to correlate with overexpression at the mRNA and protein levels.25 The early trials of trastuzumab in metastatic breast cancer enrolled patients after central testing using an immunohistochemistry (IHC) assay with the anti-HER2 antibodies 4D5 (the parent antibody of trastuzumab) and CB11 on formalin-fixed, paraffin-embedded tissue, and this clinical trials assay identified staining patterns for HER2 as negative (0 and 1+) or positive (2+ and 3+). In these studies, only patients with 2+ or 3+ staining were eligible. Retrospective analyses have suggested that only patients with IHC 3+ staining and/or HER2 gene amplification by fluorescent in situ hybridization (FISH) benefited16 (also see Appendix C). Concordance data subsequently showed that only 24% of the IHC 2+ tumors had gene amplification when tested by FISH.26 Preliminary findings from the only randomized trial of trastuzumab in patients with 0 or 1+, nonamplified HER2 status have been reported.27 In this study, there was no statistically significant benefit from the addition of trastuzumab to paclitaxel in women with HER-negative breast cancer, but the study was underpowered and limited by the lack of central testing for HER2.
Early studies suggested that as many as 30% of breast cancers have HER2 overexpression.1,25 However, it is likely that HER2 positivity was overestimated in these studies, and that its true frequency is lower in a general unselected population, since those data came largely from high-risk early-stage breast cancer cohorts and from patients with metastatic disease. The frequency of HER2-positive breast cancer appears to be lower when considering all patients with a new diagnosis of invasive breast cancer. Yaziji et al28 recently reported their experience with large volume testing and observed that 18% of samples tested (n = 2,913) showed gene amplification by FISH (defined as a HER2:CEP17 ratio The results of five randomized trials of adjuvant trastuzumab versus no trastuzumab have been reported since 2005, and various strategies to determine or confirm HER2 overexpression were used (Table 1). Adjuvant trastuzumab given during and/or after chemotherapy to women with early-stage breast cancer and evidence of HER2 overexpression results in a significant improvement in disease-free survival19,20 and overall survival.21-23 HER2 overexpression is now accepted as a strong predictive marker for clinical benefit from trastuzumab in both the metastatic39 and adjuvant settings. Indeed, the American Society of Clinical Oncology (ASCO) Tumor Marker Guidelines Panel has recommended routine testing of HER2 on newly diagnosed and metastatic breast cancer since 2001.39
In summary, HER2 testing should be routinely performed in patients with a new diagnosis of invasive breast cancer. However, the best method to assess HER2 status, in regards both to the type of assay used and the optimal method to perform each assay, remains controversial. For most of the prospective randomized adjuvant trials of trastuzumab, testing algorithms for HER2 were somewhat arbitrarily developed, consisting of either IHC testing with reflex FISH if IHC 2+ or reliance on ISH testing alone to detect gene amplification ratios of 2.0 or higher.19-22,30,31,33,40 Those with evidence of amplification by FISH or overexpression by IHC (3+) were considered suitable candidates for participation in these trials. For the most part, these algorithms have been adopted into clinical practice. The assays that are used to obtain the required data to populate these algorithms have not been standardized. Several assays have been used for HER2 determination in tissue (Table 2). US Food and Drug Administration regulations also allow pathology laboratories to develop and implement so called "home brew assays" using US Food and Drug Administration-approved analyte specific reagents.41 While some assays have been carefully validated, others, especially "home brew assays" have not. Prospective substudies from two of the adjuvant randomized trials of trastuzumab versus nil have demonstrated that approximately 20% of HER2 assays performed in the field (at the primary treatment site's pathology department) were incorrect when the same specimen was re-evaluated in a high volume, central laboratory.30,40 Such a disorganized practice and high rate of inaccuracy, for such an important test that dictates a critically effective yet potentially life-threatening and expensive treatment, is not acceptable.
Trastuzumab therapy is not without its drawbacks. Although treatment duration in the metastatic setting varies widely, currently adjuvant trastuzumab is recommended for 12 months. The drug cost of 52 weeks of trastuzumab in the community setting in the United States is approximately $100,000 based on average sales price (www.accc-cancer.org). In addition, there is a requirement for 9 to 12 months of intravenous therapy after completion of adjuvant chemotherapy. Importantly, trastuzumab is associated with a small risk of serious cardiac toxicity.42 In the prospective randomized adjuvant trials, careful serial cardiac monitoring has demonstrated that at median follow-up times of 3 years or fewer, approximately 5% to 15% of patients develop cardiac dysfunction, and approximately 1% to 4% develop significant cardiac events (including symptomatic congestive heart failure) while taking trastuzumab.43-45 Taken together, the significant benefits coupled with the high cost and potential cardiotoxicity of trastuzumab demand accurate HER2 testing. If response to therapy were to be considered a gold standard, then the ideal test for HER2 would approach 100% sensitivity (ie, identify as HER2-positive all patients who will benefit from a specific therapythe true-positives) and 100% specificity (ie, identify as HER2-negative all patients who would not benefit from a specific therapythe true-negatives). However, two points must not be forgotten. First, a precise definition of accuracy is how close the measured values are to a supposed true value, and it incorporates both variability (ie, precision) and bias (ie, a systematic difference between average measured value and true value). Implicit in this discussion is that a suitable gold standard has been established for purposes of determining true status for each specimen. Second, accurate determination of HER2 status must not be viewed exclusively in terms of benefit from anti-HER2 therapy, like trastuzumab. Patients with breast cancers that overexpress HER2 differ greatly in their response to trastuzumab. Available clinical data indicate the near certainty that there are patients who truly overexpress HER2 but have upstream or downstream anomalies that render the interaction with trastuzumab ineffective, and it would not be appropriate to consider these patients as having HER2-negative disease. Rather the challenge remains to define the additional defects that place this HER2 positivity in the appropriate therapeutic context. Despite attempts within the international pathology community to improve the status of HER2 testing in routine practice,46-50 testing inaccuracy remains a major issue with both IHC and FISH.30,31,40 Various factors can explain the large variability observed in clinical practice and in clinical trials. These factors are summarized in Table 3. The use of laboratory assays as the sole determinant for therapy selection poses a significant challenge to pathologists performing and interpreting the results and to oncologists who must rely on them for clinical decisions. Therefore, ASCO and the College of American Pathologists (CAP) established a clinical practice guideline expert panel charged with developing recommendations regarding HER2 testing in breast cancer. Information regarding the scope of the problem can be found in Appendices C (Evidence of HER2 Status and Trastuzumab Benefit) and D (Evidence on HER2 Testing Variation).
Guideline Questions This guideline addresses two principal questions regarding HER2 testing. Table 4 summarizes the recommendations.
Practice Guidelines Practice guidelines are systematically developed statements to assist practitioners and patients in making decisions about appropriate health care for specific clinical circumstances. Attributes of good guidelines include validity, reliability, reproducibility, clinical applicability, clinical flexibility, clarity, multidisciplinary process, review of evidence, and documentation. Guidelines may be useful in producing better care and decreasing cost. Specifically, utilization of clinical guidelines may provide the following:
In formulating recommendations for HER2 testing in breast cancer, ASCO and CAP considered these tenets of guideline development, emphasizing review of data from appropriately conducted and analyzed clinical trials. However, it is important to note that guidelines cannot always account for individual variation among patients. Guidelines are not intended to supplant physician judgment with respect to particular patients or special clinical situations and cannot be considered inclusive of all proper methods of care or exclusive of other treatments reasonably directed at obtaining the same result. Accordingly, ASCO considers adherence to these guidelines to be voluntary, with the ultimate determination regarding their application to be made by the physician in light of each patient's individual circumstances. In addition, these guidelines describe the use of procedures and therapies in clinical practice; they cannot be assumed to apply to the use of these interventions performed in the context of clinical trials, given that clinical studies are designed to evaluate or validate innovative approaches in a disease for which improved staging and treatment is needed. In that guideline development involves a review and synthesis of the latest literature, a practice guideline also serves to identify important questions and settings for further research.
Panel Composition The ASCO Health Services Committee (HSC) and the CAP Council on Scientific Affairs (CSA) jointly convened an expert panel consisting of experts in clinical medicine and research relevant to HER2 testing, including medical oncology, pathology, epidemiology, statistics, and health services research. Academic and community practitioners and a patient representative were also part of the panel. Representatives from the US Food and Drug Administration, the Centers for Medicare and Medicaid Services, the National Cancer Institute, and the National Academy of Clinical Biochemistry served as ex-officio members. The opinions of panel members associated with official government agencies represent their individual views and not necessarily those of the agency with which they are affiliated. The panel members are listed in Appendix A Table A1. Representatives of commercial laboratories and assay/drug manufacturers (Appendix B Table A2) were invited as guests to attend the open portion of the panel meeting held at ASCO headquarters in March 2006.
Literature Review and Analysis Study design was not limited to randomized controlled trials, but was expanded to include any study type, including cohort designs, case series, evaluation studies, comparative studies, and prospective studies. Also included were testing guidelines and proficiency strategies of various United States and international organizations. Letters, commentaries, and editorials were reviewed for any new information. Case reports were excluded. Articles were selected for inclusion in the systematic review of the evidence if they met the following criteria: (1) the study compared, prospectively or retrospectively, the negative predictive value (NPV) or positive predictive value (PPV) of FISH or IHC; the study described technical comparisons across various assay platforms; the study examined potential testing algorithms for HER2 testing; or the study examined the correlation of HER2 status in primary versus metastatic tumors from the same patients; and (2) the study population consisted of patients with a diagnosis of invasive breast cancer; and (3) the primary outcomes included the PPV and NPV of FISH and IHC to determine HER2 status, alone and in combination; concordance across platforms; accuracy in determining HER2 status and benefit from anti-HER2 therapy, sensitivity, and specificity of specific tests. Consideration was given to studies that directly compared results across assay platforms. The panel reviewed the results of randomized controlled trials in breast cancer testing anti-HER2 therapies like trastuzumab and lapatinib. The panel also reviewed unblinded trials comparing various testing methods, describing test characteristics, and defining strategies for quality assurance of testing in the literature. Individuals representing regulatory agencies (CMS and US Food and Drug Administration) also provided information about the regulatory framework. Individuals involved with quality assurance in the United States (CAP), Great Britain, and Canada (Province of Ontario) also provided information about programs to measure and improve HER2 testing. Survey data from the maker of trastuzumab (Genentech) was also evaluated as well as testimony provided by testing manufacturers (Ventana, Dako, Abbott) and large clinical laboratories (Clarient, Mayo Medical Labs, Phenopath, Quest, and US Labs) to define the current status of training and testing for HER2. This information was used to help the panel define the best algorithm for testing, specify testing requirements and exclusions, and the necessary quality assurance monitoring that will make the testing less variable and more accurate. ASCO/CAP expert panel literature review and analysis. An initial abstract screen was performed by ASCO staff. The ASCO/CAP panel reviewed all remaining potentially relevant abstracts identified in the original literature searches to select studies pertinent to its deliberations. Two panel members independently reviewed each abstract for its relevance to the clinical questions, and disagreements were resolved by third-party review. Full-text articles were then reviewed for all selected abstracts. Evidence tables were developed based on selected studies that met the criteria for inclusion.
Consensus Development Based on Evidence
Guideline and Conflict of Interest
Revision Dates
Summary of Outcomes Assessed
Literature Search Preliminary searches identified 1,802 MEDLINE abstracts. The initial abstract screen performed by ASCO staff eliminated 1,010 abstracts that failed to meet any of the inclusion criteria. The ASCO panel conducted dual independent review of all remaining 792 potentially relevant abstracts identified in the original systematic review. The panel eliminated 667 abstracts at this stage of the review; the remaining 125 articles were reviewed in full for the interventions and outcomes described herein. A meta-analysis was not performed because the studies were judged to be too heterogeneous for meaningful quantitative synthesis.
Previous Guidelines and Consensus Statements Testing algorithms described in existing guidelines assume a high level of correlation between IHC and FISH assays. An example from the United Kingdom in 2004 recommends a testing algorithm that uses IHC as the primary test, with a score of 0 or 1+ interpreted as HER2-negative, a score of 3+ interpreted as HER2-positive, and a score of 2+ interpreted as equivocal (or inconclusive) and automatically sent for FISH testing.48 The United Kingdom panel emphasizes several requirements for a laboratory to be approved for HER2 testing, such as a minimum annual case load (250 cases of IHC and 100 cases of FISH) below which laboratories should consider using a reference laboratory, use of standardized and validated assays, and adherence to ongoing quality assurance programs. Other recommendations include the use of tissue-based controls, limiting the reading to the invasive component of the tumor, and strict adherence to kit assay protocol and scoring methodology.48 CAP issued a similar set of recommendations in the United States after a Strategic Science Symposium sponsored by CAP, Rosemont, IL, May 4-5, 2002.47 It emphasized the need for individual laboratories to document their own concordance experience of FISH v IHC (90% for IHC 0 and 3+ and 95% for IHC 1+) before limiting reflex FISH testing only to IHC 2+ results, and also offered recommendations on the use of a standardized report format and defined terminology. Note that these concordance requirements were set based on a palliative role of trastuzumab. Efforts to improve HER2 testing accuracy have been reported by several groups.52
What Is the Optimal Testing Algorithm for the Assessment of HER2 Status? Summary and recommendations. The literature review and resultant panel discussion elucidated three categories of HER2 testing results leading to different clinical decisions for patients with breast cancer. The test, regardless of method used, can be found to be positive, equivocal, or negative. Each of these test results triggers defined patient management algorithms as shown in Figures 1 (IHC) and 2 (FISH).
In all cases, it is assumed that the test being used is accurate and reproducible based on good laboratory practices as defined later in this article. In order to classify a HER2 test as either positive or negative, the laboratory must have performed concordance testing with a validated FISH assay and confirmed that only 5% or less of samples classified as either positive or negative disagree with that validated assay on an ongoing basis. If the laboratory cannot satisfy this criterion, it should not perform HER2 testing and should send specimens to a reference laboratory. Equivocal cases are not expected to 95% concordant, rather they should be subjected to a confirmatory test. Concordance testing should be annually confirmed. A minority view expressed within the panel was that IHC is not a sufficiently accurate assay to determine HER2 status and that FISH should be preferentially used. It is important to note that concordance of assays does not assure accuracy (ie, how close the measured values are to a supposed true value; Appendix F). Evaluating accuracy of a test requires comparison to a gold standard. There is no gold standard at present; no assay currently available is perfectly accurate to identify all patients expected to benefit or not from anti-HER2 therapy. The following definitions have been accepted for analysis of HER2. It is critical that these analyses be conducted on the invasive component of the breast cancer, because HER2 is, for unclear reasons, frequently increased (overexpressed and/or over amplified) in in situ breast cancer, and the clinical implications of this finding are uncertain.53 Positive HER2 test. Based on a literature review of clinical trials, international studies and protocols, expert consensus, and US Food and Drug Administration Panel findings, a positive HER2 test is defined as either IHC result of 3+ cell surface protein expression (defined as uniform intense membrane staining of > 30% of invasive tumor cells) or FISH result of amplified HER2 gene copy number (average of > six gene copies/nucleus for test systems without internal control probe) or HER2/CEP 17 ratio of more than 2.2, where CEP 17 is a centromeric probe for chromosome 17 on which the HER2 gene resides. The 30% criteria for a positive IHC is further discussed in Appendix G. The original FISH test results were defined as either positive or negative, but an intermediate range (from hereon referred to as equivocal range) has since been described and the clinical significance of this observation remains unclear.34-36 This strategy classifies patients as having HER2-positive disease based on positive results with either test. It is recognized that current data are insufficient to define whether these patients represent true- or false-positives. Although the large prospective randomized clinical trials of trastuzumab were not prospectively designed to answer these questions, we anticipate and recommend that such analyses will be forthcoming as correlative studies. Equivocal HER2 test. Much of the confusion about HER2 testing has resulted from the need to define trastuzumab treatment (yes or no) based on test results that represent a continuous rather than a categoric variable. Furthermore, there is significant variation in the intermediate (equivocal) ranges for both the IHC and FISH assays. The equivocal range for IHC consists of samples scored 2+, and this may include up to 15% of samples.29 An equivocal result (2+) is complete membrane staining that is either nonuniform or weak in intensity but with obvious circumferential distribution in at least 10% of cells. Very rarely, in the experience of panel members, invasive tumors can show intense, complete membrane staining of 30% or fewer tumor cells. These are also considered to be equivocal in this guideline. Some but not all of these samples may have HER2 gene amplification and require additional testing to define the true HER2 status (Figs 1 and 2).54,55 The equivocal range for FISH assays is defined as HER2/CEP 17 ratios from 1.8 to 2.2 or average gene copy numbers between 4.0 and 6.0 for those systems without an internal control probe.34-36 Note, however, that patients with a HER2/CEP17 FISH ratio between 2.0 and 2.2 were formerly considered HER2-positive and were eligible for treatment in the adjuvant trastuzumab trials. Therefore, available efficacy data do not support excluding them from therapy with trastuzumab. This group is much smaller, probably fewer than 3% of samples.56 Polysomy 17 is observed in approximately 8% of all specimens, mostly among cases with four to six HER2 gene copies (equivocal range).57,58 There is no accepted definition of what constitutes polysomy and authors have used different criteria to define it. If polysomy 17 is defined as three or more copies of CEP17, most are not associated with protein or mRNA overexpression57 and the same has been observed in tumors with a HER2 gene copy number between 4 and 6.35 Discordant results (IHC 3+/FISH-negative or IHC < 3+/FISH-positive) have also been described, and were observed in approximately 4% among 1,503 patients screened centrally (LabCorp, Burlington, NC) with both methods for eligibility for a clinical trial with trastuzumab.59 However, clinical outcome data for these two groups are not yet available. We anticipate and recommend that such analyses will be forthcoming as correlative studies of the large prospective randomized clinical trials of trastuzumab. It is also clear from the panel discussion and literature review that patients with equivocal HER2 test results constitute a poorly studied subgroup with uncertain association of test scores to benefit from HER2-directed therapy.60 The panel suggested that further studies of this patient group would be promoted by defining these test results as equivocal or borderline. The panel elected to use the term equivocal to avoid confusion with borderline positive and borderline negative terminology which is sometimes used in the interpretation of FISH assays. Equivocal results of a single test require additional action which should be specified in the initial report. Equivocal IHC samples must be confirmed by FISH analysis of the sample. Equivocal FISH samples are confirmed by counting additional cells or repeating the FISH test. If FISH remains equivocal after additional cells counted or assay repeated, confirmatory IHC is recommended so that HER2 protein expression is known for the sample with true equivocal gene amplification status. Negative HER2 test. A negative HER2 test is defined as either an IHC result of 0 or 1+ for cellular membrane protein expression (no staining or weak, incomplete membrane staining in any proportion of tumor cells), or a FISH result showing HER2/CEP17 ratio of less than 1.8 or an average of fewer than four copies of HER2 gene per nucleus for systems without an internal control probe. The upper limit of 5% false-negatives should be considered high in view of the potential curative potential of trastuzumab treatment in the adjuvant setting, and laboratories should aim at bringing this percentage of false negative tests as close to 0% as possible. HER2 assay exclusions. Each assay type has diagnostic pitfalls to be avoided. The panel agreed that there were situations where one assay type was preferred because of assay or sample considerations. Exclusion criteria to perform or interpret an IHC or FISH assay for HER2 are presented in Tables 5 and 6, respectively. The pathologist who reviews the histologic findings on the sample in question should determine the optimal assay type.
Review of Relevant Literature The panel reviewed data from existing and completed clinical trials, published reports, and panel presentations by representatives of other national groups where stringent internal and external quality assessment measures have been implemented.
What Strategies Can Help Ensure Optimal Performance, Interpretation, and Reporting of Established Assays? Testing validation requirements. This section describes technical validation requirements of an assay. It is important that any new test be compared with a reference test for which there has been clinical validation, which means that the reference test predicts clinical outcome. The validation procedure for any new test offered by a laboratory involves several steps. The laboratory must select and acquire appropriate equipment, assure that personnel are trained in the use of the equipment, and develop a standard operating procedure for the test to be offered. Personnel must then be trained on the standard operating procedure with a standardized training plan. The new procedure must then be tested on a group of clinical cases representative of those on which the test will be offered. This testing must be done in parallel along with a validated clinical test for the same analyte (for example, HER2). If the new test (for example, HER2 by IHC) is to be compared with a previously validated complementary test (for example, HER2 by FISH) the samples are tested by both methods and the results compared. Alternatively, for laboratories that have not previously validated either test, the test can be validated by having it run in parallel by another laboratory in which a validated assay is already offered. The number of tests required for a reliable validation is not well defined, but ranges from 25 to 100 cases and depends on the variety of results possible and the amount of variation in results encountered in the test. A new test should show at least 95% concordance with the validated assay to which it is compared. Individuals interpreting the assay must also have their concordance compared with each other and this concordance should also be at least 95%. If laboratories choose to use alternative fixatives other than buffered formalin, the laboratory is obligated to validate that fixatives performance against the results of testing of the same samples fixed also in buffered formalin and tested with the identical HER2 assay, and concordance in this situation must also be 95%. Appendix F discusses statistical considerations for determining appropriate numbers of cases to include in test sets and for setting reasonable performance goals. Ongoing competency assessment. As part of the laboratory's internal quality assurance program, the competency of laboratory professionals and pathologists interpreting assays must be continuously addressed. The laboratory director has responsibility to assure the competency of those performing the test, using established laboratory procedures available for review at the time of inspection. The review of competency for pathologists should include periodic or continuous peer comparisons among reviewing pathologists for the laboratory's HER2 specimens. If variation in interpretations is encountered, remediation must be done and documented. The panel agreed that acceptable performance standards for such tests were as follows:
Review of Relevant Literature Previous consensus conferences described the laboratory requirements for HER2 testing.47,61 Internal quality assurance requirements are mandated by CLIA 88 (legislation passed by United States Congress in 1988) as the basis for these recommendations. Reporting recommendations are also defined broadly in CLIA 88 requirements and have been specified in CAP consensus conference47 and in expert opinion.65 Literature substantiating the testing exclusion criteria and interpretation criteria were reviewed and used to establish the criteria.47,66
What Is the Regulatory Framework that Permits Enhanced Testing Scrutiny?
What Are the Optimal External Quality Assurance Methods to Ensure Ongoing Accuracy in HER2 Testing? External quality assurance (laboratory accreditation). Beginning in 2007, the CAP Laboratory Accreditation Program will require that every CAP-accredited laboratory performing HER2 testing participate in a guideline concordant proficiency testing program for that testing. In the future, the panel recommends that all accrediting agencies require guideline concordant proficiency testing and laboratory accreditation requirements for HER2 testing. The Laboratory Accreditation Program will monitor performance in the required proficiency testing. Performance below 90% will be considered unsatisfactory and will require internal or external response consistent with accreditation program requirements. Responses must include identification of the cause of the poor performance, actions taken to correct the problem, and evidence that the problem has been corrected. The checklist of requirements for laboratories is presented in Table 11. International external quality assessment initiatives are described in Appendix I.
Proficiency testing requirements. All laboratories reporting HER2 results must participate in a guideline concordant proficiency testing (PT) program specific for each assay method used (ie, separate programs for IHC, FISH, brightfield ISH, image analysis). To be concordant with this guideline, PT programs must distribute specimens at least twice per year including a sufficient number of challenges (cases) to ensure adequate assessment of laboratory performance. For programs with 10 or more challenges per event, satisfactory performance requires correct identification of at least 90% of the graded challenges in each testing event. Laboratories with less than 90% correct responses on graded challenges in a given PT event are at risk for the next event. Laboratories that have unsatisfactory performance will be required to respond according to accreditation program requirements up to and including suspension of HER2 testing for the applicable method until performance issues are corrected.
Statistical Considerations for Proficiency Testing Standard measures of performance for diagnostic tests having binary outcomes include sensitivity, specificity, and overall accuracy. Overall accuracy combines sensitivity and specificity into a single measure of the percentage of cases (positive and negative) for which the assay result is concordant with the true status (concordance rate). See Appendix F for a more detailed discussion of the statistical considerations involved in testing.
How Can These Efforts Be Implemented and the Effects Measured? Educational requirements and communication strategies. For this guideline to be effectively implemented by laboratories anywhere in the world, there will need to be effective and widespread educational efforts of pathologists, oncologists, patients, and advocacy groups. CAP will offer online and live educational sessions about clinical necessity, testing requirements, test interpretation guidelines, and methods by which acceptable performance will be measured through laboratory accreditation and proficiency testing, and organizations in other parts of the world could play a similar role. ASCO will create education materials for oncologists and patients about how laboratory quality can be evaluated through review of reports and laboratory quality assurance activities. Pathologists must actively monitor the quality of their test procedures and oncologists on behalf of their patients must seek assurance that laboratories providing test results are appropriately accredited. These actions should improve the consistency of testing for HER2, although quantifying this improvement will be difficult. One of the important outcomes resulting from accurate HER2 testing is to ensure that every breast cancer patient who might benefit from anti-HER2 therapy be accurately and promptly identified, while those who would not benefit be spared a costly and potentially harmful placebo.
Review of Previous Educational Efforts of the College of American Pathologists
In 2003, CAP cosponsored (along with the National Cancer Institute [Bethesda, MD], NIST, and US Food and Drug Administration) a session to discuss the development of a standardized HER2 reference material for use in immunohistochemic HER2 assays. There was general consensus at the meeting that the availability of reference material would help to standardize HER2 testing. Recommendations from that session are summarized in Table 13. 61 NIST has since been funded by the National Cancer Institute to develop this reference material, consisting of cell lines with specific different levels of HER2 protein expression. Cell lines have been identified and standardized production is in process. CAP has also held numerous educational sessions on HER2 testing at each of its national meetings since 2002. There have also been sessions given at industry-sponsored workshops at various national pathology meetings, but the message provided by these sessions has not been uniform.
Modification of the regulatory environment. This guideline will be made available for review by organizations involved in laboratory accreditation and proficiency testing services in the USA. ASCO and CAP will jointly work to facilitate the dissemination of these guidelines. Efforts will be directed at enhancing the education of laboratories by requesting publication of guideline information in Morbidity and Mortality Weekly Report published by the Centers of Disease Control and Prevention. CAP will engage in significant live and online educational activities to help pathologists understand the significance of these changes in accreditation practice, beginning at the CAP annual meeting in September 2006. ASCO and CAP will provide educational opportunities (print, online, and society meetings) to educate health care professionals, patients, third party payers, and regulatory agencies. CAP will urge its members and participants in accreditation and proficiency testing programs to provide information in its reports specifying participation in laboratory accreditation. ASCO and CAP will work to coordinate these recommendations with those of other organizations, such as the National Comprehensive Cancer Network,71 the National Cancer Advisory Board, and patient advocacy organizations. We are confident that these measures will improve performance of laboratories using these and future predictive testing methods. CAP will actively review results of proficiency testing and laboratory accreditation activities and periodically publish performance results. The organization will also work to include quality monitoring activities of HER2 testing in its programs designed for ongoing quality assessment, similar to CAP's Q-tracks and Q-probes.72
Limitations of the Literature |