Introduction
Please refer to the “History/Background and/or General Information” section for general information on testing of RNA and DNA as it applies to oncology.
This evidence review focuses on genetic testing used to guide oncologic treatment and whether the evidence behind this testing is adequate to draw conclusions about improved health outcomes for the Medicare population. In general, health outcomes of interest include patient mortality, morbidity, quality of life, and function.
For use in the Medicare population, tests themselves must demonstrate analytic validity, clinical validity, and clinical utility. Tests should enhance clinical decision-making, directly informing clinical management and improving patient outcomes.
In the context of oncology, genetic testing endeavors to improve patient outcomes through both prognostic and predictive means. For instance, oncologic genetic testing can optimize treatment choice (predictive), avoiding ineffective treatments and reducing adverse events. Ultimately, patient-centered outcomes must be the underlying justification for oncologic testing.
Internal Technology Assessment
PubMed and Google Scholar were searched for peer-reviewed, evidence-based databases and/or knowledge bases which provide information regarding analytic and clinical validity and clinical utility for genetic testing. As identified in the Institute of Medicine’s seminal work Clinical Practice Guidelines: Directions for a New Program,9 there are eight recommended attributes for clinical practice guidelines. These include validity, reliability/reproducibility, clinical applicability, clinical flexibility, clarity, development via a multidisciplinary process, scheduled review, and documentation. These attributes were referenced in the selection of databases/knowledge bases for genetic testing.
In order to be included, the database/knowledge base was required to be evidence-based, widely available, and created and/or facilitated by an organization with a focus on either oncology or genetics. Each database/knowledge base was also required to include a scoring metric which could be utilized to determine clinical actionability for specific genetic tests. Additionally, the database/knowledge bases and their scoring metrics were required to demonstrate the attributes listed in the Clinical Practice Guidelines. All countries of origin were included as long as the database/knowledge base met the criteria, with only sources in English considered. Based on the above criteria, three databases/knowledge bases were identified that ideally met the needs of this LCD.
Databases and Knowledge Bases
National Comprehensive Cancer Network (NCCN)
The NCCN is a nonprofit alliance of U.S. National Cancer Institute-designated comprehensive cancer centers. NCCN strives to improve the effectiveness and quality of care for patients with cancer and has published clinical practice guidelines applying to more than 97% of cancers affecting individuals in the U.S.10 According to the organization, their guidelines are “intended to assist all individuals who impact decision-making in cancer care including physicians, nurses, pharmacists, payers, patients and their families, and many others.”10 (p.1)
In addition to a Guidelines Steering Committee (which provides oversight and planning), and a Guidelines Panel Chair and Vice Chair (who provide oversight of content development activities), each NCCN guideline has an individual Guidelines Panel including multidisciplinary representation from all of the core medical specialties relevant to the guideline, a primary care physician, and a patient advocate. NCCN notes that “any Panel Member with a meaningful conflict of interest is excluded from participating in Panel presentations, reviews, discussions, and voting relevant to the area of the conflict of interest.”10(p.2)
The development and update of NCCN guidelines is an ongoing process which includes “critical evaluation of evidence, integrated with the clinical expertise and consensus of a multidisciplinary panel of cancer specialists, clinical experts and researchers in those situations where high-level evidence does not exist.”10 (p.5) Recommendations for treatment are based on the level of clinical evidence available as well as consensus among the Guidelines Panel regarding the efficacy and safety of the intervention. Active NCCN guidelines are reviewed and updated at least annually.
NCCN evidence and consensus categories are as follows: Category 1 (high level of evidence with uniform Panel consensus that the intervention is appropriate); Category 2A (lower level of evidence with uniform Panel consensus that the intervention is appropriate); Category 2B (lower level of evidence with at least 50% [but less than 85%] panel consensus); and Category 3 (any level of evidence, but major Panel disagreement regarding whether the intervention is appropriate).10 As discussed by Birkeland and McClure, the majority of recommendations in the NCCN guidelines fall into Category 2A “because high-level evidence is not available for most decisions across the continuum of care.” 11(p.608)
Due to rapid development of biomarker and companion diagnostic testing in the field of oncology, the NCCN Biomarkers Compendium was established “to facilitate identification of biomarker tests recommended for use by NCCN guideline panels.” 11(p.609) As discussed by Birkeland and McClure11, the Biomarkers Compendium “focuses on the clinical usefulness of biomarker testing rather than specific tests or test kits,”11(p.611) and therefore includes all tests measuring genes or gene products, regardless of their functional category (predictive, prognostic, diagnostic, screening, monitoring, surveillance). NCCN assigns a category of evidence and consensus to individual alterations (as opposed to the entire gene). Furthermore, NCCN states guideline recommendations (including those relative to the Biomarkers Compendium) are “intended to apply to the vast majority of patients in a particular clinical situation”10(p.6) and are therefore not exhaustive or expected to apply to all patients or all situations.
NCCN is a widely available resource, which is frequently utilized by oncologists and other clinicians. Poonacha and Go discuss that clinical practice guidelines published by NCCN are “the most comprehensive and widely used standard in clinical practice in the world.”12(p.187) In their study, the authors investigated the level of scientific evidence behind NCCN guidelines for the ten most common types of cancer in the U.S. (breast, prostate, lung [both small-cell and non-small cell subtypes], colorectal, melanoma, non-Hodgkin’s lymphoma, kidney, pancreas, urinary bladder, and uterus). Of the ten clinical practice guidelines reviewed, Poonacha and Go12 identified that on average, guidelines contained over 100 intervention recommendations; the NCCN lung cancer guideline included the most recommendations (238) while the kidney cancer guideline had the least (45).
Of the ten guidelines reviewed, most intervention recommendations (83%) were from Category 2A, and only 6% were from Category 1.12 Categories of evidence were found to be highly variable based on diagnosis; the authors identified that the guidelines for kidney and breast cancers included the highest proportion of recommendations with Category 1 evidence (20% and 19%, respectively), eight of the cancer types had between 1% and 6% recommendations with Category 1 evidence, and neither urinary bladder nor uterine cancer had any recommendations with Category 1 evidence in their respective NCCN guidelines. Poonacha and Go12 also noted that of the ten guidelines reviewed, Category 3 evidence (major panel disagreement regarding whether the intervention is appropriate) was rare.
National Institute of Health funded Clinical Genome Resource (ClinGen)
The NIH-funded ClinGen was designed as an open-access resource to support clinical decision making by aggregating, curating, and defining the clinical relevance and actionability of gene-disease relationships.13 As an open-access resource, ClinGen is publicly available to all clinicians and patients. Their database “provides a structure to enable research and clinical communities to make clear, streamlined, and consistent determinations of clinical actionability based on transparent criteria to guide analysis and reporting of genomic variation.”14 ClinGen is also included in the FDA’s Recognition of Public Human Genetic Variant Databases.15
ClinGen’s consortium of experts includes a Steering Committee (responsible for establishing standards and overseeing all ClinGen processes), a Clinical Domain Working Groups Oversight Committee (responsible for overseeing the development and approval of variant curation), and a Sequence Variant Interpretation (SVI) workgroup (comprised of industry experts responsible for providing guidance relevant to variant assessment activities, including education tasks).13 ClinGen’s Variant Curation Expert Panels (VCEPs) are comprised of “individuals with scientific expertise regarding gene function, clinical expertise regarding disease manifestations, and biocurators who are trained in evaluating evidence sources that support a variant assertion.”13(p.2) VCEPs follow a standard operating procedure (SOP) during the process of gene curation and assessment; this SOP is publicly available via their website. Among other things, the SOP details the organization’s transparency and public accessibility (all variant assertions and summary evidence are publicly available), as well as conflict of interest disclosures (all conflicts are publicly declared).
ClinGen has taken great measures to ensure staff involved in variant curation and evaluation are adequately trained.13 ClinGen expects their VCEPs to demonstrate the diversity of expertise in the field of genetics (including the major areas of clinical, diagnostic laboratory, and research). While VCEPs include disease/gene experts, they also include biocurators, who are not required to be experts (and are primarily responsible for assembling evidence for expert review). Regardless of their level of expertise, each VCEP member is required to demonstrate competence through completion of extensive training and an evaluation of their proficiency. All individuals are also required to obtain HIPAA and human subjects training (based on their level of access to human subjects’ data). Finally, the SVI workgroup provides organization-wide guidance regarding the evaluation and curation of human variant data.
ClinGen requires that variant curation and preliminary evaluation must be conducted by at least two reviewers.13 The requirements for variant evaluation are described in the ClinGen Variant Curation Expert Panel Protocol, publicly available via ClinGen’s website. Part of this process includes evaluating supporting data against rules and criteria developed by the VCEP, and ranking them as either standalone, very strong, strong, moderate, or supporting. These ranks are then used to determine a classification assertion (pathogenic [P], likely pathogenic [LP], benign [B], likely benign [LB], or uncertain significance [VUS]). Final evaluation and decisions about variant assertions are made by consensus of the relevant VCEP. Consensus can be indicated by either unanimous agreement by all members of the VCEP or a majority vote. In order to be published as an approved assertion, variant classifications must have at least a majority vote. If a majority vote cannot be obtained, the variant may be considered an unclassified variant (which are reevaluated every two years to determine if additional evidence has been made available to support a classification) or may be classified as a lower-ranking class (for instance, a variant may be considered VUS if a majority vote cannot be obtained for a LP or LB classification). In order to receive final approval and publication, all variant interpretations are reviewed by the full VCEP membership (which includes non-biocurator, clinical, and disease experts). Furthermore, all evidence curated by the ClinGen team is readily accessible via their website.
The framework established by ClinGen attempts to define and evaluate the clinical validity of gene-disease relationships by evaluating the evidence supporting or contradicting them.16 This standardized framework was developed because there is substantial variability in the level of evidence supporting claims of gene-disease relationships. As noted by Strande et al, “This framework aims to provide a systematic, transparent method to evaluate a gene-disease relationship in an efficient and consistent manner suitable for a diverse set of users.” 16(p.905)
ClinGen’s database validates gene-disease relationships by evaluating both quantity and quality of evidence.16 Gene-disease relationships are then identified under one of the following levels, with each level building upon the previous: Definitive (requires that the relationship has been repeatedly demonstrated in research and clinical diagnostic settings, as well as upheld over time), Strong (requires that the relationship has been featured in two or more independent studies with multiple unrelated probands with pathogenic variants, as well as several types of supporting experimental data), Moderate (requires that the relationship has been featured in at least one independent study with several unrelated probands with pathogenic variants, as well as having some supporting experimental data), Limited (requires that the relationship has been featured in at least one independent study with more than three unrelated probands with pathogenic variants, or multiple unrelated probands without pathogenicity), and No Known Disease Relationship (where no pathogenic variants have been identified to date, therefore no evidence supports a causal role).
There are additionally two levels of evidence reserved for when conflicting evidence has been reported – Disputed (which suggests that disputing evidence has been discovered but does not necessarily outweigh existing evidence in support of the gene-disease association) and Refuted (which suggests that disputing evidence has been discovered, and significantly outweighs existing evidence in support of the gene-disease association). The refuted status is applied at the discretion of clinical experts, after analysis of all available evidence. Experimental evidence is scored based on a separate framework.
The evidence supporting clinical actionability for genetic disorders varies significantly. Therefore, ClinGen developed and implemented a standardized, evidence-based method to determine actionability of genomic testing. Hunter et al explains that the assessment of clinical actionability is part of the effort to create a central resource of information for the clinical relevance of genomic variation.14 As discussed by Strande et al, the ultimate goal of the ClinGen database is to “enhance the incorporation of genomic information into clinical care.”16(p.905) That said, ClinGen has also created a semi-quantitative scoring metric to be utilized to assess actionability for clinical decision making. As discussed by Berg et al, it should be noted that clinical actionability “is a continuum, not a binary state.”17(p.467-468) That said, the ClinGen semi-quantitative scoring metric is used to score interventions, not genes; ClinGen assigns a level of evidence to individual alterations (rather than the entire gene). The scoring metric assesses four categories: disease severity, likelihood of disease, effectiveness of the intervention, and nature of the intervention. The scoring matrix also assesses level of available evidence for two categories: likelihood of disease and intervention effectiveness.
Using the ClinGen framework, Strande et al evaluated a number of gene-disease pairs and examined reproducibility of the scoring metric by having two independent clinical domain experts evaluate each gene-disease relationship.16 Clinical domain experts agreed with the preliminary classifications for 87.1% of ClinGen’s gene-disease relationship curations with published evidence. Discrepancies between expert and curator classification were discussed and explained; additionally, it was noted that when the expert and curator classifications differed, they did so by only a single category (moderate versus limited). The authors concluded that ClinGen’s evidence-based method for evaluating gene-disease associations “will provide a strong foundation for genomic medicine.”16(p.902)
As concluded by Hunter et al, “The ClinGen framework for actionability assessment will assist research and clinical communities in making clear, efficient, and consistent determinations of actionability based on transparent criteria to guide analysis and reporting of findings from clinical genome-scale sequencing.”14(p.10)
Memorial Sloan Kettering Cancer Center Oncology Knowledge Base (OncoKB)
OncoKB was established as a comprehensive precision oncology tool to deliver evidence-based information about tumor mutations and alterations and distill NCCN guidelines, expert recommendations, and scientific literature, in order to support treatment decisions.8 OncoKB provides a resource which is available to all clinicians and patients. The database is publicly available through their website, organized by gene, alteration, tumor type, and clinical implication, and is searchable by any of the above. OncoKB has received FDA recognition for a portion of the database and is also included in the FDA’s Recognition of Public Human Genetic Variant Databases.15
OncoKB’s staff is made up of highly qualified scientists, physicians, and engineers, each meeting specific qualifications criteria including educational background, professional training, and skills.18 Individuals with Lead Scientist, Clinical Genomics Annotation Committee (CGAC), or Scientific Content Management Team (SCMT) roles are required to be physicians or Ph.D-level scientists who are considered experts in their field and disease specialty. These individuals’ responsibilities include “coordinating and monitoring training and proficiency of curators in procuring the appropriate data, assessing the data in the context of variant interpretation, and entering the data with sufficient detail into the OncoKB curation platform.”18(p.5) Curators, who are responsible for assessing and curating gene alterations, their biological effects, and associated treatment implications, can be either pre-doctoral graduate students, postdoctoral fellows, or clinical fellows. Curators receive extensive in-person training in variant classification, including mapping variants to FDA levels. All OncoKB staff are also evaluated for potential conflicts of interest, with financial conflicts being publicly disclosed on the OncoKB website. Any CGAC member with a conflict of interest relevant to a specific Level of Evidence assignment is not permitted to work on the assignment.
CGAC reviews and approves all OncoKB/FDA level associations prior to internal review.18(p.3-5) Additionally, data curated by OncoKB staff does not become publicly available until it has undergone an internal, independent review by a different OncoKB staff member. Specific protocols exist to manage conflicting data or conflicting assertations regarding alterations, including an independent review of curated data, as well as evaluation and discussion of decisions until a consensus has been reached.18(p.20-21) In instances where a consensus is reached, the alteration is accepted into the knowledge base with a notation that there was majority but not uniform consensus; in instances where consensus cannot be reached, the alteration is not assigned a level of evidence within the knowledge base.
As discussed by Chakravarty et al8, OncoKB contains a classification system for clinical utility and potentially actionable alterations. “Potentially actionable alterations in a specific cancer type are assigned to one of four levels that are based on the strength of evidence that the mutation is a predictive biomarker of drug sensitivity to FDA-approved or investigational agents for a specific indication.”8(p.2) OncoKB delineates separate levels of evidence for therapeutic, diagnostic, and prognostic use cases. OncoKB assigns a level of evidence to individual alterations (as opposed to the entire gene).
The OncoKB therapeutic levels of evidence are as follows: Level 1 gene alterations have been recognized by the FDA as “predictive of response to an FDA-approved drug in a particular disease context.”8(p.2) Level 2 gene alterations are considered “standard care” predictive biomarkers. They are not FDA-recognized but are recommended by professional guidelines (including NCCN) and predict response to FDA-approved therapy in a particular disease context. Level 3A and 3B are considered investigational; 3A requires compelling clinical evidence to support the biomarker as predictive of response to a drug in a particular disease context and only applies to investigational biomarkers for which there has been clinical activity (such as a clinical or preclinical trial). 3B could be either a standard care or investigational biomarker predictive of response to an FDA-approved or investigational drug in another indication. Level 4 is considered hypothetical and requires compelling biological evidence to support the biomarker as predictive of response to a drug.
Additionally, there are two therapeutic levels of evidence for treatment resistance; Level R1 is for standard care biomarkers predictive of resistance to an FDA-approved drug in a particular disease context, while Level R2 requires compelling clinical evidence to support the biomarker as being predictive of resistance to a drug.
OncoKB also offers scoring of evidence for both diagnostic and prognostic use cases. For diagnostic indications, level Dx1 biomarkers have been recognized by the FDA or professional guidelines as a requirement for diagnosis in a particular disease context. Level Dx2 biomarkers have been recognized by the FDA or professional guidelines as supportive of diagnosis in a particular disease context. Biomarkers in level Dx3 may assist disease diagnosis based upon clinical evidence. Similarly, for prognostic indications, Level Px1 biomarkers have been recognized by the FDA or professional guidelines as prognostic for a particular disease context based on at least one well-powered study. Level Px2 biomarkers have been recognized by the FDA or professional guidelines as prognostic for a particular disease context based on at least one small study. Biomarkers in level Px3 are considered prognostic for a particular disease context based on clinical evidence from well-powered studies.
As a portion of the OncoKB database has been approved by the FDA, the therapeutic levels of evidence indicated above can be mapped to one of three FDA Levels of Evidence within the database.19 FDA Level 1 requires companion diagnostics (CDx) tests, which “are supported by analytical validity of the test for each specific biomarker and a clinical study establishing either the link between the result of that test and patient outcomes or clinical concordance to a previously approved CDx.”19 Level 1 is the highest level of recognition by the FDA; however, OncoKB does not include any companion diagnostic claims, and therefore no genes or variants are currently considered Level 1. FDA Level 2 is designated for mutations with evidence of clinical significance, which allows providers to utilize information about their patients’ health alongside clinical evidence presented in professional guidelines. “Such claims are supported by a demonstration of analytical validity (either on the mutation itself or via a representative approach, when appropriate) and clinical validity (typically based on publicly available clinical evidence, such as professional guidelines and/or peer-reviewed publications).”19 FDA Level 3 is reserved for mutations with potential clinical significance, but not identified as a higher level. “Such claims are supported by analytical validation, principally through a representative approach, when appropriate, and clinical or mechanistic rationale for inclusion in the panel” (to include peer-reviewed publications or in vitro pre-clinical models).19 OncoKB has a validation protocol in place to assess the consistency of variant classification to FDA levels of evidence; mapping OncoKB levels of evidence to FDA levels ranges from 85.7% to 100%.18(p.20)