Variant classification discrepancies – where clinical laboratories and clinicians/genetic counselors disagree about the clinical meaningfulness of a variant – can impede the accurate diagnosis and subsequent treatment of patients.

As genetic testing is now a regular component of clinical practice, ensuring accuracy and consistency in variant classifications is crucial. In 2015, the American College of Medical Genetics (ACMG) and the Association for Molecular Pathology (AMP) recognized this need and released a set of guidelines for variant interpretation and classification.[1] Although these guidelines have since become the gold standard, discrepancies in variant classifications are still an issue today.

Discrepancies can occur between clinicians/genetic counselors and clinical laboratories due to differing perspectives. In a 2019 survey of genetic counselors, 67% reported that they had reached a different classification than a clinical laboratory, for up to 25% of the variants they assessed.[2] Similarly, a 2021 survey of genetic counselors, medical geneticists, and other clinicians found that classifications differed between these professionals and clinical laboratories 13.8% of the time.[3] These figures represent a significant number of patients being either under or overdiagnosed, causing a challenge for clinicians using genetic data to inform treatment procedures.

In addition, classifications have also been found to differ between the clinical laboratories themselves. As of January 2022, 9% of the variants in ClinVar had conflicting classifications, 13% of which could be clinically impactful.[4]  Similarly, a 2020 study reported that laboratories disagreed 46% of the time, with 37% of those discrepancies being clinically impactful.[5]


Variant classification discrepancies are encountered by most clinical genetics professionals but are correctable through collaboration and evidence-sharing between all groups involved in variant interpretation.[5-7]

The same study that reported 46% disagreement, found that number decreased to only 16% when laboratories collaborated and reevaluated their use of the ACMG/AMP guidelines.[5] Many groups – most notably the Clinical Genome Resource (ClinGen) – are now aiming to harness this power of collaboration by coordinating communication between clinicians, genetic counselors, and testing laboratories on the details of ACMG/AMP for a specific disease.[8] This process results in a valuable consensus statement, however it is time-consuming and can be completed for only a few diseases at a time. A process to recognize and resolve discrepancies is therefore needed.

The following is a guide to variant classification discrepancies – from recognition to resolution.


Forms of Classification Discrepancies

One of the most important questions to ask when assessing a classification discrepancy is whether the discrepancy has the potential to impact clinical care. This hinges on the clinical value of each type of classification – Pathogenic, Likely pathogenic, Variant of Undetermined Significance (VUS), Benign, or Likely benign.

Variants that have been classified as Pathogenic or Likely pathogenic are sufficient to determine a patient’s diagnosis and subsequent treatment, as the evidence suggests that there is a >90% chance that these variants are disease causing.[1] On the other hand, variants that are classified as a VUS or as Benign/Likely benign should not impact clinical care, as they lack evidence of pathogenicity. Thus, for a discrepancy to impact clinical care, the variant must have been classified as Pathogenic/Likely pathogenic (which would be sufficient for diagnosis) by some while classified as a VUS/Benign/Likely benign (which would not be sufficient for diagnosis) by others. To summarize,

Minor classification differences, which unlikely to impact clinical care:

  1. Pathogenic vs. Likely pathogenic
  2. Benign vs. Likely benign
  3. VUS vs. Benign/Likely benign

Major classification differences, which would impact clinical care:

  1. Pathogenic/Likely pathogenic vs. VUS
  2. Pathogenic/Likely pathogenic vs. Benign/Likely benign

Factors That Cause Classification Discrepancies

Classification discrepancies can arise as a result of:[3, 5-7]

  1. Differences in classification methods/modifications of classification methods
  2. Differences in application of a classification method
  3. Differences in evidence
  4. Differences in interpreter opinions
  5. Human error

Differences in Classification Methods

The ACMG/AMP guidelines are considered the gold standard for clinical practice, but groups may use variable, modified versions of these guidelines or even other guidelines entirely.

A commonly used modified version of the ACMG/AMP guidelines are the Sherloc criteria, which introduces further refinements and quantification to certain aspects of the guidelines.[9]  In addition, ClinGen has released a number of modified guidelines specific to certain disease areas/genes, as well as recommendations for specific evidence categories within the guidelines.[8]  These modifications, if not adopted by all groups, can easily lead to classification discrepancies through changes in when evidence is applied, how much weight that evidence carries, or whether the evidence is assessed qualitatively or quantitatively.

On the other hand, some groups may use entirely different interpretation methods, such as the Human Gene Mutation Database (HGMD), which uses their own proprietary method.[10] Using different interpretation methods can result in classifications that are incompatible both in terminology and structure. For example, HGMD uses the term “disease-causing mutation” whereas the equivalent term in ACMG/AMP would be “pathogenic”, however, these classifications require differing amounts and types of evidence, meaning they can’t be accurately compared.

In order to resolve classification discrepancies, all groups must be starting with the same interpretation method, and same data sources and versions and further, agree on any modifications that will be applied, especially disease – or gene-specific modifications.
*Since most groups use the ACMG/AMP guidelines, we will specifically consider their application in the following sections.


Differences in Application of Classification Methods

Structurally, the ACMG/AMP guidelines consist of 16 pathogenic evidence categories and 12 benign evidence categories with five different evidence strengths – stand-alone, very strong, strong, moderate, and supporting. These categories and their assigned evidence strengths are assessed according to a classification schema that weighs the burden of evidence to suggest a variant is pathogenic or benign. However, the application of individual categories and what evidence strength to apply them at, is not always clear. As a result,

There are two major challenges with application of the ACMG/AMP guidelines: ambiguity in when to apply a category and mutability of the evidence strength.

These challenges resulted from the need for the ACMG/AMP guidelines to be widely applicable; flexibility was necessary to account for variability between diseases and the genes that cause them. Each disease and its associated gene(s) require careful consideration of factors such as the inheritance pattern of the disease (including whether it is monogenic, oligogenic, or polygenic), penetrance of the disease and age at onset, types of variants causative for the disease, and the disease mechanism. These considerations are not always sufficiently addressed within the guidelines, making their application especially challenging. In addition, such considerations can impact certain categories within ACMG/AMP more than others.

Functional and population-based evidence, and how/when to apply it, is the most frequent source of persistent classification discrepancies.[7]

In the ACMG/AMP guidelines, functional data is recorded under the PS3 (denoting a damaging effect on the gene/protein) and BS3 (denoting the lack of any damaging effect on the gene/protein) categories and is derived from empirical studies in the scientific literature. Depending on how well the disease mechanism is understood, as well as how complex it is, these categories can be particularly difficult to apply. Some groups may disagree on what functional data should be considered for application of PS3/BS3 as well as on what model systems or specific assays (especially pertaining to in vitro vs. in vivo studies) to accept. Further, some functional data may be accepted but downgraded to a lower evidence strength due to uncertainty about its relationship with disease causation.

Additional categories that assess the disease mechanism, such as PM1 (denoting a “hotspot” or area of frequent/consequential mutation within the gene) or PP2 (denoting a missense variant in a gene with a low rate of benign missense and where missense is a common mechanism of disease) can suffer from the same ambiguities.

Similarly, population data can be difficult to apply due to a lack of concrete thresholds for assessing population frequencies or case/cohort studies. For example, the definition for the BS1 category is “Allele frequency is greater than expected for disorder”; however, the interpretation of “greater than expected”, is subjective and often differs between groups. This is especially true for rare diseases, where information about disease prevalence and incidence are lacking.

In addition, the PS4 category assesses whether the prevalence of the variant in affected individuals is significantly increased compared to the prevalence in controls. The guidelines define “significantly increased” as finding an odds-ratio (OR) of >5.0 in a case-control study in the medical literature, however, the category may also be applied when “multiple” unrelated cases are found with the same phenotype, albeit with a moderate, rather than strong, evidence strength. Notably, this category does not allow consideration of singular cases, even if that is the only clinical evidence available, which is often true for rare diseases. As a result, some groups may set different thresholds for “multiple unrelated cases” and/or allow inclusion of singular cases as supporting evidence.

These categories are common points of contention, but the challenges do not end there.

Nearly every category in the ACMG/AMP guidelines is subject to disagreements on its application. However, these disagreements all arise from a singular issue – subjectivity – that can be addressed through structured collaboration.


Differences in Evidence

Many sources of evidence are required for application of ACMG/AMP – including the medical literature/clinical data, population frequency databases, and computational prediction algorithms – and there are a variety of methods used to acquire it.

Differences in the source or acquisition method of any form of evidence can easily result in classification discrepancies.

One of the largest sources of such discrepancies stems from inconsistent access to the medical literature. Aggregating this evidence manually is exceedingly difficult due to both the scale of the literature and the need for search engines to recognize the myriad ways in which a disease, gene, or variant may be referred to. This results in a time-consuming process that is also error-prone, and overlooking even a single study could change the classification of a variant.

In addition, some groups may have access to clinical data, either from their own sequencing efforts or from additional databases, that others do not. This can similarly change the classification, especially if they found multiple patients with the variant and/or had evidence of segregation with the disease.

Even further, some groups may use different population frequency databases that may affect the application of population-based categories, even if all groups are using the same thresholds. What computational prediction algorithms are used may also differ, which can result in discrepant predictions and subsequent application of the appropriate category. Nevertheless, these issues are easily mitigated through collaboration.

Sharing of evidence between groups facilitates resolution of 33% of classification discrepancies.[7]

Evidence sharing ensures that all groups have a consistent foundation for their classifications, allowing them to focus on resolving any remaining disagreements on application.


Differences in Interpreter Opinions

ACMG/AMP is intended to assess the likelihood that a variant is disease-causing based on the amount and types of evidence to suggest a causative role. For rare diseases especially, there may be very little evidence available, resulting in a large number of variants being classified as a VUS. On the other hand, evidence for some diseases (such as those that are polygenic) can be more difficult to interpret appropriately within the guidelines, resulting in overclassification of variants. As a result, some may override the final classification produced through ACMG/AMP.

In clinical practice, this often occurs when a clinical laboratory classifies a variant as a VUS, but a genetic counselor/clinician overrides this decision.[2-3]  A VUS may be considered sufficient for diagnosis, despite its classification, if the patient’s clinical presentation is specific to the associated disease and there is no evidence to suggest the variant is benign. As such,

Discrepancies may arise between clinical laboratories and clinicians/genetic counselors, not because the classification was incorrect, but because clinical evidence that was not considered in the interpretation process was deemed sufficient to override the classification.

In addition, discrepancies may persist as a result of differences in opinion on how to apply the ACMG/AMP guidelines, particularly for variants that represent challenging edge-cases where guidance on how to apply the guidelines is especially lacking.


Human Error

Application of ACMG/AMP is often largely, if not entirely, a manual process, which introduces the potential for human error.[5-6]

These errors can involve typos/nomenclature errors, accidental misapplication of a category, incorrect calculation of the classification, or interpreting the variant for the wrong disease. However,

Human error can be mitigated by collaboration as well as the introduction of automation.[6]

When groups share and compare their classifications, human errors can be quickly recognized and corrected. In addition, automating certain aspects of the interpretation process, such as calculating the classification from the applied categories, can assist in preventing errors from occurring.[6]


How To Prevent Classification Discrepancies

There are a number of steps that can be taken to reduce the chance that discrepancies will occur:

  1. Thoroughly research the disease and its causative gene(s)
  2. Consult with disease experts
  3. Check for disease-specific consensus statements for variant interpretation
  4. Institute quality assurance procedures
  5. Share classifications and corresponding evidence with the wider community

Researching the disease and its causative gene(s) as well as consulting with disease experts ensures that you are acutely aware of any idiosyncrasies that must be considered in the interpretation process, such as penetrance of the disease, the inheritance pattern, and the disease mechanism. In addition, disease-specific consensus statements for variant interpretation, when available, can provide a more concrete guide; ClinGen is the primary source for these statements.[8]  These steps can increase consistency through mutual understanding of the disease and what must be considered when interpreting variants.

Quality assurance procedures focused on assessing variants with conflicting or unclear evidence can also assist in uncovering human error as well as variants that may need manual adjustments. This can increase consistency by allowing for quick, prioritized corrections.

Finally, sharing classifications with the wider community of clinical genetics professionals can facilitate collaboration, subsequently increasing consistency. ClinVar is the most commonly used repository for variants found during clinical sequencing, along with their classifications.[4]  However, in order for collaboration within ClinVar to be effective, the evidence used to produce classifications must be submitted, which is not true of all variants in the database. It’s important to note also, that ClinVar is not intended to catalog variants found only in the literature; classifications of these variants are shared in other repositories such as locus-specific databases (LSDBs),[11]  VarSome[12], and Mastermind[13].


How To Efficiently Address Classification Discrepancies

Classification discrepancies are commonly encountered in clinical genetics and highlight uncertainty about the clinical meaningfulness of a variant. However, with full transparency, these discrepancies can be addressed, and many resolved.

To address classification discrepancies, all groups must:

  1. Use the same classification method
  2. Provide transparent documentation of methodology
  3. Share internal evidence

These requirements ensure that all groups are starting with the same structure and the materials necessary to pinpoint where and why classification discrepancies have occurred. If these requirements are not met, discrepancies may not be resolved as a result of incompatible methodology or insufficient information to perform a true comparison.

The following is a simple and practical procedure to assess classification discrepancies through structured collaboration.

Four simple steps for assessing classification discrepancies:

  1. Scope the problem – determine how many discrepancies there are and how impactful they are to clinical practice
  2. Determine the causes – assess whether discrepancies are being caused by differing methodology and/or differing evidence sources
  3. Discuss the issues – discuss discrepancies and their causes
  4. Take action – decide on a single methodology, combine evidence sources, and adjust classifications

This process ensures that all groups focus on the most impactful discrepancies and guides the discussion in a way that encourages efficient action. The overall goal is for all groups to reach consensus on the most appropriate methodology, to combine evidence, and ultimately, to resolve discrepancies quickly and easily.


Variant Classifications in Mastermind

All classified variants in Mastermind Variant Landscapes (for use by pharma) or in the Mastermind Genomic Search Engine (for use by clinical laboratories/clinical professionals) have been interpreted according to the ACMG/AMP guidelines. All evidence used to produce classifications is manually annotated by our variant scientists and is visible to all users. Importantly, these classifications are considered preliminary; they are intended to be reviewed by the user prior to clinical use and are subject to change as more information is published.

For more detail regarding our application of ACMG/AMP, read our documentation.

As discussed in previous sections, classification discrepancies are commonly encountered and are caused by a multitude of factors. However, we take multiple steps to prevent classification discrepancies from occurring, including research of genes/diseases, consultation of disease-specific consensus statements, review of and comparison with classifications in other databases, automation of components that don’t require scientific review, and use of thorough quality assurance procedures. In addition, we provide detailed documentation of our methods in the event that discrepancies do occur.

What should you do if your classification differs with Genomenon?
If your classification, or a classification in another database such as ClinVar, differs from ours, the process discussed above can be used to assess the discrepancy. We always provide access to the evidence used to produce a classification, which along with our documentation, can be used to pinpoint areas where additional evidence was introduced and/or disagreement occurred.

If additional information is needed, we encourage you to email our team here.


Summary

Variant classification discrepancies are encountered by most clinical genetics professionals and can be the result of differences in classification methods, application of classification methods, evidence, or interpreter opinions, as well as human error. However, the majority of these discrepancies are imminently correctable when all groups involved in interpretation of variants for a particular gene/disease collaborate and share internal evidence.

While discrepancies are a source of concern due to the potential to impact clinical care, taking the necessary steps to reduce the chance that discrepancies may occur as well as following a structured process for assessing them can allay these concerns and ensure patients receive the most appropriate care. In recognition of this, Genomenon has taken a number of preventative actions and all variant classifications in Mastermind are supported by clear evidence that can be independently reviewed by all users.

As collaboration increases across the clinical genetics community and the provision of evidence for every known variant becomes more complete and consistent, the number of classification discrepancies is expected to decrease. However, the issue may never be fully resolved due to inherent subjectivity in the process of variant interpretation. As a result, a process for assessing and resolving classification discrepancies is necessary.

For more information, contact us HERE to talk to an expert from the team.


References
[1] Richards, Sue et al. “Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.” Genetics in medicine vol. 17,5 (2015): 405-24. doi:10.1038/gim.2015.30
[2] Wain, Karen E et al. “Variant interpretation is a component of clinical practice among genetic counselors in multiple specialties.” Genetics in medicine vol. 22,4 (2020): 785-792. doi:10.1038/s41436-019-0705-9
[3] Berrios, Courtney et al. “Challenges in genetic testing: clinician variant interpretation processes and the impact on clinical care.” Genetics in medicine : official journal of the American College of Medical Genetics vol. 23,12 (2021): 2289-2299. doi:10.1038/s41436-021-01267-x
[4] ClinVar
[5] Amendola, Laura M et al. “Variant Classification Concordance using the ACMG-AMP Variant Interpretation Guidelines across Nine Genomic Implementation Research Studies.” American Journal of Human Genetics vol. 107,5 (2020): 932-941. doi:10.1016/j.ajhg.2020.09.011
[6] Amendola, Laura M et al. “Performance of ACMG-AMP Variant-Interpretation Guidelines among Nine Laboratories in the Clinical Sequencing Exploratory Research Consortium.” American journal of human genetics vol. 98,6 (2016): 1067-1076. doi:10.1016/j.ajhg.2016.03.024
[7] Harrison, Steven M et al. “Clinical laboratories collaborate to resolve differences in variant interpretations submitted to ClinVar.” Genetics in medicine vol. 19,10 (2017): 1096-1104. doi:10.1038/gim.2017.14
[8] ClinGen
[9] Nykamp, Keith et al. “Sherloc: a comprehensive refinement of the ACMG-AMP variant classification criteria.” Genetics in medicine vol. 19,10 (2017): 1105-1117. doi:10.1038/gim.2017.37
[10] HGMD
[11] LOVD
[12] VarSome
[13] Mastermind