At the 2019 American Society of Human Genetics (ASHG) Annual Meeting, Dr. Daniel Bellissimo of the University of Pittsburgh Medical Center Clinical Genomics Laboratory presented at the Genomenon CoLab Session:

Mastermind: Advanced Literature Searching to Aid Variant Classification

About the Speaker:

Dan Bellissimo received his BS in biochemistry at the University of Wisconsin-Madison and his PhD in biochemistry from Duke. In 1999, he completed as ABMGG fellowship in clinical molecular genetics at the University of Wisconsin. He was at the Blood Center of Wisconsin for 23 years, where he led product development and molecular diagnostic testing, and he is currently at the University of Pittsburgh as a director of the clinical genomics laboratory and associate professor of obstetrics, gynecology, and reproductive sciences. Dan has over 20 years of experience as a clinical lab director responsible for development, review, and reporting of clinical genetic test results.

A Transcript of the Talk:

Here’s a little bit of an outline of the activities that are going on in my laboratory: You can see as in many laboratories now, we’re focused on next-generation sequencing assays. We’re offering both whole exome, clinical exome, and hereditary cancer panel testing, and those all go through a common pipeline where there’s variant alignment and calling and filtering.

Then the next step is where all the work happens. As people who do this work know, it is all in the variant annotation and the ACMG variant classification, and we use lots of different tools to try to dig into that information. Whether it’s databases in gnomAD, in Clinvar, or HGMD, we use predictive tools and other databases on protein structure. But we also have to do an in-depth literature search, which is a big part of gathering the information in order to do the ACMG classifications.

So what are some of those challenges in variant interpretation? Again, there’s a large number of variants that are identified in exome analysis. We prioritize these variants for evaluation and documentation of the variant annotations, so we can keep track of why we made the decisions we did at the time we did. We also need skilled analysts to do this variant interpretation. In our laboratory it’s three PhDs doing this work for us. And the literature would say, and it’s pretty much what we’re finding is, it can take somewhere between six and eight and a half hours to analyze all the exome variants. So it becomes a major bottleneck in the workflow for exome analysis. So literature search is a crucial part of this variant interpretation, specifically finding information that we need relevant to the ACMG guidelines. So we use Mastermind in our laboratory for this advanced literature searching. It’s actually integrated right into our workflow, in the variant tables and our Fabric Genomics workflow we can see the Mastermind entries.

So what type of evidence are we looking for in the literature? We’re looking for these common questions that help us classify variants:

Is the variant frequency consistent with the frequency of the disorder?
Is the prevalence of the variant in affected individuals increased compared to controls, where we’re looking for case control studies?
What are the phenotypes associated with the variants?
Does the variant co-segregate with disease in families?
Is the variant in a crucial and critical well known functional domain or a mutation hot spot?
Are there well-established functional studies that support the damaging effect of this gene or protein product
Does the variant follow the expected inheritance pattern?

So these are the kinds of things you go into the literature for. So what I thought I’d do is illustrate with a couple of cases how important this search tool can be in helping you find information quickly.

This is a case: a patient had a medical history of breast and ovarian cancer they ordered our hereditary cancer panel, and we detected a variant PALB2. So we saw a deletion in that gene. So we can go into Mastermind and enter that, we can search by all diseases, and put in the variant, and I think one of the things you see right away in this screen is that when we enter this variant you see there’s two alternate variant nomenclatures that are associated with this change. So it’s a little bit of a hint when you go do the search that there may be information and other names in the literature.

So when we go to this next page we see when it used just the primary search term, it only picked up five articles, but if you actually use the expanded search terms with those other names, you pick up 48 references. So it’s a really big difference. Again, it looks in many different ways in the literature for this variant.

This is the view you get when we go into these references. First of all, you can see all the references lined up by publication date. On this axis, we see the clinical impact of the journal, and the size of the circle gives you relevance to the search terms, so you have an idea right away which of these things might be most important to you. When you go down to this next step, you have the ability to export these references in a list. You can export them right into Google Scholar if you want to.

Next, this display tells you that the gene name has been referenced 79 times in this article, and the variant has been referenced twice, so you get an idea of whether just the gene or the reference is there. And down below then you see the PubMed links. You can go right to the PDF if it’s available. In my workflow, I can go through PubMed and my permissions from the University library immediately give me access to all the free journals, so I can quickly get to the reference. Down below you get a little bit of context inside this reference of where this variant is mentioned. You can see here that this variant looks like it’s a Czech founder mutation that becomes important when looking at the data, and you also see it was found in a Polish patient. So you get a little bit of an idea what this article might have right away. Down below you see some protein nomenclature and RSIDs for this reference.

The other important thing is that we can take this variant list and quickly break it down into specific things we’re looking for. You can break it down by ACMG guidelines, by clinical significance, genetic mechanisms, or other search terms. These are the kind of things you could pick under each of these categories.

You see a bunch of different categories on the ACMG guidelines in which you can immediately go look for information. In this case, I just picked the case controls, and you can see immediately that there are 14 references there under this this area. So again you can quickly go to references that have the specific information you’re looking for.

When I go into case control studies you see we have a reference here with a nice big dot. The gene is mentioned 62 times, the variant once, again we see some information below, and we go right to this article – remember we’re looking for case controls – you can immediately see that they tested 807 patients in 1690 controls. You get the variant frequency and patient controls. This was a founder

mutation, so in some populations it may be a little higher than you predicted just based on the fact that it was a founder. So again, quickly getting to some important information about this variant.

The next example is the same kind of medical history of breast and ovarian cancer. They ordered our panel and we detected ovarian BRCA1, which looks just like a consensus splice site variant. Normally you would just say “well, that’s at least likely pathogenic”, and that’s likely to be true, but again we can go into Mastermind. We enter this variant, and we see that there are 35 articles here and, quite notably here, that the gene name occurs quite often and there’s over 70 mentions of this variant. So clearly this paper is on this variant, and it’s really important. And if we go to that paper, I won’t make you read all the abstract, but basically what you find out right away is that this splice variant is on haplotype with another variant. I’m just going to summarize some of the findings that are in that abstract for you:

First of all, it had the odds of causality, considering in the case controls the segregation and the tumor pathology is very very low. This variant, the splice variant is always in cis with this other variant. The haplotype is caused by exon skipping, mainly due to the other variant, not the splice variant, and that the functional analysis showed that this change resulted in a twenty to thirty percent in-frame transcript, and that was enough to give function, because it turns out these exons are not as important in BRCA1, and that this variant should not be considered to be a pathogenic allele. So again, going very quickly from here, I think I have a very likely pathogenic allele, and just in the first paper I’m able to dig in and find out that that’s not the case.

The other really important thing that Mastermind does for us: there are a lot of times we don’t find the information we need. We have a lot of VUSs, or things we have to keep track of, for our variant classifications. With Mastermind, you can set up Alerts based on the variants and the genes, and you can set up the frequency. So below here are a couple of variants in which where there are VUSs, for which we have Alerts. And we get weekly messages whether any new publications have come out on these variants. So this is a really important part for a clinical lab. Our customers are always asking us “How do you keep track of changes in classification? How can we count on you to tell us if a VUS turns into likely pathogenic or benign?” This is the tool we’re using to do that.

I hope I’ve shown you that Mastermind enables the genome analyst to quickly prioritize literature on gene variants. The search criteria are flexible and includes ACMG criteria. I also heard the new release is going to be adding HPO terms. Gene and variant Alerts notify the lab of updated information available, and it makes our variant interpretation a lot more efficient. Thank you.