In this guided training, we’ll dive into genomic analysis with an expert curator and an experienced variant scientist, who will demonstrate how they utilize the Mastermind Genomic Search Engine to rapidly classify variants with all relevant evidence. Driven by common questions from users, we will highlight helpful tips and tricks for navigating the literature with the world’s most comprehensive source of genomic evidence—so that you get the most out of your data.
The Mastermind® Genomic Search Engine is the first source for variant interpretation of cancer and germline disease, using an AI-driven approach to connect patient genetic data with relevant evidence from scientific literature. The world’s most comprehensive source for genomic evidence, Mastermind is used by over 1,000 diagnostic labs and integrated into 18 clinical-grade decision support platforms and reference databases across the globe.
You will learn how to:
Use Mastermind’s powerful Genomic Language Processing (GLP) algorithm to gain actionable insights from all published articles related to your search.
Reduce turnaround time and increase diagnostic yield with clinical prioritization and in-text evaluation functionalities.
Scale your caseload with advanced features such as protein-centric searching, genomic associations, ACMG/AMP criteria filtering, and alerts.
Welcome & Introductions
02:17 General Concepts
03:13 Sensitivity, then Specificity
04:21 Protein Centric Searches
05:36 Genetic Coordinate vs. Transcript/Protein-Specific Searches
07:48 Genomic Associations
10:31 Analysis Workflows
10:51 Synonymous Variants
– Within Splice Region
– Outside Splice Region
18:39 Non-synonymous Variants
– Frameshift Variants
– Missense Variants
– Nonsense Variants
33:17 CNV Searching
– CNV Searches Smaller Than a Full Gene
– CNV Searches Larger Than a Full Gene Protein Centric Searches
Brittnee Jones, PhD
Director of Customer Success, Genomenon
With over a decade building and leading customer success teams across the NGS space, Brittnee ensures rapid product adaption and maximal value for Mastermind users.
Diane Nefcy, MA, BS
Senior Variant Analyst, Genomenon
An expert curator, Diane works on increasing the speed and accuracy of variant interpretation.
GARRETT: Hello, and welcome to today’s webinar, Ask the Masterminds: Insider Tips and Tricks for Getting the Most out of Mastermind. My name is Garrett Sheets, and I’ll be your host. In today’s webinar, an expert curator and an experienced variant scientist will guide you through how they use the Mastermind genomic search engine to rapidly classify variants with all relevant evidence. Driven by common questions from users, they will highlight helpful tips and tricks for navigating the medical literature so that you get the most out of your data.
If you don’t yet have a Mastermind account, you can sign up at the link below with a free trial of the professional edition. If you’re using the basic edition, you’ll want to talk to us about upgrading to get the most out of your subscription and access to the most features. We’re always here to answer your questions, so feel free to reach out to us at firstname.lastname@example.org to speak to a Mastermind specialist. Updated weekly, Mastermind identifies every gene, variant, disease, phenotype, therapy, and categorical keyword association across the full text of the world’s most comprehensive database of genomic publications. Mastermind delivers clinically prioritized search results and genomic insights into every article to speed clinical interpretation and genomic research for both cancer and mendelian disease.
We have a lot of great information to share, so let’s get right into introductions. Today, we’re joined by Dr. Brittnee Jones, Genomenon’s director of customer success. Hi, Brittnee, thanks for joining us today!
BRITTNEE: Hi, Garrett, thanks for the introduction, and glad to be here!
GARRETT: We also have Diane Nefcy, who is Genomenon’s senior variant analyst. Hi, Diane!
DIANE: Hi, Garrett, thanks so much!
GARRETT: Thank you both for being here to talk about some common questions that come up from our customers. We’re going to have both Brittnee and Diane get us started with some general concepts to get you comfortable with the Mastermind interface, and then we’re going to jump into some search examples. So let’s get started!
DIANE: This tutorial is intended to accompany a companion quick start guide that was written with the intent to introduce researchers to Mastermind and to ensure that you get the most out of your data and time while using it. You will find this guide in the handout section. Mastermind has been built from the ground up around certain and specific core features, and this construction shapes how data is presented as well as searched for. These features are designed with the intent of using literature evidence to develop real and actionable clinical and pharmacological insights while minimally affecting ease of access. These concepts are broadly defined as: sensitivity, then specificity; protein-centric searches; specificity via transcript or genetic coordinates; and genomic associations.
The first general concept is sensitivity, then specificity. This concept is first because all best practices for Mastermind use cases will first maximize sensitivity and then optimize specificity in a way that does not affect the original sensitivity. We will demonstrate how to do this in later examples. The rationale behind this feature is due to how false positive results can be addressed by reviewing the context in which each match was observed. Mastermind will also present to the user text surrounding each match. False negative results, however, are non-recoverable and not available for evaluation, since it is impossible to evaluate data that’s not there. This overall approach balances the issues of both false negatives and false positives, and the end result allows users the freedom to evaluate the quality of Mastermind’s evidence and then customize that evidence to suit their individual needs. Now, we will show you examples of the rest of these core concepts within the Mastermind interface.
The next general concept is protein-centric searches. To more closely describe the practical or biological effect of a given variant and thus maximize the probability of finding actionable evidence, Mastermind normalizes searches to the relevant protein description. This technique has the additional benefit of compensating for both searches that are highly specific on the cDNA level and those lacking specific observations in the literature. The end result ensures users will not miss evidence relevant to the biological impact of their desired variant. For example, a search for the specific cDNA variant FRAD:c.1799T>A would shows results for V600E. A different specific DNA variant, this time, a deletion-insertion, will also show results for V600E, as they both result in the same amino acid change within the resulting protein on the same transcript.
This leads to the third general concept, which is specificity via transcript or genetic coordinates. In Mastermind, the practical purpose of a cDNA or protein query is to act as a filter on a predefined genetic transcript, or a set of transcripts if multiple transcripts contain that reference or position. As you saw in this previous example demonstrating how Mastermind normalizes searches to the protein level, you can see how Mastermind may, at times, prompt you with a couple of protein level choices for one cDNA variant. Each of these is a valid variant, but in different transcripts. Another way to describe this is that results for the same protein variant at either the cDNA or the protein level are delineated by pre-existing transcripts. This ensures that when transcript-based nomenclature is not specified, the results cannot be confused by the existence of multiple transcripts, and reduces ambiguity in the results that Mastermind sends to you.
Of course, because transcript-specific searches are such a core feature of Mastermind, you will be shown which transcript your variant result was mapped to. Here in the variants panel, you can see that both of these two valid protein level variants are selected, and the panel will automatically select and scroll down to your variant interest, wherein you can click the “more info” option to report to you that matching transcript. For a singular variant search, when there are two possible results that Mastermind presents you with, each is mutually exclusive to a specific transcript. As you can see here, these two sets of transcripts are mutually exclusive.
The last core general concept that shapes how Mastermind presents information is genomic associations. Mastermind uses machine learning to discover genetic associations that are supported by literature and computational evidence. This is how a user can discover novel connections between variants, diseases, therapies, and any other pertinent keyword or keywords. You can use these categories as association filters, which makes it possible to explore innovations with precision, right down to the specific genetic variant. However, some research projects begin with a broad question that steadily becomes more developed with time. Mastermind’s genomic associations interface allows for these investigations to begin even from their broadest point, and then continually refine them as the situation calls for. As you will see in this next example, multiple phenotype filters can be entered into the Mastermind interface to search for genes, diseases, or even other phenotypes within the literature that have been observed in association with these phenotype terms of interest.
For example, if we have a case with prolonged neonatal jaundice combined with emphysema — and in this case, you can see the instruction: “press shift to select,” and we execute the search — we can see that right away we’re defaulted to the disease tab. You have your emphysema phenotype of interest, but then right away, in the section of 40 results, you can see a probable disease clause of alpha-1 antitrypsin deficiency. Here in the phenotypes tab, you will find, already selected, your phenotypes of interest, as well as other phenotypes that may be related. It may be helpful to sort by article count, and then you can see right away, all this. A block of results with about 80 papers has a high likelihood of being associated with the disease that you have just determined will result in these phenotypes. You can also click on the gene tab, and right at the top, we have the causative disease, the causative gene, and then you can also view therapies that have been related to these two phenotypes.
BRITTNEE: Now that Diane has gone over the foundational aspects of Mastermind and how the data is presented and organized, we’ll go over some introductory analysis workflows recommended to help get you started. Keep in mind that these workflows are shaped by what we just talked about in the general concepts.
DIANE: All right, thanks very much, Brittnee! To start, we recommend that you begin searches with a gene or variant information only. This will offer you maximum sensitivity for your gene or variant of interest. The next step that we recommend is to review the number of returned articles, seen up here on the right hand side of your screen. You should decide if your group needs a more specific search. We here at Genomenon consider an ideal number of papers to be about 20, but this may not be the same for different research groups. You should decide what an efficient number is for your specific group and set that value accordingly. One of the easier and efficient ways to filter down a large body of papers is to act on the titles of those papers. Titles are very helpful, especially when it comes to academic literature. For very well published variants, you can search the titles for custom keywords in Mastermind. That feature is available here in the articles panel: simply click the filter word. For this example, we’ll use “splicing” as our keyword filter. As you can see, it’s brought the number down considerably. Please note that you can apply this same concept to introduce a variant search at any time by interacting with either of these two variant panels. As you can see, the number of variant matches here in this column will accurately predict the number of articles resulting from the addition of a variant filter.
Another great way to filter your results is to use Mastermind’s pre-made keywords. These are generated from commonly used terms in academic literature. The category is separate from the articles panel in this filter categories drop down because these filters will act on the entire full body of the papers that we have indexed. As you can see, there are four broad categories of keywords and each one has its subcategories here on the left. To speed up searching there are two quick reference features in this filter categories drop down. First, “enable all” is always offered for each category that you will find. Second, the estimated number of articles that will be returned for when the search is executed will be listed next to the keywords, right here. You can click “enable all” to enable everything. You can also use the “disable all” feature. For this example, we’ll go to the case study section and enable all there, and then execute the search. When you have filtered the results as you see fit, the next recommended step is to directly view the literature evidence found in the full body of the texts. You can do this by selecting an article from the articles panel, which, by default, is sorted based on predicted usefulness. This is the same as relevancy to your current filter and keywords. You’ll see the text matches and the surrounding sentences here in the full text matches panel. This enables you to see all sentence fragments that have been deemed related to your assessment, and will increase the number of visible sentence fragments. The “Show:” drop down menu located here allows you to filter this full-text matches panel by keyword type. Because we have keyword filters, we can click keywords, and you’ll see as you saw in the filtered category section, the case control keywords. When reviewing these sentences, it may be helpful to note the column that reads “matched” located here, as this allows you to confirm the different variant nomenclatures that Mastermind has observed. As you can see in this current paper, this can be for exactly the variant you searched for or other variant nomenclatures. Here, we’ll quickly scan through other search results, and you can see the other types of variants that are matched. As you can see here, it’s not just the gene name, but also the gene transcript that can be matched. Here, even rsIDs are matched.
Another quick visual feature of Mastermind is to mark with a target those articles that exactly match your specific search. This compensates for the default Mastermind behavior of bucketing all matches that result in the same protein variant. For this example, we’ll use a splice variant. We’ll click on the first article result. As alluded to previously, when you’ve defined your search by nucleotide specificity, Mastermind will prioritize all articles citing this exact variant. These articles will be at the top of the results list and marked with a target icon, as you can see here. You can confirm this by showing your variants and viewing the sentence fragments surrounding your variant of interest. Those that create the same protein label variant here, but for any reason, do not match your specific nucleotide variant, will not have this target, but will be pushed to the bottom of the results list. As you can see, here are some results that match the protein variant but do not have the target icon. If we click on them, you can see why. This way, when nucleotide specificity is important, you don’t have to do much scrolling, and you can limit your review to the very first resulting articles.
BRITTNEE: Now that we’ve covered the general principles by which Mastermind is laid out and some basic analysis workflow, let’s go into how you might search for evidence for different types of variants. Let’s start maybe with the most commonly searched variant type, which is non-synonymous variants.
DIANE: Thanks so much! As Brittnee said, more common are these types of variants. To start this section, we’ll start with this specific gene. Nonsense mutations are sometimes the most clear-cut coding variants to interpret, as they primarily cause loss of function events. To determine on a high level if loss of function is a known mechanism of disease for a specified gene, you can begin your search on this gene level and then add category keyword filters to narrow down the results to pertinent articles. Loss of function type filters are found in the filter categories in ACMG interpretation, specifically over here in the left. We’ll enable all filters and execute the search. This type of search pattern will prioritize the “variants” panel by variants observed by Mastermind to be associated with our loss of function keywords. Those with the most references are pushed to the top of the list. As you can see, there are two columns here. This first column is the total number of articles for just this variant, and this column is the total amount of articles with this filter. Additionally, we’ll combine previous workflow that you’ve seen and search titles for the word “loss.” So, as you can see, we’ve severely filtered down our results, and this will help you. Please note that in Mastermind, nonsense variants will uniformly use the x abbreviation, as you can see here in the prioritized variant panel. Frameshift variants also usually have a more clear-cut biological effect, and are simpler to interpret than missense variants.
In the Mastermind variants table, nomenclature is the index amino acid and the Mastermind key ending with the -fs suffix. You may also search within the variant table to see other variants within the same codon or near that codon. The codon is 410, so if we type that in, we can see all variants within the 410 codon, or we can type “fs” and filter based on just frameshift variants. In this case, if we sort by DNA position, we’ll also see surrounding frameshift variants. Furthermore, if you know the variant results in an early termination or nonsense event, you may also achieve results with the use of the loss of function type filters. From here, as before, you can review, filter, and sort the variants panel for all variants associated with known loss of function frameshift events in the literature.
Missense variants can be among the most challenging to interpret without direct evidence from previous cases or from functional studies that support the variant’s role in causing disease. To determine if missense variants are a known mechanism of disease for this gene, begin your search on the gene level by removing any other filters, then engage the relevant category keyword filters. For missense, those will be found under “genetic mechanism” in the variant tab. Because we have a large number of filter articles still, we’ll do that same search by title. Here, the very second result starts with functional evaluation. If we commit to variants only, we can see severely defective in the n210s variant. If there are no articles for your variant of interest, sorting the variant table by position, as shown previously, will allow you to discover if other missense variants near or at the same position have been reported as pathogenic. So we’ll search for codon 443, and as you can see, Mastermind has found no results at this codon, but if we search a nearby codon, you can see that Mastermind does have some results. Because Mastermind’s variant panel will default the view to your selected variant, you can then sort by cDNA position and look at variants nearby codon 443.
BRITTNEE: Thanks, Diane, I really think these last few examples highlight how just incredibly powerful Mastermind is at finding information when there’s no articles or few articles that are going to inform you about your single variant. What you were just saying there at the end, about finding nearby variants, obviously, that informs the ACMG criteria, and that can be incredibly helpful. I feel like that’s where we got some of these example types. Why we’re searching some of these variants is to ask, what do you do if you’re looking for gene level information to answer the criteria, is loss of function a known mechanism of disease? Mastermind isn’t just useful at the variant level, but it’s incredibly useful with our powerful filter categories or keywords as we’ve been talking about. It’s incredibly useful at the gene level or including diseases in order to really be able to answer the criteria and score our classified variants as pathogenic.
DIANE: Absolutely, I totally agree. One interesting category of variant includes what would be on paper, a synonymous variant, but will biologically cause a splice event, and Mastermind can quickly help you resolve those as well.
For this example, we’ll use this synonymous variant, and even from the suggestions already, you can see that Mastermind has observed this synonymous variant in the literature, and can already tell you that it falls on a splice region. For codons that cross splice regions, synonymous variants are not necessarily benign, given that they can impact a splice junction. As shown before, the articles that Mastermind returns are grouped by their potential similar biological impact, which is another proof that you can use to confirm Mastermind’s sorting potential. To show that, we’ll click on the co-occurring matches in the variants panel. This will bring to the top of the list variants that co-occur in the same article. You can see over here that this synonymous variant and this splice variant at the same position co-occur in the same articles.
As you may be seeing, there’s a feature that allows you to discover or remind yourself straight from the variants panel where these splice sites are. This is the ability to filter straight in the variants panel. Variants within the splice acceptor or splice donor sites are grouped into s, a, and sd, respectively. Variants observed within the untranslated regions will be grouped into UTR.
BRITTNEE: I’ll jump in at this point and just add that, if you have any questions, I know sometimes it’s really hard to remember all of these different nomenclatures that Diane’s going through for you. We’ve tried to organize this information, and it can be found up there in the top right corner under help and then clicking FAQ. It’s just a quick reminder for all of the different shorthands that we’re using, so that you can you can refresh your memory of this after this presentation. Thanks, Diane.
DIANE: Awesome points. This filter for the variants panel allows you to look at or discover other potential splice variants within your gene. Practically speaking, this feature can speed up investigations concerning whether these type of protein effects are pathogenic, not only for your current variant, but also for other splice variants within the same gene. If you know the number of the codon that crosses the splice site — which you don’t necessarily have to with Mastermind, but if you do — you can type that codon number in to look at the other potential effects on the same codon. As you can see, all potential effects at this codon, 509, which when we sort, will push non-relevant codons to the bottom, you can see multiple instances of these types of variants in their variant forms in the literature.
As you may also guess, should there be no results for your variant of interest, you can expand up a level into a gene level search for these type of variants as well. You can expand to the gene level search by removing the variant of interest, and then engaging the category keyword filters to narrow down the results to pertinent articles. For splice type variants, there are several terms grouped within the genetic mechanism filter tab, which you can then leverage to prioritize where authors talk about splice events. As you can see, these filters are engaged with a boolean OR, so there’s probably going to be overlap, even though you select multiple categories. Other potentially useful filters would include those that involve the effects of splice site disruptions. These involve exon skipping or exon deletions. As such, those keywords are found in the exact same spot, under genetic mechanism in the variant subcategory, “exon skipping” and “exon deletion.” As before, any variants of interest that are associated with these two filtered categories will be pushed to the top of the variants list.
That concludes my portion of variant section searches and variant-centric search examples that we have for you in this tutorial video.
BRITTNEE: Thanks, Diane! I think, just to add in here, and what Diane was speaking to, it’s really that these filter categories at the top are incredibly powerful. If you’re not finding information about a specific variant of interest, really go back out to that gene level and start looking for more general principles that might affect your gene of interest. By indexing every variant and offering you every variant that’s in the gene on the left panel here, we can really help you search for other variants of the same type within your gene, that can inform your classification outside of someone having done an extensive study on your specific variant of interest. So again, please pay attention to the the information on the left. It’s really powerful, not just to find information about your specific variant, but about the gene. and then to apply filter categories in order to inform you about the types of information you’re looking for.
Now, we’re going to actually skip ahead a little bit in the guide. For anybody that’s following along, we’re going to jump over to the CNVs section. We want to introduce a couple of quick points to highlight some of the functionality, and again, to assist in some of the searches that you may be doing around copy number variants.
Just like almost everything that Diane was doing, using those keywords, using diseases, using phenotypes, this is another professional-only feature within Mastermind. To search a copy number variant, we can do that by starting at the exon level or gene level. So I started with a deletion in ldlr in exon 7. You can see Mastermind recognizes that search, and gives it back to me, repeats it back to me, to ensure I understand that Mastermind did recognize that. Authors do not only use this type of language when they’re speaking about a copy number variant. We’re trying to predict what authors are saying in the literature, so we’re basically trying to read their mind and figure out how to index that information correctly and give it back to you with your search. In this case, not only is there information available about that deletion in ldlr exon 7, and we’ll get those repeated back to us down here again in those sentence fragments. I’m going to relaunch this search, as some authors actually use a slightly different nomenclature. Instead of saying a deletion within an exon, so exon 7, they’ll actually use the protein nomenclature, the p. type nomenclature that corresponds to that same region. In this case, I’m going to search for this variant that was a deletion between the start of that exon and the end of that exon. In this way, it’s two different methods to get at the same information, and that’s how you can get the most comprehensive set of information returned for any CNV that is smaller than a whole gene. We recommend for this workflow, try both the protein level as well as the exon level nomenclature when you’re searching for something smaller than a gene. When you get into searches that are much larger than a gene, that’s where you can type in, say, an entire gene, or even use chromosome coordinates if that’s what you have.
This last section that I want to go through is just highlighting some of the different filter categories. Diane went through different variant-specific scenarios for us. Now, I just want to speak to all of the different filters that we have available here in “Filter Categories.” You may understand what you’re looking for when you’re looking for this type of information, or different types of clinical questions. The most common ones that our users use here is functional changes. I’m looking for something that has an in vivo type model system change, and we actually group things in cell lines within that in vivo. Or, I’m looking for studies that speak to an in vitro, or an assay. Again, these are predictive words that authors use. One important thing to note here is this number. I know Diane spoke about this earlier, but this number really will tell you how many articles are going to be returned when I highlight these keywords. As she mentioned, we always recommend starting “most sensitive” and enabling all. The next most common, in addition to functional, is pedigrees and case studies. This is where you’re going to get keywords around a “proband,” an individual person. You can see, only 16 of the authors out of the 46 here use the word proband, that’s why we launched this comprehensive search, we’re talking about first degree relatives, we’re talking about family members.
The next section I wanted to highlight here is inheritance type filters. This is what you may want to use if you’re looking for information related to a gene that might be studied under both somatic as well as in an inherited context. This is under “inheritance pattern” here on the left, but in addition to just the word “inherited,” you’re also going to get all the keywords around things like “germline,” “mendelian,” or any of the different type of inheritance patterns that you might see for those genes. On the other hand, if you’re looking for a variant or a gene within a somatic-specific context, you can actually use the word “somatic” to try to pull out the articles that will be highlighting that gene within a somatic context. Down here at the bottom, I think Diane went through some of these others that are highly useful, you know, loss of function type variants. We’d be answering questions around, is loss of function an known mechanism of disease for my gene?
Finally, down here at the bottom, I will show the classification keywords. You can see very few, if any, authors actually use the scoring criteria themselves. However, at the very bottom, there are interpretation keywords. This can help pull those out, so that you’ll get that returned to you in those sentence fragments. In all of this, every time I’m adding some of these keywords — and I’m just going to submit with those chosen filters — there’s a huge reorganization of the relevance return here in the top right as well as the benefit that I get when I highlight one of these. As Diane showed, we can sort by match type, but here, I want to look at all matches. I get an incredible number of these sentence fragments so that I can really analyze this article without ever needing to jump out and see the article. I’m able to really determine if this is what I’m looking for to test this hypothesis. Is this the information that I’m looking for? If not, then I can just move to the next article.
I hope today everyone picked up some tips and tricks from our frequently asked questions that we’ve gotten about Mastermind. We really wanted to highlight some of the workflows that you can use in addition to searching for just a single nucleotide gene and variant. If you have any questions when you’re doing any type of search, the easiest way to get in contact with our support organization is simply to copy this URL right here. This tells us exactly what you were looking at, including the highlighted terms you had and the filters. Click this “contact us” button. If you can paste that right here, and then ask your question in this window, that will email our entire support organization, including Diane and myself, who can then help you with the workflow you’re looking for or define the type of information you’re looking for. Sometimes, it’s not just as simple as saying, how do I use the software, but, I’m looking for information about a loss of function variant within my gene of interest. We can try to highlight some of these workflows or point you to these sections in the documentation to ensure you can analyze variants as quickly as possible.
GARRETT: Brittnee and Diane, fantastic information! Thank you so much for offering your expertise on using Mastermind, and how customers can get the most out of their subscription. For everyone joining us, thank you so much! In case you missed something or you’re wanting to revisit a concept or a search example, we will send you a recording of this presentation later on today. In the meantime, however, if you have any questions, feel free to reach out to email@example.com. Also, in the handout section, you’ll see two documents that you can download. One is a user guide that reviews all the information we discussed today, and the other is a data sheet that shows how we stack up against our competitors. And of course, to get your Mastermind account, visit the link you see on the screen and get started searching. Thank you, and have a nice day!