What is the Mastermind Genomic Search Engine?
Mastermind is a comprehensive search and association engine to identify gene, variant, disease, phenotype, and therapy evidence from millions of scientific articles. These data are prioritized based on clinical relevance of the article’s content, and evidence of each citation is provided in the form of contextual sentence fragments from full-text literature.
What is Mastermind used for?
Mastermind is used in variant interpretation and biomarker discovery as a decision support tool or data aggregator and has applications in molecular diagnostics, genetics, and drug discovery.
What is Genomic Language Processing?
Built on the principles of Natural Language Processing (NLP), Genomic Language Processing (GLP) identifies every way that an author can describe a gene or variant and filters out erroneous information that can be mistaken for genomic data by incorporating knowledge of biology and the human genome. The genomic data in the scientific articles within Mastermind is processed through this patented GLP system.
What is the source of evidence for Mastermind?
PubMed, Full Text and Supplemental Data. We index the titles, abstracts, and other PubMed meta-data for all articles, along with the full text and supplemental data of articles relevant to genomics and mendelian disease.
Is supplemental data included in the Mastermind database of scientific literature?
Yes! Mastermind has now included supplemental data into its indexing process. If you’d like to learn more, please contact us.
Does Mastermind provide access to full-text articles to users without journal subscriptions?
No. Mastermind provides online access to articles if they are available to you as open access or through a subscription. If you have access to the article, Mastermind will show you the full text publication within the application. If you do not have access to the article, Mastermind will provide a link to the publisher’s site to purchase the article.
How often is the Mastermind database updated?
Weekly. Mastermind performs weekly updates to its database by identifying the new content that has been published in the preceding week through PubMed and prioritizing this content for indexing.
Can results change from day to day on the same search in Mastermind?
Yes. Because Mastermind data is updated on a weekly basis and genomic data indexing is ongoing, new data can be added to the search results as new articles are indexed.
Are genes and variants cited in the tables and figures of full-text searches included in the Mastermind database?
Yes. Mastermind scans the entirety of the full-text in its search, including tables and figure captions. If data is contained directly in images, Mastermind does not index it. These instances tend to be rarer and happen more often with much older articles.
What kind of searches does Mastermind support?
Genes, variants, diseases, phenotypes, therapies, and categorical keywords. Mastermind supports searches by any of these concepts. Searching by disease, phenotype, or therapy allows you to see all of the genes and variants associated with that disease, phenotype, or therapy. You can also search by variants. Searching by gene or variant allows you to see all the diseases that involve the gene or variant. Each of these searches allow users to see and interact with all articles associated with these search terms. The Advanced Search capabilities of Mastermind Professional Edition support user-defined text-based queries in combination with multiparameter gene, variant, disease, phenotype, therapy, or keyword and boolean searches.
How can I use the Advanced Search capabilities of Mastermind?
The Advanced Search capabilities are part of Mastermind Professional Edition and can be used to search for diseases, phenotypes, therapies, category keywords, or user-defined free-text terms. Advanced Search can also be used to perform multi-parametric searches for multiple gene, variant, disease, phenotype, or therapy combinations using AND or OR operators (Boolean search).
What are the differences between Mastermind Basic and Mastermind Professional Editions?
Many. While the data content remains the same between Mastermind Basic and Mastermind Professional, there are several differences between the two versions. High-volume users with more specific analytic needs would benefit from the Professional Edition. The key differences are outlined here in our plan comparison.
What gene formats and nomenclatures are searchable in Mastermind?
HGNC and others. Mastermind uses Hugo Gene Nomenclature Committee (HGNC; http://www.genenames.org/ ) nomenclature for gene symbol display. Additional synonyms are drawn from multiple other sources including UniProt ( http://www.uniprot.org/help/gene_name).
Do article counts include variant data only available in the full text or supplemental data?
Yes. Mastermind displays all the variants found whether they were present in the title, abstract, PubMed metadata, or anywhere in the full text or supplemental data. Mastermind Professional Edition has an advanced search function that allows users to switch between displaying results specifically from PubMed or full-text only versus displaying results from full-text articles and supplemental data.
How can a user be sure that a zero result means there are no articles that mention the searched-for variant?
Pretty sure. No search engine will ever have 100% sensitivity, but we believe that Mastermind is much more sensitive than any other method or variant database software available. We have done promising benchmarks in the past:
…and making improvements to Mastermind to address this issue:
…and are assembling additional data now that speaks to the statistical confidence users can expect of a null result when using Mastermind.
How do I report a missing article or variant to Mastermind?
If an article is found to be missing or information from the article was misinterpreted by Mastermind, feel free to reach out to us through the “Contact Us” link within Mastermind or otherwise email us at email@example.com.
What variant formats and nomenclatures are supported in Mastermind?
HGVS and others. Mastermind searches the literature for any one of dozens of different variant nomenclatures – standardized (e.g. HGVS; http://www.hgvs.org/) or not, including HGVSp (protein format), HGVSc (cDNA format), HGVSg (genomic coordinates), rsID (dbSNP), or IVS nomenclatures. For data display, the protein coordinates of the variants are used preferentially to make it easier to find and interact with relevant articles, irrespective of nomenclature used by each author. Mastermind also supports searching by Copy Number Variation (CNV). See details on this function here.
How can I search for variants in Mastermind?
Variants can be entered by typing the variant name using HGVS (or any other valid variant nomenclatures) either in cDNA, protein, or genomic coordinates (using GRCh37) or by rsID or IVS nomenclature. Select your desired variant from the drop-down list.
Which types of variants can be searched in Mastermind?
Mastermind can be used to search for coding variants including missense variants; insertion, deletion, and indel variants; nonsense variants; and frameshift variants. Mastermind will also search for non-coding variants affecting 5’- and 3’-untranslated regions (UTRs), splice donor/acceptor sites, splice regions, and introns as well as intergenic variants up- and down-stream of neighboring genes.
How are insertions and deletions, nonsense, frameshift, and non-coding variants displayed in Mastermind?
Mastermind uses shorthand identifiers to represent each type of variant in the protein space. The shorthands used by Mastermind are as follows:
- Insertions: “ins” — e.g. V600ins
- Deletions: “del” — e.g. V600del
- Indels: “delins” — e.g. V600delins
- Nonsense: “X” — e.g. V600X
- Frameshift: “fs” — e.g. V600fs
- Untranslated regions: “UTR” — e.g. 5’UTR or 3’UTR. Some genes also contain introns within the untranslated regions, so all of the following splice and intron categories below may be appended to the UTR categories for untranslated regions that contain introns. For example a splice donor “sd” variant that occurs in an intron inside the 5’UTR will appear as 5’UTRsd in Mastermind.
- Splice donor: “sd” — e.g. V168sd; these are variants affecting the 2-base region at the 5′ side of the intron. In the protein space, these are mapped to the nearest amino acid in the nearest coding neighbor at the 5′ side of the intron.
- Splice acceptor: “sa” — e.g. N581sa; these are variants affecting the 2-base region at the 3′ side of the intron. In the protein space, these are mapped to the nearest amino acid in the nearest coding neighbor at the 3′ side of the intron.
- Intronic: “int” — e.g. E46int; these are variants affecting any of the bases within the intron between the splice acceptor and splice donor sites. In the protein space, these are mapped to the nearest amino acid in the nearest coding neighbor.
- Intonic donor and acceptor sides: “intd” and “inta” — e.g. N581intd or N581inta; these are variants that occur in either the donor half or the acceptor half of the Intronic “int” variant region between the splice donor and splice acceptor sites. These are more specific sub-divisions of the “int” category, and so variants in either the “intd” or “inta” categories will appear in the “int” category as well.
- Splice regions: “srd” and “sra” — e.g. N581srd or N581sra; these are variants surrounding the splice sites, from 1 to 3 bases into the exon and from 3 to 8 bases into the intron. The intronic part of the splice regions overlap the intronic “int” classification as well, so splice region variants within the intron will also appear in the “int” and either the “intd” or “inta” categories as well.
- Upstream genetic variant: “ugv” — only ugv; these are variants affecting the region of 5,000 bases upstream of the 5′ side of the gene.
- Downstream genetic variant: “dgv” — only dgv; these are variants affecting the region of 5,000 bases downstream of the 3′ side of the gene.
- Extensions: “ext” — e.g. A55int; these are variants which span multiple exons within the gene.
The “Filter by” input in the “Variants” section of the Mastermind results page can be used to find such variants. For example, typing “ins” or “del” will filter the results for all insertions and deletions cited in that gene. Other shorthands can be used in the same way.
Can we search variants on genomic positions in Mastermind?
Yes. Mastermind supports GRCh37 and GRCh38. To search for genomic coordinates in Mastermind, you can modify the URL directly or enter them in the search field in Mastermind, including the appropriate sequence identifier as in the link below.
Where (chr) can be taken from the list below and substituted into the URL:
For example, a search on chromosome 1 would look like the following example:
This can also be typed into the search field (with no spaces): NC_000001.10:g.94508323G>A
What is Relevance and how is it used to prioritize article results?
The Relevance of an article is a measure of both clinical relevance of the article and contextual relevance to your search query. Clinical relevance is informed by the content of the article, as well as the journal’s impact score, recency of the publication, and other metadata about the article. Contextual relevance includes considerations including how frequently the selected search terms are mentioned in the text of the article, how close together they appear, and where they appear in the article. Relevance is intended to be a relative and not an absolute estimation of the relevance of the content to your search queries.
This ranking is depicted in the impact plot – the size of each circle represents the relevance of the article to the selected key terms, the larger the circle, the greater the relevance. By default, Mastermind will order the publications by their relevance in the Articles List.
What is the impact factor/impact plot and how can I use it to qualify or guide my results?
The impact factor (IF) of an academic journal is a measure of the average number of citations for articles published in that journal. It is frequently used as an estimate of the relative importance of a journal within its field. The impact factor plot shows each article by publication date along the x-axis and impact factor along the y-axis, with the relevance of the article to your search represented by the size of the bubble.
What are the Categories in Mastermind and how can they be used?
Categories are part of Mastermind Professional Edition. The default keyword categories in Mastermind Professional include: ACMG Interpretation, Clinical Significance, Genetic Mechanism and Significant Terms in Abstract. Each of these categories allows the user to display only those articles that contain content that is relevant to each individual category based on the appearance of any of the given category’s key terms. The ACMG Interpretation category contains a number of subcategories, such as Functional Studies and Segregation Information, that facilitate identification of articles useful for ACMG/AMP interpretation. The Significant Terms in Abstract category identifies keywords that are specifically associated with the content for the disease and gene in the original search. Mastermind produces this list of custom key terms by aggregating the content of each of these articles, performing a word frequency calculation, normalizing this list against the rest of scientific literature, and then ordering the terms by their frequency of occurrence in the content of interest. As an example, for the disease-gene association Melanoma-BRAF, this category includes anti-BRAF inhibitors like Vemurafenib and Imatinib, other genes such as GNAQ, and ancillary disease terms like uveal.
Does Mastermind include population frequency data or computational model predictions of pathogenicity?
No. However, Mastermind has a partnership with Saphetor to provide link-outs to VarSome, which aggregates much of this publicly available data. Additionally, Mastermind links out to Google Scholar for additional information from patents and conference abstracts. We also produce custom curations called “Mastermind Genomic Landscapes” that include this material. Inquire at firstname.lastname@example.org for more information.
Does Genomenon provide data analytics services using Mastermind?
Yes. Genomenon offers gene panel design and variant database assembly, in addition to custom services in clinical NGS experimental design and drug discovery or repurposing projects.
What browsers are currently supported by Mastermind?
Google Chrome is the preferred browser. For instance, to view the articles as PDFs in Mastermind, you will need to use Google Chrome and have the Mastermind extension installed from the Google Chrome Webstore:
If you do not have Google Chrome installed, you can download it from https://www.google.com/chrome/ and follow the download instructions for your computer platform.
Mastermind is also accessible in Firefox, Safari, and Internet Explorer 11, though some features may be limited.
Is Mastermind available through API access?
Yes. There is a Mastermind API to access some data programmatically. You will need a separate license key to use the Mastermind API. To learn more visit: https://mastermind.genomenon.com/api
In addition to the published API, custom APIs are available upon request.
Can Mastermind be used to identify mutational hotspots?
Yes, Mastermind is ideal for identifying evidence of hotspots in genes. The Variant Diagram plot is a visual representation of variants discovered for the gene in question, as well as the relative number of associated articles. Using this feature, you can determine at a glance which regions are highly variable. This plot updates whenever your search terms change, or you can select a variant by clicking on its var in the Diagram.
What is Disease-Specific Curated Content?
Disease-Specific Curated Content is a comprehensive and expertly curated set of variants for a gene. The variant data includes a summary of the published evidence, ACMG-based criteria applied to the evidence, and an ACMG-based provisional pathogenicity classification based on the evidence. The curated content also includes population data, in silico prediction models, and data intrinsic to the gene.
What part of the content is human curated?
Variant analysts review each article for evidence, ensure that the variant nomenclature is accurate and based on the canonical transcript, apply ACMG interpretation criteria to the summarized evidence, and perform a quality review of the data prior to the curated content being available in Mastermind.
How do I access curated content?
For variants with curated content, this content can be accessed by searching for a gene and variant. A provisional classification will be displayed for the variant and the curated content is available to be viewed by clicking “View Interpretation.”
Do I have to define a disease to view curated content?
No. You do not have to define a disease to view the curated content. In order to view the content, only a gene and variant need to be selected.
Which variants have curated content?
Currently, Disease-Specific Curated Content is available for published variants in ATP7B (associated with Wilson disease).
Does Mastermind have curated content for every variant for a curated gene?
Curated content is available for variants in the canonical transcript of a curated gene. For variants that are unpublished or for which the published information provides no useful information for classification, curated content may not be available.
Where does your population data come from?
Population data for curated variants is based on data from gnomAD.
How does Mastermind compare to HGMD for variant searching?
Favorably. Mastermind is superior to HGMD based on customer feedback in multiple aspects briefly described below. Mastermind is a more comprehensive database with a better user interface and more ready access to the evidence required to make informed variant interpretations.
- Gene Coverage – Mastermind identifies variants in any gene in the human genome, whereas HGMD is limited to around 11,000 genes.
- Disease Coverage – Mastermind identifies variants for somatically acquired diseases in addition to germline variants, whereas HGMD focuses on germline variants.
- Article Coverage – Mastermind has indexed more than 7.5 million genetic and disease-related full-text articles, whereas HGMD has indexed fewer than 100,000. Mastermind has also indexed all 32 million+ titles and abstracts from PubMed.
- Supplemental Material Coverage – Mastermind has indexed more than 1.2 million supplemental datasets, whereas HGMD has supplemental data for fewer than 29,000 articles.
- Update Frequency – Mastermind is updated every week with new data, whereas the free version of HGMD is updated every 3 months.
- Evidence – Mastermind shows the context of gene and variant mentions, whereas HGMD solely provides PMID numbers.
- Interpretations – Mastermind presents users with the data required to quickly reach their own conclusions, whereas HGMD provides their own assertions which do not conform to and are not useful for ACMG-style interpretations.
- See a side-by-side comparison.
How does Mastermind compare to ClinVar for variant searching?
Favorably. ClinVar has limited reach into the medical literature, and does not yet offer a comprehensive landscape of genes or variants, as the database is dependent on community engagement.
Does Mastermind provide variant interpretations or reports?
No. In contrast to ClinVar and HGMD, Mastermind does not draw its own conclusions about the clinical significance of individual variants, but rather provides the user with all the evidence necessary to make these conclusions on their own. Mastermind is therefore more properly considered a decision support tool. We do produce custom curations called “Mastermind Genomic Landscapes” that include provisional interpretations and pathogenicity calls. Inquire at email@example.com for more information.
Does Mastermind differentiate between positive and negative associations for diseases and genes or diseases and variants?
No. Mastermind searches for all mentions of a disease, gene, or variant, but does not draw conclusions about the nature of the association between the disease and variant or gene. We leave that to the experienced curators.
How does Mastermind compare to PubMed for variant searching?
Favorably. PubMed searches are restricted to exact variant matches, and the search results are restricted to variants mentioned in the titles or abstracts of articles. PubMed also does not distinguish between like variant names across different genes.
In addition to indexing every PubMed title and abstract, Mastermind indexes the entire full text of many millions of articles and the supplemental material for tens of thousands. By doing so, it is able to recognize gene synonyms and multiple conventional and unconventional variant nomenclatures to ensure maximal search sensitivity while utilizing context within the evidence to determine appropriate gene associations for optimal specificity. Learn more about Mastermind’s unique approach to maximizing sensitivity and specificity here:
How does Mastermind compare to Google Scholar for variant searching?
Favorably. Google Scholar variant searches are restricted to exact variant matches, and the search results are not prioritized according to clinical significance. Moreover, since Google Search has limited context awareness for variants being mapped back to the correct genes, multiple false positive results are erroneously flagged.
In contrast, Mastermind prioritizes the search results for specific articles based on the clinical significance of each reference. Moreover, Mastermind is aware of all the different ways authors can describe genes or variants and indexes the articles using this information so users do not need to configure these searches on their own; users can use any single nomenclature they prefer and be assured of the sensitivity of their search results. Finally, Mastermind displays the result match in the context of the rest of the article content allowing users of Mastermind to very quickly determine the accuracy of the result.
You can read the results of our initial benchmark in our blog here:
Can I view full-text articles in Mastermind?
Yes. As long as it is open access or you have licensed access to the text. If you do not have licensed access, Mastermind will display the match data from the full-text and will redirect you to the publisher’s website if you’d like to purchase the article or subscribe to journal access.
How should I cite Mastermind in my paper?
When referring to the use of Mastermind within a sentence, please use the following text: “ Mastermind Genomic Search Engine (https://www.genomenon.com/mastermind) “