Big Data for Rare Disease Pharma:
Comprehensive Genomic Landscapes to Accelerate Drug Development

Rare disease is anything but rare. There are over 7,000 conditions in this broad category that collectively impact the lives of millions of patients and their family members. Because of their rarity, it’s essential for pharma to have comprehensive access to the relatively little genetic research available.

A more clear and complete understanding of the genetic cause of disease has profound implications for drug development. In fact, drug targets with human genetic evidence of disease association are twice as likely to lead to regulatory approval.

Understanding the genetic drivers of rare disease accelerates drug development at each stage of the process by:

  • Informing downstream research and discovery,
  • Guiding biomarker selection for clinical trial selection and segregation criteria, and
  • Providing documented evidence for CDx validation.

There is a proven process to assemble this essential genomic insight for rare diseases: Mastermind Genomic Landscapes. In this webinar, Genomenon co-founder and CSO Mark Kiel, MD, PhD, will use real-world examples to demonstrate how Mastermind Genomic Landscapes have empowered pharmaceutical researchers and translational teams to understand rare diseases at the molecular level.

Topics Discussed:

  • How Mastermind identifies 2-10X the number of pathogenic variants for specific rare diseases compared to results in genetic databases such as ClinVar.
  • How one rare pharma organization realized a 55-fold increase in genomic biomarkers to qualify patients for clinical trials.
  • How to accelerate the identification of genomic biomarkers for Companion Diagnostic (CDx) development, and assemble the supporting clinical evidence from the scientific literature.


Can you speak to the approach to cohort analysis that you take to determine the molecular drivers of the diseases?

Mark: That’s a great question, and just to summarize that, a cohort is a group of patients who are united by some phenomena. In this case I’m presuming it means disease, all of these patients have the same disease, if what we’re talking about is each of those patients has had some sort of genomic profiling, the question is, how do we know what gene, or variant, or sets of variants or sets of genes, are driving pathogenicity of that disease? And use that information to inform drug development. The first approach that we take is to aggregate all of that data and look for recurrence. In a simple example for hairy cell leukemia, if you were to produce a cohort of those patients and sequence at the exome level, what you would see consistently across I think 99% of those hairy cell patients is a recurrent BRAF v600e mutation and that happens once in a century. That very rarely happens where you have a singular variant that is highly penetrant across all patients with a given disease. What actually happens more regularly is the same gene is recurrently mutated, or otherwise the same pathway recurrently mutated, or separately, there’s a biological effect that’s brought about because of mutations or alterations in multiple different genes and different pathways that converge on that effect. Those are the different ways that you could demonstrate meaning out of a cohort analysis and Genomenon’s approach after uniting that data and assessing those relationships that I just talked about, would be orthogonally take the data in Mastermind and annotate the cohort data in those different categoricals with the pre-existing evidence that allows you then to prioritize which of those aspects, which of those types of ultimate value from the cohort you might see, and then answer every putative hypothesis about this gene, or this pathway, or this biological phenomena, actually contributing to disease in that cohort, all supported by that evidence from the Genomic Associations from the Mastermind database. We have a very salient, reproducible approach that’s – I hesitate to say it’s hypothesis neutral, because it actually asks and answers every single possible hypothesis and does so with the ballast of the evidence from the Mastermind database. If the asker of that question would like to talk in more detail about their specific challenge I’d be happy to go into more depth.

What are some differences between analysis of constitutional diseases as compared to oncologic diseases when assembling genomic landscapes?

Mark: The most obvious difference would be the framework – the interpretation framework, which there’s a movement to unite them. I think the two frameworks separately are now talked about in the same breath as ACMG/AMP framework but they do differ in practice clinically.

ACMG is the American College of Medical Geneticists and Genomicists, and governs interpretation of variants in the constitutional realm including hereditary cancer. Then on the somatic cancer side the Association for Molecular Pathologists, or AMP guidelines, dictates how you interpret the variants. On the ACMG side, the data that is assessed, comes in two flavors: internal to the case or the variant, and external to the nature of that case and the the variant, which is where Genomenon brings in these population frequency databases, the in silico predictive models of pathogenicity, and all of that evidence from the publications. We take all of that information together and promulgate a provisional call for each of the variants that we’ve exhaustively identified by reviewing the data that’s indexed in Mastermind. So that’s ACMG, with AMP, the approach is a little bit more simple but you have tiers of is this in the standard acceptable guidelines, like NCCN or FDA Otherwise, you look at whether it’s in routine clinical practice that this variant is diagnostic and associated with a therapy, or if you’re not able to have that information because it doesn’t exist, there’s the assessment of the information for other clinical studies, or functional corroboration of this variant is disease-causing and that information leads to a tier 3 or tier 4. There’s just a different framework approach to assessing those two different types of diseases, but Genomenon is expert in both of those and can deliver a comprehensive Genomic Landscape for each. That’s sort of operationally from a content perspective. I’ll say that in constitutional disease there’s usually a more idiosyncratic disease causation in surrounding clinical parameters that need to be assessed, whereas in oncology there’s more of a stereotypia – it’s usually much more uniform what you’re looking for even though the cancer may be different- there’s less of an idiosyncratic or custom approach to assembling this data though we can and have for those somatic cancers produced more custom annotations to the comprehensive Genomic Landscape.

The last thing I’ll say about how somatic cancer differs is that obviously cancer is economical and reuses mechanisms. When you’re putting together a data set with a specific disease indication, it’s often beneficial to take a step back and look at other diseases that also result at the hands of the gene that you’re specifically targeting, even if it’s not for your primary indication. As I said, given that it’s a comprehensive Genomic Landscape, we include all of those additional diseases that are peripheral to, but still informative for your core disease indication.

How are variants without empirical evidence incorporated into Genomic Landscapes?

Mark: If there’s no empirical evidence, they’re still included by virtue of the fact that they have been identified in large population studies, say, if they’re benign, or you could consider that a GWAS variant or a variant identified in GWAS, lacks proper empirical studies, or evidence – but is nevertheless a variant that has information associated with it – both of those types of variants are included.

We are also able to go out to locus-specific databases where users or clinicians have submitted variants that they’ve identified for which there’s no proper peer-reviewed study, so our approach in assembling a comprehensive Genomic Landscape is to bring in and aggregate all of those lines of evidence, and then as I mentioned throughout, decorate with the empirical studies to strengthen those associations and provide a ballast for, and competence in your decisions moving forward.

Another note to make about variants that you identify in clinical workflow for which there’s no specific evidence about that self same variant, there’s benefit in assembling these comprehensive landscapes because related variants (various related to the variant for which there’s no empirical evidence), if those variants that are related to that variant do in fact have empirical evidence, that evidence can support a promotion of this variant from uncertain significance to likely pathogenicity. That kind of information can only be afforded by putting together a comprehensive Genomic Landscape where the sheer burden of and comprehensiveness of that evidence in the comprehensive Genomic Landscape can then promote those variants for which there’s otherwise no empirical evidence. That’s the way that I would answer that question in those two different approaches.

Then how can those landscapes help to better identify rare disease patients?

Mark: That’s a great question! Actually, I put in that last slide about Genomenon’s core business and how it bifurcates. I put that into the last minute and this question allows me to go into a little bit more detail, but at the hands of a clinician whether they be a pathologist, or a geneticist, or not even a clinician, but a variant scientist – somebody looking at the the genetics of a patient – there’s a need to have the most comprehensive information to maximize the the diagnostic opportunity. If you don’t have a comprehensive landscape, you run the risk of this patient being missed even though the sequencing data contains an informative variant. If you don’t have a complete aggregation of all of this information, that self-same variant or the variants that are related that have evidence that support that variant, would otherwise be missed by the clinician or the variant scientists looking at that information. All of that evidence that we assemble comes along with the appearance of that variant when searched for in Mastermind’s genomic search engine, or otherwise through our partnerships with tertiary analytic software where that information gets threaded through and fed forward and can inform the prioritization mechanism in those software packages to say this is a very that you can’t miss and here’s the evidence supporting why, and it’s associated with a rare disease, which by the way has an efficacious treatment or otherwise has clinical trials that the patient can be enrolled in. That’s where we have this exponentiation of the value that we can bring to the genomics market writ large is in producing these comprehensive Genomic Landscapes to drug discovery and clinical trial design in Pharma, bringing that comprehensive knowledge into the Mastermind search engine, and surfacing it to users, either direct users of Mastermind or through our partnerships with these different software providers, will then ensure that those rare disease patients aren’t missed any more.

Can the Genomenon data be used with human microbiome information?

Mark: ‘Used with’ is pretty loaded and it’s pretty wide open, so I’ll answer it in a couple of different ways, but sticking to that wide open territory.

Referring back to the genomic Association data that I mentioned before, Genomenon’s Genomic Landscapes needn’t be predicated on individual human variation, so if the question in the context of microbiome and infectious disease was more about, ‘What are the species of infectious agent that are associated with different diseases (or different phenotypes)? That is an association that we can put together and have it be comprehensive. Our mastery of the empirical evidence allows us to ask and answer those types of questions. More specifically, if there’s a question about the way that human genetics informs or is informed by different findings in metagenomics, or the microbiome, when we produce the comprehensive Genomic Landscape each of those variants that comes to the fore as being associated with this disease or being found in this gene, if the question is more specifically, ‘Which of those variants have any empirical evidence to indicate a relationship between themselves and metagenomics? If that’s the specific custom annotation that’s being asked for, that is definitely something that can be part of a Genomic Landscape deliverable. Whoever asked that question to arrange a followup with me actually I can be reached at this email. If that answer was too vague, I welcome the opportunity to dig in deeper and talk specifically about your use case and to walk you through some of the examples that I was just lightly touching on in my answer.