Gaining Insight into Known and Novel Therapies to Drive Drug Discovery

Comprehensive Genomic Landscapes for Drug Development

A comprehensive view of published evidence linking genomics with known drug associations and outcomes is invaluable, but until now has been impossible to obtain without years of manual curation. Accessing genomic-drug associations allows researchers to uncover novel findings that inform drug discovery and design.

The addition of Therapeutic Associations to Mastermind Genomic LandscapesTM provides a comprehensive view of the therapeutic approaches associated with genes and variants found across the medical research. These “Genomic Associations” enhance the ability of pharma and bio-pharma researchers to develop and test hypotheses about a drug’s mechanism of action or effectiveness in novel clinical circumstances.

In this webinar, Dr. Mark Kiel demonstrated how therapeutic associations between diseases, genes, variants, and phenotypes are a powerful tool in drug development. Dr. Kiel will presented several use cases drawn from Genomenon’s own engagements with pharmaceutical clients, and demonstrate how the therapeutic association data greatly increased the value of the data to help determine mechanism of action and clinical efficacy for the selected disease indications.

Topics discussed: 

  • How to find all previously tested therapeutic strategies and their outcomes for any indication
  • Methods for uncovering novel insight into mechanism of action evidence for precision medicine drug targeting
  • Ways to determine which off-label therapies have been associated with any gene or disease


Can you address polygenic disease and opportunities – for example type 2 diabetes (not MODY) and target drug therapy?

Mark: Sure! That’s that’s a great point because the examples that I highlighted here had either a monogenic disease causation or multiple genes causing disease but in a monogenic fashion each. When we’re talking about proper polygenic disease, we’re talking about, diabetes type 2, that the genes and their polymorphisms, or variants, cooperate with each other to in aggregate lead to a risk for developing the disease. The approach that we would take to produce the genomic landscapes differs but only slightly in that what we would do is begin with an agnostic view of what genes may be associated with the disease, by instead of querying our database to set up the project by gene, we would set it up by disease and so we would essentially ask of our very exhaustive database, what are all of the references that speak to the mechanism the genetic cause of diabetes? And then from that slice of our data, we would ask the question, which genes are recurrently mentioned in association with diabetes? Then even further, we would decorate each of those genes and prioritize them with the evidence from those references that indicated that they did lead to disease causation. That data decoration is disease specific, as you might imagine, we would start with an awareness of the basic biology of the disease. We would also want to look into the clinical ways that the disease is assessed, as well as what types of functional assays from a research perspective have been used in the past to evince an association of the genetic level with disease causation. To summarize, begin by the disease, produce an array of all of the genes that have previously been associated with the disease, and then annotate and prioritize the evidence based on what we understand the disease to be in its idiosyncrasies, and then look at all of that data in a sort of Pareto, or prioritized approach, to give you the most efficient insight into which genes or their associated variants are likely to be driving that disease. So it’s a little bit more complicated, it’s it’s extremely more complicated when you have to take a manual approach, but with a Genomenon’s unique combination approach of using the automated capability that we have to assemble and annotate the data, and then using our infrastructure to maximize the efficiency of the manual review, that kind of work is something that could be articulated over the course of say, several weeks.

Could I just use Mastermind to build my own landscape?

Mark: That’s a great question. You could, except that you would need to know what you were looking for beforehand or be able to avail the API, but there would still be a great deal of legwork that you’d need to do in interpreting all of that information. Infrastructure and software capabilities that we did not showcase that we have internally, we’re able to produce those very large datasets and assess the meaningfulness of each of the data points that are assembled probably a hundred times faster than would be possible using the Mastermind software itself, or certainly at least in order of magnitude faster if you were to build out capability with the API. Lauren’s goal in showcasing the software was to just give you a little bit of a flavor for the way that the data is indexed both from the user interface for clinical purposes, as well as the API. Both of those tools are largely intended solely for clinical users who know what they’re looking for either the specific compound or a specific question that they want to answer about a compound. Whereas the Genomic Landscapes that we can produce efficiently answer much more broad questions, or even produce answers in a hypothesis neutral setting, so I’ll commend whoever asked that question. Whoever had similar questions, I encourage you to visit the software and insert some queries and see what types of results emerge. If you do have a large-scale ambitious project that you’re thinking about such as assembling a comprehensive Genomic Landscape, let’s have a conversation about that and I’ll show you some of the ways that we can really maximize the efficiency of that assembly process.

Can you speak to the approach to cohort analysis that was taken to determine the molecular drivers of the diseases?

Mark: Sure! It’s a bit of a memory jog because I think some of those projects happened maybe five six years ago. Those are the ones that I was at liberty to discuss, but Genomenon’s approach to cohort analysis, I’ll use the example of the JAK-STAT disease mechanism for causing T Prolymphocytic leukemia, is to say that when you’re ideally looking for a single genetic variant that causes the disease, what you’re actually more likely to find is some interrelatedness between the various variants that do cause disease among each other. In that particular example, the relatedness was common to a genetic or protein pathway, the JAK-STAT pathway, and so the approach that Genomenon takes in doing cohort analysis is just at a high level, is to take all of the variants from among those patients comprising the cohort and cross-checking the relatedness of all other variants to those variants in sort of a 2×2 matrix. What you find is that when there’s a relatedness across two of those variants you can quantitate that relatedness. If they’re the self-same variant and it’s penetrant across multiple patients, that gets some higher prioritization, versus if it’s a neighboring variant in the same domain as another variant that’s seen in a couple patients, versus common to a gene, versus common to a pathway, versus expressed in the same pattern, or tissue or what have you, you can start to see tiers or stages, or quanta of evidence supporting those associations and Genomenon’s approach is to ask and answer questions about the relatedness of pathways in a hypothesis neutral method and then let the data tell you what pathways light up and show you the evidence for why that happens. The work that we did for the JAK-STAT data assembly took place over the course of a day or two, where otherwise if you didn’t have such a cogent data analysis strategy up front, you might spend a year or more looking at the data before a pattern emerged. When you take that logical systematized approach to having the data self organize, and then you answer all those questions all upfront and prioritize the evidence according to those answers, the topmost prioritized pathway emerges by itself. That’s what happened in that JAK-STAT example and in other examples as I said that I’m not at liberty to discuss.

What are some differences between analysis of constitutional diseases, as compared to oncological when assembling the Genomic Landscapes?

Mark: I’ve already highlighted one of them, it’s just sort of a operational difference and that’s to say that they have different frameworks. They’re looking for the same types of evidence: clinical and functional. On the oncology side, there’s obviously a predicate for standard guidelines that already devine which variants are disease-causing, things like the NCCN guidelines, and so that’s part of the AMP criteria and a place to start, but then when you start to look at all the rest of the data looking for clinical information, looking for functional information, those are common to the AMP and ACMG guidelines which actually are now sort of merged together and spoken over in in the same breath, but probably the most salient difference is in oncology, cancer is very economical and reuses mechanisms across multiple different disease types, as I alluded to with BRAF V600E, it’s different almost always in the context of constitutional disease where there’s usually a one-to-one correspondence between a gene and the disease it causes. There’s a lot more idiosyncrasy associated with that disease and it’s genetic cause, and so you can’t use a lot of the paradigms in oncology that are common to multiple different gene types – you have to be thinking more specifically about the nature of that disease, and the nature of that specific gene when you’re talking about a constitutional Genomic Landscape.

I hope I have convinced you in my discussion here that the Mastermind Genomic Search Engine and Genomenon can handle that readily given our computational approach and our ability to manually review that data highly efficiently.

How are variants without empirical evidence incorporated into genomic landscapes?

Mark: I’ll circle back to say that most of what I discussed when I talked about the evidence in the Genomic Landscape had to do with the empirical evidence in the medical literature which is a necessary and challenging data substrate to put together, which Mastermind has handled quite readily. We also incorporate population frequency data such as a NOMAD, as well as in silico predictive models of pathogenicity such as PolyPhen-2, or SIFT, or a variety of others, so those are all included where appropriate in our Genomic Landscape. I think the question here is more specifically asking about the assessment of variants for which there’s no publication, such as variants that are identified in routine lab work in a patient who is seen to have the disease and the lab determines its disease-causing and it’s submitted to some third party database. Those data are also included in our effort when we produce these Comprehensive Landscapes to be maximally comprehensive. We’ve incorporated clients specific data for their eyes only, or if they wish to disseminate it publicly, to be shared with the community at large. As well as any data from any locus-specific databases or any generic database of genetic variants – those are all included – they’re decorated with any of the empirical evidence that I highlighted throughout the presentation, but they’re also included if they have no such empirical evidence. Those are the different ways that we would address the comprehensiveness of the delivery of the data that we produce.

How do your pharma clients focused on rare diseases use these comprehensive data in the Genomic Landscapes to better identify rare disease patients?

Mark: If we’re talking about rare disease, and I won’t talk at all about oncology, I’m inferring that what we’re talking about here is the challenge associated with identifying patients with a rare disease who by definition are infrequent. That is a challenge in the circumstance for some of our Pharma clients where the need to promote awareness of the disease and the need to get in contact with, in some way, the treating clinician to make them aware that this rare disease, which may not be well understood by the clinician, let alone any therapeutic approach, being in their understanding, how do we connect those dots? The software that Lauren showcased, and I think you highlighted our user base now of 7,500 and then some clinical users, we are integrated in more than a dozen next-generation sequencing platforms for clinical use, so Mastermind has a very broad and deep reach into the genomics diagnostic community, and some of our Pharma clients have engaged us in taking the data that we curate and surfacing it to the diagnosing and treating clinicians, geneticists, pathologists, oncologists, etc. As far as I know, this is a unique offering where one company has connections and deep roots in the clinical space as well as deep connections and value that can be provided in the pharmacy. What we’re increasingly seeing is a need for and a benefit associated with connecting those two offerings, and users of Mastermind in the clinic will see some of those features emerge in the coming months to address that very question that was asked here.