Watch an expert panel discuss how they used the new Mastermind Genomic Search Engine to mine the full text of the genomic literature for two key applications: variant interpretation and the development of evidence-based diagnostic gene panels.
Mark Kiel, Founder and Chief Scientific Officer of Genomenon, gives an overview of the Mastermind Genomic Search Engine and discusses a comprehensive, evidence-based cancer panel that was produced using automated machine learning techniques. The pan-hematopoietic cancer panel is a comprehensive cancer panel of more than 300 genes supported by specific literature citations from among millions of research publications.
Nikoletta Sidiropoulos and David Seward from the University of Vermont College of Medicine demonstrate their approach and the tools used to quickly and thoroughly mine the scientific literature to interpret variants in somatic cancer cases.
Victor Weigman from Q2 Solutions presents an evidence-based method that his team used to select the content for gene panels by mining millions of full-text genomic articles to identify disease-gene-variant associations. Dr. Weigman explains how he created an evidence-based gene panel in under a week with prioritized literature citations for each biomarker.
Good afternoon everyone I’m Ed Winnick Editor-in-Chief at GenomeWeb and I’ll be your moderator today for this genome webinar. The title of today’s webinar is Mining Genomic Literature for Variant Interpretation and Gene Panel Design and is sponsored by Genomenon. Our panelists are Mark Kiel founder and CSO at Genomenon, Nikoletta Sidiropoulos Medical Director of Genomic Medicine in the Department of Pathology and Laboratory Medicine at the University of Vermont Health Network, David Seward assistant professor of pathology and Laboratory Medicine College of Medicine at the University of Vermont, and Victor Whiteman director of translational genomics at Q2 Solutions a quintiles quest joint venture. You may type in a question at any time during the webinar you can do this through the control panel which usually appears on the right side of your screen, click on the Q&A; box on the upper right side of the control panel when you click on send to please select all panelists we will ask the panelists questions after the presentations have concluded and with that I’ll let mark begin with an intro to Mastermind.
Thank You Ed, welcome and thank you to attendees. What I’m going to discuss today is what Genomenon has considered to be the other human genome project, which is to say the challenge of organizing genetic knowledge out of the medical literature. So the purpose of my portion of this webinar is to provide a brief overview of Mastermind, which is our genomic search engine for those of you who are not familiar with the tool yet, as well as showcase how many of our users have incorporated Mastermind into their workflow.
So this is a diagram that’s undoubtedly familiar to most of you in attendance, it’s our conception of genomic diagnosis in both clinical and research workflows, going from sample to report or insight. The way that we’ve broken down this process is in a primary phase of analysis which invokes a physical process of converting DNA into information, and then a secondary analysis process which is predominantly digital both of which have been reduced to practice that then culminate in a tertiary analytic component, which involves interpretation and this challenge requires manual curation of genetic variants and is currently the analytic bottleneck in genomic diagnosis.
So this is the reason for that bottleneck, the challenge here is that genetic information retrieval from the literature is made difficult by the sheer quantity of sources available in the form of individual publications across many thousands of different publishers, as well as the complexity and richness of information that needs to be extracted and organized from these resources that is to say genetic and genomic information. So Mastermind has addressed this challenge by indexing the genomic literature. We have scanned the twenty nine million titles and abstracts that comprise the PubMed dataset, of which we’ve identified more than five million full text articles for complete indexing that contain information relevant to genomics and genetics. From these full-text articles we have extracted and annotated disease gene variant associations across the entire spectrum of human disease, and every gene, and every variant in the human genome.
The disease’s include both cancer and constitutional diseases and the variants include coding as well as non-coding variants and also incorporate when authors mention those variants at the cDNA or the protein level. And I’ll emphasize here that because this is a highly automated process, our algorithms for extracting and organizing this information continue to evolve to maximize the sensitivity and specificity of our results, and we provide increasingly sophisticated algorithms for computing and presenting the relevance of this information.
This is a screenshot of Mastermind which is a web-based software. Searching by disease is performed by users who are interested in seeing what genes or gene variants have been associated with that disease in the literature, or conversely searching by gene or gene variant in order for users to see what diseases those biomarkers have been associated with in the literature. And then the search can be refined according to clinical relevance depending on the circumstance that the patient is experiencing several of these examples will be showcased by David Seward that he performs with his colleagues at UVM including using Mastermind to perform multi parameter search. This is one example of one of those high level searches at the gene level, in this case this is dnmt3a where Mastermind is returning all of the diseases that this gene has been associated with in accordance with the number of articles that make mention of this association. And this is meant to really exemplify the fact that Mastermind has a comprehensive understanding of all biomarker associations across a whole spectrum of human disease.
So this data that I’m exemplifying here is useful for database assembly, as well as downstream machine learning activities that we provide to multitude of our clients. What I’m showcasing here is a typical association page that results from a more focused Mastermind search, where on the right is the article landscape for the results of that search and on the left is the variant landscape that results from that focus search and at the top are the ways to refine that search to maximize the relevancy of users results. So in subsequent slides here I’ll just break out a couple of the feature sets in Mastermind this set of features showcases the survey of the literature landscape that Mastermind provides to its users.
On the Left you’ll see what we’ve referred to as an article impact factor plot where each of those circular icons is a single reference that results from a user search for disease and/or gene and variant. On the x-axis is the publication date of any one of those references. On the y-axis is the citation factor or impact of each of those journals and the size of the icons for each of those references suggests the relevance of the content based on the user’s search or search refinements. Over on the left is that information in list form at the top with each reference called out by bibliographic information, including detailed information about the frequency of matches to a given search in this case the gene search. And then below to provide context for the relevance that Mastermind provides, we events the users search terms in context of full text sentence fragments where those terms were mentioned by authors. So this is useful for reference labs who are interpreting large gene panels or otherwise performing exome or genome sequencing interpretation activities.
This next feature set is showcasing the gene variant landscape that Mastermind provides its users. For in this example for a particular gene search, on the left is an array of all of the mutations in this particular gene as they are populating the linear axis of the protein, along the x axis juxtaposed next to the functional domains of the protein where each of these bars represents an individual variant allele, and on the y axis is a reflection of on a log scale the number of references that make mention of that variant in the literature. That information is then provided on the right in list form that can be filtered or searched or sorted.
Mastermind also showcases whether those mentions of any particular variant come from the full-text or otherwise from a high-level presentation of the information in the titles and abstracts, and this information usually reflects a 10 to 15 fold differential between information being presented in the full text as compared to the title and abstract information. Information like this can be examined in the software interface itself, or otherwise in concert with working with Genomenon’s team, users can be provided with batch consumed databases, one salient example is a recent use case for a pharmaceutical company in Phase three FDA approval for a development of a drug to combat a specific disease indication we were tasked with producing a comprehensive database of all variants for a hundred and fifty genes relevant to that particular disease indication to provide a database of companion diagnostics for this pharmaceutical company.
Other ways to automate Mastermind including examining the data through data feeds or API’s, or otherwise engaging Genomenon in using Masterminds services including gene panel design services an example use case that Victor Wegman from Q squared will discuss later as Genomenon and his group produced a gene panel design for acute myeloid leukemia. To better reify what Mastermind is able to do in the way of informing gene panel design, I invite attendees who are interested to visit the website below for a download of a white paper describing the Mastermind process for panel design.
To put it simply we’ve started with a particular disease indication just as an example pan hematopoietic cancer diseases including leukemias and lymphomas. We then turned our attention to Mastermind to extract organize and annotate information for more than 300 gene candidates and then output in a database all of the evidence support from prioritized references comprising the totality of Masterminds database. And lastly before I turn it over to Ed, I’ll just encourage any attendees who have not already registered their interest in receiving access to Mastermind free that they can go to the website provided here below and sign up and begin using Mastermind immediately.
So with that I’ll turn it back over to Ed. Okay Thank You Mark as a reminder to our webinar participants if you have a question please type it into the Q&A; box in the control panel we’d also like to ask attendees to take a moment after the webinar is ended to take our exit survey and give us your feedback. I will now turn it over to Nikoletta Sidiropoulos
Nikoletta Sidiropoulos and David
Hello, thank you Ed for the introduction. I’d like to start out that I am the medical director of the Genomic Medicine Program at the University of Vermont Health Network giving an overview as you what our program is, what our our network looks like, and what our practice in delivering clinical genomic laboratory services also looks like so that way how Genomenon and Mastermind were relevant to us can be generalized to your own practices accordingly.
So with that my personal background, I am trained as a doctor of medicine I’m a board-certified pathologist across the sub specialties that are listed on the slide, but my practice is primarily a molecular genetic pathologist and I function as the Medical Director of Genomic Medicine at our institution. I’m a pre-assistant professor of pathology and Laboratory Medicine and also through my practice of genomics, I have practice as a physician informaticist. So in terms of what our practice and what our network looks like in Vermont we made a decision as a leadership group in 2013, at the time when we were thinking about genomics, to really skip over classic molecular genetic pathology as it was practiced you know before 2013 as targeting single genes and a more limited scope, and moving because we weren’t doing that practice. And then in 2013 with the resources that we had in hand we decided to proceed directly to developing a genomic medicine laboratory with and it’s CLIA licensed cap accredited and we based on our resource evaluation and relatively low volume we service the northern regions of New York State and Vermont.
Our population overall is a little over a million that we serve and with a relatively low volume it made a whole lot of sense for us in order to get increasingly more information from limited tissue to move towards next-generation sequencing technology as our backbone as and using that on the web bench and then we partnered with Peary and DX to use the clinical genome assisted workstation for our dry bench and then report generation. So our core philosophies in developing this program you know we did perform a thorough stakeholder evaluation and secured our resources, a business plan had been proposed and signed off, and we also integrated our philosophies into the departmental strategic plan.
Our vision at the University of Vermont Health Network in this department is to practice clinical genomic medicine – for the aim of improving health and healthcare and by doing so obviously we generate a lot of data a lot of a lot of biospecimens and so developing these banks of data and fueling genomic translational research, so by our clinical practice we’re driving genomic value and implementation research. Underneath it all we are extremely dedicated also to genomic education so literacy around how and when to use genomic based testing and then how to apply the results in clinical practice. We made the decision also as a philosophy to really push ourselves to go beyond developing just the test if you will just see the genomic test and really think about developing genomic care pathways and so the way I think about developing a genomic care pathway is that it involves strategic integration of the best genomic technology with the people and processes the on the laboratory to realize the promise of precision medicine for each unique patient.
It is in our business plan to develop a care pathway to service oncology, pharmacogenomics, and inherited disease. So some of our implementation considerations you know we took a lot of effort to engage with institutional leadership to make this a reality, the end users the the especially the clinical end users in particular you know who are the people who are using our results, are we servicing their needs? Obviously our aim is to give the best possible service to patients, and we also feel like we have an educational role and a proactive role in engaging with payers of all different types to ensure reimbursement. So we made the personal decision to heavily invest in physician trained genomic medicine staff. We have four people who are signing out our genomic assays if you will the reports, three of which are board-certified molecular genetic pathologists, and one of which is a medical geneticist with a background in laboratory medicine.
In developing our tests assay design in particular, you know I think the content has to be biologically and clinically relevant to service the end users in their clinical practice as they deliver patient care. In addition I mentioned you know being able to to interact with payers educate them in terms of what is the clinical how was this test built, what is the content of this test, how is it how is it built why were these targets chosen, what is the clinical utility for these targets and why is this going to impact patient care? And in doing so I think you know our group has done our due diligence in terms of performing adequate database searching but literature medical literature searching to support the targets of the assay. You’ve heard that’s a service that Genomenon is offering and you’ll hear more about that later in the webinar, but also what became incredibly clear is that we put a lot of effort and resource into the medical interpretation of the variants that come out of our tests.
The gene panel solid tumor which is the first test that we launched the genes of which are listed on the screen, we we target every exon across those genes and as you can imagine you know we are trying to perform medical interpretation, and so you know you could imagine by doing that breadth of coverage across those genes you get a lot a lot of variants where it’s not entirely clear you know how you do an adequate medical interpretation it’s hard to find information on a lot of these. I think we lock down every process in the clinical lab to the best of our ability with operating procedures and and going by you know being a cap accredited lab you know everything is locked down.
But what we realized is that and this is in talking to our colleagues nationally and internationally the the exercise of searching the medical literature to annotate the clinical relevance of variants is is really I think it’s fair to say it’s not a process that’s locked down in the lab. Dr. Seward and our practice may search in a little bit of a different way than I will and sometimes we interface and we say “geez well you found this article and I didn’t and how do we lock this down”? So we really felt that Genomenon offered a solution where we could take it into our clinical practice and effectively standardize the way we were searching medical literature. We also felt that you know I think any pathologists or anyone who’s signing out a report if you’re going to make a statement that a variant has not been described in the literature previously certainly you know you want to make sure that’s true.
There’s been many instances where variants have come up we think that we haven’t found anything but then we use the Mastermind tool and sure enough we find an article out there, and and we feel like we can more adequately stand behind our reports and we feel like we’ve done our due diligence using this tool. So you know that’s that’s an important consideration for us as is scalability, our volumes are increasing the number of tests were offering are increasing and so scalability at the medical interpretation stage is a real challenge I think for all of us in this particular field. So with that I will hand over the presentation to my colleague Dr. David Seward thank you very much.
Alright thanks Nikki, that was a great introduction. I also want to just pass along my thanks to Mark and Ed for the introductions earlier. So like Nikki, my clinical work is focused on molecular genetic topology and I’m one of the people who signs out the assay that Nikki described here at UVM. The goals I have for today are to basically give everyone a flavor for what using Mastermind in the context of a genomic sign-out is like in a step by step basis. To accomplish this I’ve picked a couple of cases I’ve signed out in the last six months or so that I think will help illustrate the the benefits to using soft this software and how it has directly impacted my clinical sign out in my work. So the structure I’m going to use to to accomplish this I’ll first go over did a case at a high level and explain some of the variants that we detected, and also then articulate the questions that I asked and addressed specifically using the Mastermind software, and then walk through screenshots of the actual searches I did to give you a sense for how this information is presented and then where I think the utility exists as far as pointing to the appropriate medical literature which I can use to support the final reports and my communications with our oncologist.
Both of these cases are non-small-cell lung cancer that currently represents the highest volume at our practice as far as the samples that we receive to sequence, and in this first particular case one of the two interesting variants we found was an activating well described K raspberry at the g12c. So I’m not going to spend any time on searching for this variant because I’m sure anyone who signed out NGS essays or any of the classic molecular tests probably has at this point it can’t comment for the KS variants, but what was interesting about this case is in addition to the activating K res variant we identified a variant in the cdkn2a locus which encodes both p 16 and p 14 in this particular variant is predicted to be a missense variant at the 83rd positions so it’s a missense variant histidine 2 tyrosine in P 16 and I had not seen this variant before and so I’m going to use this as a starting point for using the Mastermind software.
So the first question I had quite simply was has this variant been published in the literature and does it have a functional impact? Now as many of you are likely aware there are lots of publicly available databases where you can determine whether a particular variant has been sequenced in a tumor before but frequently those did not offer any functional interpretation it’s just simply whether it’s been identified previously or not. So my question around this particular variant what specifically is there a functional impact which I should try and incorporate into my report and communicate to the oncologist.
The strategy that I used in this particular case as a first step was as Mark described I kept maximum sensitivity options open by searching for all diseases and then I specifically searched for the cdkn2a gene locus and the h8 III y variant, in addition I added an additional filtering step which I’ll document on subsequent slides that helped to narrow the focus of the search. In this first window here is the Mastermind landing page as you can see I’ve entered the cdkn2a gene and the variant of interest, and after that first search there were a series of publications identified I then used the filtering tool at the top of the page and in this case I used the mechanism of action because because p16 is a tumor suppressor I was specifically interested in whether this variant would lead to a loss of function which are the category of alterations that we are interested in for tumor suppressor genes.
As you can see as I selected the filter on the top left of these options loss-of-function there were eight relevant publications. So those are indicated in this next slide on the top right are the eight publications and on the bottom right are the sentence fragment so it has the way that I use this in my practice when I when I get a small list like this I’ll start at the top highlight that paper the sentence fragments will populate in the bottom and I basically just quickly peruse through those papers, scan the sentence fragments for verbiage that would indicate the answer to my question. So in this case I’m looking for does this H 83 Y alteration result in loss of function and as I scrolled over the 2014 publication from unko target the sentence fragment populated and said essentially yes this mutation leads to the inability of p16 or the cdkn2a an inability to stop the cell cycle our cell cycle arrest.
At this point the way I use this in my practice as I will then download the full text article, read the relevant passages where this fragment was was attained so that I can understand the context and evaluate the evidence there the particular publication is using to support their finding but at this point in this particular publication I found that the evidence was strong and I incorporated that information into my report. So that’s really though just step one so now what I’ve done is identified yes I have an activating Kara’s mutation in addition I have an in activating cdkn2a mutation so the obvious next step is is this relevant for to progression, is it relevant for patient care, will it affect therapy, will that affect prognosis? To address that using the same Mastermind software I took advantage of the advanced search criteria where I can essentially search the relevant literature for only those publications that address concomitant or combined situations where we’re studying or looking for KRAS and cdkn2a variants.
So in this particular scenario showing again the the main search page and then the advanced search options and I’ve included both the KRAS gene and the CDK into Aging and then use the keyword combined I’ve also so I wanted to make it clear at this point there’s still a little bit of art involved in searching when you’re doing these higher level searches, so I tried a few things I tried combined, I tried concomitant found a few other so sometimes you have to be a little bit creative, but in this case using combined then came to a master list which as you may expect given KRAS and cdkn2a are quite common there were there was a lot of literature available but I kept my focus to the upper right-hand corner of the article plot because this is the area where these citations are the most recent as well as the impact factor of the journals being high.
And I realize I sound a little bit like an elitist when I say this but I do pay attention to impact factor because I my assumption is that those publications are likely to have been vetted and reviewed to a high degree I realize that’s not always the case but it is the way that I choose to attack these searches. And in so doing identified this cell paper from 2015 which specifically addressed cdkn2a deletions in the contacts with tumors having activating KRAS mutations and what this paper showed was that there was the real possibility that these particular tumors would be sensitive to checkpoint inhibitors so concomitant therapy with check one inhibitors inhibitors and mk2 inhibitors so I incorporated that information into my final report and had a conversation with the oncologist about potential clinical trials for this patient.
So to move on to a second example and I might be running a little bit over so I’ll try to move through this efficiently, the second case is also a non small cell lung cancer case and just to give you the setting here this is a patient who at their initial diagnosis which was a couple of years before this was found to have an X on 19 deletion that was known to be sensitive to EGFR inhibitors so this patient had been on an EGFR targeted inhibitor and then progressed, so as you might imagine when we sequence tumors they the original exon 19 deletion was still identified the l7 477 5-1 deletion but in addition they had acquired the common resistance mutation the T7 90 M. So again here’s the case where I see this frequently enough in my practice that I know what I’m going to what I’m going to say about this particular combination, but the interesting part about this case is in addition to the T7 90 M mutation EGFR this patient hadn’t had a secondary mutation in the beta catenin G C T and then b1 this was an s 37 CMAs missense mutation and at the time that this case came to my desk I hadn’t seen his mutation before. So I used the same approach as we discussed in the previous case to determine whether this particular mutation was in the literature and whether there was a functional association.
So my strategy again starting at the main page was to maximize my sensitivity by searching in all diseases and then searching for the specific variant 37c in beta catenin and then filtering those results using keywords and I’ll show you how I did that. So again the landing page beta catenin s 37 C there was a series of papers found but then I moved into the function filter and selected mechanism, as a way to focus on papers where the functional relevance of this mutation beta-catenin was described. And in so doing I was able to find a paper from 2007 where as again I stand the sentence fragments it was very clear that this paper stated that the s 37 C mutation in beta catenin led to between a three and seven fold increase in transcriptional activation suggesting there was a functional impact.
So again at this point I filed this reference away from my report and I asked the next question which is is there a relevance for having these EGFR mutations and resistance mutation T7 90 M in concert with an activating beta catenin mutation in in this case in non-spell lung cancers specifically. So I utilized the advanced search options to look both for EGFR and beta catenin in non-small-cell lung cancer where the T7 90m mutation was present. So this is a much more refined search, in performing this I’m showing the article plot which mark described before and in the upper right hand corner which is always a great quadrant to start in is a recent paper within a high-impact journal so this is Nature Genetics that’s highlighted here in the bottom right talking about the evolution of clinical impact and co-occurring genetic alterations.
Within this paper they specifically address this phenomenon of activating data Catina mutations in the presence of EGFR mutations in the relationship they are in, and I was able to read this paper and communicate the findings which basically suggested that Co targeting EGFR mutations and beta catenin mutations in concert may lead to better outcomes in this patient population.
So that’s all I have to say for those two cases I hope I was able to communicate a flavor for the way that we utilized this tool in our day-to-day practice and some of the benefits that may provide. And I specifically like to mention what Nikki brought up during her talk which is this this ability to really feel as though we’ve done our due diligence with the literature searches and not just putting in random Association words in Google or in some other pubmed tool. We feel that in our experience this particular tool has been useful in generating appropriate literature to support our our reports and practice. With that I will hand it over to Victor and thanks again.
Thanks David for the opportunity to chat today. I’ll go a little bit different from David and Nikki here and kind of talk about the upstream part of panels that would ultimately make it into the clinic and the pathology that might be following up.
So for those are not necessarily familiar with Q squared solutions we are a genomic laboratory suited for providing panel based or sequencing services across the clinical trial continuum. So while certainly a lot of these panels can be used in small academic medical centers for testing we’re using these in clinical trials to either in a pan therapeutic case for a given cancer match people to the right therapeutics based on their mutation on landscape or in pan cancer if we’re having a trial for a given therapy, so we’re always being asked by clients they want to come and create panels with us what’s the right one should we use this vendor or not you know is it too big it’s a too large how can we make them faster cheaper etc etc.
The team that we’ve built here specifically is geared for expertise in molecular bioinformatic analyses you know kind of getting what variants or mutations are present in the system and all of us are oncology experts in all things. And so a lot of times when we’re making panels from something we’re not as familiar with this becomes quite daunting as we are experts in many different things but not necessarily maybe your cancer. What I’m showing here is the kind of pipeline for looking at this kind of thing and really I’m focusing on the front left part, because that’s where we’re at right here and I want to emphasize here kind of step one looking at research and the biomarker identification is really difficult in some cases. If it’s a genotyping panel and we know a couple of markers or that we’re going to be identifying if the patient should receive a therapy or not that’s pretty straightforward it’s well known they’ve a comment and a couple well-known markers here that that would dictate some kind of action.
But in a lot of cases folks want to be able to drive panels that have multiple utility and so the amount of time we lessen on the left-hand side of the spectrum for trial for us the better because after we’ve done that and we’ve made the reagents and we actually start writing the assay doing the bioinformatics those are things that kind of have a little bit more finite time lines bearing down that panel is very difficult and it’s something we do get asked on occasion.
From that we embarked upon an exercise about early 2015 now to create our own comprehensive cancer panel and while certainly around that time this was becoming pretty popular in other areas and other companies were coming out with this based on our clinical trial focus we wanted to hit every marker that we could that dictated some kind of trial enrollment for druggable marker. So great you know we have some databases that we could look for that but what we came very difficult very quickly is how we wanted to gather all that information that we didn’t know from what we had on a trial landscape.
Logical pathways being a big feature of this kind of stuff your classic items that I would have done in my past bioinformatics life in graduate school but really kind narrowed down let this list would be. So me and another individual and some other folks sat in really for several months pounding away at let’s start with these markers that we know are are actively using FDA drugs, here markers we know are in ones coming up in the pipeline, well that lets us with a small list that we felt might be incomplete it’s really a lot of cycles here of high-end PhD time to come up with a panel that we were really happy at the end would serve a lot of different use cases for us but certainly going forward every time a client wants us to do this we can’t charge for two PhD six months of time that becomes a really intractable problem. And so we want to build a panel and clients are coming to us for this we you know they’re all I kind of give a highlight here of wants and don’t wants right, they want to be able to the panel isn’t known or vetted or vendor already provided it for some reason you know they want that to be turned around.
A lot of cases a child might be therapy specific but they might actually be specific and just classifying a tumor or in looking at risk based markers so things that might dictate the familial inheritance patterns. And so we get multi multi varied needs that this panel may may want to do and of course cost becomes a feature because next-gen sequencing is still kind of costly in a lot of cases and customization. Like I want this size for a panel for trial doing this another size for a different panel so again a lots of variables so I kind of show here on the right what I’ve seen most historically that that we perform in that panel where the core is okay what are the guidelines that can I run a panel that’s going to dictate into seeing guidelines for me that I can make sure patient management is appropriately streamlined. And then okay which ones are going to dictate treatment? Other ones might be okay I want to classify the tumor and it kind of expands and expands and expands as we get into this at this field.
We don’t want to take a long time we don’t we can’t for the few months I talked about earlier, so what what next you know year and a half or so ago we had a large push from multiple different clients to create a custom AML panel, and now and Mark elucidated says the white paper here on Leukemia and certainly now over the last six months even a year the amount of my load and a mail panels have really exploded so you know we were seeing the same question every other vendor was seeing around the mid to late 2015 time of we really don’t have a great way to genomically profile AML.
Okay when we receive this some clients we said okay go well no NCCN I can pull up what those markers are that’s great, and from one you know from this particular version of n to the N I’ve got you know seven genes for our pharma clients and other folks that want to get to a treatment response that becomes a little bit a little bit trickier. Familiar TCGA in the subtyping category but that’s a whole lot of genes that are individually mutated AML suffers from a lot of indication specific or you know patient specific mutation so it’s kind of hot spot gets trickier. And as I mentioned earlier client for asking you know panels weren’t sufficient they weren’t large enough we can’t subtype we can’t subtype okay we want the test that we can also use the diagnose minimum residual disease okay you know the list goes on and on.
So that’s when I contacted the Genomenon folks to kind of help with this with this process. And so kind of to gloss over some arenas here what I wanted to do is just a demoing of the tool in an afternoon, I said okay cool well how am I going to and how quickly can I potentially build a panel on doing this and so I knew because we already had some tests that we ran in our laboratory here four foot three and dnmt3a I put those in so that’s pretty common and as David would speak about before really I started on the top right looking at things that are most recent and highest impact factor.
And so you know right away and you can’t really see with the size of very large but there was a really small dot here that was a highest impact possible a highlight on this is found publication for genomic classification and prognosis in AML. Fantastic great, also the same thing looking at foot three as you can see the dots in the very top right clicking on that let me do this type of manual paper as well which is also genomic profiling in AML. When I actually search for these types of fields like you know genomic profiling of AML of joint targets of AML you know a lot of different kind of Google searches before none of these publications came up in a couple different arena and so that was really difficult to really get this but you know as I said in my last bullet here in less than a minute I had two papers to pull on that had large swath of gene list subtyping markers etc etc that’s great. Of course I can’t necessarily mine and do this for every single marker I want to add, so in talking to Mark of Genomenon, I said this is this is great I found papers I want I can already take a lot of markers here to put in a panel how can I stretch this out and identify identify more items?
The slide here kind of shows your typical one-tailed graph where you’ve got your biomarkers users from the path manual paper. Some are though so complex mutations, some are translocations copy number losses etc etc and kind of split out among risk profiles for patients which is again outstanding that that’s what I’m looking for I want to channel that and hopefully for me in a lot of different kinds of classes of genomic variation not just your sins and in dales because I want it to be kind of broadly fused. So you know all right away these papers were exactly what I was looking for the specific markers to to throw into the panel.
Of course that’s great but these markers are really tricky I’m not a geneticist, I don’t know so they speak all the same lingo of that so in asking Mark either way we can mine this to further identify goodies, and so we wanted to go in on a high level four marker class saying okay I wanted to find those that are first the reason why I found these two to begin with is because I was looking for prognostic markers as the first request so I’ve highlighted here the prognostic circle, on from that I got you know genes when I was looking at flip three and the markers that were highlighted on the protein diagram were prognostic related going further I could then fill through those four things that are treatment related or diagnostic related and so for me that added that kind of variability going back to the kind of concentric circle graph that showcased a wide range of uses that a client might need and I can start classifying at least here by hand markers and genes that better of interest and that’s great.
So digging into the underlying database how do I really start doing this kind of wholesale, and working with the marketing team we started doing so by providing a list of publications for me. They were most relevant for you know your prognostic markers or diagnostic markers or treatment markers specifically.
What we came back from the underlying engine within Mastermind was to actually create a prioritized list of individual markers so this isn’t just a gene you see here in the top left for sniffs with kicks, you’re also seeing the high mark of the d8 16v as being in a highly active, you have other ones here in flip three and IBH npm one thing that I knew right away that circulated to the top and specifically the markers so that when I’m if I really want to make a tiny panel I can actually focus only on those genes should should a client need something like that.
But given something like AML said that in Dells are only a tiny part of the story knowing genomic rearrangements fusions are a big part, if you are able to course you certainly remember that what’s familiar is so much what’s one but not a whole lot but we were able to pull out that information for fusions and copy numbers very quickly which was awesome. I’m actually going even further than that and I didn’t expand the figures here but we’re able to go back to those papers in a lot of cases and the database peers I should identify the genomic coordinates of those things which I need to create an original panel.
So kind of after doing this going and following this concentric circle line I took the Union gene list from all these markers that were prioritized in publications that had associations to in some cases treatment relationship outcome or diagnostic related so really is there a market that’s going to not just identify you as a AML but you this particular you know subtype of AML which we did.
And then those that might be risk factors for clients they want to provide additional values for running in that trial so we made a unions list of about 120 genes that can be classified across these different biomarker classes treatment diagnostic prognostic or risk based so we can kind of mix and match with prioritization of the individual markers to provide a variable kind of panel design right away and this whole process took you know very short amount of time a couple weeks to have something I can go back to several clients I said here’s our basic design let’s go ahead and go on with this process. Which is great for us because that’s what we want you don’t want to wait a couple weeks to come into more than a couple weeks to do that you really want to start talking to folks about narrowing in that content, and that’s been a hard conversation as well unless you can prioritize based on these individual markers ranked in publication and providing that evidence for their inclusion which is great.
So from that to kind of wrap it up to allow some time for questions, you know we are a identified custom content I’m not an expert in AML not at all I’m a solid tumor guy, but this was very encouraging for me and just a short couple minutes of an afternoon to pull out you know hundreds of individual approaching changes of markers that I might be able to add so that was awesome. The context of that gene is it prognostic to diagnostic as a treatment related was also great at one thing looking into those combinations markers to therapeutics with can also enhance this stuff that uh we mentioned earlier.
Coming up and stuff quickly again very valuable and after that screening is done is a discussed we’re able to go back and say here’s your here’s your panel list okay if we include these top 100 markers not necessarily genes with markers your panels going to be this sized, we can run this many samples at a time, it’s going to cost as much and it allows us to start looking at some economic and actionable decision criteria for creating that panel which is really great. From that I’ll wrap it up and we can start taking some extra time to answer some of you guys this questions and thank everybody for joining.
Closing and Q&A
Thank you Victor as a reminder to our webinar participants if you have a question please type it into the Q&A; box in the control panel. We’d like to ask the attendees to take a moment after the webinar is ended to take our exit survey and give us your feedback. We will now start the Q&A; portion of the webinar.
The first question is for Mark, and Mark a participant would like to know how often the publication’s database on Mastermind is updated?
-Thank You Ed, that’s a great question. I apologize for neglecting to mention that during my portion, but Mastermind is updated on a weekly basis for both all titles and abstracts that have been published in that interval week, as well as any of those titles and abstracts at the full-text level that contain genetic or genomic content. So the short answer there is every week.
Okay thank you Mark, the next one also is for you and one of the participants would like to know is it possible to search the results by other keywords such as treatment keywords?
-Excellent question and so what Dr. Seward and Weigman exemplified was a different approaches to prioritizing filtering and augmenting the relevance of search results, the typical lines of refinement include diagnostic prognostic and therapeutic significance. There are ways to provide user input to refine that search that are available in the professional edition of Mastermind, and it’s an important place for me to also emphasize that as we’re continuing to refine our algorithms and ways of parsing the data for those of you who are interested in constitutional genomics were promulgating use of ACMG classification criteria into our categorizations based on keywords. So short answer is free text is available in the Professional Edition, and there’s a battery of additional categories and key terms that are currently available in the Professional Edition as well as in the future state ACMG classification criteria.
Okay thanks Mark, we have a question for Nikoletta do you have any thoughts on how the information from Mastermind could be used to help educate physicians on genomic medicine? How the information from Mastermind can be used to educate?
-Yes. So obviously in listening to what David Stewart presented we are searching for medical literature that’s pertinent to the clinical relevance of variants that we identify via our testing. What we end up doing is then reviewing that literature and reviewing that content in the relevant articles that we identified via Mastermind and we summarized that information in the clinical report. A lot of times there are discussions with the clinicians on the other end receiving these reports around exactly what information we are entering as text into and as as a variant classification into the report. So there is education at that level, and also you know we’re at academic Medical Center so it’s not unusual at all for residents medical students fellows to be on service with us, and you know we are very careful to include PubMed IDs in our interpretations too so that way the end user can always have the option to go back to the primary literature and and cross check us. So I think that also serves as a very rich database of information that we primarily get by using Mastermind for for all sorts of levels of you know medical clinical expertise to go back to read primary literature get educated on that learn how a molecular genetic pathologist or medical geneticist in fact interprets the primary literature identified via Mastermind, and then and then makes a you know laboratory medicine decision around how to import that into a clinical report, so hopefully that sheds some light on on you know for that question.
Okay thank you Nikki. Mark we have another question for you is there any chance that rather than typing into individual variants users could upload text files or excel files with all the variants from a single patient at once along with relevant relevant information like disease?
-Great question, Mastermind does afford those automation capabilities. At present we have a VCF file annotation process for every line item of data which is just a genetic variant the VCF processing will annotate those variants and the numbers of articles that mention those variants and then in an enhanced version of that VCF annotation process there are ways that APIs can be utilized or VCF annotation can be modified to include suspected diseases for the given clinical circumstance. So short answer there is batch VCF processing is probably the easiest way to get that information, if not through API data feeds.
Okay thanks Mark, we have a question actually now for it could be for Victor but also we’d like David and Nikoletta to way in if they’d like to. Victor let’s start with you first with this one. Can can you quantify the return on investment or time savings that you achieved using Mastermind?
-Sure, so for us in that AML example we were able to churn out again with Mark self kind of probing the the back end of it about two weeks to really one week to generate the file full kind of unions list, and that is kind of actual time for me and the other individual EA that might have been a few hours back and forth with Mark discussing you know and kind of mining ourselves. So on my end you know I would say eight hours one PhD time to to help filter and prioritize different lists based on those interaction items within that we had the master list, and the other week was spent creating a way to rank the markers so that when we gave the clients who had an interest of treatment then we could automatically very quickly give them a marker list ranked by treatment or ranked by prognostic. So the other week was spent create the panel in such a way that clients who were asking could retain could easily shifts the markers in a way that suited their needs, that really you know it might been 30 40 hours of my guya time versus something where would have been my guys time out for a month or so real time. So for me that’s that’s pretty quantifiable, and since we get multiple different requests I don’t necessarily have folks I can take out for that long. Yeah so I think. Go ahead David. So I can address this from our side so when we were first introduced to the software we had already been practicing without it for some time, and what so I’d done a numerous side-by-sides where I’d either start with my sort of might my the way that I had been searching prior to using Mastermind and then looking for using Mastermind using the same search and vice-versa trying to compare what articles I found in the time it took me to find those. Now the difficulty in trying to answer the question specifically about time quantity has to do with what I what I refer to as my comfort level with my report, so prior to being convinced that Mastermind was its sensitivity was appropriate and before using Mastermind I frequently would continue to search up to the point where the report basically was due and I needed to get it out, and that was that was the timeline, so however long I needed to be reading because what you’ll find with a lot of these variants is there is there is a massive amount of literature and so if you don’t try to narrow your search in a rational way you can be reading papers for the rest of your career on some of these topics and they’re continually being updated so where I found it to be particularly advantageous was the sense of search completion that I had done. So after perusing the top hits doing the filtering in Masterminds I didn’t have that sense that I missed something because in my analysis is whatever I found doing my searches through Google Scholar or any of the other search engines out there those inevitably were captured by my searches in Mastermind. So it’s it’s not so much about time quantity as it is about the feeling that my search has been complete, that is if that doesn’t make sense I can try and that you can we ask more specific question I can get to it. But for me it’s about the the sense of having done my due diligence, and that is in my side-by-side comparisons that has been achieved with Mastermind.
Okay thanks David. Nikki did you want to chime in on that question?
-I do think that for me as a medical director being able to have a sense of confidence that we have a standardized our process of how we’re searching the the literature has been a relief I think anytime you can lock something down and feel like there’s consistency across the group across how we’re doing our medical interpretation that’s an invaluable thing to have. So I’ll keep it short I understand we’re up against the hour but I did want to make that that final point.
Okay thanks Nikki. And that is all the time that we have today for the webinar. We’d like to thank our panelists Mark Kiel, Nikoletta Sidiropoulos, David Seward and Victor Weigman and our sponsor Genomenon. If we didn’t have time to answer your question we will try to have the panelists answer them directly afterwards. As a reminder please look out for the pop-up survey after you log out to provide your feedback. If you missed any part of this webinar or wish to listen to it again a link to an archived version will be emailed to all attendees thank you for joining us for this genome webinar.