Watch our latest webinar to learn how Genomenon is advancing Precision Therapeutics by unlocking Real-World Evidence (RWE) from decades of clinical literature.
Despite the vast trove of patient data and scientific insights contained in published research, extracting and organizing this information remains a significant challenge. Genomenon overcomes this challenge by combining genomics-focused AI with expert human curation and deep scientific expertise. This unique combination helps partners leverage this literature-driven RWE to drive discoveries that impact patient care and drug development.
During the discussion, our speakers take you through two case studies that show the breadth and depth of these insights, fueling everything from biomarker discovery and patient stratification to gene-disease curation and clinical trial design.
Rare Disease Use Case: Diagnostic Impact through Curation
Sarah Chang, Ph.D., Medical Strategy Lead at UCB, will detail how expert curation and classification of genetic variants in thymidine kinase 2 deficiency (TK2d) have expanded the disease’s knowledge base. Through meticulous expert curation, previously ambiguous or unrecognized variants are now clearly classified, providing the global clinical community with freely accessible, clinically actionable data. This resource not only accelerates accurate diagnosis but also reduces barriers to genetic testing, ultimately improving patient outcomes and informing care strategies.
Cancer Use Case: AI-Driven Insights for Drug Discovery
Dr. Mark Kiel, Genomenon Chief Scientific Officer, and Jonathan Eads, Genomenon VP of Technology, will demonstrate how Genomenon’s AI-powered knowledge graph indexes and analyzes clinical literature for RWE at scale. By intelligently linking therapies, phenotypes, genes, and variants across diverse disease areas, the platform enables comprehensive scientific analysis and precise patient characterization.
In one featured project, the platform processed over 50,000 full-text articles on colorectal cancer, uncovering novel correlations between patient demographics, genetic variants, treatments, and outcomes. These insights, including emerging biomarkers of immunotherapy response, provide actionable evidence that accelerates drug discovery and enables more targeted patient identification for precision therapeutics partners.
Broader Implications: Enabling Precision Therapeutics with Patient-Derived RWE
Whether the goal is to support the diagnosis of rare diseases or to drive targeted drug discovery in oncology, Genomenon enables partners to leverage patient-derived RWE from the literature. This capability positions therapeutic companies to make data-driven decisions, enhance precision medicine initiatives, and ultimately improve patient outcomes worldwide.
Complete the form the view the webinar recording:
Sarah Chang, PhD, is a medical affairs professional with over 20 years of experience in the pharmaceutical industry. At UCB, she leads initiatives focused on an ultra-rare mitochondrial disease, called thymidine kinase 2 deficiency (TK2d). Her work centers on advancing genetic understanding and improving diagnostic pathways to help patients receive earlier, more accurate diagnoses. Deeply committed to the rare disease community, Dr. Chang is driven by the belief that every patient—no matter how rare their condition—deserves timely care, meaningful support, and the hope of a better future.
As VP of Technology Jonathan Eads spearheads Genomenon's software development team leading the technical vision and implementation for the company technologies, including its AI strategy. He brings 25 years of experience in genomics, drug discovery, bioinformatics, and AI/ML software application development, with extensive experience building and leading engineering teams to deliver operational excellence.
Dr. Mark Kiel is the co-founder and chief scientific officer at Genomenon, where he oversees the company's scientific direction and product development. Mark received his MD/PhD in Clinical Pathology at the University of Michigan. He founded Genomenon to address the challenge of connecting researchers with evidence in the literature to help diagnose and treat patients with rare genetic diseases and cancer.
MARK: Hello, everyone, and welcome to today's webinar: Real-World Evidence For Precision Therapeutics. My name is Mark Kiel, and I'm the chief science officer and founder at Genomenon. Before we begin, I've got some housekeeping issues to cover. First of all, this webinar will be recorded and shared by email to all registered attendees, and after the presentation, we will host a Q&A session, so I encourage you to submit your questions through the Q&A panel at the bottom of the Zoom screen at any point during the webinar. During the webinar, you'll see a handful of poll questions pop up, and I invite you to answer those questions. Those are useful for us to engage the audience as well as to gather meaningful input from you. Lastly, before we wrap up, we'll share a brief optional survey. This feedback is invaluable to us as we prepare additional content and make sure that these webinars are interesting and meeting the needs of the community.
So with that said, let's get started. As I mentioned, my name is Mark. I'm the chief science officer and founder of Genomenon. I'll be leading the conversation, but the real stars of the show here are presented on the screen. I'm very grateful to be joined by these two colleagues and clients of ours. First of all, let me introduce Jonathan Eads. He is our VP Of technology at Genomenon. Hi, Jonathan!
JONATHAN: Hi, Mark!
MARK: Jonathan leads software development, overseeing the team and the technical vision behind Genomenon's AI capability and software. He has decades of experience in genomics, drug discovery, bioinformatics, AI, machine learning, and software development.
I'm also very grateful to have Sarah Chang with us today. Hi, Sarah! Sarah is the medical strategy lead at UCB. She, too, has decades of experience, but in the pharmaceutical industry. At UCB, she's focused on an ultra-rare mitochondrial disease, thymidine kinase 2 deficiency. She's going to be talking about her work there, but enfolded in that interest, she has a deep and abiding interest in advancing genetic understanding, improving diagnostic pathways, and helping patients receive earlier and more accurate diagnoses. She believes that every patient deserves timely care, meaningful support, and hope for a brighter future. So I thank you, Sarah, and I thank you, Jonathan, for joining today.
Jonathan and Sarah will be highlighting two examples of how Genomenon is leveraging its own real-world evidence platform to solve challenges for both rare disease and for oncology. I will first be introducing Genomenon before our presenters join, and I'll be guiding some discussion questions as well. So, without further ado: Who is Genomenon? Genomenon provides genomic intelligence for clinical diagnostics and precision therapeutic development. In short, we simplify complex genetic data by making actionable insights. We're able to do this through a unique combination of computational capabilities, illustrated on the left, and a whole team of extremely capable expert human curators. They're experts not only in genetics, but also in data curation for biomedical entities.
Underlying the power of this approach is a vast collection of real-world data from the clinical and scientific literature. This real-world data comprises patients and their journeys, and all of that associated information that captures their cases, whether they're patterned in case reports or pedigrees or case series or patient cohorts. That real-world data comes in the form of demographic information, phenotypic data — how those patients present — their clinical laboratory values, their treatments and their outcomes.
Harnessing that data from the medical literature involves, first, recognizing those individual data points, but then putting that information together in the relationships that collectively comprise what Jonathan and Genomenon refer to as G³, the Genomenon genomic graph database. This is a knowledge graph of all of those entities from that vast collection of clinical information, and all of their interconnectedness, their relationships as well. We put that information together and derive value for our customers and for the field itself in this arc.
Beginning on the left, there is the corpus of clinical literature, as I say, very vast. Many millions of full-text and supplemental data sets from which we extract meaningful genetic and patient-related information, including therapies and outcomes. Put that together in that knowledge graph to comprise all of the real-world data from the totality of human inquiry into clinical medicine as captured in the published literature. That information can be curated by our human experts to increase and maximize the accuracy and meaningfulness of that information. Collectively, we call that real-world evidence. Sarah will be talking about a flavor of that evidence, the variant landscape, but we also capture patient landscapes, which is to say, every patient published in any one of those scientific publications and all of their journeys, their individual journeys, as well as their collective journeys and cohorts.
Then, in addition to our team of curators, we have a team of scientists who can make meaning of that information, and make that information actionable. This unique combinatorial approach will be illustrated in both Sarah and Jonathan's presentation. The way that we realize that value comes in a number of different use cases and applications. They're illustrated here with some of our clients in pharma. On the left, you can see how this real-world evidence at the variant level can help better determine disease prevalence, which is particularly useful for rare and ultra-rare diseases, but also can pattern inclusion criteria, including but not limited to, at the genetic level, genetic variants that are, when found in patients, likely to respond to precision therapeutics, as well as identifying clinical and functional biomarkers.
We can do this at an unprecedented scale. Surveying, as I said, the vastness of the published literature, and capturing every minute detail of each one of these patient journeys, the last of which, the use case example, which Jonathan will illustrate, affords leveraging G³, the real-world data in Genomenon's data set to stratify patients, which is an exciting prospect that has a lot of applications, including in oncology, which Jonathan will highlight.
So before I introduce Sarah back on the stage, I'd like to draw a contrast and some parallels between these two use cases. Both of them leverage the computational capability that Genomenon brings to bear, as well as the human curators and scientists who review that information. One of them, Sarah's, will be focused on rare disease, an ultra-rare disease, TK2D, or thymidine kinase 2 deficiency. This involves curation of genetic variants and is useful for maximizing diagnosis of individuals with this rare disease. In contrast, Jonathan's use case example will highlight an oncology example. We'll focus on the data science aspects of leveraging the data that G³ produces, collecting that information at a patient level to inform strategies to better treat those individuals. Those similarities and parallels hopefully will come out as Sarah and Jonathan present their use cases.
So, Sarah, with that introduction, I invite you back on the stage. I'll go dark while you present your slides. I'll come back up and walk through a couple of questions with you, some that hopefully will come from the audience. Then I'll introduce Jonathan, and we'll continue. So, Sarah, take it away.
SARAH: Thank you so much, Mark, and I just want to start by saying I'm really excited to be here today, for two reasons. One is that I'll always take an opportunity to raise awareness of TK2D, which happens to be an autosomal recessive, very rare mitochondrial disease. And then, second, the main reason that I'm here is to talk about the variant landscaping project that we've conducted in partnership with you and your team, Mark, which we believe really has the impact and the potential to facilitate earlier diagnosis of patients down the line.
So I'll start with a little bit of background on thymidine kinase 2 deficiency, or TK2D, which is a rare, progressive, debilitating, and often life-threatening genetic mitochondrial myopathy, so a muscle weakness. It's caused by autosomal recessive mutations in the thymidine kinase 2, or TK2 gene within nuclear DNA. TK2D is associated with progressive proximal myopathy. That's a muscle weakness, and specifically, muscles closest to the center of your body (so think head, neck, trunk) as well as respiratory weakness. Respiratory failure actually happens to be the number one cause of death in people who have TK2 deficiency.
This disease happens to be associated with a very high mortality rate. In particular, when symptoms present early on in life. Many of the people who have TK2D lose the ability to walk, eat, and breathe independently. So it's a really devastating disease, and further insult to injury is the diagnosis of TK2D. It can be very challenging. Mitochondrial diseases in general tend to have multi-organ involvement. TK2D has variable symptoms, and those symptoms tend to overlap with some other diseases that are phenotypically similar and more widely recognized. Things like muscular dystrophy, Pompeii disease, and spinal muscular atrophy, to name a few. If you were to go to the published literature and look up the diagnostic odyssey of people who have mitochondrial diseases, what you would see is that, on average, they see about eight physicians before they get to a diagnosis.
So again, it's really devastating, not only for the people who have these diseases, but their families and caregivers as well. Genetic testing is the most direct path to diagnosing TK2D. As is often the case in ultra-rare diseases, many of the genetic variants that are identified in affected individuals haven't been previously reported, and this furthers that diagnostic odyssey. Typically, these changes are classified as variants of uncertain significance, or VUS, which makes it challenging to interpret and classify many of the variants that are associated with ultra-rare conditions like TK2D.
In the world of rare diseases, surfacing all available evidence is really crucial, because even a single publication could determine whether a patient receives a definitive diagnosis. To support our understanding of TK2 deficiency, we partnered with the team at Genomenon, and systematically identified and evaluated all of the known genetic variants in the TK2 gene that could be linked to disease. The team at Genomenon used their Mastermind platform, which, as Mark mentioned previously, is powered by artificial intelligence, and their team was able to scan the entire body of scientific literature for any mention of TK2 variants. Ultimately, we curated evidence for 95 published variants. This evidence is now freely available on Mastermind, as well as ClinVar, for the entire scientific community.
For those of you who may be less familiar with ClinVar, that is a central resource, publicly available. It's generally used by clinicians, researchers, and diagnostic labs to interpret genetic test results. However, many variants, and in particular rare ones, are either missing or underrepresented in ClinVar. When we looked at our data set, what we saw is that approximately 40% of the Mastermind-curated variants were not previously represented in ClinVar. This is really interesting, and I think it highlights a gap that a lot of us who work in the rare disease setting are up against, which is that there are a lot of data available in the published literature that are not necessarily linked up with those commonly-used genetic databases like ClinVar, for example. So it's really important that we marry the information between those two sources to ensure that the broader community has access to it in a timely fashion to facilitate the early diagnosis of these patients.
Of the new variants that the team at Genomenon identified and pushed over to ClinVar, roughly 25%, or a quarter of them, were classified as pathogenic or likely pathogenic. I think this is also an important point, because that pathogenic or likely pathogenic designation is what's required to confirm a genetic disease diagnosis. Just to put a little more context to that: you could imagine a patient who has a genetic test, and the results come back with a known pathogenic variant, in opposition to a VUS or a variant of uncertain significance. Well, that patient will not be able to move toward a disease diagnosis, because what has to happen first is that the VUS needs to be resolved. Only when you have a pathogenic or likely pathogenic in biallelic form, with that patient in autosomal recessive condition, can they be then declared to have a disease. So it's really important that those VUSs are resolved in a timely fashion.
One of the key findings that came out of this work that I am really excited about is on the bottom right hand side of the slide. It resulted in a decrease in the number of VUSs. That's because the Mastermind curation supported a likely pathogenic classification for a variant which previous ClinVar submissions had labeled as a VUS. This variant was a leucine to proline change in the amino acid sequence of the TK2 gene. When we looked back at 2022 in the ClinVar database, you could see that there were two submissions that flagged this as a VUS, but just last year, in 2024, a publication came out in the literature, and it was a genetic analysis of people with TK2 deficiency. There was a patient in that particular report who was a compound heterozygote for a well-known pathogenic variant in the TK2 gene, in opposition to what was previously classified as a VUS because of that publication, and because that VUS was resolved in this patient, and because the Genomenon team pushed it through to ClinVar.
The potential applications moving forward means that if a future person is identified to have that variant and the lab goes to ClinVar, they would see now that it is likely pathogenic, which would potentially facilitate an earlier diagnosis. So, to summarize — by submitting curated TK2 variants, what we're really doing is helping fill the gaps in the public record, standardize interpretations of variant pathogenicity, and improve the diagnostic accuracy for TK2D. This is both a strategic as well as impactful move. What it also does is it contributes to a more robust and equitable genomic knowledge base. So again, Mark, thank you to you and the team at Genomenon for partnering with us on this.
MARK: Well, thank you, Sarah. This is always exciting. I mean, your passion shines through for patients and their families who are dealing with this disease. It is very high touch, disease by disease, disease at a time. One of the things that you highlighted that I really liked is, it's not just about getting the right data. It's about getting it in the right hands, having a vehicle to get it out into the world. That's a pragmatic challenge.
I want to talk more upstream about some of the other things that you talked about in the differential diagnosis, and especially seeing that diagnostic odyssey play out over eight visits, and probably as many years. What proportion of patients with TK2 variants do you find are delayed in their genetic sequencing? Are you seeing that happen less and less, and that, more and more, it's a question of finding the right variants and the right information for those variants? Or do you still see a gap there, in patients not yet getting sequencing of that gene, to recognize this as the causative factor?
SARAH: There's a lot to unpack there, Mark. Access is always going to be an issue. Sometimes people's insurance coverage wouldn't cover a genetic test which will extend their diagnostic odyssey. There's also the situation where a physician, for example, might not know which is the most appropriate test to order. Or maybe they'll order a panel, as opposed to a whole-exome or whole-genome. That panel might not have the TK2 gene. We do a lot of education to raise awareness of that. I think in general, in the US anyway, we're moving more towards whole-exome and whole-genome, so that will resolve several of the issues associated with panel testing, but again, this is an ultra-rare disease. Typically, as patients are identified, we're at a state where we have identified a lot of pathogenic variants for TK2, but not other disease states and areas.
We don't know what we don't know, either. When new variants do come in through these genetic testing results, they will need to be resolved. We've seen instances where that is the case, where a patient is phenotypically similar to TK2 deficiency, but they only have one pathogenic finding on their lab report. They need that resolution, and what we're hearing from physicians is, "well, I don't know what to do next. What types of assays are needed? How do we resolve it?" A lot of them are relying on ClinVar as their source of truth. As this study showed, when the data aren't matched up with the most current data in the literature, it makes things really challenging.
MARK: One other thing that you you touched on is the word classification, which I quite like. It's different than variant interpretation, which is also different than evidence curation. That's a bit of a path, or, you know, a hierarchy. You curate evidence, you classify the variant based on that evidence, and you interpret the meaningfulness of the variant in the clinical context of the patient. I wonder if you can, using that framework, touch on the concept of VUS hotness. You talked about some of the evidence that we produced, converting a VUS to a likely pathogenic in its classification, but then also, some of those VUS were promoted in their heat. There's more, there's some evidence. Then, I like to think also that there's evidence just continually accreting in the published literature, functional studies, or new cases that are found. So can you put some context for TK2D in terms of ever-amassing more of this information, and interpreting it in the context of the patient that's being seen by the physician?
SARAH: It's a great question, and these things do evolve over time, to your point. Generally speaking, when we look at these variants, they're on a range, if you will, from benign to pathogenic, with uncertain significance kind of in the middle. But there is a spectrum to that, and it changes over time. The more evidence that's accumulated, whether it be in the lab or with clinical data as well, it can progress those variants toward pathogenic, or in some cases, toward benign, for example. The work that you did did progress some of those, which I think is critically important, because as additional evidence accumulates down the line, it further supports the reclassification of those VUSs.
MARK: Yup. Maybe a last question is context. I understand, with every rare disease group that we work with, that the number of individuals is of great import, from multiple different perspectives. When you're estimating that, or calculating that information, you want to be sure that you're relying on the most up-to-date and the most accurate information. Can you give me a sense for UCB's approach to really understanding how rare is ultra-rare, and whether there's a pocket of patients or a subset of patients who just aren't known, because of challenges in finding that underlying information from which to calculate?
SARAH: I mean, it's a great point that you're raising. I think, understanding how common a disease is, or its prevalence, is a critical step for any therapeutic company. Not just UCB. But as we know from today's talk, it's not always straightforward. There are a lot of diseases that companies like UCB focus on that are rare. What this means is that there are very few diagnosed cases, often limited data, which makes it hard to get accurate numbers. Further to that point, many rare diseases, especially TK2D, may go undiagnosed or are misdiagnosed for many years. Patients might not have access to genetic testing. Their symptoms might be mistaken for other more common conditions. That leads to underreporting and gaps in the data.
Also, as we touched upon, the science of genetics is constantly evolving with new prediction tools, updated databases, and emerging patient data. This can change how we interpret those disease-causing variants. I think what it points to is the fact that prevalence estimates need regular updates. The work that you're doing, it's something that would be continued over time as new data are pushed out there. It's important to reassess. If you now have VUS that becomes pathogenic, what that means is, when you take that information, explore a database like gnomAD, for example, it could have implications on how many patients are out there. We're at the point for TK2D where we don't know exactly how many patients there are. We know it's going to evolve over time, though.
MARK: Yeah, it's one of the fun and frustrating things about genetics. It's very complicated. Then, when you layer that on top of epidemiology, it becomes even more complicated. These are real questions that we need to address to make the progress that is required to build out drug programs for rare diseases, recognizing those patients and ensuring they get the proper diagnosis. So thank you, Sarah, that was excellent. An excellent realization of the challenges that you're addressing every day for individuals with TK2D. I'll invite you to come back at the end, but, Jonathan, I'd invite you to come on stage here now.
Jonathan, will be showcasing a different aspect of Genomenon's real-world data and real-world evidence capabilities. This is focused on oncology, but also has more of the artificial intelligence and more of the data science component that he's going to be focusing on. So, Jonathan, I'll go dark again. Just tell me when you want me to advance the slides. I'll come back up and lead some questions.
JONATHAN: Great. Thank you, Mark. All right. The historical focus for Genomenon has been on genes and variants and classifying those variants, and we decided we wanted to widen the aperture and go beyond variants and genes to any clinically relevant entity. What I mean by entity is anything like a patient, a biomarker, a drug, a therapy; any of those items can be entities that you could find in the literature. Being able to extract those entities with a high degree of accuracy can yield powerful insights. So we developed an AI-based technology, we call it G³. There's both a graph component to it and an LLM component to it. The LLM component to it is focused on text extraction from the literature. There's two buckets of information that we extract. One of them I just described, which is the entities. The other is the relationship between the entities.
So on this slide, I've described 15 entities, one of which is patient. Entities can have properties associated with them. These are extractable items, properties associated with patients could include the patient's genotype or family relationship. There's also outcome. Outcome from a specific drug or therapy treatment is something we can target with an LLM for extraction, and outcome could have a response associated with it. That response could be positive, negative, partial. There's a whole number of outcomes that we are able to extract.
Relationships are going to be from one entity to another. So you could think of it as, you have a subject entity, a relationship predicate and an object entity. There's a few examples listed here on the right. So clinical trial would be an entity, and it tests a drug. Tests would be the relationship predicate, drug would be the object entity. A disease occurs in a patient, so the relationships between the entities are really valuable in deriving insights from the literature. This entity and relationship extraction is codified into what we refer to as a schema, and we can take any part of our corpus of full-text articles and scan that corpus with the schema and extract entities and relationships at very, very large scales.
Cool technology, what can we do with it? We chose a case study, specifically, colon cancer. The problem we wanted to address is that colon cancer patients treated with immunotherapies exhibit a wide range of responses, from no effect to complete remission. The question we really wanted to answer is, can we use this LLM-based G³ technology to identify meaningful outcome correlations with colon cancer patients treated with immunotherapy? We're specifically looking for clinical characteristics correlated with positive drug treatment outcomes that we can look to potentially identify patient cohorts that would be a good fit for immunotherapy, based on a patient's intrinsic properties.
The large picture of technology steps we're going to go through — we have a clinical genomics publication corpus, that's, you know, 11 million. We have a prompt and relationship schema that describes entities like patients, the drugs they were treated with, their genetic variants. The relationships between them, that is, when applied with an LLM, an LLM is capable of extracting that information from the literature corpus. We store all of those entities and relationships in a graph database, and then we analyze that graph and look for relevant real-world insights. We're going to do that for colon cancer. The technology itself is totally generic. It doesn't know anything about colon cancer specifically, but it knows a lot about how to extract a disease or how to extract patients. So this case study could be any disease or any gene.
Okay, in our initial scan of the corpus, we got down to about 50,000 articles that mentioned colorectal cancer. We further filtered those by publications that were published after 1990. That's about the time when immunotherapies came on the scene. Then we have another aspect of the G³ AI technology, it can assign classification labels to articles. Two of those labels, clinical and functional, we used as filters. We only took papers with a clinical or functional label, and that got us down to 14,000 articles. That 14,000 article corpus we scanned with an entity and relationship schema, that is, extracting patients and the drugs that they were treated with, along with many other properties, genetic variants, phenotypes, etc.
We were able to extract a total of 530,000 entities and 610,000 relationships. Out of those 530,000 entities, 5,593 were individual colorectal cancer patients, real patients described in the literature diagnosed with colon cancer. Out of that patient pool, 459 of them were treated specifically with immunotherapies. One of the entities we extracted was drugs. We were able to also assign a drug class to those drugs, and one of those was immunotherapy, and that's how we were able to derive that patient cohort. So the 459 patient cohort we extract from the literature, that will be the cohort in the subsequent analysis that we're going to take a look at.
Our first step that we went through was a dimensional reduction analysis. You see two faceted graph panels. Every dot on this graph represents a patient diagnosed with colon cancer and treated with immunotherapy. The colors on the graphs represent the patient treatment responses that were inferred by the AI. What I mean by "inferred by the AI," in the paper, you typically have some kind of natural language description of a patient's outcome from a given treatment, and we asked the LLM to take that natural language description and codify it into an enumeration of controlled strings. Those strings are the labels on the graph: for a treatment outcome of progressive disease, or negative response, or no response, or complete response. By summarizing the information in that way, we're able to statistically engage with it in a meaningful way.
The two facets from that dimensional reduction are non-responders, so patients that did not respond to the treatment, and patients that did respond to the treatment, responders, on the left, non-responders on the right. The treatment response breakdown that you see on the left is what you'd expect. The yellow colored patients are progressive, exhibiting progressive disease as a treatment response. Gray exhibited negative response. Red exhibited no response, and that's what we would expect to see for non-responders in the responder facet. If you start on the right side of the graph and go to the left, we start with complete response, colored in green. Right next to that is patients that exhibited positive response in brown. Then, we see partial response, mixed response, and stable disease, those are less cohesive clusters.
I think that was one of the really exciting things that came out of this analysis. These are very well-delineated linear clusters. For the most part, we can draw a vertical line right through our non-responder cluster. If you were to overlay those two facets, the yellow box on the left of responders would be where the non-responders would sit. For analysis purposes, we've started faceting them for non-responders and responders, because then we can overlay all manner of other patient-intrinsic properties, like demographics and genetics and phenotypes, and start to make observations about what's correlated with the non-responders and what's correlated with the responders.
Using this technology, we can accurately identify patients and determine outcomes from drug treatments that were described in the literature. This allows us to build very valuable patient cohorts that are derived from thousands of articles. Then, we can differentiate responders and non-responders by utilizing all the other type of entities that we extract, like demographics and genetics and phenotypes.
So, taking it one step further, let's actually overlay some of that additional patient-intrinsic information onto our two facets. The graph facet we're looking at now is the left side. All of the patients shown here responded positively to immunotherapy treatment. We were able to identify a cohort of men under 50 with right-sided cancer, with high tumor mutational burden and a microsatellite status of stable, that all responded well to immunotherapies. This cohort was spread across three publications with completely different authors from different institutions.
One of the things that makes it really interesting is, in the literature, the general consensus is positive response to immunotherapy for colon cancer is typically correlated with left-sided cancer, and a microsatellite status of instable and high. So this was a bit in contrast to the general understanding. Some of the really unique things about the cohort, it was only in men under the age of 50. Microsatellites, if you're not familiar with them, they're short, repetitive sequences of DNA. Microsatellite instability is particularly important in colorectal cancer, and basically occurs when some of those repetitive sequence damage the mismatch repair system. That leads to what's called microsatellite instability, which, it makes sense why that would be correlated with a positive response to immunotherapy. You're seeing a lot more genetic mutations. Microsatellite stable is where you can have a high mutational burden, but the mismatch repair system is still working correctly and hasn't been damaged.
This is an observation that's a little bit hidden in the literature right now, and it's easy to be hidden when we're talking about thousands and thousands of papers. This is really where we see the power of this technology. We can look across any quantity of papers at scale with a lot of precision around entities and their relationships, patients and their relationships to the drugs that they were treated with, along with all manner of other types of metadata. We're able to identify subpopulations of patients and identify clinical criteria to expand target patient populations using this technology.
For the approach and how we could deliver this as more of a generalized service, we can extract structured, real-world evidence data from our article corpus. Specifically, we can extract patient cohorts for any disease, or any gene, or really, using any criteria. We can extract all of that entity data and maintain all of the specific article reference associations. These entity and relationship schemas — we can have as many as we want. We don't just have one, and we have to work with it. We can do very, very specialized properties.
In this colon cancer study, we have a base set of entities we go after, but then we had some very specific entities and properties around micro satellites and tumor mutational burden. We can go very specific to a disease, or we can go generic, or we can do both. We analyze that output. In this case, we were identifying new patient inclusion criteria for immunotherapy for colon cancer. Then we can deliver an actionable insight. In this case, it was a cohort that is, I think, a little bit hidden right now. So novel patient inclusion criteria can be used to identify more patients to target for therapy. That's really the large scale value to be had with this particular application of the G³ technology.
MARK: All right, Jonathan, let me jump back on stage here. Thank you for that presentation. It's always exciting to hear the latest and greatest from you. First question for you is, you mentioned at the beginning that this is a representative example, and in no way is G³ just focused on oncology. Can you go deeper and talk about the topics and the scale that G³ can handle? I quite like your "widening the aperture" analogy or metaphor, because it allows more light to come in from different areas, but it still preserves the ability to focus, as I think you just said on that last slide. So if you could speak a little bit more to the types of things you're looking for, and can find.
JONATHAN: To date, we've used it to look at clinically relevant entities, like some of the ones I shared. Patients, their phenotypes, the drugs and therapies they were treated with. We're not limited by that at all. Really, the only limitation is the content of the 11 million article corpus we have. We can look for anything. We can look for any kind of entity. It could be a functional assay and its performance across a large population of patients. There's really not much limitation there. I'd also say we're not limited to the article corpus, so we can scan any kind of text data. This schema approach will work with other types of data. It could be EHR data, claims data, patient registry sources.
MARK: Yeah, so about that. If you can draw a comparison between, say, those other types of data — EHR, let's just choose that, for instance, that's very commonly used — and the literature, are they at odds with each other, or are they to be used and considered to be complementary to one another?
JONATHAN: I think they're definitely complementary, and I think there's a big value added by using the two in conjunction. EHR records are probably even larger volume, but it really represents more real-time information about what's happening in the now, in a hospital, in a clinic. So it's very, very valuable information, but it can be lacking in critical, scientific, and experimental detail. It's also not going to have been subjected to something analogous to literature peer review. That's an aspect with bringing the literature corpus to the table in combination with EHRs where I think there's a real value add. We can use the literature to extract critical scientific information for a patient cohort across thousands of publications. That type of information would be inaccessible in EHR corpus. You could use that kind of information in conjunction with an EHR corpus to figure out what to scan the EHR corpus for in the first place.
MARK: Right? I mean, you're suggesting that there's multiple ways to pair them in series or in parallel, one before the other, even. It depends on what you're starting with, what your questions are, but there's a way to link them together to best effect.
JONATHAN: Yeah, exactly.
MARK: What I wonder about this is, sometimes the clinical literature can be taken for granted, because it's the air that we breathe as scientists and various different types of practitioners in this space. But you're illuminating how challenging it can be to find that information. It's not only time-consuming, it's sometimes so time-consuming that it's not even an endeavor that's selected. "We're never going to find this," say. What you're describing across the entire corpus allows this to be almost automated. The system that you set up just requires a question to be asked. I mean, there's obviously much more to it, but basically, you just have to have a will to ask that question. I've often thought about that. With respect to the power of AI, more broadly, it's increasingly less a question of, can we do it? It's more, what do we choose to do? Where do we choose to dedicate our energy? If you could speak to that in the context of real-world evidence and solving some of these real-world problems in drug development.
JONATHAN: Yeah, sure. Doing these large scale scans across the corpus, and deriving entities relationships into a graph model that you can derive insights from, it opens some doors. They're just not accessible otherwise. You know, you're looking at the example we gave, 50,000 articles. I mean, 50,000 hours of read time. That's like five or six years for one person. Then, how are you gonna hold the millions of entities and relationships in mind at once? This is clearly a job for modern AI and computers. Also, we've all heard about hallucinations with LLMs. That is definitely an issue that needs to be accounted for. I feel like this technology really plays to LLM strengths. So text extraction, text summarization. That's what we're doing here. We're not trying to infer a classification for a variant. That's where we get into the danger zone, where hallucinations become a problem. In our observations with the LLM accuracy performance, it does pretty reasonably well with text extraction and text summarization. We're playing to the strengths, the potential value of tapping the literature for its full potential. There's value for target discovery, preclinical, clinical, postclinical. There's specific application you can search the literature corpus for, and you're not going to get those insights without looking through it at scale.
MARK: You said the H word here, you said hallucinations. It's on a lot of people's minds, both for people who are in the know and for people who are, you know, casual outside observers. Can you speak to how you're controlling for that? You mentioned it here, let's go deeper into how you and your team are controlling for hallucinations? Although what you just said, we're in really good territory with what you're trying to do. But even still, how are you controlling for errors?
JONATHAN: So with the way we measure accuracy in this case, we build what we call a "truth set" of papers, that we can curate a set of entities and relationships from, and use that to measure things like recall, precision, F-score, accuracy, a standard set of performance metrics for utilizing an LLM or any kind of AI model. Some of the things we see — this is all very prompt-based engineering with LLMs — we will see the LLM misinterpret the prompt. Maybe it will call a structural variation a short length variation or something like that. The thing it extracted from the text will be correct, but it'll mislabel it. What we don't see is, "I'm going to invent a structural variation out of the blue that doesn't occur in this article that no one's ever seen before." We don't see that. That's across all of our entities and relationships. There are other applications that we've tried with LLMs, where we do see stuff like that, but I think, by really focusing on that text extraction piece first, that gets us into an area where the accuracy is reasonable for the insights we're trying to derive.
There's a second phase to this, where we've got all those entities and relationships in a graph database. That's where we can start to take a different AI approach, that looks like building models, probably more from scratch with very specific purposes, like looking for new gene-disease relationships based on graph topology. There's some exciting work to be done there. We're not quite there. We're really trying to get the entity relationship extraction piece right from the ground up.
MARK: Yeah, I like what you said. I don't even want to say they're hallucinations. They're just proper errors that you'd see, but that you can address and correct for as you adjust your prompts and the model, etc. In my experience as a lay person, I know well enough what you do, but I don't know how you do it. Oftentimes, when it gets it wrong, it gets it wrong in a meaningful way. Sometimes they're edge cases that you're asking it to assess, and wrong is, in that case, often relative. You know, you could see it go either way, and you can give a pass to the model for disagreeing, perhaps, with a human interpretation.
You also mentioned inferring. I know that there's some news about the ambition of AI understanding the whole of human inquiry, not just biomedicine, and "correcting errors" and drawing inferential conclusions across different things. What you're doing and talking about is very different. To be so bold, using AI, I think, would require a great deal of empiricism to check. We're still just talking about information and knowledge management. We're not actually testing things empirically, just in the context of an AI bubble. That empiricism is necessary, especially in clinical medicine. So I wonder if you know what I'm referring to, without naming names, and if you had any reaction to that bold comment.
JONATHAN: Yeah, at this point, we are not asking the LLM to come up with anything new. We're asking it to summarize natural language information in an article in a way that we can make statistically meaningful. The patient outcome example is a great one, where in an article, you'll maybe have a long-winded, natural language description of what occurred with this patient. We can't engage with that in a way that's statistically relevant, so we tell the LLM, hey, you read that, and we want you to assign — we give it an enumeration of possible string representations of that patient outcome. That's the only option the LLM has. You got to give us one of those, and you got to get it right. Then that gives us a means of doing a statistical study. But we are not asking the LLM to do something like infer that this patient is going to have this outcome beyond what's written in the literature. That's not part of this at all.
MARK: Right. Maybe one last question. And, Sarah, I invite you to the stage, because I've got one that will bring what Jonathan's talked about, and some things that you had talked about, too, together. But, Jonathan, a quick question from an audience member: you mentioned that you can ingest any accessible body of information. You suggested EHR. The question has to do with, say, abstracts, late-breaking abstracts, or otherwise, poster presentations as well.
JONATHAN: That's absolutely something we could do. There's some complexity there. Where do we find them? Where do we find them at scale? How do we reproducibly download them and extract them? We just need a text source. We can do some LLM OCR extractions, if it's in a PDF, those have their own host of issues associated with it, too, but definitely a possibility. It comes down to identifying the right source and going for it. I think we perform pretty well once we get into full-text, because they're formatted very similarly to a scientific article.
MARK: Yeah, I'd say, from a business perspective, it's just needing to know that that's of interest and there's potential value there. What I said before is, we're almost to the point where we can do anything we want, as long as we know it's valuable. That's where we should dedicate energy and turn our attention.
SARAH: Just add to that, Mark, I can confirm that it is of value, especially within rare disease. What often happens is we'll see these individual case reports presented at conferences, and maybe they'll make it into the published literature in a peer-reviewed journal. Maybe not. It might take 12 months, 18 months. So there's a time lag, too. Yeah, definitely beneficial.
JONATHAN: A quick question for you, Sarah. Do you see much rare disease showing up in bioRxiv? Do you find that a useful source of information?
SARAH: So, to be perfectly honest, I rarely refer to them. I occasionally do go into bioRxiv, and there is information there. It is usually earlier-stage, which is nice. To be perfectly transparent, I think, for me and my daily work, the more-used resource would be something like PubMed, where it has made its way to that peer-reviewed journal.
MARK: So, Sarah, I wanted to turn back our attention to variants, and I think you had mentioned this in the context of the promise of AI in better predicting the outcome. It's related to what Jonathan said: you're starting to get into grayer territory when you start to predict, as opposed to extract and present evidence. This is timely, because I think we're all well aware of DeepMind and AlphaMissense, and its predictive algorithms. My own understanding, from work that my team and I have done, is that it's really good at saying a benign variant is benign and really good at saying a pathogenic variant is pathogenic, but that's not the problem. The problem is with the VUS, and that's where it starts to have issues. It's unknown to me, at least, how improved the model is. I wonder, Sarah, in the context of de novo variants, do you think we'll ever get to a time in human history when we'll know every variant? We'll have empirical data from cases, say, and we'll just have hit an asymptote, and there's not any more to see, because we will have seen it all. Or do you think we're always going to have de novo variants, and there will be some need to rely on predictive mechanisms to interpret the, let's say, protein confirmation or evolutionary conservation within that protein? Jonathan, I know you're interested in that. Sarah, you brought it up, so I'd like you to go first and talk about that need, and what you'd need to see to rely on those predictions. Then, Jonathan, if you could say how we might come to that state.
SARAH: Broadly speaking, every gene will be different. Some genes will be very large, so you have to think about saturating every base, basically, on that gene. At some point, theoretically, you should hit saturation for a given gene. How far away from today that is, I don't know. I think that there's always the potential, if you haven't completely hit that saturation point yet, that new variants will emerge. Then, as it relates to AI and predictive modeling, as AI evolves, it's evolving so rapidly we're positioning ourselves for success, and that we will know more. Will it be reliable? There's always going to be hesitancy around AI. Can people have confidence? I mean, trust takes time, and I think it will build, but it's hard to answer that affirmatively, one way or another. But I think I see a lot of potential.
MARK: Yeah, I like that you said trust takes time. It's also something to reflect on, that it's just information. Every piece of information has its own history and reliability, etc., etc. It's how you use it. It's not overusing something. It's couching it in the right context. Fortunately, we're amassing a lot of this information as a field, as you brought up before. We just want to make sure we harness it and make it usable and accessible to every practitioner. But, Jonathan, I know that you're interested in this. If you could touch on how we might do it, and also, if you have anything to say about how to rely on it.
JONATHAN: I think it's a harder problem to crack than a lot of people think. We certainly need a gene-specific model strategy. Each gene, how it relates into disease pathways, metabolic pathways, and all of that can be dramatically different. This isn't quite what you're talking about, but ultimately, aside from the classification of a variant getting to the point where we could, with a model, understand phenotypic consequences in a patient, then I think we need way more data. We don't have an experimental means of doing that consistently, not that I know of. There's a lot of work in real-world data generation that needs to be done to get there. There are things models can do really well, like predict protein function from amino acid sequence. That nut has pretty much been cracked, and that can be done with a pretty high level of accuracy.
Where it gets murkier is when we're talking about a gene sequence. What's the consequence of that change in an actual human organism? Now we have many systems interacting, and I don't think there's enough insight from the training data to inform the model to get that right. I could see us going with gene-specific models, and getting much better at classifying the VUSs. We'll get there eventually. We're a little ways out, but I do think we'll get there. We want to get to where this change ends in this phenotype in a human. That's a ways out.
MARK: You said human twice there. My next question to follow up was going to be cellular or organismal avatars, but like I said, it's about the trust. Does every piece of data has a backstory, and how much confidence do you have? It's about amassing and assessing and checking your predictions in the future, before you can have that reliability and confidence to use that information.
JONATHAN: Well, I think taking a model system, some type of immortalized cell line or something like that, and just going to town and getting a ton of experimental data that relates genotype changes to phenotype, and using that as a training set, working in a controlled environment, that would be a path to getting there.
MARK: Maybe one last question here to wrap everything up together is, Sarah, we, in our conversation, had focused on the genetic variants, but I know that there's value in understanding other aspects of the patient journey for determining endpoints and watching the time course of disease, etc. Can you speak to your approach to getting some of that non-genetic patient information to help inform the work that you do in better identifying and diagnosing these individuals?
SARAH: Sure. We have a clinical development program where we have very specific endpoints, but in the context of a clinical trial, you are restricting what you're looking at. There are broader endpoints that are likely relevant to people with not only TK2D, but mitochondrial diseases in general. Due to the heterogeneity of these diseases, those might be different for different people. We are not there yet. There's a lot to be uncovered. There are hundreds of mitochondrial mutations that exist, and everyone looks different. That's an area of research. I know CHOP is doing a lot of research to that end, to really uncover what a good endpoint would look like for a specific patient type. We're continuing to do that work as well behind the scenes.
MARK: Yeah. Jonathan, I'll give you the last word. We talked about winding the aperture with G³, but it also allows you to do expansive studies. Sarah talked about hundreds of different diseases and looking at phenotypic outcomes and that sort of thing. I feel like that could be something that only could happen through a large effort from G³, that would otherwise defy manual development.
JONATHAN: That's the part for me that gets really exciting. I mean, we could work through a catalog of rare diseases, at least, ones that are represented at the literature, and automate as far as we can going all the way through. Construct the patient cohort, construct dimensional analysis results, and scale up the process of looking for these types of impactful insights. I think that's a realistic opportunity with the technology, where it's at.
MARK: All right. So lots of hope, as you brought up, Sarah. Lots of hope for these rare disease patients, and other patients, like the oncology patients you brought up, Jonathan. Thank you both for dedicating your time here. Thank you, audience members, for joining us. Before signing off, we'll just ask you all to stay on to answer that two-part survey to provide us with that valuable feedback. Just as a reminder, the recording will be made available by email after the webinar. Thank you, Jonathan. Thank you, Sarah. Thank you, audience. Have a great day!
SARAH: Thank you.
We help provide insights into key genetic drivers of diseases and relevant biomarkers. By working together to understand this data, we enable scientists and researchers to make more informed decisions on programs of interest. To learn more about how we can partner together to find your genomic variant solutions, we invite you to click on the link below.