Back to Podcasts
The AI in Business Podcast

Using Artificial Intelligence to Unlock Hidden Insights in Genomic Research - with Dr. Mark Kiel of Genomenon

The AI in Business Podcast • Daniel Faggella (Emerj)

Monday, November 24, 202527m
Using Artificial Intelligence to Unlock Hidden Insights in Genomic Research - with Dr. Mark Kiel of Genomenon

Using Artificial Intelligence to Unlock Hidden Insights in Genomic Research - with Dr. Mark Kiel of Genomenon

The AI in Business Podcast

0:0027:29

What You'll Learn

  • Clinical and scientific literature contains a wealth of real-world data and insights on patient journeys, demographics, lab values, treatments, and outcomes that can inform drug development and clinical diagnostics.
  • Manually accessing and applying these insights is a slow, error-prone process due to the sheer volume and heterogeneous nature of the literature.
  • AI-powered solutions, including natural language processing, knowledge graphs, and human-in-the-loop curation, can help extract, organize, and connect the relevant information to enable faster, more reliable insights.
  • Curation is an essential AI capability, with human experts playing a key role in developing, testing, and iteratively improving the models to ensure the quality and reliability of the final insights.
  • Leveraging the real-world evidence in the literature can help pharmaceutical companies enhance disease understanding, guide trial design, and support label expansion.

Episode Chapters

1

Introduction

Overview of the guest, Genomenon, and the value of clinical and scientific literature as a source of real-world evidence.

2

Challenges in Leveraging Clinical Literature

Discusses the challenges pharmaceutical companies face in turning the vast amount of information in the literature into actionable insights, including the heterogeneous nature of the data.

3

AI-Powered Solutions

Explains the role of natural language processing, knowledge graphs, and human-in-the-loop curation in extracting, organizing, and connecting the relevant information from the literature.

4

The Value of Real-World Evidence

Discusses how leveraging the real-world evidence in the literature can benefit pharmaceutical companies in areas like disease understanding, trial design, and label expansion.

AI Summary

This episode discusses how pharmaceutical companies can leverage the wealth of real-world data and insights hidden in clinical and scientific literature to accelerate drug discovery and development. The guest, Dr. Mark Kiel of Genomenon, explains the value of this literature as a source of real-world evidence, the challenges in accessing and applying these insights at scale, and how AI-powered solutions can help overcome these challenges. The conversation covers the role of natural language processing, knowledge graphs, and human-in-the-loop curation in extracting actionable insights from the literature.

Key Points

  • 1Clinical and scientific literature contains a wealth of real-world data and insights on patient journeys, demographics, lab values, treatments, and outcomes that can inform drug development and clinical diagnostics.
  • 2Manually accessing and applying these insights is a slow, error-prone process due to the sheer volume and heterogeneous nature of the literature.
  • 3AI-powered solutions, including natural language processing, knowledge graphs, and human-in-the-loop curation, can help extract, organize, and connect the relevant information to enable faster, more reliable insights.
  • 4Curation is an essential AI capability, with human experts playing a key role in developing, testing, and iteratively improving the models to ensure the quality and reliability of the final insights.
  • 5Leveraging the real-world evidence in the literature can help pharmaceutical companies enhance disease understanding, guide trial design, and support label expansion.

Topics Discussed

#Real-world data and evidence#Genomic and clinical literature#Natural language processing#Knowledge graphs#Human-in-the-loop curation

Frequently Asked Questions

What is "Using Artificial Intelligence to Unlock Hidden Insights in Genomic Research - with Dr. Mark Kiel of Genomenon" about?

This episode discusses how pharmaceutical companies can leverage the wealth of real-world data and insights hidden in clinical and scientific literature to accelerate drug discovery and development. The guest, Dr. Mark Kiel of Genomenon, explains the value of this literature as a source of real-world evidence, the challenges in accessing and applying these insights at scale, and how AI-powered solutions can help overcome these challenges. The conversation covers the role of natural language processing, knowledge graphs, and human-in-the-loop curation in extracting actionable insights from the literature.

What topics are discussed in this episode?

This episode covers the following topics: Real-world data and evidence, Genomic and clinical literature, Natural language processing, Knowledge graphs, Human-in-the-loop curation.

What is key insight #1 from this episode?

Clinical and scientific literature contains a wealth of real-world data and insights on patient journeys, demographics, lab values, treatments, and outcomes that can inform drug development and clinical diagnostics.

What is key insight #2 from this episode?

Manually accessing and applying these insights is a slow, error-prone process due to the sheer volume and heterogeneous nature of the literature.

What is key insight #3 from this episode?

AI-powered solutions, including natural language processing, knowledge graphs, and human-in-the-loop curation, can help extract, organize, and connect the relevant information to enable faster, more reliable insights.

What is key insight #4 from this episode?

Curation is an essential AI capability, with human experts playing a key role in developing, testing, and iteratively improving the models to ensure the quality and reliability of the final insights.

Who should listen to this episode?

This episode is recommended for anyone interested in Real-world data and evidence, Genomic and clinical literature, Natural language processing, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

Today's guest is Dr. Mark Kiel, Chief Science Officer and Founder at Genomenon. Genomenon is a genomics intelligence company that unlocks real-world evidence from biomedical literature to help pharmaceutical and clinical diagnostics companies inform precision medicine, accelerate patient diagnosis, and guide trial design and label expansion. Mark joins Emerj Editorial Director Matthew DeMello to explore how AI can streamline the extraction, organization, and interpretation of genomic and clinical data, enabling faster, more accurate decision-making in pharmaceutical R&D. He also shares practical approaches to integrating AI with human curation, improving workflow efficiency, and scaling insights for rare disease diagnosis, trial design, and drug development strategy. This episode is sponsored by Genomenon. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1. Want to share your AI adoption story with executive peers? Click emerj.com/expert2 for more information and to be a potential future guest on the 'AI in Business' podcast!

Full Transcript

Welcome, everyone, to the AI in Business podcast. I'm Matthew DeMello, Editorial Director here at Emerge AI Research. Today's guest is Mark Keel, Chief Science Officer and Founder at Genomenon. Genomenon is a real-world evidence and genomics intelligence company. Genomenon helps pharmaceutical and clinical diagnostics companies to inform precision medicine efforts and accelerate patient diagnosis by unlocking real-world evidence in biomedical literature. By combining AI with deep scientific expertise, Genomenon helps enhance disease understanding, guide trial design, and support label expansion. Dr. Keel joins us on today's show to break down how pharmaceutical teams can turn genomic and clinical literature into structured, actionable evidence, and what it takes to make that information usable at scale. He explains where real-world data in the literature delivers unique value, why manual review bottlenecks limit what teams can ask and answer, and how AI systems can finally make those insights searchable, consistent, and reliable. Our conversation also covers the practical workflow changes that matter inside large life sciences organizations, how curation and AI work together, where human expertise remains essential, and what measurable advantages leaders can bring back to their R&D and clinical teams from faster evidence gathering to new levels of disease coverage and trial design precision. Today's episode is sponsored by Genomenon, but first, Interested in putting your AI product in front of household names in the Fortune 500? Connect directly with enterprise leaders at market-leading companies. Emerge can position your brand where enterprise decision makers turn for insight, research, and guidance. Visit Emerge.com slash sponsor for more information. Again, that's Emerge.com slash S-P-O-N-S-O-R. Also, are you driving AI transformation at your organization? or maybe you're guiding critical decisions on AI investment strategy or deployment. Without further ado, here's our conversation with Mark. Mark, welcome to the program. It's a pleasure having you. Yeah, thank you so much, Matt. Pleasure to be here. Absolutely. We've had a lot of discussions on the show this year within the realm of drug discovery and clinical research. The answers to critical questions about patient populations, treatment responses, and disease mechanisms already exist, as we all know and the audience knows. It's just, the problem is it's scattered across millions of pages of peer-reviewed literature. And yet, for most pharmaceutical organizations, accessing and applying those insights remains a slow, manual, and error-prone process. Before we get into how AI can change that equation, I think it's important to just start with the foundation itself. What real value are you seeing exists in clinical literature as a source of real-world evidence, and how can it meaningfully shape the path from discovery to commercialization? Yeah, it's a great question. And you had in your question some context. I'd like to clarify some of that context before I dive into the value of the literature. So the way I conceive of real-world data and real-world evidence and real-world insight is on a continuum. The real-world data being the raw material, the individual data points about individual patients, the clinical lab values, the phenotypes, etc. Real-world evidence is when that information gets organized into a database or some larger data structure. That's when you can really start to make sense of what that information means. And then from that emerge real-world insights, which is to say answers to the questions that drive business value. And further setting that context, all of that information, real-world data, real-world evidence, is everything that happens in medicine outside of the context of highly structured clinical trials. Clinical trials are obviously essential for drug development, but there's a whole wealth of information and insight that's potentially achievable outside of the context of those rigorous trials. Some of those myriad sources for real-world data include EHRs, which is a common go-to, as well as claims data and registry data. But an important and often taken-for-granted resource is the medical literature, as you stated. So what I'd really like to clarify for your listeners is there are real patients captured in these case reports, case studies, families, and cohorts. their real patient journeys, and all of that real-world data that reflects their journeys, the demographic data, the way that they present in the clinic, the laboratory values, including the genotypes, but also just routine laboratory values, the treatments and their outcomes, all of that information, particularly for rare disease and precision oncology patients, has been painstakingly aggregated by the clinician who saw them, the subject matter expert who works on that disease day in, day out. All of that information comes pre-aggregated and pre-vetted and has many applications in both pharma for drug development, as well as in clinical diagnostics. And I hope to go through several of those use case applications. Absolutely. I know another gigantic use case outside of the literature that's hugely important across healthcare and life sciences is de-identified patient data. And what is the literature except for de-identified crystallized patient data for that explicit purpose. So just to tie a couple of loose ends from different series that we're conducting right now on clinical trials. So just with that backdrop, what challenges are you seeing pharmaceutical companies face in turning genomic and clinical literature into actionable insights? Yeah. The first is a challenge and an asset. It's the sheer volume of this information. The way I conceive of the real world data in the clinical and scientific literature is it's a summary of all of the insight, the collected and peer reviewed insight across the many decades of inquiry. So there's a lot of information, a lot of patient lives are patterned in those studies, and just harnessing it, getting access to it, appreciating the heterogeneous ways that those insights, those real data are patterned That could mean the different types of articles the different types of supporting databases and just basically the way that all that information is patterned in human language So that sheer volume, the access to the resources, but also the complexity of the way that that information has been laid out, all of those create challenges that up until this point had to be addressed manually. And everybody knows, as you began in your intro, manual challenges scale. And when you've got a volume problem and the only way to parse it was manually, you are prevented from asking and answering a lot of these questions, which is why we're here to talk about how AI is able to surmount some of those challenges by obviating the need to manually review a lot of that information. Absolutely. And I think for this audience, too, as we're getting kind of closer to talking about the AI capabilities, I think what might occur first for them is there's, especially for the deterministic side, definitely, you know, machine learning, document processing, what goes into there, you know, simply to read a lot of this material and drink it in. That seems like, oh, you don't need anything fancy. You don't need JADGBT. You don't need anything generative. But it would occur to me you need something a little bit more advanced to prioritize and really understand, hey, this research, not this research, is the most relevant. Take us a little bit through What are the AI capabilities that create that prioritization? Yeah, that's great. It's especially true when you're thinking about genetics, the complexities of, say, genetic language for finding the right references to begin with is you can't simply ask language model right now, show me all of the ways that this variant is described and organize all these articles and then help me answer questions within each of those articles. That is a challenge that's on the frontier of what off-the-shelf GPTs are able to do. So the tools that we've developed not only aggregate that information, extract the relevant units, the genes and the variants, but also the patient information, demographic data, phenotypes, et cetera. We have built tools and capabilities that first find that information within the aggregated content. So identifying that information, then extracting it out of those articles and organizing and annotating it into those real world evidence data sets. So we have a sophisticated natural language processing capability, what we call genomic language processing, that first finds all that information very sensitively. But then we leverage large language models, LLMs and LLM-RAG, to better understand those entities as we found them in context and reconcile them to standard ontologies so we can start to have interoperable data for individual questions that we ask about this disease or that disease. And as I said in answer to my previous question, this is amortized across all human disease and the full breadth of human inquiry to understand all of those diseases. So maybe the last thing I'll say as we're talking about going into the next stage of extracting benefit out of this data is not just finding those things and not just putting them individually into their context, but drawing connections or relationships across them in the form of a knowledge graph to both recover the data so that when there's a user query, you can be sure that you're finding all and all of the relevant information to answer that question. But you can also use knowledge graphs to make causal linkages between those two for things that are directly patterned in the literature, but also things that are indirectly suggested in the literature. So that's, as I say, a frontier that we're pushing on now is to not just have data recovery where we recognize what information is there, but also putting the data together to make discoveries using AI capabilities. Absolutely. And I always say in these episodes, you know, AI 101, when you're first getting started, you know, manual processes are the enemy. And when you get to your black belt, it's actually it's a lot different in that manual processes are not the enemy. It's you want to choose what your manual processes are. You want to make those, you know, the things where your insights and judgment are the most valuable or your passion comes out if maybe you're not, if your workflow isn't necessarily life or death. But we're in healthcare and life sciences, so a little bit more intense, a little bit more serious. But for what you were describing there with the knowledge graphs, it sounds at some point the baton gets handed back to the human. And it needs some feedback on, hey, here's me, the system. Here's what I think the connections are most important. And then it has to get that PhD level feedback. Tell us about just where does the manual process end there? What does the human in the loop experience look like? Is that for your team or is that more for the users? That's a fantastic question. And it allows me to make what I wonder is a controversial topic is to say that curation is actually an AI capability. So all of the steps that I described, we have the great benefit of having a large team of curators. They're very well trained in the ways of genetic curation, but also clinical information and annotation. And so they're used in that context as expert curators in multiple stages. First, in developing the data sets that are used to train those models and test those models. They're put to purpose in iterating and improving those, you know, setting up the models, but then going through and checking the results and incrementally improving them. And then depending on the nature of the final deliverable and the use, our curators will review that information. Examples include regulatory submission, publication. if this is going toward making a multi-billion dollar drug development decision, you had better be sure that the data coming out of the model is accurate and traceable and defensible from the references. So those scientific curators that we have on the team, as well as the scientists themselves, stand on the shoulders of the data that's produced by our AI systems, fit for purpose, depending on what our pharma clients and clinical diagnostic clients need. Makes a lot of sense. I think then once you really get a sense of, okay, here's what the end game is going to feel like to the end user, then they have to go sell that back upstairs, as it always goes. I think the obvious benefit here is the reduction of time and effort anytime you're taking down a manual process. But what measurable benefits even outside kind of the time spend are we seeing leaders take from these systems to really sell back upstairs to the folks that are farther away from from AI. Yeah. Cheaper better faster more Right Right Right The time and the cost savings those are are true But let talk a little bit more about the things that are not possible without AI and that scalability I mentioned before the ability to scale allows you to ask questions that you couldn't have otherwise conceived of being possible before. So what we're able to do from the real world data in the literature and our ability to access that information with our AI capability married to our curators is catalog every human disease and specifically every patient ever described with any one of those diseases and all of those characteristics that I talked about earlier, the demographic, clinical, laboratory treatment and outcome data, all of that information very granularly captured using those AI capabilities in an automated way. So when you talk about the measurable benefits, you're not just talking about the time savings and the expense that would attend a manual extraction. People wouldn't even conceive of trying to do that for one disease, let alone for multiple different diseases. So when you think about that measurable benefit, it's the number of patients that you can extract and the longitudinality of their patient journeys and the granularity of the data that you can extract, and as well as the customizability of the information that you're pulling out. Every disease has its own idiosyncrasies, different lab values, different test results, different treatments, and different endpoints that you're looking for. So all of that is possible at the scale that we're talking about only by leveraging AI capability. So I'd say that that's the principal benefit is making what otherwise would be inconceivable, relatively routine. Absolutely. I think even more than 2025 being the year of agentic or whatever the big headline is now kind of for the mainstream media normies out there. I think really internally, the headline is 2025 is the year of small eye infrastructure more than anything, because really the value you're saying here is, oh, yeah, we'll save you time. We'll save you costs. You know, you're not doing a manual process anymore. But the real value is setting the foundation of scalability. And that's what we're, especially in these spaces across industries, really talking about as AI, small I infrastructure beyond the hammers, the nails, the physicals, the metal servers, the data centers, et cetera. So we've got our benefits together. Hopefully, we're upselling this to management. We're starting to see that foundation for scalability really be set within the life sciences enterprise. There's that sense of, you know, we've only gotten started on this beachhead right now. It's going to look a lot different a couple of years from now, not only when those organizations have built those systems of scale for which we have the opportunity now, but also, you know, the entire industry is also caught up. I always love, and this dates me, I'm in my late 30s, but it reminds me of internet adoption and seeing all the houses on the street go from ethernet to Wi-Fi. And we knew even 20, 25 years ago, oh yeah, there'll be this thing called Wi-Fi. Maybe I've seen one person have that, but everybody's got ethernet right now. And then you would see five, eight years later, oh yeah, the whole block has Wi-Fi. And that makes for a different experience with the technology. What does the Wi-Fi moment look like for AI capabilities for genomic intelligence when they're fully integrated into pharmaceutical R&D workflows. Yeah, I've heard this said before. I think I'm going to butcher the actual quote, but I think the concept still stands, is if you think you can automate drug development, that's as challenging as automating a trip to Mars. Right. So many individual pieces, and our AI is not super intelligent yet. It's intelligent in lanes. And so I would actually invite the speaker of that quote to conceive of how you could break out the trip to Mars into individual chunks. There are aspects of drug development that are routinizable. Individually, and now they're challenging, but if you break it out into its component parts, there are many aspects, particularly those that have to do with information gathering and assessment, that do lend themselves to automation when you have the access, when you have the ability to scale, extract that data, and start to synthesize that data, and especially when you can marry it efficiently with curation. So I'm thinking of applications like divining endpoints. What's the optimal endpoint for this trial? What are the inclusion criteria for your trial? They're typically genetic biomarkers cataloging all of those and assessing those that is routinizable with AI married to a curation when you're when you're going out to regular regulatory bodies say for your actual submission helping understand how many patients there are for any one given disease the prevalence that is paramount to making decisions about which drug program to pursue next And that's fundamentally an information gathering challenge. So when you think about doing the wet lab work, the empirical work, those are challenging stages of on your trip to Mars. But a lot of that information, those questions that are germane to information assessment, those can be routinizable. So the future that I'm envisioning, my Wi-Fi moment, will be when we can automate those aspects of drug development that are automatable and then dramatically scale up our ability to develop new drugs and for more and more indications, which otherwise would not be economical. rare disease, there aren't enough patients, or there's an underappreciation of how many patients there are, it's going to cost too much to set up the infrastructure for drug development. If we can determine that there are more patients and reduce the cost to embark on some of those more routinizable aspects of drug development, we're going to see a flurry of new drugs in the coming years here, especially when you marry that to our ability to sequence faster and to sequence more individuals at scale in programs such as newborn sequencing initiatives. You can imagine how much more straightforward it will be to find those patients, understand what diseases they have, and know a priori based on that data where to turn your attention to to develop new drugs for otherwise invisible rare indications. Absolutely. And I think this is again maybe this is where folks are at kind of their blue belt if I staying in the karate metaphor of AI But you come to understand that there are certain ironic symmetries between what happens to your human team which is the silos come down versus how you need to manage you know what AI tools you using That can be very components based, as you're saying. A comparison, and let me know if I'm totally off base with this. But, you know, when the generative AI explosion happened and maybe for like it felt like especially on the show, the next nine months after the entire conversation was foundational models and bespoke models. You'll have foundational models that that, you know, are no chomsky. They understand the whole human human language, but they make really bad call center agents. And you'll need smaller models that maybe don't understand the breadth of human language, but they are fine-tuned to answer questions in a call center. And just in that same way, you're breaking off components and you're solving AI with AI. We do this at Emerge with our fact-checking systems. And we're already encountering that advice we hear from guests over and over again of, you know, it's going to be hard to stomach. It's going to be hard to sell, but you want to offset one AI model with another. It seems like you're really transplanting the silos that are between your human teams, which are really unhelpful, and you're trying to find the right silos for these AI tools to really hone their focus on specific tasks that get closer and closer to end users and clients. Is that basically kind of the comparison you're making with component pieces? Yeah, and I clarify that it comes in two flavors, is you're starting to come straight and the questions that you're asking. So we have talked about finding the right articles so that you're not generally trained on science writ large, but you know what your field of view is, what the specific scope is, what the content is in those articles, and how to ask the context-specific questions to extract the most meaningful information. so that you're not thinking generically, you're thinking very specifically about those questions and where the answers are going to come from. Right, and in no small order, it seems that that's really a huge role humans are going to be playing going forward, as we were saying, kind of with human in the loop and making sure that there's expertise and that feedback to really verify these systems. But I think it's purely a human task to really decide, okay, as those silos move from, hey, where they are in our human teams to the models, the tools that we have to offset with each other to make sure that the end product is accurate and beneficial. I think that's purely going to be a human judgment. It'll be assisted more and more and more by AI as we go. But for all the talk we hear today of, oh, hey, humans are going to be displaced from this, from this, from this. Yes, a lot of human displacement. But that, I think, is long term where we're going to see a lot of human value. A machine can't do that yet. And we'll put a yet on that. That's at least one of the places we don't see it moving anytime soon. But Mark, really fascinating stuff, especially when we're dealing in anything as complex as genomics and genomic data. Thank you so much for being on the show this week. Yeah, thank you so much, man. It was fun. Wrapping up today's episode, I think there were three critical takeaways for leaders overseeing data R&D and clinical strategy across the life sciences enterprise. First, clinical and genomic literature already holds deep, real-world patient data, but most teams lack the structure to use it effectively. Treating the literature as an organized data asset expands what analysis and decision-making can look like. Second, purpose-built AI tools for genomics do more than speed up review. They extract variants and phenotypes with sensitivity, align data to standard ontologies, map relationships through knowledge graphs, and pair it with expert curation to provide evidence that is traceable and decision-ready. Finally, the long-term gain is scalability. As more patient journeys enter the published record, teams with strong data foundations will be able to automate the repeatable steps in development, estimate patient populations with greater accuracy, and choose programs with more confidence. This is the path toward a modernized R&D workflow. Are you driving AI transformation at your organization or maybe you're guiding critical decisions on AI investments, strategy, or deployment? If so, the AI in Business podcast wants to hear from you. Each year, Emerge AI Research features hundreds of executive thought leaders, everyone from the CIO of Goldman Sachs to the head of AI at Raytheon and AI pioneers like Yoshua Bengio. With nearly a million annual listeners, AI in business is the go-to destination for enterprise leaders navigating real-world AI adoption. You don't need to be an engineer or a technical expert to be on the show. If you're involved in AI implementation, decision-making, or strategy within your company, this is your opportunity to share your insights with a global audience of your peers. If you believe you can help other leaders move the needle on AI ROI, visit Emerge.com and fill out our Thought Leader submission form. That's Emerge.com and click on Be an Expert. You can also click the link in the description of today's show on your preferred podcast platform. That's Emerge.com slash expert1. Again, that's Emerge.com slash expert1. We look forward to featuring your story. If you enjoyed or benefited from the insights of today's episode, consider leaving us a review on Apple Podcasts and let us know what you learned, found helpful, or just liked most about the show. Also, don't forget to follow us on X, formerly known as Twitter, at Emerge, and that's spelled, again, E-M-E-R-J, as well as our LinkedIn page. I'm your host, at least for today, Matthew DeMello, Editorial Director here at Emerge AI Research. On behalf of Daniel Fagella, our CEO and head of research, as well as the rest of the team here at Emerge, thanks so much for joining us today. And we'll catch you next time on the AI in Business podcast. Outro Music

Share on XShare on LinkedIn

Related Episodes

Comments
?

No comments yet

Be the first to comment

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies