Back to Podcasts
The AI in Business Podcast

"Waking Up" Data in Clinical Workflows with AI - with Mathew Paruthickal of Sanofi

The AI in Business Podcast • Daniel Faggella (Emerj)

Tuesday, December 2, 202518m
"Waking Up" Data in Clinical Workflows with AI - with Mathew Paruthickal of Sanofi

"Waking Up" Data in Clinical Workflows with AI - with Mathew Paruthickal of Sanofi

The AI in Business Podcast

0:0018:13

What You'll Learn

  • Achieving the right balance between structured and unstructured data is crucial, requiring a 'crawl, walk, run' approach
  • Embedding governance, explainability, and auditability into AI systems from the beginning is critical for regulated industries like life sciences
  • Human experts must remain in the driver's seat, validating the medical accuracy and consistency of AI-generated outputs
  • Regulatory precision, patient safety, and compliance are non-negotiable priorities when scaling AI across life sciences teams
  • AI can accelerate and augment workflows like pharma co-vigilance and medical writing, but human judgment and oversight are essential

Episode Chapters

1

Introduction

The host introduces the guest, Matthew Peruthical, and the topic of how life sciences organizations can move from isolated digital tools to orchestrated, interoperable systems.

2

Balancing Structured and Unstructured Data

The discussion covers the key barriers to connecting structured and unstructured data, and the importance of a 'crawl, walk, run' approach.

3

Embedding Governance and Explainability

The guest explains the need to build in governance, explainability, and auditability into AI systems from the start, to ensure trust and accountability.

4

The Role of Humans in AI-Powered Engineering

The conversation explores how human experts must remain in the loop, validating AI outputs and ensuring regulatory compliance and patient safety.

5

Scaling AI Across Life Sciences Teams

The discussion covers the key priorities and use cases for AI in life sciences, such as pharma co-vigilance and medical writing.

AI Summary

This episode explores how life sciences organizations can move from isolated digital tools to orchestrated, interoperable systems that enable structured and unstructured data to drive engineering workflows and operational decision-making, especially in highly regulated contexts. The discussion covers key barriers, such as balancing structured and unstructured data, building in governance and explainability from the start, and ensuring human oversight and accountability. The guest also shares insights on how the discipline of engineering is evolving to prioritize compliance, auditability, and real-time decision-making, with AI playing a supporting role to augment and empower human experts.

Key Points

  • 1Achieving the right balance between structured and unstructured data is crucial, requiring a 'crawl, walk, run' approach
  • 2Embedding governance, explainability, and auditability into AI systems from the beginning is critical for regulated industries like life sciences
  • 3Human experts must remain in the driver's seat, validating the medical accuracy and consistency of AI-generated outputs
  • 4Regulatory precision, patient safety, and compliance are non-negotiable priorities when scaling AI across life sciences teams
  • 5AI can accelerate and augment workflows like pharma co-vigilance and medical writing, but human judgment and oversight are essential

Topics Discussed

#Data integration#Explainable AI#Regulatory compliance#Human-AI collaboration#Engineering workflows

Frequently Asked Questions

What is ""Waking Up" Data in Clinical Workflows with AI - with Mathew Paruthickal of Sanofi" about?

This episode explores how life sciences organizations can move from isolated digital tools to orchestrated, interoperable systems that enable structured and unstructured data to drive engineering workflows and operational decision-making, especially in highly regulated contexts. The discussion covers key barriers, such as balancing structured and unstructured data, building in governance and explainability from the start, and ensuring human oversight and accountability. The guest also shares insights on how the discipline of engineering is evolving to prioritize compliance, auditability, and real-time decision-making, with AI playing a supporting role to augment and empower human experts.

What topics are discussed in this episode?

This episode covers the following topics: Data integration, Explainable AI, Regulatory compliance, Human-AI collaboration, Engineering workflows.

What is key insight #1 from this episode?

Achieving the right balance between structured and unstructured data is crucial, requiring a 'crawl, walk, run' approach

What is key insight #2 from this episode?

Embedding governance, explainability, and auditability into AI systems from the beginning is critical for regulated industries like life sciences

What is key insight #3 from this episode?

Human experts must remain in the driver's seat, validating the medical accuracy and consistency of AI-generated outputs

What is key insight #4 from this episode?

Regulatory precision, patient safety, and compliance are non-negotiable priorities when scaling AI across life sciences teams

Who should listen to this episode?

This episode is recommended for anyone interested in Data integration, Explainable AI, Regulatory compliance, and those who want to stay updated on the latest developments in AI and technology.

Episode Description

Today's guest is Mathew Paruthickal, Global Head of Data Architecture, Utilization, and AI Engineering at Sanofi. Founded in 1973, Sanofi is a French multinational pharmaceutical and healthcare company. Sanofi works in the research, development, and manufacturing of pharmaceuticals and vaccines. Mathew joins Emerj Editorial Director Matthew DeMello to explore how life sciences organisations can move from isolated digital tools to orchestrated, interoperable systems and how engineering teams can bake in traceability, auditability, and human-in-the-loop governance from day one. Want to share your AI adoption story with executive peers? Click emerj.com/expert2 for more information and to be a potential future guest on the 'AI in Business' podcast!

Full Transcript

Welcome, everyone, to the AI in Business podcast. I'm Matthew DeMello, Editorial Director here at Emerge AI Research. Today's guest is Matthew Peruthical, Global Head of Data Architecture Utilization and AI Engineering at Sanofi. Matthew joins us on today's show to explore how life sciences organizations can move from isolated digital tools to orchestrated interoperable systems that enable structured and unstructured data to drive engineering workflows and operational decision making, especially in highly regulated contexts. But first, are you driving AI transformation at your organization, or maybe you're guiding critical decisions on AI investments, strategy, or deployment? If so, the AI in Business podcast wants to hear from you. Each year, Emerge AI Research features hundreds of executive thought leaders, everyone from the CIO of Goldman Sachs to the head of AI at Raytheon and AI pioneers like Yoshua Bengio. With nearly a million annual listeners, AI in Business is the go-to destination for enterprise leaders navigating real-world AI adoption. You don't need to be an engineer or a technical expert to be on the show. If you're involved in AI implementation, decision-making, or strategy within your company, this is your opportunity to share your insights with a global audience of your peers. If you believe you can help other leaders move the needle on AI ROI, visit Emerge.com and fill out our Thought Leader submission form. That's Emerge.com and click on Be an Expert. You can also click the link in the description of today's show on your preferred podcast platform. That's Emerge.com slash ExpertOne. Again, that's Emerge.com slash ExpertOne. This podcast is sponsored by Google. Hey, folks, I'm Amar, product and design lead at Google Define. We just launched a revamped Vibe Coding Experience in AI Studio that lets you mix and match AI capabilities to turn your ideas into reality faster than ever. Just describe your app and Gemini will automatically wire up the right models and APIs for you. And if you need a spark, hit I'm feeling lucky and we'll help you get started. Head to ai.studio slash build to create your first app. Without further ado, here's our conversation with Matthew. Matt, welcome back to the program. It's a pleasure having you. Thank you. Absolutely. Taking from the top, we spoke a lot in the last episode, mostly talking about where we're driving data into clinical trials. Taking a different view today, we're exploring more questions about engineering workflows, especially in life sciences. But in that last episode, you had mentioned how much of the task has to do with connecting structured and unstructured data. What do you see as the main barriers for life sciences firms today in connecting structured and unstructured data for operational use in engineering workflows, especially in regulated industries? The first part is achieving the balance. We cannot do everything on day one. So you need to definitely have a constant conversation with your team about having the crawl, walk, run phase. And even in the run phase, it's all about optimization and transformation. So on the crawl phases, we try to talk to them, show them the importance of data, and data being not just structured data, it's data and documents and all those frameworks and why we need to actually collect it, ingest it, store it, cleanse it, and store it in a clean place. We have to bake in governance right from day one. But to do all these things, it's going to take some time. Rome is not built in a day. We're not going to be building all these things in a daytime. So you have to actually show them, of course, what is possible, out of the possible, that's what we call it. So that takes a few weeks. So what we've done is you take a few weeks to show them what it's capable of and they'd see the value. And that where then you move on to the next phase which is how do you actually create a modular system which you can control it and then you build it layer by layer start on iterating on and adding on to eventually building the full platform So that's what it takes, you know. So they are seeing some things very quickly and then we iterate upon the various versions into what we call as the, you know, the walk phase and then the run phase is where they are active on. So we start with a small user base and then we get the buy-in of key stakeholders and they see the value. And then we build a business case around that, and then we start layering it on. That's how at least we had success in ensuring that the engineering is evolving along with the business priority. So we involve the business early on, as often as possible. Every initiative, it starts with a co-authored problem statement. It's not just a tech scoping. We embed product thinking into engineering pipelines. And that's where, you know, we, you know, whatever is it that they're trying to do, what is the real business benefit? And that's exactly how we kind of like, you know, shape into our, shape our engineering pipelines and projects. Yes, very much a chorus we hear often on the show, which is you don't want to wait to jump into the pool of AI. You want to start at the shallow end. And especially for our friends in life sciences, I think that slow but steady wins the race approach, I think especially is important for driving patient outcomes. Along those lines, something we talked about a lot in the last episode is not only that sense of caution felt by regulated industries when it comes to these advanced technologies, but also that we want humans in the driver's seat. We see this probably more so the case in life sciences than other sectors. But, of course, there's the same transformations coming for the software development space, of course. and I think there's a couple different players in there that believe firmly in the long term software engineering is going to be a machines game. It is a debate. The future is not settled. The future is unwritten as one of my heroes love to say. But I think especially for life sciences, even if that future opens up and we do find that software development into the future might be more machine-based really than human-based. I think they're going to be slower to that future simply for the reasons that we want human beings in the driver's seat, not just for ethical reasons or compliance reasons, but healthcare and the Hippocratic Oath is at least for the time being best understood by human beings. Still, just taking that thousand foot view from where you are in both data science and life sciences, how do you see the discipline of engineering evolving as life sciences organizations move from isolated digital tools towards orchestrated, interoperable systems that prioritize compliance, auditability, and real-time decision-making, as you were describing in our last episode? So what we try to do is we try to set boundaries and not bottlenecks. So when we are encouraging teams to run experiments, anything on the tech side, I want to make sure that teams actually work with the business in understanding the business case, right? It's the data silos, like when they're trying to create a prompt pipeline, what are the different data sources that they use for that? You know, there may be many disparate systems, right? EDC, CDMS, safety, regulatory, commercial. And we may have a lack of metadata around that, you know, when we're trying to actually build systems. So for you to explain anything that we're doing, any question that you ask and you try to explain something, I mean, the AI explains it. Explainability also has to be built into, you know, day one, it's a day one priority. And AI must not just prove accuracy, but also auditability. And that's, again, a very key aspect of it. And that's where you need the human in the loop kind of thing. So any and every engineering pipeline that is created, we create explainable AI frameworks, and we need the human in the loop to actually make sure that it's not just accurate, but you can also see where it came from. So we're building auditability also as part of the governance standard? Where did the data come from? Which was the document or the data that was actually referenced and which version of the document that was actually a reference for this purpose? So those are the things that we're trying to actually enable so that the business sees what we're doing. They see exactly, okay, the answers in AI is grounded in truth. And you can audit it and you can go back to the original source to confirm it Yes they see it exactly like how it is in the document It sounds like there two sides to that table where you need human beings both humans in the loop who will be experts to check the outputs of the system Also, you need humans in a loop to know what needs explaining, know how to build an explainable system all the way up through the top, know what kinds of questions could possibly be asked. And those are kind of two different skill sets. The doctor that needs to double check all of the answers that an AI system gives to ensure patient outcomes, but also maybe doctors and data scientists asking, all right, well, what if we start to see a wrong answer here or confusing answer here? How will we be able to show that human in the loop the entire way up the chain and give them the confidence to say that this information is based on real science? It's not a hallucination. I'm curious, what roles do you see humans playing, not just in the outputs, verification, or even the explainability? What roles are you seeing human talent play in building proactive, scalable AI systems for engineering? And how should companies balance automation with human judgment and oversight on both sides of that table? So AI can accelerate, right? AI can summarize, AI can even recommend. But in our industry, it's only a human that can actually certified, they can actually sign off, and they can be held accountable. So it's not about compliance. Again, it's about, I mentioned about trust, traceability, and the ethical responsibility. So when you're, you know, generating a narrative, actually, so they're going to be validating the medical accuracy of that, right, in every single content. And we, the way we give them is we give them editable components so that they can go and edit it if they don't see it right. So the final sign-off is definitely going to be done by the human, especially in this case when we're working. So for signal detection use cases, any interpretation of risk in the clinical context, ensuring the consistency with an approved label. So it is the human that's actually signing off. And generative AI is never replacing a human judgment. And we're pretty clear about that. We're just trying to amplify it, protect it, and document it. And just for where we see life sciences organizations begin to fully connect their engineering data ecosystems, what new capabilities and efficiencies are you or what do you hope to see being unlocked? And how should leaders prioritize business outcomes over side projects or those little, you know, kind of like flash projects that you were talking about at the beginning of the last episode when scaling AI across teams? Yeah, so we try to actually prioritize certain things, right? Regulatory precision is absolutely key here. Patient safety, compliance, and these three things are non-negotiable for us. And, you know, so even if you have to generate an auto-generative, say, a clinical narrative, or you're trying to pre-fill a complex regulatory form, those are some of the trends that we're actually seeing in thousands of use cases. You know, so it's all coming down to how do we empower pharma co-vigilance? How do we empower medical writing and regulatory teams to shift from manual-heavy workflows to AI-assisted decision-making? You know, today we get a lot of like patient reports like adverse event reports is something that we get. It's a very manual form, you know, of entry, manual form entry and prescribing calls. What generally can help is auto extraction some of those data, you know, from those free form text. And then after that, then you have the ICSR documents, individual case safety report, you know, which is, again, a structured report about one adverse event. and how do you actually pre-fill the ICSR documents using LLM training? And we're trying to actually even do a fine-tuning of some of our models, some of the models that we have. We're trying to use some reinforcement learning techniques to actually fine-tune it based on the history of cases that we've actually seen, which means that it's much more faster. It's much more fine-tuned to the data that has to be provided. And the outcome of all this is, of course, we're going to be writing, like generating high-quality clinical narrators from all these structured studies. So these are some of the cases that we're seeing where Jene is making a huge difference in the ways of working when it comes to the life sciences department. Yes. And in that last episode, we'll have to give a link at the end of this one just so everybody can refer back to it for as much as I am. But you had mentioned proactive systems auto anomalies writing audit instructions ahead of time I curious what lessons do you think engineering teams no matter where they are learn from life sciences in making AI systems not just reactive to problems but proactives in surfacing risks optimizing workflows and ensuring safety at scale Yeah, you know, you need both. It's a two-part thing, right? While we're doing the technical enablement, it's a cultural shift, too, actually, right? How do you build pipelines that listen to triggers? And these are things that's not been done before. We've not really connected a business world problem with a technical problem. So how do we enable that? So the tech has to actually meet the cultural shift too, actually, because some of these things that we're actually releasing, we're releasing it back to the end user. So building the feedback loop, capturing the human overrides, any corrections, any approvals, They're all training signals because when you're capturing every single feedback that the human is actually doing, you know, when the AI responds, we're capturing that in our database. And that is going to actually inform us in the training process. It's not a one-stop shop. It's an ongoing process. It's an incremental process so that we improve the AI accuracy over time. And that's what we're calling as the few short examples. You start with a single prompt, and that's about it for a particular use case. But then over time, we're collecting a lot of the human feedback. And that is actually every action a reviewer takes can train the system to get better. And that's exactly how we are designing pipelines. So everything behind the scenes has got a pipeline where things get better over time. The more people use it, the more questions they have, the system is getting better every time. And not just that, we're also monitoring it because if you cannot monitor it, we can't trust it. Regulators are not going to be trusting it. So monitoring is also, again, based, you know, it's already implemented. as a first-class citizen while we're building all these systems. So continuous improvement and compliance is kind of baked into the system that's being built. Yes, and I think it's those capacities across the board, every industry where we see these systems being developed, that capacity for self-learning that I think gives folks, especially in less regulated industries, this idea that, oh, we'll be able to, you know, oh, this won't involve humans after a little while. If that ever happens, and it really depends on the industry, it depends on the use case, depends on so many factors, even if that does happen we're still so far away from there I don't even know if we really see it on the horizon but Matt, very fascinating stuff and especially illuminating to hear where this is coming from in life sciences that we can teach the rest of software engineering as a discipline of course. Thank you so much for being with us these past two episodes. It's been incredibly insightful It was an absolute pleasure talking to you Matt Wrapping up today's episode, I think there were three critical takeaways for life sciences leaders from our conversation with Matthew Peruthical, Global Head of Data Architecture, Utilization, and AI Engineering at Sanofi. First, connecting structured and unstructured data works best through a phased modular build that shows value early and keeps engineering aligned with business needs. Second, trustworthy AI requires explainability, traceability, and human oversight to be built in from the start, not added after system scale. Finally, proactive, self-improving workflows emerge when teams pair technical pipelines with cultural habits of feedback, monitoring, and continuous refinement. Interested in putting your AI product in front of household names in the Fortune 500? Connect directly with enterprise leaders at market-leading companies. Emerge can position your brand where enterprise decision makers turn for insight, research, and guidance. Visit Emerge.com slash sponsor for more information. Again, that's Emerge.com slash S-P-O-N-S-O-R. I'm your host, at least for today, Matthew DeMello, Editorial Director here at Emerge AI Research. On behalf of Daniel Fagella, our CEO and Head of Research, as well as the rest of the team here at Emerge, Thanks so much for joining us today, and we'll catch you next time on the AI in Business podcast.

Share on XShare on LinkedIn

Related Episodes

Comments
?

No comments yet

Be the first to comment

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies