Harnessing Multi Omics for Next Generation Drug Discovery
Tom Lanz
Senior Director of Multi-Omics and Biomarkers
Pfizer
Sunhwa Kim
Associate Scientific Director
AbbVie
Wei Wang
Director, Genomic Core Facility
Princeton University
Junmin Peng
Professor of Proteomics and Metabolomics
St. Jude Children's Research Hospital
Format: 1 hour webinar: 20 minute interview followed by 40 minute panel discussion
Thank you. Hi everyone. I would like to welcome everyone to this Thought Leader webinar on Harnessing Multi Omics for Next Generation Drug Discovery. In this session, we will explore how multi omics is changing early-stage drug discovery and development, from target identification to translational insight. We have the pleasure of introducing our Thought Leader, Tom Lanz, Senior Director of Multi-Omics and Biomarkers from Pfizer, who will share with us Pfizer's strategic approach to multi omic integration for 20 minutes, and thereafter, Tom will be moderating a panel discussion on how multi omics is transforming drug development and looking at the key opportunities and challenges.
So, Tom to start with, could you briefly describe your current work and how it intersects with multi omics and drug development?
Sure. So, thanks for that. My group is called multi omics and biomarkers. We sit in the drug safety organisation and Pfizer. We work with different parts of the portfolio, different programs, from, you know, early stage all the way through, you know, phase three and beyond, and even in some cases, some approved products, and usually we'll get involved when there's a need for either a safety biomarker or there's some niche where different omic technologies can help to answer a question.
That's great. Thank you. So, in your view, how has multi omics transformed early-stage drug discovery in the past five years?
Sure. So, I think there's a couple ways, I mean, for one, one thing- Target ID, I think, has had a huge impact. There's just been, you know, explosion of genetics, you know, for over a decade, when you couple onto that, advancements in single cell data, where we're going from, you know, atlases like GTEx, we have tissue level expression data for any potential target you could look at to, you know, human Cell Atlas and other consortium building, you know, single cell versions of a human Atlas, so we can have a better cellular resolution of, you know, target expression as we're, you know, trying to understand, you know, what's the best target I want to go after in this particular pathway for particular disease. We're getting a much better context for you know how that target is expressed in terms of pharmacology, but also maybe you know from the drug safety perspective, what else we might want to take a look at, because we know this tissue might be expressed in some other cell type, or tissue that you know, we hadn't anticipated before. And then on the other side of kind of, there's the analysis portion, but then also in the and generation of data, you know, functional genomics is becoming, you know, a bigger part of workflows in screening. You know, CRISPR screens are, you know, almost commonplace now. They're getting more advanced, and there's different variants of CRISPR to achieve different things. And so, I think I see the biggest impact in kind of new target ID and new target validation. Just you know, in part, due to the explosion of technology, the explosion of data that is available to inform new targets. And, you know, help us understand the targets we have. Help us discover new targets going forward.
That's brilliant. Thank you. So where does multi omics deliver the most value? Is it from target ID to clinical translation, particularly through, you know, from phase one to phase three?
I think it has the potential to impact all those. I think probably, you know, these days, there's many more tools available for enhancing target ID. So that would be the space I would say is, you know, probably seeing the most, most bang for your buck, but we use it all the way through. So, as we're developing new in vitro models, in vivo models, we're using omics to, you know, really understand those, understand how they represent the disease area that we're trying to study in humans, building a translational bridge, and then the clinic. I mean, there's lots of work and biomarker discovery going on. There's massive efforts like UK Biobank and others where they're, you know, getting lots of patients together, merging genetics with protein data or expression data. So, I think more and more companies are, you know, getting into the space of applying some more exploratory omic endpoints in clinical trials, and, you know, being able to bring that data back and have kind of a loop of learning.
So, actually, that brings me to the next question, which is, what are the most promising combination of omics? So, we have genomics, transcriptomics, proteomics and metabolomics, for example - you know which ones, in terms of combination of omics for actionable insights in discovery programs, are the most promising?
I think again, that probably depends on what phase of discovery you're working on. But if you're in the targeted space, I mean, we're getting a lot out of merging genetics with either transcriptomics, you know, generating eQTLs or high, high throughput proteomics like SomaLogic and Olink, and we're, you know, getting pQTL data to try to understand, you know, maybe causal variants and disease that might be druggable.
You know, as you get into your target space, certainly there's, if you have an epigenetic target, it makes sense to be looking at, you know, epigenetics, methylation alongside transcriptomics, to understand, you know, what downstream effect of the epigenetic modification. In the single cell space, they've really been working hard to pair things like VDJ, analysis of T cell and B cell receptors with RNA Seq. Those still be the most of the RNA level. But then with SITE-Seq and similar technologies, you can add on some level of protein information to your single cell resolution data. And so, I think, you know, we're seeing lots of great data sets in this area, merging the, you know, at the single cell level, RNA and protein.
So, what are the key technical or operational challenges integrating multi-omic data into drug pipelines?
So, one thing that I see, especially for newer technologies, is cost, you know, for short, read sequencing over the years, you know, the cost to sequence, you know, bulk RNA Seq has gone down and down, and it's become really accessible. At the single cell or the spatial transcriptomic level, spatial proteomics, you know, we haven't got to that point where the cost is really coming down yet. So, you know, it can be, you know, 10 to 100 times more expensive to interrogate samples at that level of resolution. So, I think those technologies are really great for building these atlases, building foundational data. But you know, it's, you know, in my work, for example, you know, nine times out of 10 we'll use bulk sequencing instead of single cell to answer a question, or at least start with bulk because the investment involves time and money, and a single cell or spatial approach is just, it's much larger. And you know, we need to build confidence that that is going to be the right approach to give us an answer there, yeah.
Thanks for that. Could you share a specific example where multi omics was pivotal in accelerating a drug candidate or reshaping a clinical strategy?
Sure, yeah, I've got, you know, a couple examples of, you know, still probably single omics. So, in the target ID space, there was a target years ago that we were interested in for schizophrenia. We knew what we wanted to go after, we had a tool compound, but the tool compound was non selective. The literature was really mixed. And so, we did some lcm and RNA Seq, and we're able to really clearly define one NBA receptor subtype over another. And that, you know, that drove, you know, establishing the screening strategy, you know, today, if we were to, if we were to, you know, have that same problem today, there's probably good single cell atlases we can go to and just get an answer right away and not have to spend the months it took us to, you know, painstakingly curate these data. So, I think that's a that's a good approach where, you know, we use omic data to figure out our target exactly, but it could be probably even accelerated today. And then kind of late stage, you know, we're using it for de risking in certain circumstances. And so, one example we had was AAV therapy. And AAV is thought to be kind of a non-integrating vector, it does kind of randomly integrate at a low degree. And, you know, there was a paper that came out years ago in mice, suggesting that these engineered Avs can lead to cellular carcinoma. And so, there was this, you know, question we had is, you know, how can we prove that our therapy is really not going to be, you know, leading to developing, you know, a cancer. And so rather than going through, you know, a lengthy carcinogenesis study, what we did was looked at some of the techniques that were kind of a current in the space, and developed a targeted DNA sequencing approach. And so, from a short-term study, we were able to use DNA information, look at where the vector integrated, and use some bioinformatic tools to predict that there was or was not kind of changes in clonality that would be indicative of, you know, a starting oncogenic response. And so, you know this, you know, saved us, you know, maybe at least a year of having to do a long study by having a more predictive short term, you know, bioinformatic and DNA sequencing driven approach.
I see, that's brilliant. Thank you. So, talking about the clinical strategy. What advancements in AI, biomarkers, or precision medicine do you believe will elevate the impact of multi omics in phase two and three trials over the next two or three years?
So, I think, you know, there's so many, you know, single cell spatial approaches right now, and platforms again. You know, I mentioned cost, if there was a breakthrough that, you know, made these a little bit more accessible, a little bit cheaper to implement, you know, that's one thing. But, you know, I look at the evolution of, you know, NGS technologies over the years. You know, the development of short read sequencing really caused an explosion. Just it made it accessible and cheap. There were so many people getting involved, so many new methods being developed. Just a simple, a simple idea of using DNA as a barcode was brilliant and enabled so many innovations like, you know, single cell DNA barcoding for, you know, even drug screening, and, you know, proteomics. I think if there was some kind of breakthrough technology there for, you know, one of the other omics, proteomics, epigenetics, something that put that omic on the same kind of level playing field with NGS, the sequencing, I think, you know, we see, you know that proteomics does a lot, and you know you can interrogate most of the proteome with mass spec these days, but it is still requiring, you know, expensive instruments, in depth expertise, whereas sequencing, I think, is much more accessible and high throughput at this point. So, I think something along the lines of, you know, a simple the use of DNA barcodes broke open gates for so many methods for sequencing. If we had something, something novel in the proteomic space, that's what I would like to see for kind of elevating that.
That's interesting. Thank you. Thanks so much. So, what collaborations or ecosystem shifts do you think we'll need for multi omics to reach full potential in drug development?
So, one area I think would be important is, you know, data sharing platforms may have to evolve. So, for years, we've been depositing sequencing data in geo you know, over the past few years, massive has become a good repository for proteomic data. As we get to, like the spatial era, where you're merging images with transport omens or proteomics, you know, that's going to require a different, whole different architecture. And if we want to be able to integrate these, you know, right now, they're sitting in these separate repositories, and, you know, we can pull them in, but, you know, these complex data sets, you know, I don't know the best way to, you know, normalize and make these shareable. So, I see that as a place where, you know, there could be a good solution, or email, maybe a consortium that tackles this kind of problem in the future.
That's great. So, I've got one question left, which is a poignant one. What's one piece of advice would you give to senior scientists or R&D leaders in biopharma, biotech, as well as academic institutions who are looking to integrate multi omics meaningfully into their discovery strategy?
So, I would recommend, you know, embedding expertise in this area in your team, not just view it as you know, we'll go to a service provider when we need sequencing or moving proteomics. I think these tools, you know, there's, first of all, a huge variety of tools. They can be used to answer a huge number of questions. But if you're just, you know, dipping into it when you need a specific, you know, question answered, I think you won't get the value out of it as if you had someone who really was well versed in the technology, you know, integrate bioinformatics and the biologists early to really improve the questions that you're asking, and that'll improve the quality of the answers that you'll get.
Thank you very much Tom for the interview. Really appreciate it, and so now I'm going to move on to the panel discussion on how multi omics is transforming drug development - key opportunities and challenges. So, I'm going to introduce Tom again to moderate the panel discussion. Please kindly introduce the panellists.
So, we've got three panellists today, and maybe I'll go on the order I see my screen.
Wei Wang, would you want to introduce yourself?
Hi everyone. Yeah. I run the genomics core facility at Princeton University, so mostly, of course, we run the next generation sequencing, including both the short read on Illumina platform and long read on the PacBio review platform. And using these we have been also doing the single cell sequencing of both the RNA and like a chromatin A tag, as well as the multi omic combined assay. And lately, we also delved into the spatial transcriptomics using the Xenium system from the next genomics. And on the proteomics part, yeah, it's the Olink system is already using the sequencing as a readout to quantify the protein targets, so that's also the area we have been working on.
Thank you. And now next, Junmin Peng.
Yeah, hi everyone. Yeah. I work at St Jude Children's Research Hospital in Memphis, Tennessee. I have been working on proteomics for about more than 25 years, and then largely focusing on mass spec technology, and then to use mass spec to develop the latest technology for providing proteins, protein modifications, and also apply those technology to a wide variety of biological question, from human samples, bio fluid, all the way to animal model, cellular model, and then to understand potential disease mechanism, and then also discover potential biomarkers.
And we have Sunhwa Kim.
Yeah, good morning, good afternoon, good evening wherever you are. I am Sunhwa Kim, first of all, thanks to the Oxford global for organising this very important discussion session. And I am a precision medicine scientist at AbbVie. And this omics, you know, leveraging omics, and then raising questions, how to use this, you know, omics, for certain specific questions in in pre-clinic to clinic, and then all the way to the patient. These are kind of daily, you know, my life, what I am, you know, facing all the time, and then raising questions. So, I'm really excited for this webinar, and then I'm looking forward to have further discussions within this forum. However, please feel free to reach out to me if you by any chance, if you have any questions even after this session.
Thank you.
Alright, so then I think the first question that we had was, one that was posed to me as well, you know, what are some of the promising multi combinations for target discovery and so, Sunhwa maybe, if you want to start with that.
Yeah, thanks a lot Tom. And, as you just, you know mentioned, you already answered this pretty nice, you know, the end to end, how we can, you know, combine, and then what is the most promising combination of this multi omics, right? So, I want to little bit step back where we started, and then I want to, you know, provide my view what we are doing right now. And then also, I want to provide a little bit more sense of where we are. And then what is the current gaps, what I think, right, what I feel. So, as Tom you mentioned this- omics started a long time ago, right? Transcriptomics, proteomics, and then now we are even raising our hand to the metabolomics and then lipidomics, and then genomics, furthermore, right? And then in the in the past that we were very happy, okay, oh my gosh, I have this transcriptomic data. And then, and then we were also very excited, oh my gosh, I have this proteomic data and that we tried to counter, you know, validate one spot here and then another spot there. However, throughout this, our journey drug discovery, target identification, we realized that human biology is not that simple, right? And then, and because of that, this integrating over these different omics to try to find a stronger signal is always better, right? And then, to me, the question is, it's not about what is the combination, the powerful combination, right? Because, if we can combine transcriptomic, genomic, transcriptome, proteomic, and then even metabolism. If we have tools to be able to really, you know, combine, based on build the confidence, then I think this combination should occur any kind of data sets. However, for me, the most important thing is Tom, what you also raised, you know, in your interview. It's a question, right? How we are going to use these combinations right? How we are going to integrate this data and beyond over that, how we are going to use right, and which stage of the in target, the journey or drug discovery or development we want to use, whether that is really early stage, or whether that is a clinical stage, right, and then depending on the questions, depending on how to use, I think those really drive the approach. Which combination of this multi omics can provide the most you know, the value of evidence for our next steps. Okay, so hopefully I answered your question.
Yes, I agree. I think the question, carrying up the question with the right approach is definitely really key. And so maybe I'll pass that back to Wei or Junmin, since I already kind of talked about this one a little bit earlier.
Yeah, thanks, yeah. In regard to the combination, I think it's probably the transcriptomics on the mRNA expression will probably be sitting in the central like a position connecting the other omics, because it's kind of what the central dogma already have told us that the mRNA is already the functional part of the genome. The genome may carry, like any kind of variation of normal up like aberrations, but it's going to reflect it by the mRNA expression to show its effect, and also the downstream translation into proteins could be affected by the mRNA abundance and the mutations and rearrangement of the by splicing. So, I would think, oh, and epigenomics, that's also like regulating the expression of mRNA. So, any if you choose, like a pairwise multi omics, I would say in most cases, transcriptomics would be an included and except, of course, between proteomics and metabolomics, this pair will confirm created, like all functional relationship.
Junmin?
Yeah, yeah. Thanks for the for the previous three summaries. And then I would like to actually put this in a little bit larger frame, because we all know the genomic information flow from DNA to our protein, protein control metabolic enzyme, but in the last 25 years, we all recognized the metabolome actually can feed back to modify the DNA on and protein right. And that generated a huge network to control biological activity, like a now called EP genome and EP trans transcriptomics, and also EP proteome, EP protein. Normally, we call it protein translation modifications, right? So now, if we really want to understand that the driver of the disease, I think either you can get information from DNA mutation analysis, or all those pQTL and QTL analysis, all the way from DNA to RNA to protein right and modification QTL. And then, in addition to that, I think there is still a missing information embedded in this very large scale of network. The reason is, well, when we do omics analysis, we're all looking for the major changes, right? We are looking for DEP, DAP, right, gene changes, proteins, but however, we actually have to recognise some of the major drivers. They may not change that much. They simply change activity or change your modifications or change your locations, and those actually, we mean we may not even cover them, right? So, actually, in some of our work were published previously with another group when we simply combine the protein and RNA providing data together. We find almost no overlap the major changes, there are very few. But however, if you do network analysis, you'll find one major pathway has been changed, and then that actually allow us to chase down to the main drivers. And then on other side, I work on neurodegenerative diseases, and then the vast majority of neurodegenerative diseases, they are not affected by directly DNA changes, not like a cancer, right? All we all know the most common like metabolic disease, heart disease, right? Diabetes, there are, there are genetic components, but then they are not necessarily all controlled at the DNA levels. Could be right at protein levels, and then on the other side, the environmental factor affect all those omics. Right? The environmental factor largely affect us by the small molecules. Right? A small molecule can get into a body and then modify our other biomolecules, right? So I think this actually allows us to really think eventually we have to assemble them together and assemble the omics together. I think now there's a major move, like a virtual cell, right, using AI technology and try to do those things. And I think those are really clearly important directions. And then I think now is the beginning to really work on those directions, to integrate all the data together.
Great. Thank you. I'll kind of lead into the next question, which is, how best can we translate omic signals to actionable pathways in early-stage drug development? So, you know, one way I think about this, you know, a problem that comes up, you know, again and again as programs are progressing through the portfolio is, as we are looking forward to phase one or phase two, we need to figure out what is going to be our pharmacodynamic biomarker, what is going to be that thing that will measure in patients that tells us that the target is engaged. I mean, we'll, we'll need additional signals beyond that to look at efficacy. But for target engagement, you know, this is often something that you know should be simple. And you know, maybe in a screening assay you have this is very robust when, as you get into animals, or, you know, look forward to what you can measure in people. It's a different story. You, you maybe are looking at something as a robust changes in the brain. It's not going to be accessible when we get to humans. And so, this is one area where we've done a lot of work, and, you know, using omics to try to identify pharmacodynamic biomarkers. And, you know, look at the changes caused by your drug so that we have something we carry into phase one, we can measure this, you know, one or two biomarkers, and say, this tells us that, yes, we've engaged target to the extent that we think is going to be, you know, what we need to drive efficacy. So maybe I'll, you know, pass to Sunhwa, if you have other thoughts on, you know, how we can translate, you know, omic signals to actionable pathways for drug development.
Yeah, thanks a lot Tom, and you actually nailed pretty well how we are leveraging this omics data to really, you know, brutally understand the pathways, and then why we are doing it right. And then I want to add a little bit more, you know, different nuance for that. So, in terms of Tom and everybody here, if we think about drug discovery in the past 100 years, right? And then, and not that many success, right? And then a lot of fails and, however, as we discussed before, most of our targets came from some type of omics studies, transcriptomics, genomics, proteomics, and we found some of the signals. So, to me, for early-stage drug development, this kind of, you know, the targets or drugs do not really fail because that had lack of signals, right? We always started to from some of evidence. However, beyond the what Tom mentioned, leveraging the pathway. But to me, I want to emphasize drug discovery and early stage so, so programs have failed because of the coherence of this mechanistic connections right to the pathways, right? So, starting from the genomics, DNA change, and then transcriptomic change, what you Junmin mentioned, what Wei mentioned, right? And then proteomic change. And then further downstream the pathway change, that this coherence, you know, the mechanistic, you know, connectivity, and the change emphasising that aspect is really, really important in all the, you know, the target discovery. And then all the way to all these stage. And then, in terms of all these days perspective, I think, and then also the pathway perspective. Going back to Tom, you mentioned the PD biomarkers, right? The dynamic biomarkers, and how we can monitor the drug activities, this is also very important Tom, right? Let's just say we have IL-17, anti-IL-17. However, depending on the mode of actions of a drug, we don't necessarily looking for the changes in the in the IL-17. However, we are looking for the best biologies, or pathways driven by IL-17, right? So, what that means is identify targets based on the omics. However, we needed to really deepen and widen our understandings in broad pathways, and not necessarily for some of you know, the individual data what's available. However, the target patient, the population that what we are aiming for, and the deepen down for those pathways to really track down the drug activity in early stage of drug discovery, right? And the data will actually increase the success from target stage to the clinic, and then also all the stage to the latest stage.
Thank you. So, Junmin, I wonder if you had any thoughts around this question too, on translating omics into pathways?
Yeah, I think we often say the omics measurement is still discovery analysis, right? It's not really hypothesis driven research in academia, right? People always say, oh yeah, you need a hypothesis, right? So, you may have heard many times, I think somehow it's important. I think on a measurement we observed, may or may not be the drivers, right? May or may not be the key components, and then to follow. I think on the other side, there are two approaches I think we're always actively thinking. One is, can we actually obtain time cross like non your digital data, and then really it can configure what's the early events during the cellular response? Right, that way, they can allow us to use the omics data to trace down what occur first, right, what is the front right, and then what's the most, earliest events that can probably help us to figure out what is the driver. The second approach is to use a perturbation approach, right like a CRISPR screen and a combined CRISPR screen approach with omics approach. I think that is another approach had been widely used, and especially at St Jude and we’re working on cancer to paediatric cancers, and they have been done very large scale, on perturbation analysis using CRISPR approaches in in collaboration with Broader Institute. And then they can even do the context dependent CRISPR screen. For example, you already have a gene mutation, then what's the synthetic disparity if you add another right mutation on that. So I think combining the genetic approaches and an omics measurement would be probably a very appropriate, promising approach to pin down those key drivers and actionable pathway for the future development.
And Wei, any thoughts on the screen?
Yeah, sure, yeah. I'd like to bring up on like, another way of looking at this question, see to better translate the omic signal discovery into pathway as actionable. So maybe one important aspect is that what would be the best, like a tissue or the specimen to assay. So that had better like reflect, mimic the true, like a disease the tissue. So whether being like as a disease model, it's either like by cell lines or animal model in early drug development. But I would say lately, like organoid becomes, like, somewhere in the middle that's practical to like a culture that also have some sort of tissue texture, especially when, like a multiple cell types are co-cultured together. So, in this somewhat tissue like context, the pathway may be better, like identified through the like a multi omic approaches. So, this, of course, add to the cost level and the complexity of like processing right? So, I think hopefully in the future, it's going to be more feasible to apply.
I agree, that's my hope as well. So maybe the next question we have is, you know, in thinking about some of the barriers to reverse translation. So, you know, things like data integration, harmonisation, computational complexity, and then if we can add some technical confounds to this, such as sample quality. You know, we have FFP versus fresh frozen tissue in different kinds of technologies, bulk versus spatial resolution.
So, what are some of the big barriers here that may be slowing things down or requiring, you know, something new to move the field forward, and so maybe again, Sunhwa I'll start with you on this one, if you have thoughts.
Yeah, thank you so much. Tom, you already touched on some of this reverse translation component during your interview in the beginning of this session. However, before answering and then before our discussion for that barrier component, I want to speak a little bit more about this reverse translation, because I believe this reverse translation is the future of what we needed to you know, expand. However, we are already stepping into that space, right? So, I believe everybody in this audience, this forum, heard reverse translation, and then that words actually tell right reverse translation. However, I don't know how much people really know what the power of this reverse translation. So basically, reverse translation is the bring the bed to the bench. It's opposite direction, right? So basically integrating, and then deepen our understanding and our knowledge in real world data set and the patient data set, and then from the clinical data set, and they tried to whatever knowledge we learned from the bed side, bring back to bench, right? And then here, I think the most important in reverse translation is, again, as I mentioned before, how we are going to use that, you know, reverse translation, the data points, real world data, and what is our question, right? What are our questions? So, the reverse translation Tom, I think you and I work really a lot from in different organisations. I think in this space, everybody's working on- it's huge potential. It's a real world. It's patients, right? And then I am, as I mentioned, I'm a precision medicine scientist. So to me, this reverse translation, this data sets, are they day to day, what I am dealing with. And then, to me, the data sets are really precious, because I can learn the patient segments.
So, actually nowadays, if we step back and we are talking about the target and then drug development, not all patients need new drugs, because we already have many lines of medicines, and then, and then some patients, like it's only 40 patients are already responding for existing medicines. So, step back for that is reverse translation will allow us to study about the patients who are not responding. Okay, so that's number one. So, patients who are in need. And then reverse translation is actually allow us to study about the drug which is not working. So basically, drug works, and then drug not work. And why not work? And where that is. And the reverse translation allows us to study about actually, disease segmentations, early stage and then active stage or late stage, which stage has unmet needs, right? So, so from that sense, I am dealing with this, you know, data sets to really kind of practice, to practice reverse translation every day, okay? And then omics. You know, of course, data is the only source or evidence what I can work right, the genomics, transcriptomics, and then and then, and then proteomics. And even we are using the metabolomics and the lipidomics right and in terms of a varied perspective, Tom, it's very obvious, right? And depending on that, let's say I want to do some of spatial omics, and is it FFPE better or frozen better? And then within FFPE, depending on the handling of tissue, or the condition of the preserve the condition the special omics is not available, while your signal is not really good, right? So, I think the barrier is obvious at this moment. So, what I want to say is, before you do any studies, you will need to keep in mind how you are going to use and based on that, you select the form to preserve your tissues right and then, depending on also your questions, you may collect different ways. Sometimes you collect as RNAs. Sometimes you collect as a protein. Sometimes you want to, you know, maintain the tissue structure, you know, fully. So, what I want to say is the challenges, barriers for this reverse translation. Those are resided tissue collection component. And then the first step, and then this one we cannot mitigate at the end. You need to always think before. And then another barrier, is so basically, we are going to have tons of data, but the question is, how we are going to integrate, right? Transcriptomic integration, proteomics sounds easy, however, not that easy, because they are using different technologies. And then probably they are they have different data points, right, and then also adding Junmin mentioned the metabolomics, lipidomics. To me, it was always very challenging. The reason behind it is metabolomics, lipidomies, we have this small chemical sometimes going upstream, downstream, with that small molecule, you know, the chemical. It's not that easy, because that actually involves everywhere, right? So, the two different barriers, three different you know, potentially, the first one is the collection of the samples, and the second one is the integration of this omics, what we talked and then the third one is the raising proper questions right beforehand. Okay, so that actually, those are some of the key areas, what I see all the time challenges and then, and the caveat, and then, you know, the kind of, you know, huge areas where bothers me all the time, from the reverse translation component.
Yeah, so I’m going to add one more complexity in there as well - species differences. So, a lot that we can do in human and we have, you know, tremendously well annotated, you know, genome and proteome and human mouse is, you know, probably second to that. But then, you know, in drug safety, oftentimes we're working in rats and monkeys as well. And there are things, you know, we can't always use the same platforms. Sometimes there's gaps in the annotated genomes and proteomes. So okay, I think it's great that we're collecting all these data in human. We need to figure out sometimes, how to best translate to the species that we need to answer our question and sometimes that's just simple. As you know, there's a panel that's been developed for human, there's a panel for mouse, and there's nothing in-between will sometimes try to adapt, or, you know, come up with a different solution. Maybe I'll hand back to Wei, if you have thoughts on this question of barriers or complexities for reverse translation.
Yeah, I would think like you, pretty much you and Sunhwa has already like, indicate all these challenges. I'd say like the solution, also like a kind of corresponds to where these challenges arise. That is all these like potential challenges had better be like considered in the very beginning of the drug trial clinical design so that the specimens could be like collected as planned for future analysis using the multi omics tools. The computation and the data integration normalisation like they can only be applied on, like a data obtained without the initial specimen, like some technologies are not even possible, right? So, I'd say, yeah, between like the, either fix or FFP fresh FFP tissue. So, if the most original sample can be somewhat like preserved as like a fresh frozen. Then in the future, if fixation is needed, then it's possible to be applied to convert the fresh frozen to FFP, but if it's the opposite, there's no way to get back the fresh frozen. So, plan these like a multi omic applications for reverse transmission, like Sunhwa has said as early as possible in the clinical trial design.
Thank you, Junmin?
Yeah, I want to follow a little bit on Wei’s idea about the sample. I think the most, I would like to emphasise the power analysis of the sample size of the co-host, right? I think many of the study now is retro reactive, right? So basically, you already have the patient sample, and then you analyse it and then, but in reality, is better to design those earlier, because the human population is so heterogeneous, right? And any of the measurement we done had been confounded by so many factors. And then to discuss with about statisticians a little earlier to before, we do omics, analysis always helps. And then, if you have some, everyone has some pilot analysis. Based on your small pilot study, you can evaluate the variations about all those measurements. And then based on that, you can, you can do power analysis to figure out how big a cohort should be, and then to get a meaningful data. Another point is, as Tom mentioned, is the species differences, right? I think, I think those is clearly a major problem. But however, with all those omics tools, but now with the technology, development is possible, we can compare different species at the omics level, at least, we can understand which pathway has been conserved, disease pathway or shared between species, which pathway are not actually absent across species. I think that that can be a little bit addressed, at least alleviated at this stage.
And while you're up, let me ask you the next questions. You know, your thoughts on either tumour microenvironment or disease tissue microenvironment, and what are some of the challenges, or kind of, what are the clinical relevance of, you know, interrogating it, these, you know, microenvironment alterations with omics.
Maybe I'll get started with the tumour microenvironment like, probably like, you know, well, tumour itself is already very heterogeneous because it contains multiple subclones or arising from like accumulated mutations and continue the selections. And the tumour also sits in a like a complex tissue environment. So, there's like term, the like tumour microenvironment, including the like a stromal cells that supports the tumour, and also the like epithelial cells that forms the blood vessel system, and also a lot of like infiltrating immune cells of all like a kind, like B Cell, C cell, macrophage and the like. So, all these like different cell types, really reside close to each other to form this complex like a tumour micro ecosystem that has like a complex interactions and interplays between all these components. So, I'd say the clinical relevance is obvious, because the drugs effect is not only, I say, anti-tumour drugs like going to be determined by the tumours characteristic in itself, but also by all the surrounding cells, especially the immune cells.
So, I would say the multi omics advancement is more like giving an opportunity to investigate the tumour microenvironment, rather than a challenge, because, like it's behaving as these tools multi omics can be used as a somewhat like a molecular microscope, to look into the cells in the tissue context. So initially, the multi omics assays were developed using larger amount of biomolecules, mostly DNA and RNA, and as it develops, so the input amount keeps getting reduced with improved amplification methods, so it's really easily applied at single cell level.
So, for tumour microenvironment, like one powerful strategy is to dissociate the complex tissues into single cells and assay each single cells properties such as the mRNA G expression and as well as the genomics mutation and epigenomic chromatin accessibility. And also, the next level is to assay these multi omic targets in a spatial context that is a so-called spatial transcriptomics and spatial proteomics. So, I say, believe or not, these multi omic in tumour microenvironment is has already been applied even before the age of genome, like 30 or even 40 years ago, simply using the microscope using the classical histology, by staining protein with antibody, or hybridising DNA RNA with probes. So, what the current, latest multi omic assay has enabled, like the highly multiplexed target detection. For mRNA, now it's feasible to detect up to 5000 or soon to become like a near 10,000 mRNA targets, and in the meanwhile, only a handful of protein can be detected together. For the spatial proteomics, it's amazing that the mass spec can already be combined with imaging to assay both small molecule like lipid and peptide and even protein in the like to be combined with the histology images.
So, I guess Junmin will shed more light on this, like proteomics.
Thanks. So Junmin, your thoughts on the disease, tissue microenvironment?
Yeah, I think we work on neurodegeneration. So, I just use this as an example. When we look at the brain of the of the human patients, you still see the majority of the cell are not affected by pathology. You only see a subset portion of the cell get affected by pathology like amyloid and tau aggregation.
I think the signal first attack technologically, if you analyse a specific area, disease affected area, and the signal is much more, much better than bulk analysis. And then in this way, I think the technology now becomes much better. So, for example, if we use nasal capture, micro dissection, capture a single neuron. Now we can analyse almost around 4000 proteins. That's actually incredible, comparing to, right? Because you consider one cell has only about, you know, less than one nanogram protein, and then protein cannot be amplified easily, right? So, I think the nasal capture, micro dissection combined with mass spec, and now it becomes a major tool in a spatial omics studies. And then it becomes more and more popular. Theoretically, in mass spec, you only need one molecule to be detected in, you know, near vacuum conditions. So, I think the technology still have room to develop in the future.
Second, I want to emphasise that biologically, when we analyse the single cell RNA analysis versus single cell protein analysis, I think we recognise there are many protein involved in cell to cell communications. For example, we found the RNA changes in astrocytes, but we didn't find protein at all in astrocytes, but we found a protein changes in another protein, like a microglia, in the same environment. So that actually tells us in those micro environment, not only we should focus on the cell itself, but we should actually really think about cell to cell communication, how the small molecule or proteins actually had been used to communicate between cells, and then that actually will be probably another area, and then can be potentially addressed or perturbed in the future.
Thanks. So maybe I'll hand it back to Cerlin to wrap up the panel portion of this webinar.
Yes. So, thank you so much for the discussion. There's been some very important topics covered. We shall certainly transcribe what's been discussed. And you know, it'd be really good if we then, you know, can share it with you and also the community. So, thank you ever so much for your contribution. It's indeed great to have all of you to share your ideas regarding this area.
So, from the webinar itself, we have about five or six questions being posted from our registrants. The first one is- is methylation data considered?
So, I think it definitely is dependent on what is your question. So, there are many, many questions where we don't really care about methylation. But for epigenetic targets, I mean, this is going to be critical for, you know, things that are definitely, you know, affecting methylation directly or indirectly and but maybe I'll see if there's someone else, if one of the panellists wants to take a stab at that.
Yeah, that's a really important question, right? So, methylation study, epigenetic studies started from academic long time ago. However, now industry is really leveraging. We are doing a lot of methylation studies, and especially, I think it disappeared, and not just us, but in this science communities, we are really appreciated that this cell free DNA methylation, especially for cancer area, right? So, methylation, not only for that area that Tom mentioned, the target, per se. However, methylation as a PD biomarkers, primary dynamic biomarkers, methylation as patient stratification, methylation as a chasing after some of efficacy component - that is really hot area right now. And then, as I mentioned before, oncology is the first stage, or indicate the disease area where is leverage a lot of this methylation. However, I see the future in this methylation much broader disease areas, immunology and then potentially neuroscience and some other respiratory diseases.
That's good. Thank you. Thank you for that. So, the second question is - what type of genomics data do you integrate?
What type of genomics data? So, I think, well, for you know, the eQTL, pQTL. A lot of those big studies are still relying on snip chips, microarray DNA data. But maybe I'll hand off to someone else. If there's, you know, other types of genomics data you're commonly integrating with protein or RNA.
Yeah, we do some. We use some of the genomic data to generate the protein variants, for identify the corresponding protein variance, corresponding to the genomic changes. I think that we call the proteogenomics field. That's a kind of subfield, proteomics field,
Yeah. So, I'm not actually data science scientist, right? I'm preseason medicine scientist. The way, how our team, leverage our genomics, is like what you Junmin mentioned. So, we tried to connect this, you know, the genomic changes such as SNPs and the change to the potential translation level and the potential proteome level, right? And then, not necessarily, of course, this is very important as for the target identification as integrated that you know, the multi omic approach. However, for precision medicine scientists like me, we are really leveraging whether there is any patient segmentation we can do right because, as I mentioned, not all patients are in need for new medications. We needed to find the patients who are in need, who are not responding to any existing medicines, and potentially patients who will respond for new treatment, right for that aspect, this kind of the connection of a sniff for our target, and then and they related, the translation of proteomic changes. Those are really important. So that's the way, how we are usually trying to leverage the generic data, regardless of the source. What Tom you mentioned,
Brilliant. Thank you. I've got the last question, I think, because the last two questions are pertaining to novel biomarker discovery and patient stratification which is going to be covered in our latest series of webinars. So, this third question, which is the last question – which is how do you correlate signals across omic data types, and at which levels do you perform integration?
So Junmin, you started talking about, you know, signals where you're seeing RNA changes in one cell type protein changes in another. I mean, how do you deal with, you know, integrating datasets like that, where you have additional complexity of cell cell communication.
So, we basically prefer to take a time course data you. For you, if you took a very short time course, right? If you work on animal this, the time course is the month, right? And then in that one, for example, when we compare the omics data from mouse models, we obtain the single cell RNA data and also single cell type of proteomics, we haven't got to single cell products yet. Basically, we sorted the cell out by fax and then laser capture, micro dissection, and then, in this way, we can understand whether the RNA changes can be preserved in a in a protein level or on a protein server as a buffer. They can actually regulate the protein itself at a degradation level and also in a post translational, maybe translational level. And then you see RNA changes, but you don't see protein changes. Actually, that's very common in cancer field, like this, gene duplication, some gene duplication, you don't see protein changes at all, right. So, I think that raised to another dimension, measure protein turnover becomes pretty important. We have a new approach who are doing this now and then, we found that there are many RNA protein inconsistent components actually regulated at protein degradation level.
That's brilliant. Thank you. Thanks very much. Are there any additional answers to this question?
I've seen one factor to pay special attention when integrating different omic data is that you got to look at each type of data set distribution as well as the information content. That's kind of the weight or importance, so that the true. the usefulness of different omics data can be evaluated in a reasonable way. Because, say, the SMV is binary and the mRNA expression is like a continuum. So these kind of data, you have to find a use, a sensible way to scale them, to put together, to evaluate the relative importance and contribution to the disease analysis.
So, I want to add a little bit more - one sentence only, and we are trying to use a lot of AI and machine learning to integrate this multi omics. It is not easy. It is very, very difficult. And depending on how you are doing, actually, your output will be different, right? That will impact your output. So, when we integrate these omics, I mean, we spoke pretty easily. However, that's not so we have many different types of AIs, many different types of machine learnings, many different types of algorithms we tried to integrate, and we are not settled down in one area. I don't think this period is emerging, so please tune in. However, I think the person who raised this question, your interest is really valuable because that's the future, and that's how we can maximise the value of our existing data for, you know, for future, you know, drug discovery and development.
That's brilliant. Thank you very much. That's a good conclusion to the webinar. Thank you so much. Tom, Sunhwa, Junming and Wei Wang in joining this webinar, I hope you've enjoyed it and learned from each other, and you know it's great to have you join us. So, we are concluding the webinar now and thank you for your time. See you soon. Thanks so much. Thank you. Goodbye.
Related posts
Harnessing Multi Omics for Next Generation Drug Discovery
Interview with Dr D. Marshall Porterfield on leadership in the GeneLab Open Science program at NASA
Altum Sequencing Interview: NGS technologies revolutionising cancer diagnostics
Novel Sequencing Technologies in NGS for Genomics Research
Upcoming events
Single Cell & Spatial Analysis in Tumour Microenvironment
Online
Join Oxford Global and Evan T. Keller to discuss single cell and spatial analysis in tumour microenvironment.
Translating AI-Enabled Multi-Omics Diagnostics into Clinical Practice: Regulatory, Reimbursement, and Adoption Pathways
Online
This session will explore how AI-enabled multi-omics diagnostics can be effectively translated into clinical practice.