0:07 

Yiannis CEO and Co-founder. 

 
0:10 
I will give you a 20 minute rundown on how AI can be used as a tool for scientists and biomarker discovery dealing with the key problems that exist in that process. 

 
0:24 
So it'll be a ride through a couple of slides and then happy to take any questions at the end. 

 
0:29 
We also have a booth where you can see a demo of the system in real life. 

 
0:33 
So I start with a couple of numbers that you all probably are familiar with. 

 
0:38 
10.5 That's the number of years it takes from phase one into regulatory approval as an average overall disease areas for an asset. 

 
0:47 
9.1 It's the number of years it takes with an effective biomarker strategy. 

 
0:52 
And now the key question is, of course, how can AI help here to further accelerate this? 

 
0:58 
Because as you know, every day counts for the patients who are waiting for the drugs out there. 

 
1:03 
So I want to give you an overview of three things today. 

 
1:07 
One want to make the case that proper biomarker discovery and identification is a knowledge problem and it's a really difficult problem to solve correctly. 

 
1:18 
Then I want to dive into why it is important that AIs should be there to support the human and not vice versa and make the case on how that works in real life. 

 
1:29 
And then I'll go through 2 use cases with you, how that can be applied and how we have seen it being applied by thousands of researchers that we are working together with. 

 
1:41 
OK, so let's start quickly. 

 
1:42 
You know what kind of biomarkers are out there? 

 
1:44 
I mean, all of you, I mean, you are all the experts here. 

 
1:47 
You probably know all this, but this is the point here, that there are a lot of different types of biomarkers. 

 
1:52 
There are those that are more in the later stages of a development like prognostic and predictive biomarkers. 

 
1:59 
There are those that are proximal biomarkers for target engagement, safety biomarkers, risk biomarkers, diagnostic biomarkers. 

 
2:05 
So there are all sorts of different biomarkers and these are different types of action and reaction between entities at the end of the day. 

 
2:14 
So they are important in the in drug development for different key use cases that you're familiar with. 

 
2:21 
And I will make the case that AI can help in a variety of different use cases that exist to identify these biomarkers and qualify them in your programmes. 

 
2:33 
That's a really difficult problem. 

 
2:35 
However, the scientists reality is that the last innovation in science, it happened pretty much at 1997. 

 
2:44 
That was the invention of Pub Med. Thanks to that, you don't have to go to libraries anymore to find your papers. 

 
2:51 
It's a great tool, obviously. 

 
2:53 
And you know, humans have been using tools over centuries on millennia to improve ourselves. 

 
2:59 
That was one huge improvement. 

 
3:02 
Since then, in that knowledge process of finding evidence and then qualifying it, removing bias, not much has changed. 

 
3:11 
Of course there's some expert tools for especially technology prone people, but for the pharmacologist, the biologist, for the medical directors, there's essentially mostly pub Med, the status quo. 

 
3:25 
On the other hand, scientific literature, documents, patents, clinical trials that have been increasing exponentially, 10s of 1,000 have been added since pub Med was released. 

 
3:37 
And the question becomes, of course, So what do you do now? 

 
3:40 
You know, how do you deal with all this data? 

 
3:43 
And the case here is to say that biomarker discovery fundamentally is underpinned by a knowledge problem. 

 
3:50 
And there's certain problems in biology. 

 
3:53 
Starting with biology is very complex, as you all know, it's multi-dimensional. 

 
3:58 
In many cases there are multiple networks that somehow interrelate with each other in order to conclude something. 

 
4:05 
There's a lot of complexity there. 

 
4:07 
The same time there's hidden evidence in biology, what we call in computer science is there's a long tail problem. 

 
4:15 
So the things that you're looking for are not the common knowledge, but the things that are in the long tail. 

 
4:21 
So scarce knowledge. 

 
4:22 
It's very rare to find that there are outliers in there, right? 

 
4:27 
And these outliers, finding them is exceedingly difficult. 

 
4:30 
If you have 10s of millions of documents that are out there, internal and external, there's data overload. 

 
4:36 
Bias has to be introduced by humans because since you cannot read 10,000 documents, you can only read maybe 500 or only 100 abstracts. 

 
4:46 
You introduce bias, your own bias, and you only select the things that you believe you will find something in, right? 

 
4:53 
So that is the consequence of that for dealing with the data. 

 
4:56 
And there's poor traceability. 

 
4:58 
You don't know what your colleagues are doing or what they have done before if you're in a research organisation together. 

 
5:05 
So you tend to, over the years, repeat maybe experiments or make the same mistakes that your colleagues have done before. 

 
5:12 
So these are a lot of different problems. 

 
5:15 
Now the good news is that all these, let's say, knowledge processing problems, they're all very prone for artificial intelligence to process that knowledge for you and then you can use it somehow. 

 
5:28 
The way that you can imagine this, that is the way on how we do it at causally is there is an AI that can read similar to a human. 

 
5:38 
It understands all these relationships between cause and effect, between action and reaction from text. 

 
5:44 
And then you can ask questions. 

 
5:46 
So you have to imagine a system that read everything and you can ask a question. 

 
5:49 
Now the question that was asked here is what are potential targets related to the pathogenesis or Sjogren's syndrome? 

 
5:57 
That was the question here. 

 
5:58 
You get like a dendrogram. 

 
6:00 
You can see you do not get documents because Sjogren's syndrome documents is probably a few thousands of documents. 

 
6:08 
What you're getting here is a list of potential targets and you're getting a snippet there and it's from a paper from the conclusion section. 

 
6:16 
And it says our results suggest that IL 22 may play a pro inflammatory role in the pathogenesis of PSS, right? 

 
6:24 
So the reason why you can ask that question and you get a dendrogram and not 10,000 documents is because the AI read everything and you can interrogate that data. 

 
6:34 
Now, the important thing here is that the AI is a tool for you. 

 
6:39 
It is like, you know, what happened with pub Med, You don't have to go to the library anymore. 

 
6:43 
That's great. 

 
6:44 
Here you don't have to read everything and you can, you know, remove the noise from the signal because there's a lot of noise in finding the right documents or that I the right evidence. 

 
6:57 
So we work with great companies together. 

 
6:59 
We're very fortunate to work with the top ten, the 10 of the top 20 pharma companies together, literally with thousands of scientists, not only in biomarker identification discovery, also target ID, non clinical safety, all these questions that are underpinned by biological problems. 

 
7:16 
We're working with them. 

 
7:18 
We have very happy users. 

 
7:19 
Scientists actually are adopting our solution and they're using it, which is a very rare thing if you have looked into science organisations, not only into commercial ones but also into non commercial organisations. 

 
7:33 
So let's put ourselves into the shoes of a scientist to give you a case on how that looks like in reality and how we have seen it being done by our users around the world. 

 
7:46 
So that's James. 

 
7:48 
James has a project designing a clinical trial investigating the treatment for Luminal A breast cancer in the face of discovery. 

 
7:56 
And these are a couple of steps that James would be doing highly simplified. 

 
8:01 
Obviously, it's more complex behind it. 

 
8:04 
But we want to identify diagnostic biomarkers to define inclusion and exclusion criteria. 

 
8:09 
And for that, we need to explore what are the known and emerging biomarkers, which ones of them are validated, studied in humans. 

 
8:16 
And then we want to specifically find those ones related to the luminal A subtype and not to luminal B. 

 
8:24 
So these are like very specific questions for our use case. 

 
8:27 
So let's start with question number one question #1 is what are the known and emerging diagnostic biomarkers for luminal A breast cancer? 

 
8:36 
The typical thing that you would do at this moment is to go into some sort of a pub Med like system and you would make a search right, for Luminal A and maybe targets, biomarkers* I mean all sorts of search strategies that you can do. 

 
8:50 
And normally you would get a lot of documents and then you would have to start reading and find biomarkers and make sense, whether they make sense what you're reading there in a system again that read everything you can just want to say what are the emerging biomarkers And what you're getting here effectively is a timeline. 

 
9:08 
So the biomarkers are graphed here on a timeline from 2014 up to 2023. 

 
9:15 
And you can see the latest one is DNM 2 published emerged for the very first time in January 2023 in scientific literature or in the corpus and it's coming out of this publication. 

 
9:28 
And there is the sentence that says these observations suggest the key role of Dyn two in the progression of luminal a breast cancer, right? 

 
9:36 
So that might be a relevant document for you or not. 

 
9:40 
But what the AI does here, it doesn't tell you this is the Holy Grail. 

 
9:44 
It tells you here it is investigated now further, now you can dive into it and you can look into the experimental method and to the model Organism. 

 
9:53 
You can think, do I believe this or not? 

 
9:55 
That is still a scientific exercise to do that, right? 

 
9:59 
But you found it effectively immediately from thousands of documents. 

 
10:04 
So once you have found your biomarker, the next question you would ask is which biomarkers are have been studied in humans. 

 
10:12 
So there might be a very long list of biomarkers and the ones studied in humans that carry higher validation perhaps for you. 

 
10:19 
And what you can do is of course you can set a philtre and say from these 500 biomarkers, show me only the ones that are in human models, OK, so in humans. 

 
10:29 
And then the list gets streamlined and it gets sorted again. 

 
10:34 
You as a scientist can choose how to sort this list. 

 
10:38 
You can sort it by most recent, by most studied, by some other characteristics that exist there in the background. 

 
10:45 
You can choose them. 

 
10:47 
So the important thing is, and that's why the this, you know, this talk is under human centred AI is that you are in the driving seat, you have autonomy and you have full transparency of what the AI is doing. 

 
11:00 
It's working for you and not the opposite, right? 

 
11:03 
So we do that. 

 
11:05 
Have we select now our biomarker or dive a bit deeper into the one that could be of interest to us? 

 
11:11 
And now the last question is I want to have only the biomarkers that are unique to luminal A, to the luminal A breast cancer subtype. 

 
11:21 
This is a really difficult question to answer normally because it requires you to effectively you have to have read everything, find all the biomarkers, the ones related here to the luminal B subtype, the ones related to the luminal A subtype. 

 
11:38 
And then you can see here this intersection are the common biomarkers. 

 
11:42 
You don't want the common ones. 

 
11:44 
The question one is you want to the unique ones, right? 

 
11:46 
So you're looking here at this side of biomarkers related only to luminal A and you can find some examples in there like cortactin, for example. 

 
11:55 
They're common biomarkers as well, but we don't want those. 

 
12:00 
So getting this kind of information, if at all, is normally a quite lengthy bioinformatics project to do that. 

 
12:09 
If you have access to those resources and if they can pull it out with a system that is AI based, you can ask this question and get this is a screenshot from causally. 

 
12:21 
You can get this in two seconds. 

 
12:23 
This view here with any kind of disease, with any question that you want to have answered, you can do a common cause and a common effect analysis essentially. 

 
12:32 
So this is what happened here. 

 
12:33 
So these were our three steps. 

 
12:36 
And what is the advantage? 

 
12:38 
You know, I mentioned a couple of things here already, you know, while you know, in a system AI based, like causally, you find hundreds of biomarkers that the search that relates to pub Med is thousands of papers, right? 

 
12:52 
And you cannot, you will not read these papers. 

 
12:55 
Even if you would have the time, you would probably not invested in reading 3 1/2 thousand papers, right? 

 
13:00 
So this is a huge advantage. 

 
13:02 
And of course they are, you know, there's a time advantage, but there's also the advantage of even if you actually had time to read 5000 papers, it is very likely that you would not remember paper #1 after you read paper #3000, right? 

 
13:18 
So it would be, it's not humanly possible. 

 
13:20 
And that's fine because that's not the key competence of a scientist. 

 
13:25 
A great scientist is very sceptical and can qualify evidence, right? 

 
13:30 
Which is the difficult part actually and not finding the evidence to begin with. 

 
13:34 
So that was one use case and I have one more before we wrap up. 

 
13:38 
That's Emma. 

 
13:39 
That's another project here in the pre early preclinical and that's about target engagement. 

 
13:46 
We want to demonstrate proof of concept of BCL-2 inhibitors to treat hematologic malignancies. 

 
13:51 
Again, we have a couple of steps. 

 
13:53 
We want to identify biomarkers to evaluate target modulation or target engagement. 

 
13:58 
And for that, we want to see the relationships between BCL-2 and another downstream entity, look at pathways, and then maybe design an essay approach on how to do this. 

 
14:12 
So that's a very typical workflow again that we have seen our users do. 

 
14:17 
So how do we do this? 

 
14:18 
We start with number one. 

 
14:20 
The question is what biomarkers are closely linked to the mechanism of action of BCL-2. 

 
14:26 
Effectively what you search for here is and you type something like this in our system is pretty much what you would maybe type also in pub Med, right? 

 
14:36 
So you type biomarker BCL-2 knockout mice and you're searching for biomarkers affected by BCL-2 and knockout mice. 

 
14:43 
And what you're getting is a list, right? 

 
14:46 
And you're getting a list here. 

 
14:48 
And the list is underpinned by evidence, the results section. 

 
14:51 
BCL-2 reduction is further associated with the activation of caspase 3. 

 
14:57 
All right. 

 
14:58 
So that's what you get here. 

 
14:59 
It's one of the biomarkers in that list. 

 
15:02 
Again, you can sort them, you can rank them, you can deal with them in different ways. 

 
15:06 
It's transparent for you and how to do this. 

 
15:09 
So you found one biomarker here that you want to look into more detail. 

 
15:15 
The next one is a much more complicated scenario. 

 
15:19 
It is you can look into pathways, right, that go with a distance of two from your biomarker. 

 
15:28 
So what you see here is you can generate hypothesis, you can say this is BCL-2 here and BCL-2 activates or induces the box protein and that one is related to apoptosis. 

 
15:43 
So I'm apologies you can't read that all here, but this is apoptosis box and BCL-2. 

 
15:49 
And you could now hypothesise that this link is somehow related to apoptosis. 

 
15:54 
And of course it is in this case BCL-2 is known to be regulating or modulating apoptosis. 

 
16:00 
So that's a clear one. 

 
16:01 
But imagine you would be looking at the target where almost nothing has been studied on. 

 
16:08 
So there's no literature, there's no prior knowledge out there. 

 
16:12 
There's very little things known about it. 

 
16:14 
You can look at these cascades and then conclude that apoptosis could be an effect that you would be looking into your essays when you are modulating the target. 

 
16:27 
So this is a really advanced way of how you can do these things and you can only do them with AI based systems. 

 
16:36 
And finally, you remember in step number one, we took for example, the example of caspase 3. 

 
16:44 
And now the question is, you know, is a given biomarker as sayable. 

 
16:48 
And one of the key questions to ask there is where is it being expressed this biomarker? 

 
16:53 
So how are you going to find that again in I said, you know biomarker identifications underpinned by knowledge problem, right. 

 
17:03 
So how are you going to find the cells where a certain biomarker is being expressed If a system read everything, you can just ask these questions like cell and tissues affected by caspase 3 and you can get here a long list of different cells like hepatocytes or myocytes, cardiac myocytes, etcetera. 

 
17:25 
And then you can click on them and you can interrogate that evidence and hopefully choose something that makes sense for your essay design. 

 
17:34 
So very quick, again, you can get to the results. 

 
17:37 
That's the headline here. 

 
17:41 
There are again, some numbers here. 

 
17:42 
We don't have to go into them in detail, but you can see here there are like a lot of cell types that we have in the system and you can just go through it and quickly find that evidence for you. 

 
17:55 
So finally, I'm going to leave you with a couple of things. 

 
17:58 
It's my last slide. 

 
18:01 
As I said, biomarker discovery or anything really in Preclinical Research and discoveries underpinned fundamentally by a knowledge problem. 

 
18:12 
The tools that we have as scientists, they are a bit outdated. 

 
18:17 
And while we are fighting, you know, against exponential increase of data text, internal and external and commercial evidence, and it's time to upgrade, you know, the software stack that scientist has at her disposal. 

 
18:33 
And AIs can do that. 

 
18:35 
But it's really important that the AI is there to support the scientist and not vice versa. 

 
18:42 
That means the scientist needs to have full control, full transparency, and is in the driver's seat for interrogating evidence and qualifying the evidence themselves. 

 
18:53 
And the AI is just there to make your life easier, right? 

 
18:57 
So you can single out noise from signal when you're looking at thousands of documents or potentially 10s of thousands. 

 
19:06 
That's all I had. 

 
19:07 
Thank you very much.