0:02 

Thank you very much for the introduction and of course to have the opportunity today to present a range of single cell omics approaches that we are recent implementing in our core unit. 

 
0:19 
This is the NGS Integrative Genomics Core unit at the University Medical Centre, Gottingen. 

 
0:24 
And of course, the data we present today has been generated in close collaboration with our users, researchers in Gottingen and other institutions. 

 
0:37 
The rapid evolution in omics technologies has dramatically changed the way we are investigating human diseases, offering unpresented in science into the molecular foundations of complex diseases, for example, cancer for a range of heart disorders or neurodegenerative diseases. 

 
1:03 
By integrating omics data, genomics, transcriptomics, epigenomics, proteomics, researchers can now explore with much more precision pathogenic or disease mechanisms. 

 
1:19 
And in particular, when we are integrating single cell omics independent of RNA or DNA approaches, we can have a closer look at the landscape on the single cell resolution, at the single cell resolution and promoting, of course, the discovery of new biomarkers and in particular, improving the accuracy in molecular phenotyping. 

 
1:48 
And this, of course, leading to an improvement in patient outcome. 

 
1:53 
What I would like to highlight today is the way that we are currently integrating different data sets of omics data. 

 
2:01 
And I would like to give you 2 examples. 

 
2:04 
The first of them is the development of single cell DNA sequencing that we recently implemented by using the Shasta chemistry. 

 
2:16 
We choose that especially because of the quality in the amplification step of single cells, which is the most critical step to win single cell DNA. 

 
2:25 
And the goal was to investigate copy number variants or a small structural variant at the single cell level. 

 
2:34 
And for that what we did is in order to validate our findings to perform in parallel standard whole genome sequencings by using short read platforms such as Illumina. 

 
2:46 
But even we include the use of long reads, in this particular case scientific long reads recently launched by Illumina. 

 
2:57 
But today I would like to focus much more in the performance of the very well done single cell RNA sequencing. 

 
3:05 
Why very well done? because the performance of this kind of data set is in a full length manner, sequencing deeper than conventional single cell RNAs so that we can simultaneously apply for variant calling analysis, determining variants as well as performing differential expression gene analysis tradition as well.  

 
3:31 
The most critical step when we are starting thinking about how to do a well done single cell analysis is, of course, the how to isolate cells, living cells, single cells to discuss debris, cell doublets, cell clustering and so on that you can in conventional method, never check before.  

 
3:53 
We choose the isolate and now the Shasta instrument because we have the opportunity to check single cell quality by imaging the Shasta instrument allow the use of three different channels so that we can for example, by standing with that it takes us red this car beautiful between living and dead cells and even to check quality prior to library preparation and prior to sequencing. 

 
4:23 
The other advantages that we tested a broad range in sizing, starting with nuclei, but even using very unusual cells in morphology and even in sizing. And the new Shasta instrument has the possibility. 

 
4:42 
Now this is the recent launch, it has the possibility to work with combinatorial indexing strategy, which increase the number of cells that you can analyse per chip or per experiment from 1600 to nearly 100,000 cells. 

 
5:03 
In my opinion, the most important thing is the chemistry and the choice of the SMART or Shasta chemistry is because the flexibility in the performance of library preparation, you can access 3 end approaches. 

 
5:18 
These are very similar to droplet technologies, or you can really choose for smart, full length approaches allowing you to analyse isoforms or variant calling. 

 
5:35 
And very interesting is the new RNA Shasta protocol because it's the very first one that allow you to perform stranded total RNA, which is very important because in many diseases, human diseases, long noncoding RNAs becomes very important. 

 
5:54 
And by using that you can increase the detection of both coding as well as long noncoding RNAs. 

 
6:02 
And of course when you would like to sequence them. 

 
6:06 
The flexibility in sequencing is really great because you can choose a single-end with a paired-end mode. 

 
6:14 
And choosing different read length from 50 to 150 making it possible to mix the samples with standard other NGS and of course reducing pricing in the in that case. This is an example from the single cell DNA performed with the Shasta chemistry. 

 
6:35 
In this particular case, we validated the data and choose of course for the very beginning very simple samples with identical background, in this case, retina cell lines RPE1 that we treated with DMSO as a negative control and with the known NPS1 inhibitor already known to induce random aneuploidism. 

 
7:01 
In this particular case, we dispense the cells undergo quality control with RPE and Texas Red. 

 
7:09 
In this particular case. 

 
7:10 
And the good thing is that the system from the dispensing and the whole library preparation is fully automated. 

 
7:19 
In this particular case doing the amplification, incorporation of the indexing. 

 
7:25 
And as you can see at the end we pull all libraries. 

 
7:29 
And in this particular case we perform in parallel decided scientific long race from Illumina as well as standard whole genome sequencing. 

 
7:38 
And sequence in 1 and 1/2 is for flow cell because of a huge amount of data. 

 
7:45 
And at the end you can see the primary data analysis visualising a typical heat map performed with AneuFinder and you can see beautiful the difference between the positive and the negative control. 

 
8:01 
We currently are focusing especially in a range of studies performed with heart and brain, long non-coding RNAs and in this particular case I am presenting today a recent publication in 2024 of very relevant non-coding RNA, the PRDM6 in that was specifically expressed in astrocytes. 

 
8:32 
This study has been done with postmortem tissues and just to demonstrate how good is the performance searching for specific RNAs are tissue specific and in this specific case most of the long non-coding RNA are moderate express. 

 
8:51 
So you need also a very sensitive approach to detect most of them. 

 
9:00 
This study has been done together with Andre Fischer. 

 
9:03 
He is the director of the German Centre for Neurodegenerative Disease in getting them and we plan to do much more samples and now we are including a heart. And this is just to give you an example of a very unusual cells that we could access in a whole cell manner. 

 
9:25 
In this typical case, one cells from peripheral nervous system and these cells exhibiting a prominent axonic regrowing and has a sizing between 2 and 500 microns. 

 
9:40 
And as you can see by imaging the cells we could access this was the isolate like 800 cells, you know within one chip. 

 
9:52 
Very important, I mentioned before that we're using the instrument we can apply for nuclei and very large cells. 

 
10:01 
And of course when you are working with heart the investigation of cardiomyocytes is imperative. 

 
10:09 
And the idea was to start performing studies with whole cardiomyocyte with our again unusual in morphology and they can, they have a size between 60 and over 260 microns. 

 
10:28 
And in this specific case, the very first case illustrates some nuclei in the middle 1, the cardiomyocytes in this case enriched from mice samples. 

 
10:39 
And the way we have to work with that is that most of the cardiomyocytes about 30, depends of mice or human, but most of them are polynucleated. 

 
10:50 
And for that reason, it's much better to proceed this kind of approaches with whole cells. 

 
10:58 
And in the last case, these are normal stem cells differentiated to cardiomyocyte, which are totally standard for the instrument. 

 
11:09 
And here just to give you 2 examples of enriched cardiomyocyte, the very first publication we did in 2020 and in 2023, we could access different subtypes of cardiomyocytes when they undergo a stress situations. 

 
11:30 
Now we are coming to the second part of the presentation. 

 
11:34 
And what I would like to say is that is really very impressive when you can access different kind of things by using one data sets of data. 

 
11:46 
And again by doing the full length approach, we perform this study with nuclei and discover that the use of full length and nuclei is accessing a lot of intergenic and intronic regions. 

 
12:10 
Of course you know that intronic sequences are nearly the half of the complete genome. 

 
12:18 
And for that we tested how many of the pre messenger RNAs are detectable when we are doing this kind of approach. 

 
12:29 
And nearly 30% of the sequences were intronic and intergenic sequences. 

 
12:36 
And we tested that we could cover a lot of pre messenger RNAs in the nuclei. 

 
12:44 
And the idea was to access snip calling and focus especially in the underreported intronic variant, especially deep intronic variants and analysing that we access some genes that undergo pathogenic splicing or missplicing in cancer. 

 
13:08 
So I have to say pancreatic cancer is the best example to test that kind of studies. 

 
13:16 
And we reported in this publication from 2024 a large number of genes that exhibit intron retentions or cellular exon activations, of course changing the protein. 

 
13:33 
And the question is if the protein's still there or not or probably will be degraded. 

 
13:38 
This we not tested now, but we could access the relevance of intronic variants situated in the proximal region very near to the exon. 

 
13:50 
But much more interesting was to evaluate the significance of an intronic variant situated very deep in the intron. 

 
13:59 
And but on the other hands, you can proceed with classical transcriptomic analysis, improving the phenotype. 

 
14:07 
This classification is very old, the basal and the classical like. 

 
14:13 
I have to say that the real phenotyping of cancer is much more complex than that. 

 
14:19 
But this is just to give you an example how the clustering in organoids. 

 
14:25 
In this I will explain what we exactly did and the way that you can of course access very novel biomarkers. 

 
14:34 
In this particular study, we proceed with the single cell nuclei analysis, RNA sequencing using the primary PDA cells from patients and working with the clinical models which are PDX and organoids. 

 
14:49 
And the idea was to corroborate the transcriptional concordance between the original tumour and the models. 

 
14:59 
And when we are clustering by patients, in this case, I'm only showing three patients, but you can see clear clustering for organoid PDX and tumours. 

 
15:09 
And you can even see that the tumour, the primary tumour is the only tissue which is very heterogeneous. 

 
15:17 
And most of the organoids became like classical like mostly. 

 
15:22 
And doing that you can beautifully analyse biomarkers from the basal like cell as well as the classical. 

 
15:30 
And again here we found out that the most relevant markers for basal like cells were long non-coding RNAs such as MALAT or NEAT. 

 
15:43 
In parallel. 

 
15:44 
Starting with the coding variants, we detect share variants between all tumours investigated, about 78 K variants including insertions and deletions. 

 
15:58 
And as you can see most of them are located in intronic regions. 

 
16:02 
Of course this is the half of the genome and about 80% related to protein codings. 

 
16:12 
And as you can see long non-coding RNAs are still very prominent in this data set, including 121 small RNAs for example. 

 
16:23 
So very sensitive assay and the question was how to analyse intronic variants. 

 
16:30 
And the first things we did is to normalise database on intron length because this is very variable. 

 
16:36 
And then classified by positions SS within the two base pairs after the exon proximal regions within the 40 base pairs near to the exon and the rest is deep. 

 
16:52 
And we classified that and performed for example pathway enrichment analysis. 

 
16:57 
And which is really very beautiful to see is that the well reported variants within the proximal region very near to the exon are much more related to RNA polymerase transcriptional elongation things which makes sense. 

 
17:14 
And with the deep one, we found for example in this particular PDAC case histone H3K43 methylation activity. 

 
17:25 
So now I will present the most important slide in the presentation. 

 
17:30 
This is a traditional heat map, you know based on UMAP, we could assess when we are integrating all data from primary PDAC, 15 common markers in all tumours from the basal like cells which are the more aggressive type in the tumour and nearly 78 markers for the classical one. 

 
17:54 
And very interesting when you are integrating the information from the intronic variants. 

 
18:02 
We found out that basal like tumour cells are associated with genes exhibiting high frequency in intronic variants in PDACs and this is the best finding. 

 
18:13 
The problem was to convince people that the way to perform phenotyping approaches is much more complex than that and the good news was that in 2024 others since the reports reported or confirm our findings. 

 
18:32 
An alternative splicing signature defines the basal life phenotype and predicts worse clinical outcome in pancreatic cancer. 

 
18:40 
So this is totally concordant to the findings that we got by a single cell analysis. 

 
18:48 
And in this particular case we introduced public data. 

 
18:52 
In this case, was the cBioPortal taking all pancreatic samples and the points in green are concordant finding between public data, our own data from the Core Research Unit 5002. 

 
19:13 
This is about pancreatic pancreas and genome dynamics. 

 
19:19 
We sequenced nearly hundred samples in this consortium, and we put all together our data, the public data and found a set of 22 genes that are not only mutated, the mutation is related to splicing and you can see the Kaplan Meier curve showing survival with and without these 22 genes with mutations or not. 

 
19:47 
This has been done with the cBioPortal again. 

 
19:51 
And lastly, I would like to illustrate how you can integrate real genotype to phenotype. 

 
20:01 
This is another study what we perform with a thermal fibroblast from progeria patient samples, and we took six different phenotypes that we know before that the progeria phenotyping is totally different and perform the isolate a full length sequence. 

 
20:23 
And as you can see beautiful, you can see clusters based on patients. 

 
20:29 
In this study we include two controls from thermal fibroblast. 

 
20:34 
And just to give you an example, what we are specifically doing is in case you can separate phenotypes, you can search for variants patient specific and of course known synonymous mutations. 

 
20:52 
We corroborate this data with very well done bulk RNA sequencing as well as whole genome sequences. 

 
20:59 
So all variant that we reported in these studies has been described in the databank from SNIP databank where corroborated with our own bulk sequencing as well as the own done whole genome sequencing. 

 
21:15 
And beautiful is if you would like to analyse the possible relevance of the mutations. 

 
21:21 
What you can do is then perform differential expression analysis between mutated and non-mutated cells from one given phenotype. 

 
21:31 
And in this particular case, you can see crucial pathways that we found by doing in this case Reactome. 

 
21:39 
But most important thing is that the pathways correlate with the transcriptional analysis that we perform very conventional in parallel with the same data set. 

 
21:52 
So I would like to thank Argyris Papantonis. 

 
21:56 
Our lab is working very close to the translational epigenetics laboratory. 

 
22:02 
He's an expert in, for example, high chromatin and epigenetics. 

 
22:08 
And of course, all researchers, groups and institutions, we are as a core unit, we have to work very close to them in order to get material and to develop a method like those here. 

 
22:24 
Thank you very much.