Henry Cope works at both the Royal Marsten and Great Ormond Street Hospitals in London as a bioinformatician, helping to improve their clinical genomics workflows. Each of these hospitals use genetic screening and analysis for different ends. Royal Marsten looks for somatic genetic variation in tumour samples, while Great Ormond Street’s focus is more directed towards diagnostic variants. 

The clinical genomics workflow starts with the collection of samples which are collected and sequenced via next generation sequencing. The sequencing data are automatically processed to identify variants and uploaded to a web server where clinical scientists can add their interpretations. A report is then generated and sent to the clinician who can assign a clinical outcome such as changes to treatment, further testing, or diagnosis. 

The increased use of the NHS genetic testing directory has seen an increase in samples in recent years, with the Royal Marsten Hospital receiving up to 3,500 in some months. This growth in throughput and volume has led to a need for automation and streamlining of processes to make them more efficient. 

Developments in artificial intelligence and machine learning technologies are one of the ways the clinical genomics workflow is being automated. Transformer-based AI models, like large language models (LLMs), rely on self-attention mechanisms which capture long-range contextual relationships between tokens. The transformer architecture allows for the automation of complex tasks, which can be applied to the genomics workflow. 

Cope highlighted a project which aimed to integrate LLMs into genomic reporting. When the bioinformatics pipeline has outputted its report, it is checked by a clinical scientist and written up as a summary. This project aimed to streamline the clinical summary writing process using AI, particularly for straightforward cases, to alleviate this bottleneck in the workflow. Retrieval-augmented generation is used enhance model accuracy by providing context from structured guidelines and scientific literature. Cope described how this helped to prevent errors and hallucinations in AI-generated summaries.  

Looking forward, Cope also highlighted the potential for language models in querying databases, translating natural language into structured queries in SQL. This could assist in bioinformatics development and indicates a broad scope for AI applications in genomics.