Google has announced DeepSomatic, an AI tool that can identify cancer-related mutations in tumour genetic sequences more accurately.
Cancer starts when the controls governing cell division malfunction. Finding the specific genetic mutations driving a tumour’s growth is essential for creating effective treatment plans. Doctors now regularly sequence tumour cell genomes from biopsies to inform treatments that can target how a particular cancer grows and spreads.
Published in Nature Biotechnology, this work presents a tool that uses convolutional neural networks to identify genetic variants in tumour cells with greater accuracy than current methods. Google has made both DeepSomatic and the high-quality training dataset created for it openly available.
The challenge of somatic variants
Cancer genetics is complex. While genome sequencing finds genetic cancer variations, distinguishing real variants from sequencing errors is difficult and where an AI tool would provide welcome assistance. Most cancers are driven by ‘somatic’ variants acquired after birth rather than inherited ‘germline’ variants from parents.
Somatic mutations happen when environmental factors like UV light damage DNA, or when random errors occur during DNA replication. When these variants alter normal cell behaviour, they can cause uncontrolled replication, driving cancer development and progression.
Identifying somatic variants is harder than finding inherited ones because they can exist at low frequencies within tumour cells, sometimes at rates lower than the sequencing error rate itself.
How DeepSomatic works
In clinical settings, scientists sequence both tumour cells from a biopsy and normal cells from the patient. DeepSomatic spots the differences, identifying variations in tumour cells that aren’t inherited. These variations reveal what’s fuelling the tumour’s growth.
The model converts raw genetic sequencing data from both tumour and normal samples into images representing various data points, including the sequencing data and its alignment along the chromosome. A convolutional neural network analyses these images to differentiate between the standard reference genome, the individual’s normal inherited variants, and cancer-causing somatic variants while filtering out sequencing errors. The output is a list of cancer-related mutations.
DeepSomatic can also work in ‘tumour-only’ mode when normal cell samples are unavailable, which happens frequently with blood cancers like leukaemia. This makes the tool applicable across many research and clinical scenarios.
Training a more precise AI cancer research tool
Training an accurate AI model requires high-quality data. For its AI tool, Google and its partners at the UC Santa Cruz Genomics Institute and the National Cancer Institute created a benchmark dataset called CASTLE. They sequenced tumour and normal cells from four breast cancer samples and two lung cancer samples.
These samples were analysed using three leading sequencing platforms to create a single, accurate reference dataset by combining the outputs and removing platform-specific errors. The data shows how even the same cancer type can have vastly different mutational signatures, information that can help predict patient response to specific treatments.
DeepSomatic models performed better than other established methods across all three major sequencing platforms. The tool excelled at identifying complex mutations called insertions and deletions, or ‘Indels’. For these variants, DeepSomatic achieved a 90% F1-score on Illumina sequencing data, compared to 80% for the next-best method. The improvement was more dramatic on Pacific Biosciences data, where DeepSomatic scored over 80% while the next-best tool scored less than 50%.
The AI performed well when analysing challenging samples. Testing included a breast cancer sample preserved with formalin-fixed-paraffin-embedded (FFPE), a common method that can introduce DNA damage and complicate analysis. It was also tested on data from whole exome sequencing (WES), a more affordable method that sequences only the 1% of the genome coding for proteins. In both scenarios, DeepSomatic outperformed other tools, suggesting its utility for analysing lower-quality or historical samples.
An AI tool for all cancers
The AI tool has shown it can apply its learning to new cancer types it wasn’t trained on. When used to analyse a glioblastoma sample, an aggressive brain cancer, it successfully pinpointed the few variants known to drive the disease. In a partnership with Children’s Mercy in Kansas City, it analysed eight samples of paediatric leukaemia and found the previously known variants while identifying 10 new ones, despite working with tumour-only samples.
Google hopes research labs and clinicians will adopt this tool to better understand individual tumours. By detecting known cancer variants, it could help guide choices for existing treatments. By identifying new ones, it could lead to new therapies. The goal is to advance precision medicine and deliver more effective treatments to patients.
See also: MHRA fast-tracks next wave of AI tools for patient care

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo, click here for more information.
AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.
The post Google AI tool pinpoints genetic drivers of cancer appeared first on AI News.
Leave a Reply