Poster Presentation ESA-SRB 2023 in conjunction with ENSA

Nanosplit – Allele-specific read sorting programme for Oxford nanopore long read sequencing platform. (#407)

Teruhito Ishihara 1 , Christian Belton 1 , Gavin Kelsey 1
  1. Epigenetics Programme, Babraham Institute, Cambridge, United Kingdom

Genomic imprinting, a complex and important epigenetic phenomenon in mammals, causes a subset of genes to be expressed in a parent-of-origin-specific manner. Comprehensive searches for these imprinted genes in mice have been conducted primarily by allelic expression analysis of hybrid samples using Illumina-based sequencing platforms. However, this approach has limitations, including fragmentation of transcripts and PCR biases. These can lead to loss knowledge of the splice variant origin of each read and biased allelic specificity of sequencing libraries, respectively. To overcome these problems, direct cDNA/RNA sequencing on the Oxford nanopore long read platform, which avoids fragmentation of information and PCR bias, is a preferred alternative approach. However, while the programme SNPsplit can be used for Illumina-based platforms to separate sequencing reads by parental genome based on strain-specific single nucleotide polymorphisms (SNPs), the Oxford nanopore platform has no equivalent programme. In this study, we developed a new Python-based programme, Nanosplit, to achieve this in nanopore long read sequencing datasets. We performed nanopore sequencing on embryos and placentas of hybrid samples obtained from crosses between the C57BL6 and CAST/Ei strains. Correct allelic sorting was confirmed by examining parental origin differences in known imprinted genes such as Peg10 and Igf2r. The Nanosplit programme also allowed us to identify tissue-specific and/or isoform-dependent mono-allelically expressed genes that could not have been identified without long read sequencing. This study demonstrates the ability of Nanosplit to further advance the analysis of allelic expression and the identification of novel imprinted genes.