Brain Isoform

We used long-read transcriptome sequencing to characterise the structure and abundance of full-length transcripts in the human cortex from donors aged 6 weeks post-conception to 83 years old. We identified thousands of novel transcripts, with dramatic differences in the diversity of expressed transcripts between prenatal and postnatal cortex. A large proportion of these previously uncharacterised transcripts have high coding potential, with corresponding peptides detected in proteomic data. Novel putative coding sequences are highly conserved and overlap de novo mutations in genes linked with neurodevelopmental disorders in individuals with relevant clinical phenotypes. Our findings underscore the potential of novel coding sequences to harbor clinically relevant variants, offering new insights into the genetic architecture of human disease. Please refer to Bamford et al. (2024) for more details.

Database and annotations

Our cortical transcript annotations are available as a resource to the research community via an online database: http://www.isoforms.com. Long read expression data (PEXT score) across all samples is available in the new LRBrainCoverage app.

UCSC tracks corresponding to data presented in the paper is available here. We also have a version of the dataset which is processed with default SQANTI3 parameters to exclude non-canonical splice junctions. We have a version of the dataset which has been filtered for 10 reads across 10 samples and we have separately processed the data using the Bambu analysis pipeline (Chen Y. et al. 2023). Finally, for a subset of samples we have generated direct (native) RNA data.

Raw sequencing data is available in the Sequence Read Archive (SRA) database (https://www.ncbi.nlm.nih.gov/sra) under accession numbers PRJNA1117615 and PRJNA1129050. Processed intermediate data and targeted probe panels are available for download at Zenodo.