Human cortex isoform data

We used long-read transcriptome sequencing to characterise the structure and abundance of full-length transcripts in the human cortex from donors aged 6 weeks post-conception to 83 years old. We identified thousands of novel transcripts, with dramatic differences in the diversity of expressed transcripts between prenatal and postnatal cortex. A large proportion of these previously uncharacterised transcripts have high coding potential, with corresponding peptides detected in proteomic data. Novel putative coding sequences are highly conserved and overlap de novo mutations in genes linked with neurodevelopmental disorders in individuals with relevant clinical phenotypes. Our findings underscore the potential of novel coding sequences to harbor clinically relevant variants, offering new insights into the genetic architecture of human disease. Please refer to Bamford et al. (2024) for more details.

Database and annotations

A summary of isoform expression values across samples (PEXT scores) is available in the LRBrainCoverage app. 

All UCSC tracks corresponding to data presented in the paper are available on the ‘Human Cortex Transcriptome‘ Track Hub or individually:

  1. Default annotations related to the transcripts described in our paper here (minimum 10 reads across 10 samples with full junction support from the recount3 database, with the exception of FSM and ISM transcripts)
  2. A version of the dataset which is processed with default SQANTI3 parameters to exclude non-canonical splice junctions. 
  3. A version of the dataset which has been filtered to include very rare transcripts (minimum 2 reads across 2 samples)
  4. A version of the dataset processed using the Bambu analysis pipeline (Chen Y. et al. 2023). All tracks have been filtered to include a minimum of two full-length reads across two samples, and all reads flagged as artefacts by SQANTI3.
  5. Tracks generated from direct (native) RNA sequencing on a subset of samples.

Raw sequencing data is available in the Sequence Read Archive (SRA) database (https://www.ncbi.nlm.nih.gov/sra) under accession numbers PRJNA1117615 and PRJNA1129050. 

Processed intermediate data and targeted probe panels are available for download at Zenodo.