Location: Plant, Soil and Nutrition Research
Title: Pan-genomic approaches to consistent annotation of rice genomesAuthor
CHOUGULE, KAPEEL - Cold Spring Harbor Laboratory | |
LU, ZHENUAN - Cold Spring Harbor Laboratory | |
OLSON, ANDREW - Cold Spring Harbor Laboratory | |
WEI, SHARON - Cold Spring Harbor Laboratory | |
Ware, Doreen |
Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only Publication Acceptance Date: 1/13/2023 Publication Date: N/A Citation: N/A Interpretive Summary: Technical Abstract: Since the first rice genome, sequenced 20 years ago, new genomes representing a wider variety to explore agriculturally important traits, have been sequenced exhibiting higher contiguity and completeness. Assembling a high-quality rice genome assembly has become a commodity practice with lowering sequencing cost, improved sequencing chemistry, and assembly algorithms. As we transition from having single reference to multiple reference pangenomes, many challenges exist post-assembly including accurately predicting gene structural annotation and assigning locus identifiers. Due to the predictive nature of annotation algorithms and lack of curated or accession-specific transcript evidence, the majority of the annotation tools lack either sensitivity or specificity for accurately predicting gene structure. Protein structure is well conserved across grass phylogeny and more so for accession within a specie. Based on this we developed an annotation protocol that builds a pan-gene index using representative pan-gene models selected from comparative analysis of protein coding gene family trees. We have benchmarked this protocol across other grasses but presenting results in rice. We propagate these pan-genes onto genome assemblies of other unannotated rice accessions using Liftoff, and update the gene structures with available transcriptome evidence using PASA. To support the projected models we curated and included evidence from the Nipponbare reference transcriptome that includes EST and full-length mRNA that were filtered for intron retention and clustered using CDS-hit. In addition, we demonstrate the power of using accession-specific rice full-lengths in improving gene structure and capturing alternate isoforms. Supported by USDA-ARS #8062-21000-044-00D. |