Skip to main content
ARS Home » Plains Area » College Station, Texas » Southern Plains Agricultural Research Center » Crop Germplasm Research » Research » Publications at this Location » Publication #310031

Title: Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution

Author
item LI, FUGUANG - Cotton Research Institute - China
item FAN, GUANGYI - Beijing Genome Institute
item LU, CUIRUI - Cotton Research Institute - China
item XIAO, GUANGHUI - Peking University
item ZOU, CHANGSONG - Cotton Research Institute - China
item Kohel, Russell
item MA, ZHIYING - Agricultural University Of Hebei
item SHANG, HAIHONG - Cotton Research Institute - China
item MA, XIONGFENG - Cotton Research Institute - China
item WU, JIANYONG - Cotton Research Institute - China
item LIANG, XINMING - Bgi Shenzhen
item HUANG, GAI - Peking University
item Percy, Richard
item LIU, KUN - Cotton Research Institute - China
item YANG, WEIHUA - Cotton Research Institute - China
item CHEN, WENBIN - Bgi Shenzhen
item DU, XIONGMING - Cotton Research Institute - China
item SHI, CHENGCHENG - Bgi Shenzhen
item YUAN, YOULU - Cotton Research Institute - China
item YE, WUWEI - Cotton Research Institute - China
item LIU, XIN - Bgi Shenzhen
item ZHANG, XUEYAN - Cotton Research Institute - China
item LIU, WEIQING - Bgi Shenzhen
item WEI, HENGLING - Cotton Research Institute - China
item WEI, SHOUJUN - Cotton Research Institute - China
item HUANG, GUODONG - Bgi Shenzhen
item ZHANG, XIANLONG - Huazhong Agricultural University
item ZHU, SHUIJIN - Zhejiang University
item ZHANG, HE - Bgi Shenzhen
item SUN, FENGMING - Bgi Shenzhen
item WANG, XINGFEN - Agricultural University Of Hebei
item LIANG, JIE - Bgi Shenzhen
item WANG, JIAHAO - Bgi Shenzhen
item HE, QIANG - Bgi Shenzhen
item HUANG, LEIHUAN - Bgi Shenzhen
item WANG, JUN - Bgi Shenzhen
item CUI, JINJIE - Cotton Research Institute - China
item SONG, GUOLI - Cotton Research Institute - China
item WANG, KUNBO - Cotton Research Institute - China
item XU, XUN - Beijing Genome Institute
item Yu, John
item ZHU, YUXIAN - Wuhan University
item YU, SHUXUN - Cotton Research Institute - China

Submitted to: Nature Biotechnology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/15/2015
Publication Date: 5/1/2015
Citation: Li, F., Fan, G., Lu, C., Xiao, G., Zou, C., Kohel, R.J., Ma, Z., Shang, H., Ma, X., Wu, J., Liang, X., Huang, G., Percy, R.G., Liu, K., Yang, W., Chen, W., Du, X., Shi, C., Yuan, Y., Ye, W., Liu, X., Zhang, X., Liu, W., Wei, H., Wei, S., Huang, G., Zhang, X., Zhu, S., Zhang, H., Sun, F., Wang, X., Liang, J., Wang, J., He, Q., Huang, L., Wang, J., Cui, J., Song, G., Wang, K., Xu, X., Yu, J., Zhu, Y., Yu, S. 2015. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nature Biotechnology. 33:524-530.

Interpretive Summary: Complete DNA sequencing of commercial Upland cotton (Gossypium hirsutum) is difficult due to its relatively large and complex genome. Much of the complexity of cotton's genetic makeup is due to its being the product of hybridization between two ancestral species. In this study we sequenced, assembled, and analyzed the world's most important cultivated Upland cotton genome. The success of this accomplishment required that we first successfully sequence the parental species of Upland cotton as a preliminary effort. Among many benefits, DNA sequence information will support gene discovery for important agronomic and quality traits and will facilitate high-resolution association studies that will result in development of molecular markers for expedited breeding and selection of traits of interest. The sequencing of the Upland cotton genome will also lay the foundation for understanding of the evolutionary and functional significance of multi-genome plants.

Technical Abstract: Genetic and genomic analyses of Upland cotton (Gossypium hirsutum) are difficult because it has a complex allotetraploid (AADD; 2n = 4x = 52) genome. Here we sequenced, assembled and analyzed the world's most important cultivated cotton genome with 246.2 gigabase (Gb) clean data obtained using whole-genome shotgun sequencing technology and an additional set of 100,187 bacterial artificial chromosomes (BACs) representing about five-fold genome coverage. A total of 89.2% of the 2,173 Mb scaffolds were anchored and oriented to 26 pseudochromosomes with assistance from a high-resolution genetic map that comprises 39,662 single-nucleotide polymorphism (SNP) markers. The Upland cotton is suggested to have originated from paleo-hexaploidy of an eudicot with successive polyploidization and finally the fusion of an A and D genome ancestral species 1-2 million years ago (MYA). The allotetraploid genome contained 76,943 protein-coding genes that displayed high degrees of conserved gene order with the two reported ancestral diploid cotton genomes. Transposable elements (TEs) accounted for 67.2% of the allotetraploid genome and Dt-originated (in which 't' indicates tetraploid) TEs seemed more active than that of the At after the allopolyploidiztion. Gene loss studies through quartet alignment indicated that genome downsizing occurred in the allotetraploid cotton shortly after the allopolyploidy and the Dt subgenome evolved faster than At. The Upland cotton genome is found to couple with the evolvement of different regulatory mechanisms regarding two important gene families, Cellulose Synthase (CesA) and 1-aminocyclopropane-1-carboxylic acid oxidase1 and 3 (ACO1,3), that are essential for the production of long and spinnable fiber cells used by the modern textile industry.