Skip to main content
ARS Home » Research » Publications at this Location » Publication #201296

Title: The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships among angiosperms.

Author
item SAMSON, NALAPALLI - UNIVERSITY OF CENTRAL FL
item Bausher, Michael
item LEE, SEUNG-BUM - UNIVERSITY OF CENTRAL FL
item JANSEN, ROBERT - UNIV. OF TEXAS, AUSTIN
item DANIELL, HENRY - UNIVERSITY OF CENTRAL FL

Submitted to: Plant Biotechnology Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/23/2006
Publication Date: 2/12/2007
Citation: Samson, N., Bausher, M.G., Lee, S., Jansen, R.K., Daniell, H. 2007. The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships among angiosperms.. Plant Biotechnology Journal. 5(2):339-353

Interpretive Summary: This report describes the sequencing and phylogenetic analysis of the coffee (Coffea arabica) chloroplast genome. Coffee is the first member of the Rubiaceae family to be sequenced. Sequencing the chloroplast genome of a species is done for a number of reasons; one of which is determining its relationship to other plant groups. Additionally, the sequence data can be used to identify regions in the genome which are targets for the insertion of foreign genes. Insertion of foreign genes into the chloroplast genome offers potential for the improvement of crop plants.

Technical Abstract: The chloroplast genome sequence of Coffea arabica L., first member of family Rubiaceae (fourth largest family of angiosperms) is reported. The genome is 155,189 bp in length, including a pair of inverted repeats of 25,943 bp, separated by a small single copy region of 18,137 bp and a large single copy region of 85,166 bp. Of the 130 genes present in the genome, 112 are unique and 18 genes are duplicated in the inverted repeat. The coding region comprises 79 protein genes, 29 tRNA genes and 4 rRNA genes, 18 genes containing introns (3 with two introns). Repeat analysis revealed five direct and three inverted repeats of 30 bp or longer with sequence identity > 90%. The coffee genome is different from tobacco in having rps 19 truncated in the IR-A region. Furthermore, the whole genome comparisons identified large indels (> 500 bp) in several intergenic spacer regions and introns in the Solanaceae, including trnE-UUC – trnT-GGU spacer, ycf4 – cemA spacer, trnI-GAU intron, and rrn5 – trnR-ACG spacer regions. Phylogenetic analyses based on DNA sequences for 61 protein-coding genes for 35 taxa performed using both maximum parsimony and maximum likelihood methods strongly support the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I and II. Coffea (Rubiaceae, Gentianales), is only the second order sampled from the euasterid I clade. The availability of the complete chloroplast genome of coffee provides regulatory and intergenic spacer sequences for utilizing chloroplast genetic engineering to improve this important crop.