Skip to main content
ARS Home » Plains Area » Kerrville, Texas » Knipling-Bushland U.S. Livestock Insects Research Laboratory » LAPRU » Research » Publications at this Location » Publication #364455

Title: The Pacific Biosciences de novo assembled genome from a parthenogenetic New Zealand wild population of the longhorned tick, Haemaphysalis longicornis Neumann, 1901

Author
item Guerrero, Felicito
item Bendele, Kylie
item GHAFFARI, NOUSHIN - Texas A&M Agrilife
item GUHLIN, JOSEPH - University Of Otago
item GEDYE, KRISTENE - Massey University
item LAWRENCE, KEVIN - Massey University
item DEARDEN, PETER - University Of Otago
item HARROP, THOMAS - University Of Otago
item HEATH, ALLEN - Agresearch
item LUN, YANNI - Texas A&M Agrilife
item METZ, RICHARD - Texas A&M Agrilife
item TEEL, PETE - Texas A&M University
item Perez De Leon, Adalberto - Beto
item BIGGS, PATRICK - Massey University
item POMROY, WILLIAM - Massey University
item JOHNSON, CHARLES - Texas A&M Agrilife
item BLOOD, PHILIP - Non ARS Employee
item BELLGARD, STANLEY - Landcare Research
item TOMPKINS, DANIEL - Non ARS Employee

Submitted to: Data in Brief
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/25/2019
Publication Date: 10/4/2019
Citation: Guerrero, F., Bendele, K.G., Ghaffari, N., Guhlin, J., Gedye, K.R., Lawrence, K.E., Dearden, P.K., Harrop, T.W., Heath, A.C., Lun, Y., Metz, R.P., Teel, P., Perez De Leon, A.A., Biggs, P.J., Pomroy, W.E., Johnson, C.P., Blood, P.D., Bellgard, S.E., Tompkins, D.M. 2019. The Pacific Biosciences de novo assembled genome from a parthenogenetic New Zealand wild population of the longhorned tick, Haemaphysalis longicornis Neumann, 1901. Data in Brief. 27:104602. https://doi.org/10.1016/j.dib.2019.104602.
DOI: https://doi.org/10.1016/j.dib.2019.104602

Interpretive Summary: The cattle tick Haemaphysalis longicornis feeds upon a wide range of mammalian hosts, including cattle, deer, sheep, goats, and horses, and can transmit a number of tick-borne diseases. This tick has recently established populations in at least 8 states of the US and the geographical source of the US outbreak is still unknown. To assist in the development of control technologies for this tick, a New Zealand-USA consortium was established to sequence, assemble, and annotate the genome of this tick using samples obtained from New Zealand. In New Zealand, the tick is largely parthenogenic, with all individuals having identical (or nearly identical) genomes. The tick's genome size is very large and having all ticks with identical genomes will greatly assist the assembly of the sequencing reads into a useful genome sequence. Very high molecular weight genomic DNA was sequenced on the long-read Pac Bio Sequel platform. Twenty-eight SMRT cells produced a total of 21.3 million reads which were assembled with the assembly algorithm named Canu on a reserved node with access to 12TB of RAM, running for over 24 days. The final assembly dataset consisted of 34,211 contigs with an average contig length of 215,205 bp. The tick's genome size was estimated from the assembly to be ~7.3 Gbp, more than twice the size of the human genome. The quality of the annotated genome was assessed by a bioinformatics approach known as BUSCO analysis, which 95% of the BUSCO gene set was found in the tick genome. Only 48 BUSCOs were missing and only 9 fragmented. The raw sequencing reads and the assembled and annotated contigs were submitted to National Center for Biotechnology Information.

Technical Abstract: The cattle tick Haemaphysalis longicornis feeds upon a wide range of mammalian hosts, including cattle, deer, sheep, goats, and horses, and can transmit a number of tick-borne diseases. This tick has recently established populations in at least 8 states of the US and the geographical source of the US outbreak is still unknown. A New Zealand-USA consortium was established to sequence, assemble, and annotate the genome of this tick using samples obtained from New Zealand. In New Zealand, the tick is largely parthenogenic and this trait was deemed useful for genome assembly. Very high molecular weight genomic DNA was sequenced on the long-read Pac Bio Sequel platform. Twenty-eight SMRT cells produced a total of 21.3 million reads which were assembled with Canu on a reserved node with access to 12TB of RAM, running for over 24 days. The final assembly dataset consisted of 34,211 contigs with an average contig length of 215,205 bp. The tick's genome size was estimated from the assembly to be ~7 Gbp. The quality of the annotated genome was assessed by BUSCO analysis, an approach that provides quantitative measures for the quality of an assembled genome. Over 95% of the BUSCO gene set was found in the tick genome. Only 48 of the 1066 BUSCOs genes were missing and only 9 were present in a fragmented condition. The raw sequencing reads and the assembled and annotated contigs were submitted to National Center for Biotechnology Information.