findmap.f90 | Align sequence reads to reference map, call previous variants, and identify new variants | ||
Downloads | Version 2.2 programs, example and test outputs, and executables (released December 10, 2018)
|
||
Version 2.1 programs, example outputs, and executables (released October 1, 2018) |
|||
Version 2 programs, example outputs, and executables (released May 31, 2018) |
|||
Version 1 programs, example outputs, and executables (released July 19, 2016; last updated July 28, 2016) |
|||
Version 0 programs, example outputs, and executables (beta version; released January 8, 2016) |
|||
Inputs | reference.fa | Standard fasta format for reference genome: > as 1st character in line for each new chromosome 50-byte lines of ACGT (or N for unknown bases), or acgt for repeated sections All programs in this series treat lower and uppercase as the same because storemap identifies, counts, stores, and links repeated k-mers to each other while hashing reference map |
|
variants.prior | Lists all previously known SNPs and indels Insertions reported 1 base to left of 1st base where they differ from reference genome, reading left to right Deletions reported at their detected location, not 1 base to left Use variants.sas to reformat the 1000 Bull Genomes variant file Format: chr# location vartype (SNP, INS, or DEL) variant# length alternate_allele |
||
fastq.filelist | List of DNA source names such as source1, source2, etc., along with numeric IDs | ||
source1.1.fq, source1.2.fq, source2.1.fq, source2.2.fq, etc. |
Standard fastq format for paired end reads, with reads 1 and 2 of each pair at same position in 2 separate files for each DNA source | ||
*.options | Program control file with user-defined options | ||
Outputs | storemap.unf | Hash table, etc., for reference map | |
reference.unf | Unformatted map for faster input | ||
variant.readdepth | Number of ref and alt alleles, 1 row/variant Format: variant# chr# var_location ref# alt# |
||
individual.readdepth | Format: ID# chip# #SNPs Read counts for A and B alleles stored in 1-byte hexadecimal format (input format for imputation program findhap4; VanRaden et al., 2015) |
||
segments.found | Alignments, errors, and known variant locations for segments where paired end locations differ by <fraglen Format: segment# pair# direction chr# segment_location num_alts num_errs (var_locations var_type) (err_locations err_base) |
||
segments.lost | Same format, but for segments where paired end locations do not match | ||
segments.newindels | Locations and properties of new indels detected (those not already in variants.txt) Format: segment# pair# direction chr# seg_location indel_size indel_location bases (inserted or deleted) |
||
SNPs.new | Summary of new SNPs including read depth and number of alternate alleles found Locations can have >1 row if differing alternate alleles are observed Format: chr# SNP_location read_depth num_alt ref_allele alt_allele |
||
indels.new | Summary of new indels including read depth and number of alternate alleles found Locations can have >1 row if differing alternate alleles are observed Format: chr# indel_location read_depth num_alt ref_allele alt_allele |
||
variants.all | Combined list of prior and new variants in same format as variants.prior | ||
References | 2019 | VanRaden, P.M., Bickhart, D.M., and O'Connell, J.R. Calling known variants and identifying new variants while rapidly aligning sequence data. J. Dairy Sci. 102:3216–3229. | |
2016 | VanRaden, P.M., and D.M. Bickhart. Fast single-pass alignment and variant calling using sequencing data. Plant Anim. Genome XXIV Conf., San Diego, CA, Jan. 9–13, W161. | Presentation slides VanRaden, P.M., D.M. Bickhart, and J.R. O'Connell. Identifying and calling insertions, deletions, and single-base mutations efficiently from sequence data. J. Dairy Sci. 99(E-Suppl. 1):140(abstr. 0302). | Presentation slides |
||
2015 | VanRaden, P.M., C. Sun, and J.R. O'Connell. Fast imputation using medium- or low-coverage sequence data. BMC Genet. 16:82. | ||
2014 | Daetwyler, H.D., A. Capitan, H. Pausch, P. Stothard, R. van Binsbergen, R.F. Brøndum, X. Liao, A. Djari, S.C. Rodriguez, C. Grohs, D. Esquerré, O. Bouchez, M.N. Rossignol, C. Klopp, D. Rocha, S. Fritz, A. Eggen, P.J. Bowman, D. Coote, A.J. Chamberlain, C. Anderson, C.P. Van Tassell, I. Hulsegge, M.E. Goddard, B. Guldbrandtsen, M.S. Lund, R.F. Veerkamp, D.A. Boichard, R. Fries, and B.J. Hayes. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nature Genet. 46:858–865. VanRaden, P.M., and C. Sun. Fast imputation using medium- or low-coverage sequence data. Proc. 10th World Congr. Genet. Appl. Livest. Prod., 179. | Presentation slides |
||
License | Fortran package findmap.f90 is public domain and was developed with U.S. taxpayer funding. Accurate results are not guaranteed. Please report any bugs to paul.vanraden@usda.gov. You may modify, improve, use, and redistribute the code to anyone for any purpose. Or, you can ask Paul to make changes that could benefit U.S. evaluations and other users. | ||
Paul VanRaden
Animal Genomics and Improvement Laboratory
Agricultural Research Service, USDA