Author
Leesburg, Vicki | |
Macneil, Michael |
Submitted to: Western Section of Animal Science Proceedings
Publication Type: Proceedings Publication Acceptance Date: 5/15/2004 Publication Date: 6/15/2004 Citation: Reisenauer, V.L., Macneil, M.D. 2004. Sas® tools to facilitate qtl discovery1. Western Section of Animal Science Proceedings 55:87-89. Interpretive Summary: Genetic linkage maps have been developed for several species, with the intention to aid the identification of chromosomal regions that may influence traits of economic importance. Computing is an integral component of these studies. Our objective was to develop data management tools using SAS® that facilitate searching, management, evaluation, analysis, and distribution of genotypic data. This software development effort has been process oriented and modular in structure. The applications have become more complex as the project has progressed from data accumulation and entry to verification and ultimately to mapping. . Seven SAS® applications were developed: A1) marker_in, A2) animal_in, A3) plate_in, A4) genotype_in, A5) compare, A6) crimap_in, and A7) simplify. A1-A4 generate a series of data files. The file from A1 is keyed by marker name and contains its chromosome number and map position. Files from A2-A4 are keyed by animal number. The file from A2 contains sire, dam, and sex of each animal. The file from A3 contains the microtiter plate number and cell containing DNA from each animal and the electrophoresis gel lane assigned to that animal. A4 facilitates entry of genotypic scores generating separate files for each scorer; 1XYZgen or 2XYZgen (XYZ is marker name). A5 compares the files from A4 and produces four new files. The 'zero' file identifies animals with genotypes not scored by either scorer. The 'discrepancy' file identifies animals with genotypes not agreed upon by both scorers. The 'zero' and 'discrepancy' files also contain the microtiter plate location for that animal's DNA. The 'good' file identifies animals and their genotype when both scorers agreed. The 'good-zero' file merges data from the 'good' and 'zero' files. A6 assembles data for a chromosome from the 'good-zero' files and creates an input file for CRI-MAP. A7 simplifies CRI-MAP output for resolving non-inheritances using animal ID and marker to generate a file containing animal ID, related marker genotype, sire and dam ID and genotypes. Following on database design principles redundancy across data files has been minimized. SAS is used as a platform for managing the data owning to its power, implementation across numerous operating systems and computing platforms, and widespread use in the animal science research community. Technical Abstract: The objective was to develop data management tools using SAS® that facilitate searching for quantitative trait loci (QTL). Genotypes were surveyed for six F1 bulls and 159 markers selected after being found informative in at least four bulls and providing an inter-marker interval less than 20 cM. Genotypes (N=162,816) generated using PCR® and Li-COR GeneReader 4200® are scored by two people. Seven SAS® applications were developed: A1) marker_in, A2) animal_in, A3) plate_in, A4) genotype_in, A5) compare, A6) crimap_in, and A7) simplify. A1-A4 generate a series of data files. The file from A1 is keyed by marker name and contains its chromosome number and map position. Files from A2-A4 are keyed by animal number. The file from A2 contains sire, dam, and sex of each animal. The file from A3 contains the microtiter plate number and cell containing DNA from each animal and the electrophoresis gel lane assigned to that animal. A4 facilitates entry of genotypic scores generating separate files for each scorer; 1XYZgen or 2XYZgen (XYZ is marker name). A5 compares the files from A4 and produces four new files. The 'zero' file identifies animals with genotypes not scored by either scorer. The 'discrepancy' file identifies animals with genotypes not agreed upon by both scorers. The 'zero' and 'discrepancy' files also contain the microtiter plate location for that animal's DNA. The 'good' file identifies animals and their genotype when both scorers agreed. The 'good-zero' file merges data from the 'good' and 'zero' files. A6 assembles data for a chromosome from the 'good-zero' files and creates an input file for CRI-MAP. A7 simplifies CRI-MAP output for resolving non-inheritances using animal ID and marker to generate a file containing animal ID, related marker genotype, sire and dam ID and genotypes. WINDOW, DISPLAY, and TRUNCOVER statements and macro facility, CALL SYMPUT were used in several of these applications. Applications of SAS® described here save labor and improve data integrity in conducting whole-genome searches for QTL. |