Skip to main content
ARS Home » Midwest Area » Madison, Wisconsin » Vegetable Crops Research » Research » Publications at this Location » Publication #336445

Title: SOFIA: an R package for enhancing genetic visualization with Circos

Author
item DIAZ-GARCIA, LUIS - University Of Wisconsin
item COVARRUBIAS-PAZARAN, GIOVANNY - University Of Wisconsin
item SCHLAUTMAN, BRANDON - University Of Wisconsin
item Zalapa, Juan

Submitted to: Journal of Heredity
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/9/2017
Publication Date: 4/8/2017
Publication URL: https://handle.nal.usda.gov/10113/5695460
Citation: Diaz-Garcia, L., Covarrubias-Pazaran, G., Schlautman, B., Zalapa, J. 2017. SOFIA: An R package for enhancing genetic visualization with Circos. Journal of Heredity. 108(4):443-448. doi: 10.1093/jhered/esx023.

Interpretive Summary: Visualization of data from any stage of genetic and genomic research is one of the most useful approaches for detecting potential errors, ensuring accuracy and reproducibility, and presentation of the resulting data. Therefore, we developed a software package denominated SOFIA for plotting a variety of genetic data types in a concise manner for data exploration and presentation. The program is very simple, requires minimal coding experience, even for complex figures that incorporate high-dimensional genetic information, and allows simultaneous analysis and visual exploration of genomic and genetic data. The program is also very flexible in formatting and configuration, automatable, and provides publication quality figures. SOFIA is a software tool useful for genetic and genomic researchers with little computational expertise.

Technical Abstract: Visualization of data from any stage of genetic and genomic research is one of the most useful approaches for detecting potential errors, ensuring accuracy and reproducibility, and presentation of the resulting data. Currently software such as Circos, ClicO FS, and RCircos, among others, provide tools for plotting a variety of genetic data types in a concise manner for data exploration and presentation. However, each of the programs have one or more disadvantages that limit their usability in data exploration or construction of publication quality figures, such as inflexibility in formatting and configuration, reduced image quality, lack of potential for automation, or requirements of high-level computational expertise. Therefore, we developed the R package SOFIA, which leverages the capabilities of Circos by manipulating data, preparing configuration files, and running the Perl-native Circos directly from the R environment with minimal user intervention. The advantages of integrating both R and Circos into SOFIA are numerous. R is a very powerful, mid-level programming language widely used among the genetic and genomic research community, while Circos has proven to be a novel software for arranging genomic data to create aesthetical publication quality circular figures. Producing Circos figures in R with SOFIA is simple, requires minimal coding experience, even for complex figures that incorporate high-dimensional genetic information, and allows simultaneous analysis and visual exploration of genomic and genetic data in a single programming environment.