Skip to main content
ARS Home » Northeast Area » Wyndmoor, Pennsylvania » Eastern Regional Research Center » Characterization and Interventions for Foodborne Pathogens » Research » Publications at this Location » Publication #308181

Title: Predicting protein submitochondrial locations using a K-Nearest neighbor method based on the Bit-Score weighted euclidean distance

Author
item HU, JING - Franklin And Marshall College
item Yan, Xianghe

Submitted to: Bioinformatics Research and Applications International Symposium Proceedings Series
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/1/2014
Publication Date: 4/15/2014
Citation: Hu, J., Yan, X. 2014. Predicting protein submitochondrial locations using a K-Nearest neighbor method based on the Bit-Score weighted euclidean distance. Bioinformatics Research and Applications International Symposium Proceedings Series. DOI: 10:1007/9783-319.

Interpretive Summary: Mitochondria are unusual organelles that act as a powerhouse and functional like a digestive system in a cell. But little is known about how their proteins residing in the intricate network of membranes within the mitochondria. Mitochondria have an envelope of two membrane layers and consist of several subcellular locations. Prediction of protein subcellular localization could serve for in-depth knowledge of protein-protein interactions and further for protein functional identification. In this study, we present a computer-based method for predicting the protein localization sites and assigning them to regions of their specific subcellular locations within the mitochondria. Due to their structure and chemical similarities to bacteria, this case study could provide an excellent way of developing and applying new algorithm/software for functional bacterial protein prediction as well.

Technical Abstract: Mitochondria are essential subcellular organelles found in eukaryotic cells. Knowing information on a protein’s subcellular or sub subcellular location provides in-depth insights about the microenvironment where it interacts with other molecules and is crucial for inferring the protein’s function. Therefore, it is important to predict the submitochondrial localization of mitochondrial proteins. In this study, we introduced a K-nearest neighbor method based on a novel bit-score weighted Euclidean distance, which is calculated from an extended version of pseudo-amino acid composition. We further improved the method by applying a heuristic feature selection process. Using the selected features, the final method achieved a 93% overall accuracy on the dataset of SubMito, which is higher than or comparable to other state-of-art methods. On a larger recently curated dataset, the method also achieved a consistent performance of 90% overall accuracy.