Skip to main content
ARS Home » Plains Area » Houston, Texas » Children's Nutrition Research Center » Research » Publications at this Location » Publication #406884

Research Project: Metabolic and Epigenetic Regulation of Nutritional Metabolism

Location: Children's Nutrition Research Center

Title: A comprehensive and integrative approach to MeCP2 disease transcriptomics

Author
item TROSTLE, ALEXANDER - Texas Children'S Hospital
item LI, LUCIAN - Texas Children'S Hospital
item KIM, SEON-YOUNG - Texas Children'S Hospital
item WANG, JIASHENG - Texas Children'S Hospital
item AL-OURAN, RAMI - Texas Children'S Hospital
item YALAMANCHILI, HARI - Children'S Nutrition Research Center (CNRC)
item LIU, ZHANDONG - Texas Children'S Hospital
item WAN, YING-WOOI - Texas Children'S Hospital

Submitted to: International Journal of Molecular Sciences
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/16/2023
Publication Date: 3/7/2023
Citation: Trostle, A., Li, L., Kim, S., Wang, J., Al-Ouran, R., Yalamanchili, H.K., Liu, Z., Wan, Y. 2023. A comprehensive and integrative approach to MeCP2 disease transcriptomics. International Journal of Molecular Sciences. 24(6). Article 5122. https://doi.org/10.3390/ijms24065122.
DOI: https://doi.org/10.3390/ijms24065122

Interpretive Summary: Investigating rare diseases like Rett syndrome, is akin to piecing together a complex jigsaw puzzle. Each piece of this puzzle is a piece of information sourced from myriad research studies. The scientific community typically must collect these puzzle pieces themselves. The collection process entails running a gamut of experiments, a task that is notoriously time-consuming, requires specialized skills and expertise, and demands substantial resources. This is hardly a model of efficiency. Moreover, each piece of information or data that scientists gather doesn't always fit perfectly with others. To tackle these issues, we have created MECP2pedia. Think of it as a meticulously organized, comprehensive repository containing every piece of information related to the MECP2 gene, which is linked to Rett syndrome. As we assembled MECP2pedia, we discovered recurring patterns in the MECP2 gene's behavior. We identified which characteristics of the gene increase or decrease consistently and verified that these observations hold true across different studies. We also found that individual studies often fail to unearth vital clues due to the limited scope of their investigations. Further, we identified a significant problem in the form of "batch effects" - imagine this as the potential distortion or biases that can creep in when a detective attempts to piece together evidence collected by different colleagues. In summary, while the task of integrating data from diverse sources is formidable, it is also immensely valuable. Scientists can gain a more complete understanding of a disease by looking at a wide range of data. Our methodology, therefore, offers a more efficient pathway to probe biological questions, mirroring the enhanced success a detective enjoys through a collaborative, well-organized approach.

Technical Abstract: Mutations in MeCP2 result in a crippling neurological disease, but we lack a lucid picture of MeCP2's molecular role. Individual transcriptomic studies yield inconsistent differentially expressed genes. To overcome these issues, we demonstrate a methodology to analyze all modern public data. We obtained relevant raw public transcriptomic data from GEO and ENA, then homogeneously processed it (QC, alignment to reference, differential expression analysis). We present a web portal to interactively access the mouse data, and we discovered a commonly perturbed core set of genes that transcends the limitations of any individual study. We then found functionally distinct, consistently up- and downregulated subsets within these genes and some bias to their location. We present this common core of genes as well as focused cores for up, down, cell fraction models, and some tissues. We observed enrichment for this mouse core in other species MeCP2 models and observed overlap with ASD models. By integrating and examining transcriptomic data at scale, we have uncovered the true picture of this dysregulation. The vast scale of these data enables us to analyze signal-to-noise, evaluate a molecular signature in an unbiased manner, and demonstrate a framework for future disease focused informatics work.