Location: Children's Nutrition Research Center
2023 Annual Report
Objectives
Objective 1: Use transgenic mouse models, microdissection, nuclear sorting, next-generation sequencing and innovative computational approaches to alter DNA methylation in specific subpopulations of hypothalamic neurons and evaluate lifelong effects on energy metabolism, food intake, and physical activity; isolate specific neuronal (and potentially non-neuronal) hypothalamic cell types to evaluate cell type-specific alterations in DNA methylation in established models of nutritional programming.
Objective 2: Advance understanding of the causes of interindividual epigenetic variation and consequences for human energy balance by conducting target-capture bisulfite sequencing in multiple tissues from an existing cohort of molecularly-phenotyped individuals to determine associations between genetic variation, epigenetic variation, and gene expression at human metastable epialleles; identify human metastable epialleles that predict risk of obesity by exploiting existing longitudinal cohorts of metabolically-phenotyped individuals; assess how DNA methylation at obesity-associated metastable epialleles is affected by maternal periconceptional nutrition.
Objective 3: Determine the functional impact of folic acid supplementation and establish the functional role of age-related p16 epimutation in genetically and epigenetically engineered mouse models of colon cancer and in intestinal carcinogenesis.
New Project (HY):
Objective 1. Create multi-omic nutritional data share portal to resolve the unmet demand for an efficient access to the large volumes of heterogeneous multi-omic data across various research labs and centers.
Objective 2. Integrate heterogeneous multi-omic datasets such as genetic (SNPs), transcriptomic, epigenetic, proteomic, metabolomic and microbiome to infer molecular network structures illustrating eating disorder dynamics.
Objective 3. Decode genetic and epigenetic patterns of disordered eating using machine learning methods.
Approach
Developmental programming occurs when nutrition and other environmental exposures affect prenatal or early postnatal development, causing structural or functional changes that persist to influence health throughout life. Researchers are working to understand epigenetic mechanisms of developmental programming. Epigenetic mechanisms regulate cell-type specific gene expression, are established during development, and persist for life. Importantly, nutrition during prenatal and early postnatal development can induce epigenetic changes that persist to adulthood. We focus on DNA methylation because this is the most stable epigenetic mechanism. The inherent cell-type specificity of epigenetic regulation motivates development of techniques to isolate and study specific cell types of relevance to obesity and digestive diseases. These projects integrate both detailed studies of animal models and characterization of epigenetic mechanisms in humans. We will use mouse models of developmental epigenetics in the hypothalamus to understand cell type-specific epigenetic mechanisms mediating developmental programming of body weight regulation. Mouse models will also be used to investigate how folic acid intake affects epigenetic mechanisms regulating intestinal epithelial stem cell (IESC) development and characterize the involvement of these mechanisms in metabolic programming related to obesity, inflammation, and gastrointestinal cancer. In human studies, we will identify human genomic loci at which interindividual variation in DNA methylation is both sensitive to maternal nutrition in early pregnancy and associated with risk of later weight gain. An improved understanding of how nutrition affects developmental epigenetics should eventually lead to the creation of early-life nutritional interventions to improve human health. And scientists will elucidate the molecular interplay of epigenome and transcriptome in aberrant eating behaviors using robust genome-wide computational analyses. They will conduct a multi-omic integrative study to systematically decipher the regulatory aspects of DNA methylation and histone modifications on alternative splicing and alternative polyadenylation in disordered eating. Novel machine learning approaches will be designed to address specific analytical challenges.
Progress Report
Epigenetics is the molecular mechanisms that enables different cell types (containing the same DNA) to develop and maintain very different structures and functions. We focus on DNA methylation, the most stable epigenetic mark, and relevant to our understanding of how nutrition before conception and during embryonic, fetal, and postnatal development has influences on disease risk during life (developmental programming). Objective 1 focuses on mouse models of epigenetic development in the hypothalamus to understand programming of obesity. We tested that early postnatal overnutrition induces epigenetic changes within specific subclasses of neurons in the hypothalamus. Transgenic mice were used in which a type of hypothalamic neuron, Agrp neurons, were fluorescently tagged (NPY-GFP mice). This allowed us to isolate specifically Agrp neurons from mice who were overnourished postnatally and compare to normally fed mice. We are working with the Baylor Human Genome Sequencing Center attempting to optimize the preparation of the DNA samples for whole-genome DNA methylation analysis.
Objective 2 focuses on identifying human metastable epialleles and assessing their associations with obesity. Rather than focus on identifying canonical metastable epialleles (at which individual variation in DNA methylation is largely independent of genetic variation), we have shown that systemic interindividual epigenetic variants in humans can have a stochastic (probabilistic) component, a genetic component, and moreover be influenced by periconceptional nutrition. Rather than limit our focus to metastable epialleles, in 2019 we introduced a new term: Correlated Regions of Systemic Interindividual Variation in DNA methylation (CoRSIVs). We performed target-capture bisulfite-seq at baseline on 100 weight stable and 100 'gainer' adults from Starr County, Texas. With the help of an industry partner, we designed a panel of custom ‘baits’ to allow us to target and capture human genomic regions corresponding to the nearly 10,000 CoRSIVs we discovered. We validated our approach using peripheral blood DNA samples from the same 12 individuals, documenting excellent quantitative reproducibility across the entire panel. We performed target-capture on the 200 Starr County samples (100 weight stable and 100 'gainers'). The data was of high quality (genomic coverage, etc.), and analysis is underway.
Objective 3 looks at the tie between dietary factors and cancer-causing epigenetic regulation. We used a mouse model of colon cancer based on two common molecular alterations: Apc mutation and p16 epimutation. We found that the methyl-donor supplemented mice had a shortened overall survival compared to controls. We detected a greater number of tumors within the middle and distal regions of the small intestine from the supplemented mice. In both small intestines and colons, the dietary supplementation led to larger tumors compared to controls. To gain insight into the cause-and-effect relationship, we performed comparative metabolomic analysis using both serum and tumor samples. We saw an increase in the metabolites involved in one-carbon metabolism (i.e., glycine betaine, dimethylglycine, and methionine). We compared gut microbiome profiling and concluded that the tumor promoting phenotype was not due to alterations in bacteria compositions. We analyzed DNA methylation across a panel of tumor suppressor genes and found that the methyl-donor supplementation caused the most robust pro-methylation effect at the p16 promoter. We show that p16 epimutation is modulated by dietary methyl donor supplementation, leading to increased colon cancer risk. Our findings provide evidence concerning the safety of folic acid fortification in our general population.
Scientists can now profile biomolecules at different levels, including genes, proteins, metabolites, and epigenetic factors. By integrating and analyzing these diverse data sets, we can gain deeper insights into how biological systems work. Yet, the challenge lies in harmonizing this data due to varying tools and standards, which can impede comprehensive multi-omic research. This hinders efficient access and uniform data processing standards which are essential for sharing and exploring multi-omic data. Objective 4 seeks to create a user-friendly portal that facilitates the sharing and exploration of multi-omic nutritional data, enabling researchers to access and process data more efficiently, promoting collaboration and advancing knowledge of nutrition's impact on various biological processes. We established our computational infrastructure to store and process publicly available datasets and established/streamlined our analytical pipelines. We collected data both from our own team in Houston and from public online sources like the Gene Expression Omnibus (GEO) database, including RNA sequencing (RNA-Seq), Chromatin Immunoprecipitation Sequencing (ChIP-Seq), Reduced Representation Bisulfite Sequencing (RRBS), and Whole-Genome Bisulfite Sequencing (WGBS). We curated multiple datasets from GEO, considering factors like organism, tissue, and genotype. For consistency, all RNA-Seq, ChIP-Seq, and RRBS datasets were processed through a standardized pipeline, and we addressed the "batch effects" issue—a common problem when merging data from diverse sources—with ComBat-Seq. In the end, we implemented an Integrative Database Framework that helps scientists quickly find and use data from many sources. Despite numerous analytical standards, datasets, and tools, the scientific community struggles to comprehend these data sources. Our integrative database organized the scientific information, including gene expression profiles. A problem we tackled is the issue of "batch effects," which can introduce potential biases when integrating evidence from different scientists. This platform resolves the unmet demand for an efficient access to the large volumes of heterogeneous data across various labs and presents data online. It can accelerate research by overcoming the data barrier for bench and computational researchers.
Sub-objective 5A aims to decode the regularity network of alternative splicing and epigenetic changes in disordered eating. We evaluated available algorithms, including CrypSplice, rMATS (Multivariate Analysis of Transcript Splicing), MISO (Mixture of Isoforms), MAJIQ (Modeling Alternative Junction Inclusion Quantification), and SUPPA (Super-Fast Pipeline for Alternative splicing analysis). We selected rMATS due to its ability to comprehend alternative splicing forms from RNA-Seq data. The rMATS algorithm classifies splicing events into five categories: skipped exons, retained introns, mutually exclusive exons, alternative 5’ and 3’ splice sites. For the analysis of nutrient-linked datasets, we utilized rMATS to infer splicing changes dependent on nutrients. We also processed epigenetic datasets, including RRBS and WGBS, using a Bismark and methylKit pipeline. This allows us to examine epigenetic modifications. The next step involves integrating the identified splicing changes with causal epigenetic changes that could potentially be regulated through nutritional interventions. This integration will show the relationship between alternative splicing and epigenetic regulation in disordered eating.
Sub-objective 5B focuses on decoding the relationship between alternative polyadenylation (APA) patterns and the changes in eating behavior. We reviewed current computational methods for analyzing APA (Alternative Polyadenylation) patterns in RNA-Seq data, including DaPars (Dynamic Analysis of Alternative Polyadenylation from RNA-seq), APALyzer, and TAPAS (Tool for Alternative PolyAdenylation Site analysis). Noting existing limitations, we upgraded our tool, PolyA-miner, using advanced machine learning techniques. This improved version better accommodates bulk RNA-Seq datasets. Specifically, it efficiently tracks changes in APA using beta-binomial modeling and vector projections. Our tool is less susceptible to inherent data variations, thereby effectively identifying novel APA sites that might otherwise remain undetected. We will next delve into understanding how nutrient-dependent epigenetic changes influence feeding behavior by studying APA changes. This will shed light on the interplay between APA and epigenetic patterns in the context of disordered eating. Thus, we aim to uncover critical connections between molecular mechanisms and nutritional influences in eating disorders.
Objective 6 is dedicated to decoding the genetic and epigenetic patterns of disordered eating using machine learning methods. DNA methylation plays a role as an epigenetic regulator of gene expression programs, making it susceptible to alterations caused by environmental exposures, nutrition, aging, and pathogenesis. Traditional DNA methylation analyses are predominantly qualitative and less quantitative, encountering challenges due to the complex, high-dimensional, and non-linear nature of the data. Nonetheless, efficient frameworks for interpreting DNA methylation data are needed. We explored various machine learning models and developed an approach based on neighboring sequence grammar. Our model demonstrated promising performance in capturing various patterns in DNA methylation and their associations with gene expression. This has the potential to uncover deeper insights into epigenetic regulation, paving the way for advancing our understanding of various biological phenomena, including nutrition-dependent feeding behavior.
Accomplishments
1. Researchers discover trouble in the epigenetics toolbox. Epigenetics describes the molecular mechanisms that enable our different cell types to develop and stably maintain different structures and functions. For more than a decade, researchers worldwide have been performing population studies to detect associations between DNA methylation (the most stable epigenetic mark) and disease; and nearly all these studies have used the same commercial methylation arrays. Scientists at the Children's Nutrition Research Center (CNRC) in Houston, Texas, reported that these arrays are not appropriate for population epigenetics, because 95% of the genomic sites they target do not show appreciable interindividual variation among humans (without interindividual variation, detecting associations is impossible). Additionally, we validated an innovative approach for studying Correlated Regions of Systemic Interindividual epigenetic Variation (CoRSIVs, which CNRC scientists discovered in 2019) and demonstrated the superiority of targeting CoRSIVs by documenting over 70-fold more genetic influence on human DNA methylation than had been previously documented. These advances call into question the results of over 1,000 studies of population epigenetics conducted over the last decade. A new product being marketed makes this technology available to epigenetic epidemiologists worldwide, helping the field of science to move forward.
2. A robust and integrative database framework to find and use/integrate heterogeneous data. Investigating the complexities of nutrition benefits and rare diseases is akin to piecing together an intricate jigsaw puzzle and each fragment of this puzzle represents information gathered from a multitude of research studies. Despite numerous analytical standards, datasets, and tools, the scientific community often struggles to efficiently comprehend these vast data sources. To address these challenges, researchers at the Children's Nutrition Research Center in Houston, Texas, have developed an integrative database framework that meticulously organizes scientific information, including gene expression profiles. A significant problem we successfully tackled is the issue of "batch effects," which can introduce potential biases when integrating evidence from different scientists; our framework allows us to identify and account for these effects, enabling the discovery of recurring patterns in gene behaviors that individual studies might miss due to their limited scope. In essence, our newfound approach empowers scientists to gain a more comprehensive understanding of specific nutrients or diseases by examining a wide range of data. Our methodology offers a more efficient pathway for the scientific community to explore critical biological questions, ultimately benefiting farmers and consumers with precise dietary recommendations and nutritional planning.