Skip to main content

Publications search

Found 37003 matches. Displaying 71-80
Kim J, Lee C, Ko BJ, Yoo DA, Won S, Phillippy AM, Fedrigo O, Zhang GJ, Howe K, Wood J, Durbin R, Formenti G, Brown S, Cantin L, Mello CV, Cho S, Rhie A, Kim H, Jarvis ED
Show All Authors

False gene and chromosome losses in genome assemblies caused by GC content variation and repeats

GENOME BIOLOGY 2022 SEP 27; 23(1):? Article 204
Background Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as complete and error-free as possible, which requires utilizing long reads, long-range scaffolding data, new assembly algorithms, and manual curation. A more thorough evaluation of the recent references relative to prior assemblies can provide a detailed overview of the types and magnitude of improvements. Results Here we evaluate new vertebrate genome references relative to the previous assemblies for the same species and, in two cases, the same individuals, including a mammal (platypus), two birds (zebra finch, Anna's hummingbird), and a fish (climbing perch). We find that up to 11% of genomic sequence is entirely missing in the previous assemblies. In the Vertebrate Genomes Project zebra finch assembly, we identify eight new GC- and repeat-rich micro-chromosomes with high gene density. The impact of missing sequences is biased towards GC-rich 5 '-proximal promoters and 5 ' exon regions of protein-coding genes and long non-coding RNAs. Between 26 and 60% of genes include structural or sequence errors that could lead to misunderstanding of their function when using the previous genome assemblies. Conclusions Our findings reveal novel regulatory landscapes and protein coding sequences that have been greatly underestimated in previous assemblies and are now present in the Vertebrate Genomes Project reference genomes.
Kawano Y, Edwards M, Huang YM, Bilate AM, Araujo LP, Tanoue T, Atarashi K, Ladinsky MS, Reiner SL, Wang HH, Mucida D, Honda K, Ivanov II
Show All Authors

Microbiota imbalance induced by dietary sugar disrupts immune-mediated protection from metabolic syndrome

CELL 2022 SEP 15; 185(19):3501-+
How intestinal microbes regulate metabolic syndrome is incompletely understood. We show that intestinal micro -biota protects against development of obesity, metabolic syndrome, and pre-diabetic phenotypes by inducing commensal-specific Th17 cells. High-fat, high-sugar diet promoted metabolic disease by depleting Th17-inducing microbes, and recovery of commensal Th17 cells restored protection. Microbiota-induced Th17 cells afforded protection by regulating lipid absorption across intestinal epithelium in an IL-17-dependent manner. Diet-induced loss of protective Th17 cells was mediated by the presence of sugar. Eliminating sugar from high-fat diets protected mice from obesity and metabolic syndrome in a manner dependent on commensal -spe-cific Th17 cells. Sugar and ILC3 promoted outgrowth of Faecalibaculum rodentium that displaced Th17-inducing microbiota. These results define dietary and microbiota factors posing risk for metabolic syndrome. They also define a microbiota-dependent mechanism for immuno-pathogenicity of dietary sugar and highlight an elaborate interaction between diet, microbiota, and intestinal immunity in regulation of metabolic disorders.
Amin M, Ott J, Gordon D, Wu RL, Postolache TT, Vergare M, Gragnoli C
Show All Authors

Comorbidity of Novel CRHR2 Gene Variants in Type 2 Diabetes and Depression

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES 2022 SEP; 23(17):? Article 9819
The corticotropin-releasing hormone receptor 2 (CRHR2) gene encodes CRHR2, contributing to the hypothalamic-pituitary-adrenal stress response and to hyperglycemia and insulin resistance. CRHR2-/- mice are hypersensitive to stress, and the CRHR2 locus has been linked to type 2 diabetes and depression. While CRHR2 variants confer risk for mood disorders, MDD, and type 2 diabetes, they have not been investigated in familial T2D and MDD. In 212 Italian families with type 2 diabetes and depression, we tested 17 CRHR2 single nucleotide polymorphisms (SNPs), using two-point parametric-linkage and linkage-disequilibrium (i.e., association) analysis (models: dominant-complete-penetrance-D1, dominant-incomplete-penetrance-D2, recessive-complete-penetrance-R1, recessive-incomplete-penetrance-R2). We detected novel linkage/linkage-disequilibrium/association to/with depression (3 SNPs/D1, 2 SNPs/D2, 3 SNPs/R1, 3 SNPs/R2) and type 2 diabetes (3 SNPs/D1, 2 SNPs/D2, 2 SNPs/R1, 1 SNP/R2). All detected risk variants are novel. Two depression-risk variants within one linkage-disequilibrium block replicate each other. Two independent novel SNPs were comorbid while the most significant conferred either depression- or type 2 diabetes-risk. Although the families were primarily ascertained for type 2 diabetes, depression-risk variants showed higher significance than type 2 diabetes-risk variants, implying CRHR2 has a stronger role in depression-risk than type 2 diabetes-risk. In silico analysis predicted variants' dysfunction. CRHR2 is for the first time linked to/in linkage-disequilibrium/association with depression-type 2 diabetes comorbidity and may underlie the shared genetic pathogenesis via pleiotropy.
Ko BJ, Lee C, Kim J, Rhie A, Yoo DA, Howe K, Wood J, Cho S, Brown S, Formenti G, Jarvis ED, Kim H
Show All Authors

Widespread false gene gains caused by duplication errors in genome assemblies

GENOME BIOLOGY 2022 SEP 27; 23(1):? Article 205
Background False duplications in genome assemblies lead to false biological conclusions. We quantified false duplications in popularly used previous genome assemblies for platypus, zebra finch, and Anna's Hummingbird, and their new counterparts of the same species generated by the Vertebrate Genomes Project, of which the Vertebrate Genomes Project pipeline attempted to eliminate false duplications through haplotype phasing and purging. These assemblies are among the first generated by the Vertebrate Genomes Project where there was a prior chromosomal level reference assembly to compare with. Results Whole genome alignments revealed that 4 to 16% of the sequences are falsely duplicated in the previous assemblies, impacting hundreds to thousands of genes. These lead to overestimated gene family expansions. The main source of the false duplications is heterotype duplications, where the haplotype sequences were relatively more divergent than other parts of the genome leading the assembly algorithms to classify them as separate genes or genomic regions. A minor source is sequencing errors. Ancient ATP nucleotide binding gene families have a higher prevalence of false duplications compared to other gene families. Although present in a smaller proportion, we observe false duplications remaining in the Vertebrate Genomes Project assemblies that can be identified and purged. Conclusions This study highlights the need for more advanced assembly methods that better separate haplotypes and sequence errors, and the need for cautious analyses on gene gains.
Garg A, Krueger JG
Show All Authors

Raising the bar for efficacy in hidradenitis suppurativa: a rationale for combination targeted therapies

BRITISH JOURNAL OF DERMATOLOGY 2022 SEP; 187(3):414-415
Zhang YF, Wang XL, Xu CH, Liu N, Zhang L, Zhang YM, Xie YY, Zhang YL, Huang QH, Wang L, Chen Z, Chen SJ, Roeder RG, Shen SH, Xue K, Sun XJ
Show All Authors

A direct comparison between AML1-ETO and ETO2-GLIS2 leukemia fusion proteins reveals context-dependent binding and regulation of target genes and opposite functions in cell differentiation

FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY 2022 SEP 7; 10(?):? Article 992714
The ETO-family transcriptional corepressors, including ETO, ETO2, and MTGR1, are all involved in leukemia-causing chromosomal translocations. In every case, an ETO-family corepressor acquires a DNA-binding domain (DBD) to form a typical transcription factor-the DBD binds to DNA, while the ETO moiety manifests transcriptional activity. A directly comparative study of these "homologous " fusion transcription factors may clarify their similarities and differences in regulating transcription and leukemogenesis. Here, we performed a side-by-side comparison between AML1-ETO and ETO2-GLIS2, the most common fusion proteins in M2-and M7-subtypes of acute myeloid leukemia, respectively, by inducible expression of them in U937 leukemia cells. We found that, although AML1-ETO and ETO2-GLIS2 can use their own DBDs to bind DNA, they share a large proportion of genome-wide binding regions dependent on other cooperative transcription factors, including the ETS-, bZIP- and bHLH-family proteins. AML1-ETO acts as either transcriptional repressor or activator, whereas ETO2-GLIS2 mainly acts as activator. The repressor-versus-activator functions of AML1-ETO might be determined by the abundance of cooperative transcription factors/cofactors on the target genes. Importantly, AML1-ETO and ETO2-GLIS2 differentially regulate key transcription factors in myeloid differentiation including PU.1 and C/EBP beta. Consequently, AML1-ETO inhibits, but ETO2-GLIS2 facilitates, myeloid differentiation of U937 cells. This function of ETO2-GLIS2 is reminiscent of a similar effect of MLL-AF9 as previously reported. Taken together, this directly comparative study between AML1-ETO and ETO2-GLIS2 in the same cellular context provides insights into context-dependent transcription regulatory mechanisms that may underlie how these seemingly "homologous " fusion transcription factors exert distinct functions to drive different subtypes of leukemia.
Ilanges A, Shiao R, Shaked J, Luo JD, Yu XF, Friedman JM
Show All Authors

Brainstem ADCYAP1(+) neurons control multiple aspects of sickness behaviour

NATURE 2022 SEP 22; 609(7928):761-+
Infections induce a set of pleiotropic responses in animals, including anorexia, adipsia, lethargy and changes in temperature, collectively termed sickness behaviours(1). Although these responses have been shown to be adaptive, the underlying neural mechanisms have not been elucidated(2-4). Here we use of a set of unbiased methodologies to show that a specific subpopulation of neurons in the brainstem can control the diverse responses to a bacterial endotoxin (lipopolysaccharide (LPS)) that potently induces sickness behaviour. Whole-brain activity mapping revealed that subsets of neurons in the nucleus of the solitary tract (NTS) and the area postrema (AP) acutely express FOS after LPS treatment, and we found that subsequent reactivation of these specific neurons in FOS2A-iCreERT2 (also known as TRAP2) mice replicates the behavioural and thermal component of sickness. In addition, inhibition of LPS-activated neurons diminished all of the behavioural responses to LPS. Single-nucleus RNA sequencing of the NTS-AP was used to identify LPS-activated neural populations, and we found that activation of ADCYAP1(+) neurons in the NTS-AP fully recapitulates the responses elicited by LPS. Furthermore, inhibition of these neurons significantly diminished the anorexia, adipsia and locomotor cessation seen after LPS injection. Together these studies map the pleiotropic effects of LPS to a neural population that is both necessary and sufficient for canonical elements of the sickness response, thus establishing a critical link between the brain and the response to infection.
Formenti G, Abueg L, Brajuka A, Brajuka N, Gallardo-Alba C, Giani A, Fedrigo O, Jarvis ED
Show All Authors

Gfastats: conversion, evaluation and manipulation of genome sequences using assembly graphs

BIOINFORMATICS 2022 SEP 2; 38(17):4214-4216
Motivation: With the current pace at which reference genomes are being produced, the availability of tools that can reliably and efficiently generate genome assembly summary statistics has become critical. Additionally, with the emergence of new algorithms and data types, tools that can improve the quality of existing assemblies through automated and manual curation are required. Results: We sought to address both these needs by developing gfastats, as part of the Vertebrate Genomes Project (VGP) effort to generate high-quality reference genomes at scale. Gfastats is a standalone tool to compute assembly summary statistics and manipulate assembly sequences in FASTA, FASTQ or GFA [.gz] format. Gfastats stores assembly sequences internally in a GFA-like format. This feature allows gfastats to seamlessly convert FAST* to and from GFA [.gz] files. Gfastats can also build an assembly graph that can in turn be used to manipulate the underlying sequences following instructions provided by the user, while simultaneously generating key metrics for the new sequences.
Caradonna SG, Paul MR, Marrocco J
Show All Authors

An allostatic epigenetic memory on chromatin footprints after double-hit acute stress

NEUROBIOLOGY OF STRESS 2022 SEP; 20(?):? Article 100475
Stress induces allostatic responses, whose limits depend on genetic background and the nature of the challenges. Allostatic load reflects the cumulation of these reponses over the course of life. Acute stress is usually associated with adaptive responses, although, depending on the intensity of the stress and individual differences , some may experience maladaptive coping that persists through life and may influence subsequent responses to stressful events, as is the case of post -traumatic stress disorder. We investigated the behavioral traits and epigenetic signatures in a double-hit mouse model of acute stress in which heterotypic stressors (acute swim stress and acute restraint stress) were applied within a 7-day interval period. The ventral hippocampus was isolated to study the footprints of chromatin accessibility driven by exposure to double-hit stress. Using ATAC sequencing to determine regions of open chromatin, we showed that depending on the number of acute stressors, several gene sets related to development, immune function, cell starvation, translation, the cytoskeleton, and DNA modification were reprogrammed in both males and females. Chromatin accessibility for transcription factor binding sites showed that stress altered the accessibility for androgen, glucocorticoid, and mineralocorticoid receptor binding sites (AREs/GREs) at the genome-wide level, with double-hit stressed mice displaying a profile unique from either single hit of acute stress. The investigation of AREs/GREs adjacent to gene coding regions revealed several stress-related genes, including Fkbp5, Zbtb16, and Ddc, whose chromatin accessibility was affected by prior exposure to stress. These data demonstrate that acute stress is not truly acute because it induces allostatic signatures that persist in the epigenome and may manifest when a second challenge hits later in life.
De novo gene origination, where a previously nongenic genomic sequence becomes genic through evolution, is increasingly recognized as an important source of novelty. Many de novo genes have been proposed to be protein-coding, and a few have been experimentally shown to yield protein products. However, the systematic study of de novo proteins has been hampered by doubts regarding their translation without the experimental observation of protein products. Using a systematic, mass-spectrometry-first computational approach, we identify 993 unannotated open reading frames with evidence of translation (utORFs) in Drosophila melanogaster. To quantify the similarity of these utORFs across Drosophila and infer phylostratigraphic age, we develop a synteny-based protein similarity approach. Combining these results with reference datasets ontissue- and life stage-specific transcription and conservation, we identify different properties amongst these utORFs. Contrary to expectations, the fastest-evolving utORFs are not the youngest evolutionarily. We observed more utORFs in the brain than in the testis. Most of the identified utORFs may be of de novo origin, even accounting for the possibility of false-negative similarity detection. Finally, sequence divergence after an inferred de novo origin event remains substantial, suggesting that de novo proteins turn over frequently. Our results suggest that there is substantial unappreciated diversity in de novo protein evolution: many more may exist than previously appreciated; there may be divergent evolutionary trajectories, and they may be gained and lost frequently. All in all, there may not exist a single characteristic model of de novo protein evolution, but instead, there may be diverse evolutionary trajectories.