N Nature Genetics · Dec 03, 2025 ADAR1 editing is necessary for only a small subset of cytosolic dsRNAs to evade MDA5-mediated autoimmunity Endogenous long double-stranded RNAs (dsRNAs), which are not edited by the RNA editing enzyme ADAR1, may activate the antiviral dsRNA receptor MDA5 to trigger interferon-mediated immune responses. Among the large number of endogenous long dsRNAs, the key substrates that activate MDA5—termed as immunogenic dsRNAs—remain largely unidentified. Here we reveal that human immunogenic dsRNAs constitute a surprisingly small fraction of all cellular dsRNAs. We found that these immunogenic dsRNAs were highly enriched in mRNAs and depleted of introns, consistent with their role as cytosolic MDA5 substrates. We validated the MDA5-dependent immunogenicity of these dsRNAs, which was dampened following ADAR1-mediated RNA editing. Notably, immunogenic dsRNAs were enriched at genetic susceptibility loci associated with common inflammatory diseases, implying their functional importance. We anticipate that a focused analysis of immunogenic dsRNAs will enhance our understanding and treatment of cancer and inflammatory diseases, where the roles of dsRNA editing and sensing are increasingly recognized. Immunogenetics Transcriptomics biology
N Nature Genetics · Nov 27, 2025 Passenger mutations link cellular origin and transcriptional identity in human lung adenocarcinomas DNA damage is preferentially repaired in expressed genes; thus, genome-wide correlations between somatic mutation patterns and normal cell transcription may reflect tumor cell origins. Accordingly, we found that aggregate lung adenocarcinoma (LUAD) and squamous cancer (LUSC) somatic mutation density associated most strongly with distal (alveolar) and proximal (basal) lung cell-type-specific gene expression, respectively, consistent with presumed LUAD and LUSC cell origins. Analyzing individual genomes, 21% of LUADs bore mutational footprints of proximal airway origins, with 38% classified as ambiguous. Distal origin LUADs, enriched forKRASandSTK11drivers, occurred mainly in smokers; proximal origin LUADs, enriched forEGFRdrivers, were more common in never-smokers. Ambiguous origin LUADs showed APOBEC signatures andSMARCA4alterations.TP53mutant LUADs with non-distal cell origins preferentially exhibited non-distal transcriptional identity. Our study reveals a complex interplay between lineage and identity in LUAD evolution and offers a scalable strategy to infer tumor origins in human cancers. Genomics Non-small-cell lung cancer Transcriptomics biology
N Nature Genetics · Nov 27, 2025 A genome-wide association study of mass spectrometry proteomics using a nanoparticle enrichment platform Most studies to date of protein quantitative trait loci (pQTLs) have relied on affinity proteomics platforms, which provide only limited information about the targeted protein isoforms and may be affected by genetic variation in their epitope binding. Here we show that mass spectrometry (MS)-based proteomics can complement these studies and provide insights into the role of specific protein isoform and epitope-altering variants. Using the Seer Proteograph nanoparticle enrichment MS platform, we identified and replicated new pQTLs in a genome-wide association study of proteins in blood plasma samples from two cohorts and evaluated previously reported pQTLs from affinity proteomics platforms. We found that >30% of the evaluated pQTLs were confirmed by MS proteomics to be consistent with the hypothesis that genetic variants induce changes in protein abundance, whereas another 30% could not be replicated and are possibly due to epitope effects, although alternative explanations for nonreplication need to be considered on a case-by-case basis. Genetics research Genome-wide association studies Proteomics biology
N Nature Genetics · Nov 24, 2025 Spatially resolved multi-omics of human metabolic dysfunction-associated steatotic liver disease Metabolic dysfunction-associated steatotic liver disease (MASLD) is a leading cause of chronic liver disease worldwide. We generated single-cell and spatial transcriptomic and metabolomic maps from 61 human livers, including controls (n= 10), metabolic dysfunction-associated steatotic liver (MASL) (n= 17) and metabolic dysfunction-associated steatohepatitis (MASH) (n= 34). We identified microphthalmia-associated transcription factor (MITF) as a key regulator of the lipid-handling capacity of lipid-associated macrophages (LAMs), and further revealed a hepato-protective role of LAMs mediated through hepatocyte growth factor secretion. Unbiased deconvolution of spatial transcriptomics delineated a fibrosis-associated gene program enriched in advanced MASH, suggesting profibrotic crosstalk between central vein endothelial and hepatic stellate cells within fibrotic regions. Mass spectrometry imaging-based spatial metabolomics demonstrated MASLD-specific accumulation of phospholipids, potentially linked to lipoprotein-associated phospholipase A2-mediated phospholipid metabolism in LAMs. This spatially resolved multi-omics atlas of human MASLD, which can be queried at theHuman Masld Spatial Multiomics Atlas, provides a valuable resource for mechanistic and therapeutic studies. Gene expression Liver diseases biology
N Nature Genetics · Nov 24, 2025 Proteome-wide model for human disease genetics Missense variants remain a challenge in genetic interpretation owing to their subtle and context-dependent effects. Although current prediction models perform well in known disease genes, their scores are not calibrated across the proteome, limiting generalizability. To address this knowledge gap, we developed popEVE, a deep generative model combining evolutionary and human population data to estimate variant deleteriousness on a proteome-wide scale. popEVE achieves state-of-the-art performance without overestimating the burden of deleterious variants and identifies variants in 442 genes in a severe developmental disorder cohort, including 123 novel candidates. These genes are functionally similar to known disease genes, and their variants often localize to critical regions. Remarkably, popEVE can prioritize likely causal variants using only child exomes, enabling diagnosis even without parental sequencing. This work provides a generalizable framework for rare disease variant interpretation, especially in singleton cases, and demonstrates the utility of calibrated, evolution-informed scoring models for clinical genomics. Computational biology and bioinformatics Genetic association study Neurodevelopmental disorders biology
N Nature Genetics · Nov 20, 2025 Scalable and accurate rare variant meta-analysis with Meta-SAIGE Meta-analysis enhances the power of rare variant association tests by combining summary statistics across several cohorts. However, existing methods often fail to control type I error for low-prevalence binary traits and are computationally intensive. Here we introduce Meta-SAIGE—a scalable method for rare variant meta-analysis that accurately estimates the null distribution to control type I error and reuses the linkage disequilibrium matrix across phenotypes to boost computational efficiency in phenome-wide analyses. Simulations using UK Biobank whole-exome sequencing data show that Meta-SAIGE effectively controls type I error and achieves power comparable to pooled individual-level analysis with SAIGE-GENE+. Applying Meta-SAIGE to 83 low-prevalence phenotypes in UK Biobank and All of Us whole-exome sequencing data identified 237 gene–trait associations. Notably, 80 of these associations were not significant in either dataset alone, underscoring the power of our meta-analysis. Bioinformatics Genetics research Genome-wide association studies Software
N Nature Genetics · Nov 18, 2025 Adenine DNA methylation associated with transcriptionally permissive chromatin is widespread across eukaryotes DNA methylation is a key regulator of eukaryotic genomes, most commonly through 5-methylcytosine (5mC). In contrast, the existence and function ofN6-methyladenine (6mA) in eukaryotes have been controversial, with conflicting reports resulting from methodological artifacts. Nevertheless, some unicellular lineages, including ciliates, early-branching fungi and the algaChlamydomonas, show robust 6mA signals, raising questions about their origin and evolutionary role. Here we apply Oxford Nanopore sequencing to profile 6mA at base-pair resolution across 18 unicellular eukaryotes representing all major supergroups. We find that robust 6mA patterns occur only in species that encode the adenine methyltransferase AMT1. Notably, 6mA consistently accumulates downstream of transcriptional start sites, positioned between H3K4me3-marked nucleosomes, indicating a conserved association with transcriptional activation. Our results support the idea that the last eukaryotic common ancestor had a dual methylation system, with transcription-linked 6mA and repressive 5mC, which has been repeatedly simplified in both multicellular and unicellular lineages through the loss of the AMT1 pathway. Epigenetics Epigenomics Gene regulation
N Nature Genetics · Nov 18, 2025 Disentangling the architectural and non-architectural functions of CTCF and cohesin in gene regulation Cohesin- and CTCF-mediated chromatin loops facilitate enhancer–promoter and promoter–promoter interactions, but their impact on global gene regulation remains debated. Here we show that acute removal of cohesin or CTCF in mouse cells dysregulates hundreds of genes. Cohesin depletion primarily downregulates CBP/p300-dependent putative enhancer targets, whereas CTCF loss both up- and downregulates enhancer targets. Beyond loop anchoring, CTCF directly modulates transcription, acting as an activator or repressor depending on its binding position and orientation at promoters. Mechanistically, when activating, CTCF increases DNA accessibility and promotes RNA polymerase II recruitment; when repressing, it prevents RNA polymerase II binding without altering chromatin accessibility. Promoter-bound CTCF activates housekeeping genes essential for cell proliferation. CTCF’s transcriptional activation function—but not its loop anchoring role—is shared with its vertebrate-specific paralog, CTCFL. These findings reconcile architectural and non-architectural roles of cohesin and CTCF, offering a unified model for their functions in enhancer-dependent and enhancer-independent transcription control. Epigenetics Gene regulation
N Nature Genetics · Nov 17, 2025 African-ancestry-specific variant IKKβ p.Glu502Lys confers high lupus risk Cutaneous lupus erythematosus (CLE) is an autoimmune disease of the skin, occurring with or without systemic lupus erythematosus (SLE). People with African ancestry have a higher risk than people with other ancestries of developing lupus1but have been underrepresented in genetic studies. We whole-genome-sequenced 27,820 Americans with genetically inferred African ancestry from the Diverse Ancestry Cohort, including people with CLE (n= 211) and/or SLE (n= 574). We discovered an association with a rare missense variant inIKBKB, rs115698972G>A, IKKβE502K, exclusive to people with African ancestry, conferring an odds ratio (OR) of 5.4 for CLE and 3.3 for SLE. These associations replicated in the All of Us and VA Million Veteran Research Programs for CLE (ORmeta= 3.8,Pmeta= 5.3 × 10−20,n= 1,243) and SLE (ORmeta= 3.2,Pmeta= 1.0 × 10−19,n= 1,697). In this cohort, IKKβE502Kaccounts for 10.4% of CLE cases and 6.4% of SLE cases, confers a high lupus risk, and contributes substantially to the disease prevalence among people with African ancestry. This highlights the value of including diverse ancestries in genetic association studies. Genome-wide association studies Systemic lupus erythematosus
N Nature Genetics · Nov 14, 2025 Uniform dynamics of cohesin-mediated loop extrusion in living human cells Most animal genomes are partitioned into topologically associating domains (TADs), created by cohesin-mediated loop extrusion and defined by convergently oriented CCCTC-binding factor (CTCF) sites. The dynamics of loop extrusion and its regulation remain poorly characterized in vivo. Here we tracked the motion of TAD anchors in living human cells to visualize and quantify cohesin-dependent loop extrusion across multiple endogenous genomic regions. We show that TADs are dynamic structures whose anchors are brought in proximity about once per hour and for 6–19 min (~16% of the time). Moreover, TADs are continuously extruded by multiple cohesin complexes. Remarkably, despite strong differences in Hi-C patterns across chromatin regions, their dynamics is consistent with the same density, residence time and speed of cohesin. Our results suggest that TAD dynamics is primarily governed by the location and affinity of CTCF sites, enabling genome-wide predictive models of cohesin-dependent chromatin interactions. Bioinformatics Biophysics Genetic engineering Genetics Microscopy
N Nature Genetics · Nov 14, 2025 Genome-wide association study and polygenic risk prediction of hypothyroidism We performed a genome-wide meta-analysis of hypothyroidism (113,393 cases and 1,065,268 controls), free thyroxine (191,449 individuals) and thyroid-stimulating hormone (482,873 individuals). We identified 350 loci associated with hypothyroidism, including 179 not previously reported, 29 of which were linked through thyroid-stimulating hormone. We found that many hypothyroidism risk loci regulate blood cell counts and the circulating inflammasome, and through multiple gene-mapping strategies, we prioritized 259 putative causal genes enriched in immune-related functions. We developed a polygenic risk score (PRS) based on more than 115,000 hypothyroidism cases to address diagnostic challenges in individuals with or at risk of thyroid hormone deficiency. We show that the highest predictive accuracy for hypothyroidism was achieved when combining the PRS with thyroid hormones and thyroid-peroxidase autoantibodies, and that the PRS was able to stratify risk of progression among individuals with subclinical hypothyroidism. These findings demonstrate the potential for a hypothyroidism PRS to support the prediction of disease progression and onset in thyroid hormone deficiency. Genome-wide association studies Thyroid diseases
N Nature Genetics · Nov 13, 2025 Genome-wide association analyses identify distinct genetic architectures for early-onset and late-onset depression Major depressive disorder (MDD) is a common and heterogeneous disorder of complex etiology. Studying more homogeneous groups stratified according to clinical characteristics, such as age of onset, can improve the identification of the underlying genetic causes and lead to more targeted treatment strategies. We leveraged Nordic biobanks with longitudinal health registries to investigate differences in the genetic architectures of early-onset (eoMDD;n= 46,708 cases) and late-onset (loMDD;n= 37,168 cases) MDD. We identified 12 genomic loci for eoMDD and two for loMDD. Overall, the two MDD subtypes correlated moderately (genetic correlation,rg= 0.58) and differed in their genetic correlations with related traits. These findings suggest that eoMDD and loMDD have partially distinct genetic signatures, with a specific developmental brain signature for eoMDD. Importantly, we demonstrate that polygenic risk scores (PRS) for eoMDD predict suicide attempts within the first 10 years after the initial diagnosis: the absolute risk for suicide attempt was 26% in the top PRS decile, compared to 12% and 20% in the bottom decile and the intermediate group, respectively. Taken together, our findings can inform precision psychiatry approaches for MDD. Depression Population genetics
N Nature Genetics · Nov 12, 2025 Computationally efficient meta-analysis of gene-based tests using summary statistics in large-scale genetic studies Meta-analysis of gene-based tests using single-variant summary statistics is a powerful strategy for genetic association studies. However, current approaches require sharing the covariance matrix between variants for each study and trait of interest. For large-scale studies with many phenotypes, these matrices can be cumbersome to calculate, store and share. Here, to address this challenge, we present REMETA—an efficient tool for meta-analysis of gene-based tests. REMETA uses a single sparse covariance reference file per study that is rescaled for each phenotype using single-variant summary statistics. We develop new methods for binary traits with case–control imbalance, and to estimate allele frequencies, genotype counts and effect sizes of burden tests. We demonstrate the performance and advantages of our approach through meta-analysis of five traits in 469,376 samples in UK Biobank. The open-source REMETA software will facilitate meta-analysis across large-scale exome sequencing studies from diverse studies that cannot easily be combined. Genome-wide association studies Software
N Nature Genetics · Nov 11, 2025 Stable clonal contribution of lineage-restricted stem cells to human hematopoiesis Dynamic steady-state lineage contribution of human hematopoietic stem cell (HSC) clones needs to be assessed over time. However, clonal contribution of HSCs has only been investigated at single time points and without assessing the critical erythroid and platelet lineages. Here we screened for somatic mutations in healthy aged individuals, identifying expanded HSC clones accessible for lineage tracing of all major blood cell lineages. In addition to HSC clones with balanced contribution to all lineages, we identified clones with all myeloid lineages but no or few B and T lymphocytes or all myeloid lineages and B cells but no T cells. No other lineage restriction patterns were reproducibly observed. Retrospective phylogenetic inferences uncovered a ‘hierarchical’ pattern of descendant subclones more lineage biased than their ancestral clone and a more common ‘stable’ pattern with descendant subclones showing highly concordant lineage contributions with their ancestral clone, despite decades of separation. Prospective lineage tracing confirmed remarkable stability over years of HSC clones with distinct lineage replenishment patterns. Ageing Stem-cell research
N Nature Genetics · Nov 07, 2025 Genetic basis of flavor complexity in sweet corn Sweet corn is an important vegetable crop consumed globally. However, the genetic differentiation between field corn and sweet corn, and the impact of breeding on the metabolite composition and flavor (other than sweetness) of sweet corn, remain poorly understood. Here we assembled a cultivated sweet-corn genome de novo and re-sequenced 295 diverse sweet-corn inbred lines. We examined the genetic architecture of sweet-corn kernel quality by combining genetic, metabolite and expression profiling methodologies. New genes (for example,ZmAPS1,ZmSK1andZmCRR5) and metabolites associated with flavor and consumer preference were identified, highlighting important target flavor metabolites, including sugars, acids and volatiles. These findings provide valuable knowledge and targets for future genetic breeding of sweet-corn flavor, and to balance grain yield and quality and contribute to our broader understanding of crop diversification. Genome-wide association studies Plant genetics Population genetics
N Nature Genetics · Nov 07, 2025 TGF-β builds a dual immune barrier in colorectal cancer by impairing T cell recruitment and instructing immunosuppressive SPP1+macrophages Transforming growth factor β (TGF-β) signaling in the tumor microenvironment predicts resistance to immune checkpoint blockade (ICB). While TGF-β inhibition enhances ICB efficacy in murine cancer models, clinical trials have yet to demonstrate benefit, underscoring the need to better understand its immunoregulatory roles across disease contexts. Using mouse models of advanced colorectal cancer and patient-derived data, we demonstrate that TGF-β impairs antitumor immunity at multiple levels in liver metastases. It acts directly on T cells to block recruitment of peripheral memory CD8+T cells, thereby limiting the effectiveness of ICB. Concurrently, TGF-β instructs tumor-associated macrophages to suppress clonal expansion of newly arrived T cells by inducing SPP1 expression. This extracellular matrix protein promotes collagen deposition and accumulation of tumor-associated macrophages and fibroblasts, ultimately driving ICB resistance. Our findings reveal how TGF-β coordinates immunosuppressive mechanisms across innate and adaptive immune compartments to promote metastasis, offering new avenues to improve immunotherapy in colorectal cancer. Colon cancer Immunosurveillance Metastasis
N Nature Genetics · Nov 05, 2025 Epigenetically driven and early immune evasion in colorectal cancer evolution Immune system control is a principal hurdle in cancer evolution. The temporal dynamics of immune evasion remain incompletely characterized, and how immune-mediated selection interrelates with epigenome alteration is unclear. Here we infer the genome- and epigenome-driven evolutionary dynamics of tumor-immune coevolution within primary colorectal cancers (CRCs). We utilize a multiregion multiomic dataset of matched genome, transcriptome and chromatin accessibility profiling from 495 single glands (from 29 CRCs) supplemented with high-resolution spatially resolved neoantigen sequencing data and multiplexed imaging of the tumor microenvironment from 82 microbiopsies within 11 CRCs. Somatic chromatin accessibility alterations contribute to accessibility loss of antigen-presenting genes and silencing of neoantigens. Immune escape and exclusion occur at the outset of CRC formation, and later intratumoral differences in immuno-editing are negligible or exclusive to sites of invasion. Collectively, immune evasion in CRC follows a ‘Big Bang’ evolutionary pattern, whereby it is acquired close to transformation and defines subsequent cancer-immune evolution. Cancer Epigenomics Gastrointestinal cancer Genome informatics
N Nature Genetics · Nov 05, 2025 Genome-scale CRISPR screens identify PTGES3 as a direct modulator of androgen receptor function in advanced prostate cancer The androgen receptor (AR) is a critical driver of prostate cancer (PCa). Here, to study regulators of AR protein levels and oncogenic activity, we developed a live-cell quantitative endogenous AR fluorescent reporter. Leveraging this AR reporter, we performed genome-scale CRISPRi flow cytometry sorting screens to systematically identify genes that modulate AR protein levels. We identified and validated known AR protein regulators, including HOXB13 and GATA2, and also unexpected top hits including PTGES3—a poorly characterized gene in PCa. PTGES3 repression resulted in loss of AR protein, cell-cycle arrest and cell death in AR-driven PCa models. Clinically, analysis of PCa data demonstrates that PTGES3 expression is associated with AR-directed therapy resistance. Mechanistically, we show PTGES3 binds directly to AR, regulates AR protein stability and is necessary for AR function in the nucleus at AR target genes. PTGES3 represents a potential therapeutic target for overcoming known mechanisms of resistance to existing AR-directed therapies in PCa. Cancer Prostate cancer
N Nature Genetics · Nov 04, 2025 Multi-ancestry genome-wide association analyses of polycystic ovary syndrome Polycystic ovary syndrome (PCOS), the leading endocrine disorder in women of reproductive age, is highly heritable, yet its polygenic architecture remains poorly understood. Here we conducted a genome-wide association study on 12,419 Chinese women with PCOS and 34,235 controls, followed by a multi-ancestry meta-analysis with up to 13,773 European cases and 411,088 controls, identifying 94 independent loci, 73 of which were previously unreported. Despite different evolutionary pressures, Chinese and European ancestries showed substantial genetic overlap. Integrative functional analyses prioritized regulatory variants controlling gene activity in specific tissues, disease-causing genes including anti-Müllerian hormone (AMH), and biological pathways involving ligand-binding domain interactions and peroxisome proliferator-activated receptor gamma (PPARG) signaling. We identified granulosa cells as particularly important in PCOS development. Our genetics-driven drug discovery approach revealed multiple drug targets and repurposing opportunities, enabling personalized treatment strategies. These results enhance our understanding of the molecular basis of PCOS, paving the way for precision medicine. Genome-wide association studies Polycystic ovary syndrome
N Nature Genetics · Nov 04, 2025 Genetic associations with educational fields Educational field choices shape careers, wellbeing and the societal skill distribution, yet genetic influences on what people study remain poorly understood. Here we show that genetic factors are associated with educational field specializations using genome-wide association studies (GWASs) across 463,134 individuals from Finland, Norway and the Netherlands (effectivenbetween 40,072 and 317,209). We identified 17 independent genome-wide significant variants linked to 7 of 10 educational fields, with average heritability of 7%. The genetic signal is specific to field choice rather than educational level, persisting after controlling for years of schooling and confounding factors. By examining genetic clustering across specializations, we uncovered two key dimensions: technical versus social and practical versus abstract. We performed GWASs of these components and demonstrated distinct genetic correlations with personality, behavior and socioeconomic status. Our findings demonstrate that genomic research can illuminate ‘horizontal’ stratification, revealing insights into vocational interests and social sorting beyond traditional attainment measures. Behavioural genetics Genetics research Genome-wide association studies Psychiatric disorders
N Nature Genetics · Nov 03, 2025 Integrated metabolomic and transcriptomic analyses identifyMYBgenes regulating key metabolites and agronomic traits in upland cottonGossypium hirsutum Understanding early embryonic development is fundamental for unraveling plant cell differentiation and organogenesis. Here we integrate multiomics data from 403 upland cotton ovules to identify 2,960 metabolic quantitative trait loci and 24,485 expression quantitative trait loci. A key locus,ME_A07, influencing 252 known metabolite levels and expression of 4,293 genes, with theMYBgeneGhTT2_A07identified as central regulator, potentially regulated by a 520 kb inversion.GhTT2_A07orchestrated both primary and secondary metabolite biosynthesis, influencing agronomic traits. Another locus,ME_A06, driven by theMYBgeneProanthocyanidin Regulator(GhPAR), modulates proanthocyanin content and suggests an ecological adaptation.GhTT2_A07andGhPARexhibit both shared and distinct expression profiles, contributing variably to fiber quality and yield. These findings highlight the critical role ofMYBgenes in the early development of cotton ovules and fibers, offering comprehensive multiomics resources that advance cotton research and molecular breeding. Gene regulation Genome-wide association studies Metabolomics
N Nature Genetics · Nov 03, 2025 Liability threshold model-based disease risk prediction based on electronic health record phenotypes Electronic health records have been increasingly adopted as useful resources for genomic research. However, case–control labeling of clinical data from electronic health records is challenging and most studies utilize phenotype codes to define case/control labels, resulting in suboptimal downstream analyses. Here we describe the liability threshold phenotypic integration, a method combining genetic relatedness with phenotypic data, including binary and continuous traits such as diagnosis codes, family disease history, laboratory measurements and biomarkers, to derive new continuous phenotypes for target diseases. The model utilizes an automatic trait selection algorithm that increases performance in disease risk prediction and provides insights into nontarget traits associated with the target disease. Our simulations and applications to the eMERGE network and the UK Biobank data demonstrate consistent performance gains in disease risk prediction and genome-wide association study power compared to conventional phenotype codes, models that solely incorporate family history and the phenotype imputation method SoftImpute, with similar false-positive rate control. Genetics research Genome-wide association studies
N Nature Genetics · Nov 03, 2025 Functionally dominant hotspot mutations of mitochondrial ribosomal RNA genes in cancer The vast majority of recurrent somatic mutations arising in tumors affect protein-coding genes in the nuclear genome. Here, through population-scale analysis of 14,106 whole tumor genomes, we report the discovery of highly recurrent mutations affecting both the small (12S,MT-RNR1) and large (16S,MT-RNR2) mitochondrial RNA subunits of the mitochondrial ribosome encoded within mitochondrial DNA (mtDNA). Compared to non-hotspot positions, mitochondrial rRNA hotspots preferentially affected positions under purifying selection in the germline and demonstrated structural clustering within the mitoribosome at mRNA and tRNA interacting positions. Using precision mtDNA base editing, we engineered models of an exemplarMT-RNR1hotspot mutation, m.1227G>A. Multimodal profiling revealed a heteroplasmy-dependent decrease in mitochondrial function and loss of respiratory chain subunits from a heteroplasmic dosage of ~10%. Mutation of conserved positions in ribosomal RNA that disrupt mitochondrial translation therefore represent a class of functionally dominant, pathogenic mtDNA mutations that are under positive selection in cancer genomes. Cancer Genetics
N Nature Genetics · Oct 31, 2025 An African ancestry-specific nonsense variant inCD36is associated with a higher risk of dilated cardiomyopathy The high burden of dilated cardiomyopathy (DCM) in individuals of African descent remains incompletely explained. Here, to explore a genetic basis, we conducted a genome-wide association study in 1,802 DCM cases and 93,804 controls of African genetic ancestry (AFR). A nonsense variant (rs3211938:G) inCD36was associated with increased risk of DCM. This variant, believed to be under positive selection due to a protective role in malaria resistance, is present in 17% of AFR individuals but <0.1% of European genetic ancestry (EUR) individuals. Homozygotes for the risk allele, who comprise ~1% of the AFR population, had approximately threefold higher odds of DCM. Among those without clinical cardiomyopathy, homozygotes exhibited an 8% absolute reduction in left ventricular ejection fraction. In AFR, the DCM population attributable fraction for theCD36variant was 8.1%. This single variant accounted for approximately 20% of the excess DCM risk in individuals of AFR compared to those of EUR. Experiments in human induced pluripotent stem cell-derived cardiomyocytes demonstrated thatCD36loss of function impairs fatty acid uptake and disrupts cardiac metabolism and contractility. These findings implicateCD36loss of function and suboptimal myocardial energetics as a prevalent cause of DCM in individuals of African descent. Cardiomyopathies Genome-wide association studies
N Nature Genetics · Oct 30, 2025 Transcription factor switching drives subtype-specific pancreatic cancer Emerging evidence suggests that lineage-specifying transcription factors control the progression of pancreatic ductal adenocarcinoma (PDAC). We have discovered a transcription factor switching mechanism involving the poorly characterized orphan nuclear receptor HNF4G and the putative pioneer factor FOXA1, which drives PDAC progression. Using our unbiased protein interactome discovery approach, we identified HNF4A and HNF4G as reproducible, FOXA1-associated proteins, in both preclinical models and Whipple surgical samples. In the primary tumor context, we consistently find that the dominant transcription factor is HNF4G, where it functions as the driver. A molecular switch occurs in advanced disease, whereby HNF4G expression or activity decreases, unmasking FOXA1’s transcriptional potential. Derepressed FOXA1 drives late-stage disease by orchestrating metastasis-specific enhancer–promoter loops to regulate the expression of metastatic genes. Overall survival is influenced by HNF4G and FOXA1 activity in primary tumor growth and in metastasis, respectively. We suggest that the existence of stage-dependent transcription factor activity, triggered by molecular compartmentalization, mediates the progression of PDAC. Cancer
N Nature Genetics · Oct 29, 2025 Real-time dynamic polygenic prediction for streaming data Polygenic risk scores (PRSs) are promising tools for advancing precision medicine. However, existing PRS construction methods rely on static summary statistics derived from genome-wide association studies, which are often updated at lengthy intervals. With genetic data and health outcomes continuously being generated, the current PRS training and deployment paradigm is suboptimal in maximizing prediction accuracy for incoming patients in healthcare settings. We introduce real-time PRS-CS (rtPRS-CS), which enables online, dynamic refinement and standardization of PRS as each new sample is collected. We perform extensive simulations to evaluate rtPRS-CS across various genetic architectures and training sample sizes. Leveraging quantitative traits from two large-scale biobanks, we show that rtPRS-CS can integrate massive streaming data to enhance PRS prediction over time. We further apply rtPRS-CS to 22 schizophrenia cohorts across seven Asian regions, demonstrating the clinical utility of rtPRS-CS in dynamically capturing health status changes and predicting disease risk across diverse genetic ancestries. Genetic association study Schizophrenia
N Nature Genetics · Oct 29, 2025 Spatiotemporal gene expression and cellular dynamics of the developing human heart Heart development relies on topologically orchestrated cellular transitions and interactions, many of which remain poorly characterized in humans. Here, we combined unbiased spatial and single-cell transcriptomics with imaging-based validation across postconceptional weeks 5.5 to 14 to uncover the molecular landscape of human early cardiogenesis. We present a high-resolution transcriptomic map of the developing human heart, revealing the spatial arrangements of 31 coarse-grained and 72 fine-grained cell states organized into distinct functional niches. Our findings illuminate key insights into the formation of the cardiac pacemaker-conduction system, heart valves and atrial septum, and uncover unexpected diversity among cardiac mesenchymal cells. We also trace the emergence of autonomic innervation and provide the first spatial account of chromaffin cells in the fetal heart. Our study, supported by an open-access spatially centric interactive viewer, offers a unique resource to explore the cellular and molecular blueprint of human heart development, offering links to genetic causes of heart disease. Cardiovascular diseases Gene expression profiling Organogenesis
N Nature Genetics · Oct 29, 2025 Precise control of chromatin loop extrusion enhances sustainable green revolution yield in rice Continuous and excessive use of inorganic fertilizers underlies current global crop production; therefore, reducing fertilizer use while increasing crop productivity is critical for ensuring agricultural sustainability and food security. Here we show that the natural variant ofRCN2, a riceTERMINAL FLOWER 1/CENTRORADIALIShomolog, enhances photosynthesis, nitrogen assimilation and grain yield by restricting chromatin loop extrusion. RCN2 competitively inhibits the interaction between growth-repressing DELLA proteins and SQUAMOSA PROMOTER BINDING PROTEIN-LIKE transcription factors, breaking the green revolution trade-off between plant growth and metabolism of carbon and nitrogen. We demonstrate that targeting CCCTC-containing insulator elements at theRCN2locus confers not only decoupling tillering and panicle branching without affecting beneficial semi-dwarfism, but also improves source-to-sink carbon allocation and nitrogen-use efficiency, consequently increasing harvest index and rice yield at low nitrogen fertilization levels. Precise modulation of loop extrusion thus enables new breeding strategies to reduce nitrogen fertilizer use in high-yield cereal crops. Plant breeding Plant genetics
N Nature Genetics · Oct 29, 2025 Cell-type- and locus-specific epigenetic editing of memory expression Epigenetic mechanisms have long been proposed to act as molecular mnemonics1,2,3, but whether the epigenetic makeup of a single genomic site can guide learnt behaviors remains unknown. Here we combined CRISPR-based epigenetic editing tools4,5with c-Fos-driven engram technologies6,7to address this question in memory-bearing neuronal ensembles. Focusing on the promoter ofArc, which encodes a master regulator of synaptic plasticity8, we found that its locus-specific and temporally controlled epigenetic editing is necessary and sufficient to regulate memory expression. Such effects occurred irrespective of the memory phase—during the initially labile period after learning and for fully consolidated memories—and were reversible within subject, testifying to their inherent plasticity. These findings provide a proof-of-principle that site-specific epigenetic dynamics are causally implicated in memory expression. Behavioural genetics Epigenetics Neuroscience
N Nature Genetics · Oct 27, 2025 Cellular states associated with metastatic organotropism and survival in patients with pancreatic ductal adenocarcinoma Most patients with localized pancreatic ductal adenocarcinoma (PDAC) experience recurrence after resection. Analysis of 744 patients with resected PDAC revealed that patients with initial isolated liver-metastatic recurrence (n= 100) had significantly worse overall survival than those with initial isolated lung-metastatic recurrence (n= 31). Using single-nucleus RNA sequencing in a representative cohort, we found that transcriptional profiles of primary cancer cells with liver-metastatic recurrence and lung-metastatic recurrence were correlated with those of normal liver and lung parenchymal cells, respectively, suggesting adoption of organ-specific metastatic programs at the primary site. These signatures were confirmed in transcriptomes of PDAC lung and liver metastases, primary lung and liver tumors, and organotropic PDAC xenograft models. These signatures were independent of large genomic events, and analysis of large-scale tumor profiling data showed no genetic alterations predictive of recurrence patterns. Additional analyses suggested that metastatic recurrence may be determined early in tumorigenesis and influenced by tumor-infiltrating immune cells. Thus, pre-existing cellular states within primary tumors appear to guide organ-specific metastatic relapse. Metastasis Pancreatic cancer Translational research
N Nature Genetics · Oct 27, 2025 Maternal factor OTX2 regulates human embryonic genome activation and early development Transcription factors (TFs) are instrumental in kickstarting embryonic genome activation (EGA) in many species, yet their regulatory roles in human embryos remain poorly understood. Here, we show that OTX2, a maternally provided PRD-like homeobox TF, is required for proper human EGA and early development. At the four-cell stage, OTX2 promotes activation of key EGA genes, includingTPRX1andTPRX2, and the EGA-associated repeat HERVL-int and MLT2A1. At EGA targets, OTX2 directly binds promoters and putative enhancers, many of which overlap with Alu and MaLR repetitive elements containing the OTX2 motif, and promotes chromatin accessibility. The transcriptome and developmental defects uponOTX2knockdown are partially rescued by overexpression ofTPRX1andTPRX2. Finally, joint knockdown ofOTX2andTPRXL, encoding another maternal PRD-like homeobox TF, exacerbates chromatin opening and EGA defects at the 8C stage. These findings establish OTX2 as a crucial maternal TF that awakes the genome at the beginning of human life. DNA sequencing Embryogenesis Gene regulation Transcriptomics
N Nature Genetics · Oct 27, 2025 A pangenome of maize provides genetic insights into drought resistance Drought poses a severe threat to the stability of crop yields. It is crucial to identify genetic resources and decipher the molecular mechanisms underlying drought resistance in crops. Here we generated high-quality genome assemblies of 25 maize germplasms exhibiting substantial variation in drought resistance. Combined with 31 additional maize genome sequences, a comprehensive pangenome analysis was performed. Rare allelic variations and extensive regulatory diversity were revealed in abscisic acid-related or drought-related genes, which may contribute to the diversity in drought resistance among germplasms. Furthermore, we identified three genes,ZmUGE2,ZmSIL2andZmASI3, that enhance maize drought resistance by strengthening mechanical support of the cell wall, regulating stress-responsive gene expression and coordinating male and female inflorescence development, respectively. Thus, this study provides valuable insight into the genetic control of drought resistance in maize at different growing stages. The expanded maize pangenome information serves as a valuable resource for maize genomic research. Genomics Plant physiology
N Nature Genetics · Oct 22, 2025 PathogenicUNC13Avariants cause a neurodevelopmental syndrome by impairing synaptic function TheUNC13Agene encodes a presynaptic protein that is crucial for setting the strength and dynamics of information transfer between neurons. Here we describe a neurodevelopmental syndrome caused by germline coding or splice-site variants inUNC13A. The syndrome presents with variable degrees of developmental delay and intellectual disability, seizures of different types, tremor and dyskinetic movements and, in some cases, death in early childhood. Using assays with expression ofUNC13Avariants in mouse hippocampal neurons and inCaenorhabditis elegans, we identify three mechanisms of pathogenicity, including reduction in synaptic strength caused by reduced UNC13A protein expression, increased neurotransmission caused by UNC13A gain-of-function and impaired regulation of neurotransmission by second messenger signalling. Based on a strong genotype–phenotype-functional correlation, we classify three UNC13A syndrome subtypes (types A–C). We conclude that the precise regulation of neurotransmitter release by UNC13A is critical for human nervous system function. Genetic testing Neurodevelopmental disorders Neuroscience
N Nature Genetics · Oct 21, 2025 Multi-modal spatial characterization of tumor immune microenvironments identifies targetable inflammatory niches in diffuse large B cell lymphoma Diffuse large B cell lymphomas (DLBCLs) are a heterogeneous group of malignancies that can arise in lymph nodes or extranodal locations, including immune-privileged sites. Here, we applied highly multiplexed spatial transcriptomics and proteomics together with genomic profiling to characterize the immune microenvironment architecture of 78 DLBCL tumors. We define seven distinct cellular niches, each characterized by unique cellular compositions, spatial organizations and patterns of intercellular communication associated with niche-specific phenotypes of both T cells and tumor B cells. Among these, DLBCLs from immune-privileged sites showed abundant T cell infiltration into diffuse niches, where immune cells were intermixed with tumor B cells and bore transcriptional hallmarks of activation and effector function, suggesting that they may be primed for anti-tumor immunity. Spatial characterization of the DLBCL immune microenvironment, therefore, reveals cellular niches that foster divergent patterns of cell–cell interactions contributing to the phenotypic heterogeneity of both niche-resident tumor and immune cells. Cancer microenvironment Translational research
N Nature Genetics · Oct 21, 2025 Genetic diversity and evolution of rice centromeres Understanding the driving force of centromere dynamics is crucial for deciphering the complexity of eukaryotic evolution and speciation. Here we assembled 67 rice genomes from theOryzaAA group and analyzed >800 nearly complete centromeres. Through de novo annotation of centromeric satelliteCEN155sequences and employing a progressive compression strategy, we quantified the local homogenization and multilayer structures of rice satellite arrays. Our results indicate that genetic innovations in rice centromeres primarily arise from structural variations and centrophilic retrotransposon insertions. The single-base substitution rate in rice centromeres appears to be lower relative to that in chromosome arms. Comparisons ofCEN155arrays, retrotransposons and functional centromeres highlight their dynamic but correlated interplay. Contrary to the KARMA model forArabidopsiscentromere evolution, we propose a hypothesis that retrotransposon invasion probably contributes to the decline of progenitor centromeric satellite arrays and promotes centromere repositioning, as evidenced by extended CENH3 chromatin immunoprecipitation sequencing enrichment beyond the native satellite arrays. Genomics Plant genetics Sequencing
N Nature Genetics · Oct 21, 2025 Complete genome assemblies of two mouse subspecies reveal structural diversity of telomeres and centromeres It has been more than 20 years since the publication of the C57BL/6J mouse reference genome, which has been a key catalyst for understanding the biology of mammalian diseases. However, the mouse reference genome still lacks telomeres and centromeres, contains 281 chromosomal sequence gaps and only partially represents many biomedically relevant loci. Here we present the first telomere-to-telomere (T2T) mouse genomes for two key inbred strains, C57BL/6J and CAST/EiJ. These T2T genomes reveal substantial variability in telomere and centromere sizes and structural organization. We thus add an additional 213 Mb of new sequence to the reference genome, which contains 517 protein-coding genes. We also examined two important but incomplete loci in the mouse genome—the pseudoautosomal region (PAR) on the sex chromosomes and KRAB zinc-finger protein loci. We identified distant locations of the PAR boundary, different copy numbers and sizes of segmental duplications and a multitude of amino acid substitution mutations in PAR genes. DNA sequencing Personalized medicine
N Nature Genetics · Oct 20, 2025 Modeling heterogeneity in single-cell perturbation states enhances detection of response eQTLs Identifying response expression quantitative trait loci (reQTLs) can help to elucidate mechanisms of disease associations. Typically, such studies model the effect of perturbation as discrete conditions. However, perturbation experiments usually affect perturbed cells heterogeneously. Here we show that modeling of per-cell perturbation state enhances detection of reQTLs. We use single-cell data to study the effect of perturbations with influenza A virus,Candida albicans,Pseudomonas aeruginosaandMycobacterium tuberculosison gene regulation. We found on average 36.9% more reQTLs by accounting for single-cell heterogeneity compared to the standard discrete reQTL model. For example, we detected a decrease in the expression quantitative trait loci effect forPXKwith influenza A virus. Furthermore, we found that, on average, 25% of reQTLs have cell-type-specific effects. For example, the reQTL effect forRPS26was stronger in B cells. Our work provides a general model for more accurate reQTL identification and underscores the value of modeling cell-level variation. Gene regulation Genetics
N Nature Genetics · Oct 20, 2025 Shifted assembly and function of mSWI/SNF family subcomplexes underlie targetable dependencies in dedifferentiated endometrial carcinomas The mammalian (m)SWI/SNF family of chromatin remodelers govern cell type-specific chromatin accessibility and gene expression and assemble as three distinct complexes: canonical BRG1-associated or BRM-associated factor (cBAF), poly(bromo)-associated BAF (PBAF) and noncanonical BAF (ncBAF). ARID1A and ARID1B are paralog subunits that specifically nucleate the assembly of cBAF complexes and are frequently co-mutated in highly aggressive dedifferentiated or undifferentiated endometrial carcinomas (DDEC/UECs). Here in cellular models and primary human tumors, we find that ARID1A and/or ARID1B (ARID1A/B) deficiency-mediated cBAF loss results in increased ncBAF and PBAF biochemical abundance and chromatin-level functions to maintain the DDEC oncogenic state. Furthermore, treatment with clinical-grade SMARCA4 and/or SMARCA2 ATPase inhibitors markedly attenuates DDEC cell proliferation and tumor growth in vivo and synergizes with carboplatin-based chemotherapy to extend survival. These findings reveal the oncogenic contributions of shifted mSWI/SNF family complex stoichiometry and resulting gene-regulatory dysregulation and suggest therapeutic utility of mSWI/SNF small molecule inhibitors in DDEC/UECs and other cBAF-disrupted cancer types. Endometrial cancer Epigenetics
N Nature Genetics · Oct 17, 2025 Genotyping sequence-resolved copy number variation using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes Copy number variable (CNV) genes are important in evolution and disease, yet their sequence variation remains a blind spot in large-scale studies. We present ctyper, a method that leverages pangenomes to produce allele-specific copy numbers with locally phased variants from next-generation sequencing samples. Benchmarking on 3,351 CNV genes and 212 challenging medically relevant (CMR) genes, ctyper captures 96.5% of phased variants with ≥99.1% correctness of copy number in CNV genes and 94.8% of phased variants in CMR genes. Ctyper takes 1.5 h to genotype a genome on one CPU. The ctyper genotypes give a 4.81-fold improvement in predictions of gene expression compared to known expression quantitative trait locus (eQTL) variants. Allele-specific expression quantified divergent expression in 7.94% of paralogs and tissue-specific biases in 4.68%. We found reduced expression ofSMN2due toSMN1conversion, potentially affecting spinal muscular atrophy, and increased expression of translocated duplications ofAMY2B. Overall, ctyper enables biobank-scale genotyping of CNV and CMR genes. Gene expression Genomics Sequence annotation Software
N Nature Genetics · Oct 17, 2025 Locityper enables targeted genotyping of complex polymorphic genes The human genome contains many structurally variable polymorphic loci, including several hundred disease-associated genes, almost inaccessible for accurate variant calling. Here we present Locityper, a tool capable of genotyping such challenging genes using short-read and long-read whole-genome sequencing. For each target, Locityper recruits and aligns reads to locus haplotypes, for instance, extracted from a pangenome, and finds the likeliest haplotype pair by optimizing read alignment, insert size and read depth profiles. Across 256 challenging medically relevant loci, Locityper achieves a median quality value (QV) above 35 from both long-read and short-read data, outperforming state-of-the-art Illumina and PacBio HiFi variant calling pipelines by 10.9 and 1.7 points, respectively. Furthermore, Locityper provides access to hyperpolymorphicHLAgenes and other gene families, includingKIR,MUCandFCGR. With its low running time of 1 h 35 m per sample at eight threads, Locityper is scalable to biobank-sized cohorts, enabling association studies for previously intractable disease-relevant genes. Genome informatics Genomics
N Nature Genetics · Oct 16, 2025 Meta-analysis reveals differences in somatic alterations by genetic ancestry across common cancers Genetic similarity of populations (or genetic ancestry) is associated with differences in somatic alterations in cancers. We meta-analyze two targeted panel sequencing cohorts with 275,605 samples from 14 cancer types. Here we find a recurrent depletion ofTERTpromoter mutations in patients of African and East Asian ancestry across multiple cancers. Several clinically actionable alterations, such asERBB2mutations in lung adenocarcinoma andMETmutations in papillary renal cell carcinoma, occur at a higher frequency in patients of non-European ancestry. Furthermore, in both cohorts, we show depletions in total driver alterations in non-European ancestries in multiple cancer types, potentially reflecting biases in current panel-based testing that prioritize established targets derived from predominantly patients of European ancestry. Our study highlights a need to increase population diversity in genomic studies to find new drivers and enhance precision oncology interventions for all populations. Genetics research Personalized medicine Tumour biomarkers
N Nature Genetics · Oct 16, 2025 Characterization of induced cohesin loop extrusion trajectories in living cells Cohesin (SMC1–SMC3–RAD21) constantly extrudes DNA loops to organize chromosomes into structural domains, pausing and anchoring at specific DNA-bound CTCF molecules. To study the detailed consequences of cohesin loop extrusion, we developed TArgeted Cohesin Loader (TACL) for controlled pan-cellular activation of chromatin loop formation at defined genomic locations in living cells. With TACL, we show that highly complex looping networks can exist, with extruding cohesin complexes that block each other, drive cohesin queuing and induce loop anchoring at nearly all CTCF-bound sites. TACL loops extend upon acute depletion of STAG2, PDS5A or WAPL. Activated cohesin loop extrusion hinders local gene transcription and can alter chromatin accessibility and H3K27ac distribution. TACL shows that the loading/extrusion complex NIPBL–MAU2 can be transported by cohesin to CTCF sites but, together with SMC1, to enhancers in a RAD21-independent manner. TACL thus enables studying the consequences of activated loop extrusion at defined genomic locations. Epigenetics Epigenomics
N Nature Genetics · Oct 14, 2025 Activity-based selection for enhanced base editor mutational scanning Base editing is a CRISPR-based technology that enables high-throughput, nucleotide-level functional interrogation of the genome that is essential for understanding the genetic basis of human disease and informing therapeutic development. Base editing screens have emerged as a powerful experimental approach, yet significant cell-to-cell variability in editing efficiency introduces noise that may obscure meaningful results. Here we develop a co-selection method that enriches for cells with high base editing activity, substantially increasing editing efficiency at a target locus. We evaluate this activity-based selection method against a traditional screening approach by tiling guide RNAs acrossTP53, demonstrating its enhanced capacity to pinpoint specific mutations and protein regions of functional importance. We anticipate that this modular selection method will enhance the resolution of base editing screens across many applications. High-throughput screening Mutagenesis
N Nature Genetics · Oct 13, 2025 Statistical construction of calibrated prediction intervals for polygenic score-based phenotype prediction Accurately quantifying uncertainty in predicted phenotypes from polygenic score (PGS)-based applications is essential for reliable clinical interpretation of PGS, supporting effective disease risk assessment and informed decision-making. Here, we present PredInterval, a nonparametric method for constructing well-calibrated prediction intervals. PredInterval is compatible with any PGS method, takes either individual-level data or summary statistics as input and relies on information from quantiles of phenotypic residuals through cross-validation to achieve well-calibrated coverage of true phenotypic values across diverse genetic architectures. We apply PredInterval to analyze 17 traits in real-data applications, where PredInterval not only represents the sole method achieving well-calibrated prediction coverage across traits, but it also offers a principled approach to identify high-risk individuals using prediction intervals, leading to an average improvement of identification rates by 8.7–830.4% compared with existing approaches. Overall, PredInterval represents a robust and versatile tool for enhancing the clinical utility of PGS. Genome-wide association studies Statistics
N Nature Genetics · Oct 10, 2025 Population-scale gene-based analysis of whole-genome sequencing provides insights into metabolic health In addition to its coverage of the noncoding genome, whole-genome sequencing (WGS) may better capture the coding genome than exome sequencing. Here we sought to exploit this and identify new rare, protein-coding variants associated with metabolic health in WGS data (n= 708,956) from the UK Biobank and All of Us studies. Identified genes highlight new biological mechanisms, including protein-truncating variants (PTVs) in the DNA double-strand break repair geneRIF1that have a substantial effect on body mass index (2.66 kg m−2, s.e. 0.43,P= 3.7 × 10−10).UBR3is an intriguing example where PTVs independently increase body mass index and type 2 diabetes risk. Furthermore, PTVs inIRS2have a substantial effect on type 2 diabetes (odds ratio 6.4 (3.7–11.3),P= 9.9 × 10−14, 34% case prevalence among carriers) and were also associated with chronic kidney disease independent of diabetes status, suggesting an important role for IRS2 in maintaining renal health. Our study demonstrates that large-scale WGS provides new mechanistic insights into human metabolic phenotypes through improved capture of coding sequences. Genetics research Genome-wide association studies Obesity Type 2 diabetes
N Nature Genetics · Oct 10, 2025 Single-cell multi-omic and spatial profiling of esophageal squamous cell carcinoma reveals the immunosuppressive role of GPR116+pericytes in cancer metastasis Tumor metastasis leads to most cancer deaths. However, how cellular diversity and dynamic cooperation within the tumor microenvironment contribute to metastasis remains poorly understood. Here we leverage single-cell multi-omics (16 samples, 117,169 cells) and spatial transcriptomics (five samples, 195,366 cells) to uncover the cellular and spatial architecture of esophageal squamous cell carcinoma (ESCC), and characterize an immunosuppressive GPR116+pericyte subset promoting tumor metastasis and immunotherapy resistance. GPR116+pericyte enrichment is transcriptionally regulated by PRRX1, evidenced by pericyte-specificPrrx1knockout mice. Mechanistically, GPR116+pericytes secrete EGFL6 to bind integrin β1 on cancer cells, activating the NF-κB pathway to facilitate metastasis. Serum EGFL6 serves as a noninvasive biomarker for the diagnosis and prognosis of several tumors. Blocking integrin β1 suppresses metastasis and improves immunotherapy response in animal models of ESCC. Collectively, we provide a spatially resolved landscape of the prometastatic tumor microenvironment in ESCC and highlight the biological and clinical importance of GPR116+pericytes, proposing potential innovative therapeutic strategies for metastatic cancers. Cancer microenvironment Metastasis Oesophageal cancer Tumour immunology
N Nature Genetics · Oct 10, 2025 Nucleotide dependency analysis of genomic language models detects functional elements Deciphering how nucleotides in genomes encode regulatory instructions and molecular machines is a long-standing goal. Genomic language models (gLMs) implicitly capture functional elements and their organization from genomic sequences alone by modeling probabilities of each nucleotide given its sequence context. However, discovering functional genomic elements from gLMs has been challenging due to the lack of interpretable methods. Here we introduce nucleotide dependencies, which quantify how nucleotide substitutions at one genomic position affect the probabilities of nucleotides at other positions. We demonstrate that nucleotide dependencies are more effective at indicating the deleteriousness of genetic variants than alignment-based conservation and gLM reconstruction. Dependency analysis accurately detects regulatory motifs and highlights bases in contact within RNAs, including pseudoknots and tertiary structure contacts, revealing new, experimentally validated RNA structures. Finally, we leverage dependency maps to reveal critical limitations of several gLM architectures and training strategies. Altogether, nucleotide dependency analysis opens a new avenue for discovering and studying functional elements and their interactions in genomes. Computational biology and bioinformatics Genetics Molecular biology
N Nature Genetics · Oct 10, 2025 Spatial signatures for predicting immunotherapy outcomes using multi-omics in non-small cell lung cancer Non-small cell lung cancer (NSCLC) shows variable responses to immunotherapy, highlighting the need for biomarkers to guide patient selection. We applied a spatial multi-omics approach to 234 advanced NSCLC patients treated with programmed death 1-based immunotherapy across three cohorts to identify biomarkers associated with outcome. Spatial proteomics (n= 67) and spatial compartment-based transcriptomics (n= 131) enabled profiling of the tumor immune microenvironment (TIME). Using spatial proteomics, we identified a resistance cell-type signature including proliferating tumor cells, granulocytes, vessels (hazard ratio (HR) = 3.8,P= 0.004) and a response signature, including M1/M2 macrophages and CD4 T cells (HR = 0.4,P= 0.019). We then generated a cell-to-gene resistance signature using spatial transcriptomics, which was predictive of poor outcomes (HR = 5.3, 2.2, 1.7 across Yale, University of Queensland and University of Athens cohorts), while a cell-to-gene response signature predicted favorable outcomes (HR = 0.22, 0.38 and 0.56, respectively). This framework enables robust TIME modeling and identifies biomarkers to support precision immunotherapy in NSCLC. Cancer microenvironment Non-small-cell lung cancer Tumour biomarkers
N Nature Genetics · Oct 03, 2025 A genetic map of human metabolism across the allele frequency spectrum Genetic studies of human metabolism have been limited in scale and allelic breadth. Here we provide a data-driven map of the genetic regulation of circulating small molecules and lipoprotein characteristics (249 traits) measured using proton nuclear magnetic resonance spectroscopy across the allele frequency spectrum in ~450,000 individuals. Trans-ancestral meta-analyses identify 29,824 locus–metabolite associations mapping to 753 regions with effects largely consistent between men and women and large ancestral groups represented in UK Biobank. We observe and classify extreme genetic pleiotropy, identify regulators of lipid metabolism, and assign effector genes at >100 loci through rare-to-common allelic series. We propose roles for genes less established in metabolic control (for example,SIDT2), genes characterized by phenotypic heterogeneity (for example,APOA1) and genes with specific disease relevance (for example,VEGFA). Our study demonstrates the value of broad, large-scale metabolomic phenotyping to identify and characterize regulators of human metabolism. Epidemiology Genome-wide association studies
N Nature Genetics · Oct 03, 2025 Dissecting the impact of transcription factor dose on cell reprogramming heterogeneity using scTF-seq Reprogramming often yields heterogeneous cell fates, yet the underlying mechanisms remain poorly understood. To address this, we developed single-cell transcription factor sequencing (scTF-seq), a single-cell technique that induces barcoded, doxycycline-inducible TF overexpression and quantifies TF dose-dependent transcriptomic changes. Applied to mouse embryonic multipotent stromal cells, scTF-seq generated a gain-of-function atlas for 384 mouse TFs, identifying key regulators of lineage specification, cell cycle control and their interplay. Leveraging single-cell resolution, we uncovered how TF dose shapes reprogramming heterogeneity, revealing both dose-dependent and stochastic cell state transitions. We classified TFs into low-capacity and high-capacity groups, with the latter further subdivided by dose sensitivity. Combinatorial scTF-seq demonstrated that TF interactions can shift from synergistic to antagonistic depending on the relative dose. Altogether, scTF-seq enables the dissection of TF function, dose and cell fate control, providing a high-resolution framework to understand and predict reprogramming outcomes, advancing gene regulation research and the design of cell engineering strategies. This study introduces single-cell transcription factor (TF) sequencing, a single-cell barcoded and doxycycline-inducible TF overexpression approach that reveals dose-sensitive functional classes of TFs and cellular heterogeneity by mapping TF dose-dependent transcriptomic changes during the reprogramming of mouse embryonic multipotent stromal cells. Bioinformatics High-throughput screening RNA sequencing Stem cells Transcriptomics
N Nature Genetics · Oct 03, 2025 Targeting histone H2B acetylated enhanceosomes via p300/CBP degradation in prostate cancer Prostate cancer is driven by oncogenic transcription factor enhanceosomes comprising chromatin and epigenetic regulators. The lysine acetyltransferases p300 and CREB-binding protein (CBP) are key cofactors that activate enhancers through histone acetylation. Here we identify p300/CBP-mediated multisite histone H2B N-terminal acetylation (H2BNTac) as a defining feature of oncogenic enhanceosomes in androgen receptor (AR)-positive prostate cancer. p300/CBP are essential for AR and ETS transcription factor ERG transcriptional activity, and their dual degradation eliminates H2BNTac and histone H3 lysine 27 acetylation at hyperactive enhancers, leading to stronger suppression of oncogenic transcription than targeting either paralog or bromodomain alone. Cytotoxicity profiling across >900 cell lines revealed that tumors with high H2BNTac, including AR-positive prostate cancer, are selectively dependent on p300/CBP. In preclinical models, systemic p300/CBP degradation inhibited tumor growth, synergized with AR antagonists and showed no evident toxicity. These findings position H2BNTac as an epigenetic marker of enhancer addiction and establish dual p300/CBP degradation as a promising therapeutic strategy for enhancer-driven cancers. Epigenetics Prostate cancer Targeted therapies
N Nature Genetics · Sep 30, 2025 Limited overlap between genetic effects on disease susceptibility and disease survival Understanding disease progression is of high biological and clinical interest. Unlike disease susceptibility, whose genetic basis has been abundantly studied, less is known about the genetics of disease progression and its overlap with disease susceptibility. Considering nine common diseases (ncasesranging from 11,980 to 124,682) across seven biobanks, we systematically compared genetic architectures of susceptibility and progression, defined as disease-specific mortality. We identified only one locus substantially associated with disease-specific mortality and showed that, at a similar sample size, more genome-wide significant loci can be identified in a genome-wide association study of disease susceptibility. Variants substantially affecting disease susceptibility were weakly or not associated with disease-specific mortality. Moreover, susceptibility polygenic scores (PGSs) were weak predictors of disease-specific mortality, while a PGS for general lifespan was substantially associated with disease-specific mortality for seven of nine diseases. We explored alternative definitions of disease progression and found that genetic signals for macrovascular complications in type 2 diabetes overlap with similar phenotypes in the general population; however, these effects are attenuated. Overall, our findings indicate limited similarity in genetic effects between disease susceptibility and disease-specific mortality, suggesting that larger sample sizes, different measures of progression, or the integration of related phenotypes from the general population is needed to identify the genetic underpinnings of disease progression. Clinical genetics Genetic association study
N Nature Genetics · Sep 22, 2025 Rapid epigenomic classification of acute leukemia Acute leukemia requires precise molecular classification and urgent treatment. However, standard-of-care diagnostic tests are time-intensive and do not capture the full spectrum of acute leukemia heterogeneity. Here, we developed a framework to classify acute leukemia using genome-wide DNA methylation profiling. We first assembled a comprehensive reference cohort (n= 2,540 samples) and defined 38 methylation classes. Methylation-based classification matched standard-pathology lineage classification in most cases and revealed heterogeneity in addition to that captured by genetic categories. Using this reference, we developed a neural network (MARLIN;methylation- andAI-guidedrapidleukemia subtypeinference) for acute leukemia classification from sparse DNA methylation profiles. In retrospective cohorts profiled by nanopore sequencing, high-confidence predictions were concordant with conventional diagnoses in 25 out of 26 cases. Real-time MARLIN classification in patients with suspected acute leukemia provided accurate predictions in five out of five cases, which were typically generated within 2 h of sample receipt. In summary, we present a framework for rapid acute leukemia classification that complements and enhances standard-of-care diagnostics. Acute lymphocytic leukaemia Acute myeloid leukaemia
N Nature Genetics · Sep 22, 2025 Bryophytes hold a larger gene family space than vascular plants After 500 million years of evolution, extant land plants compose the following two sister groups: the bryophytes and the vascular plants. Despite their small size and simple structure, bryophytes thrive in a wide variety of habitats, including extreme conditions. However, the genetic basis for their ecological adaptability and long-term survival is not well understood. A comprehensive super-pangenome analysis, incorporating 123 newly sequenced bryophyte genomes, reveals that bryophytes possess a substantially greater diversity of gene families than vascular plants. This includes a higher number of unique and lineage-specific gene families, originating from extensive new gene formation and continuous horizontal transfer of microbial genes over their long evolutionary history. The evolution of bryophytes’ rich and diverse genetic toolkit, which includes new physiological innovations like unique immune receptors, likely facilitated their spread across different biomes. These newly sequenced bryophyte genomes offer a valuable resource for exploring alternative evolutionary strategies for terrestrial success. High-throughput screening Plant molecular biology
N Nature Genetics · Sep 18, 2025 Genetic and epigenetic screens in primary human T cells link candidate causal autoimmune variants to T cell networks Genetic variants associated with autoimmune diseases are highly enriched within putativecis-regulatory regions of CD4+T cells, suggesting that they could alter disease risk through changes in gene regulation. However, very few genetic variants have been shown to affect T cell gene expression or function. Here we tested >18,000 autoimmune disease-associated variants for allele-specific effects on expression using massively parallel reporter assays in primary human CD4+T cells. We find 545 variants that modulate expression in an allele-specific manner (emVars). Primary T cell emVars greatly enrich for likely causal variants, are mediated by common upstream pathways and their putative target genes are highly enriched within a lymphocyte activation network. Using bulk and single-cell CRISPR-interference screens, we confirm that emVar-containing T cellcis-regulatory elements modulate both known and previously unappreciated target genes that regulate T cell proliferation, providing plausible mechanisms by which these variants alter autoimmune disease risk. Autoimmune diseases Functional genomics Gene regulation Immunogenetics
N Nature Genetics · Sep 18, 2025 Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects Large biobanks, such as the UK Biobank (UKB), enable massive phenome by genome-wide association studies that elucidate genetic etiology of complex traits. However, people from diverse genetic ancestry groups are often excluded from association analyses due to concerns about population structure introducing false positive associations. Here we generate mixed model associations and meta-analyses across genetic ancestry groups, inclusive of a larger fraction of the UK Biobank than previous efforts, to produce freely available summary statistics for 7,266 traits. We build a quality control and analysis framework informed by genetic architecture. Overall, we identify 14,676 significant loci (P< 5 × 10−8) in the meta-analysis that were not found in the EUR genetic ancestry group alone, including new associations, for example betweenCAMK2Dand triglycerides. We also highlight associations from ancestry-enriched variation, including a known pleiotropic missense variant inG6PDassociated with several biomarker traits. We release these results publicly alongside frequently asked questions that describe caveats for interpretation of results, enhancing available resources for interpretation of risk variants across diverse populations. Genome-wide association studies Population genetics
N Nature Genetics · Sep 17, 2025 Genome-wide association meta-analysis of childhood ADHD symptoms and diagnosis identifies new loci and potential effector genes We performed a genome-wide association meta-analysis (GWAMA) of 290,134 attention-deficit/hyperactivity disorder (ADHD) symptom measures of 70,953 unique individuals from multiple raters, ages and instruments (ADHDSYMP). Next, we meta-analyzed the results with a study of ADHD diagnosis (ADHDOVERALL). ADHDSYMPreturned no genome-wide significant variants. We show that the combined ADHDOVERALLGWAMA identified 39 independent loci, of which 17 were new. Using a recently developed gene-mapping method, Fine-mapped Locus Assessment Model of Effector genes, we identified 22 potential ADHD effector genes implicating several new biological processes and pathways. Moderate negative genetic correlations (rg< −0.40) were observed with multiple cognitive traits. In three cohorts, polygenic scores (PGSs) based on ADHDOVERALLoutperformed PGSs based on ADHD symptoms and diagnosis alone. Our findings support the notion that clinical ADHD is at the extreme end of a continuous liability that is indexed by ADHD symptoms. We show that including ADHD symptom counts helps to identify new genes implicated in ADHD. Behavioural genetics Genome-wide association studies
N Nature Genetics · Sep 15, 2025 Accelerated Bayesian inference of population size history from recombining sequence data This study introduces population history learning by averaging sampled histories (PHLASH), a new method for inferring population history from whole-genome sequence data. It works by drawing random, low-dimensional projections of the coalescent intensity function from the posterior distribution of a pairwise sequentially Markovian coalescent-like model and averaging them together to form an accurate and adaptive estimator. On simulated data, PHLASH tends to be faster and have lower error than several competing methods, including SMC++, MSMC2 and FITCOAL. Moreover, it provides automatic uncertainty quantification and leads to new Bayesian testing procedures for detecting population structure and ancient bottlenecks. The key technical advance is a new algorithm for computing the score function (gradient of the log likelihood) of a coalescent hidden Markov model, which has the same computational cost as evaluating the log likelihood. PHLASH has been released as an easy-to-use Python software package and leverages graphics processing unit acceleration when available. Population genetics Software
N Nature Genetics · Sep 10, 2025 DNA methylation cooperates with genomic alterations during non-small cell lung cancer evolution Aberrant DNA methylation has been described in nearly all human cancers, yet its interplay with genomic alterations during tumor evolution is poorly understood. To explore this, we performed reduced representation bisulfite sequencing on 217 tumor and matched normal regions from 59 patients with non-small cell lung cancer from the TRACERx study to deconvolve tumor methylation. We developed two metrics for integrative evolutionary analysis with DNA and RNA sequencing data. Intratumoral methylation distance quantifies intratumor DNA methylation heterogeneity. MR/MNclassifies genes based on the rate of hypermethylation at regulatory (MR) versus nonregulatory (MN) CpGs to identify driver genes exhibiting recurrent functional hypermethylation. We identified DNA methylation-linked dosage compensation of essential genes co-amplified with neighboring oncogenes. We propose two complementary mechanisms that converge for copy number alteration-affected chromatin to undergo the epigenetic equivalent of an allosteric activity transition. Hypermethylated driver genes under positive selection may open avenues for therapeutic stratification of patients. Epigenetics Non-small-cell lung cancer
N Nature Genetics · Sep 08, 2025 Multiancestry brain pQTL fine-mapping and integration with genome-wide association studies of 21 neurologic and psychiatric conditions To understand shared and ancestry-specific genetic control of brain protein expression and its ramifications for disease, we mapped protein quantitative trait loci (pQTLs) in 1,362 brain proteomes from African American, Hispanic/Latin American and non-Hispanic white donors. Among the pQTLs that multiancestry fine-mapping MESuSiE confidently assigned as putative causal pQTLs in a specific population, most were shared across the three studied populations and are referred to as multiancestry causal pQTLs. These multiancestry causal pQTLs were enriched for exonic and promoter regions. To investigate their effects on disease, we modeled the 858 multiancestry causal pQTLs as instrumental variables using Mendelian randomization and genome-wide association study results for neurologic and psychiatric conditions (21 traits in participants with European ancestry, 10 in those with African ancestry and 4 in Hispanic participants). We identified 119 multiancestry pQTL–protein pairs consistent with a causal role in these conditions. Remarkably, 29% of the multiancestry pQTLs in these pairs were coding variants. These results lay an important foundation for the creation of new molecular models of neurologic and psychiatric conditions that are likely to be relevant to individuals across different genetic ancestries. Alzheimer's disease Depression Gene expression profiling Proteome informatics
N Nature Genetics · Sep 08, 2025 Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes The Ancestral Recombination Graph (ARG), which describes the genealogical history of a sample of genomes, is a vital tool in population genomics and biomedical research. Recent advancements have substantially increased ARG reconstruction scalability, but they rely on approximations that can reduce accuracy, especially under model misspecification. Moreover, they reconstruct only a single ARG topology and cannot quantify the considerable uncertainty associated with ARG inferences. Here, to address these challenges, we introduce SINGER (sampling and inferring of genealogies with recombination), a method that accelerates ARG sampling from the posterior distribution by two orders of magnitude, enabling accurate inference and uncertainty quantification for hundreds of whole-genome sequences. Through extensive simulations, we demonstrate SINGER’s enhanced accuracy and robustness to model misspecification compared to existing methods. We demonstrate the utility of SINGER by applying it to individuals of British and African descent within the 1000 Genomes Project, identifying signals of population differentiation, archaic introgression and strong support for ancient polymorphism in the human leukocyte antigen region shared across primates. Population genetics Software
N Nature Genetics · Sep 05, 2025 Genetic variants affecting RNA stability influence complex traits and disease risk Gene expression is modulated jointly by transcriptional regulation and messenger RNA stability, yet the latter is often overlooked in studies on genetic variants. Here, leveraging metabolic labeling data (Bru/BruChase-seq) and a new computational pipeline, RNAtracker, we categorize genes as allele-specific RNA stability (asRS) or allele-specific RNA transcription events. We identify more than 5,000 asRS variants among 665 genes across a panel of 11 human cell lines. These variants directly overlap conserved microRNA target regions and allele-specific RNA-binding protein sites, illuminating mechanisms through which stability is mediated. Furthermore, we identified causal asRS variants using a massively parallel screen (MapUTR) for variants that affect post-transcriptional mRNA abundance, as well as through CRISPR prime editing approaches. Notably, asRS genes were enriched significantly among a multitude of immune-related pathways and contribute to the risk of several immune system diseases. This work highlights RNA stability as a critical, yet understudied mechanism linking genetic variation and disease. Gene expression Transcriptomics
N Nature Genetics · Sep 05, 2025 A multi-tissue single-cell expression atlas in cattle Systematic characterization of the molecular states of cells in livestock tissues is essential for understanding the cellular and genetic mechanisms underlying economically and ecologically important physiological traits. Here, as part of the Farm Animal Genotype-Tissue Expression (FarmGTEx) project, we describe a comprehensive reference map including 1,793,854 cells from 59 bovine tissues in calves and adult cattle, spanning both sexes, which reveals intra-tissue and inter-tissue cellular heterogeneity in gene expression, transcription factor regulation and intercellular communication. Integrative analysis with genetic variants that underpin bovine monogenic and complex traits uncovers cell types of relevance, such as spermatocytes, responsible for sperm motility and excitatory neurons for milk fat yield. Comparative analysis reveals similarities in gene expression between cattle and humans, allowing for the detection of relevant cell types to study human complex phenotypes. This Cattle Cell Atlas will serve as a key resource for cattle genetics and genomics, selective breeding and comparative biology. Data processing Transcriptomics
N Nature Genetics · Sep 04, 2025 DNA methylation influences human centromere positioning and function Maintaining the epigenetic identity of centromeres is essential to prevent genome instability. Centromeres are epigenetically defined by the histone H3 variant CENP-A. Prior work in human centromeres has shown that CENP-A is associated with regions of hypomethylated DNA located within large arrays of hypermethylated repeats, but the functional importance of these DNA methylation (DNAme) patterns remains poorly understood. To address this, we developed tools to perturb centromeric DNAme, revealing that it causally influences CENP-A positioning. We show that rapid loss of methylation results in increased binding of centromeric proteins and alterations in centromere architecture, leading to aneuploidy and reduced cell viability. We also demonstrate that gradual centromeric DNA demethylation prompts a process of cellular adaptation. Altogether, we find that DNAme causally influences CENP-A localization and centromere function, offering mechanistic insights into pathological alterations of centromeric DNAme. Epigenetics
N Nature Genetics · Aug 28, 2025 Telomere attrition becomes an instrument for clonal selection in aging hematopoiesis and leukemogenesis The mechanisms through which mutations in splicing factor genes drive clonal hematopoiesis (CH) and myeloid malignancies, and their close association with advanced age, remain poorly understood. Here we show that telomere maintenance plays an important role in this phenomenon. First, by studying 454,098 UK Biobank participants, we find that, unlike most CH subtypes, splicing-factor-mutant CH is more common in those with shorter genetically predicted telomeres, as is CH with mutations inPPM1Dand theTERTgene promoter. We go on to show that telomere attrition becomes an instrument for clonal selection in advanced age, with splicing factor mutations ‘rescuing’ HSCs from critical telomere shortening. Our findings expose the lifelong influence of telomere maintenance on hematopoiesis and identify a potential shared mechanism through which different splicing factor mutations drive leukemogenesis. Understanding the mechanistic basis of these observations can open new therapeutic avenues against splicing-factor-mutant CH and hematological or other cancers. Ageing Genetics research Haematological cancer
N Nature Genetics · Aug 27, 2025 Cross-biobank generalizability and accuracy of electronic health record-based predictors compared to polygenic scores Comparison of electronic health record-based phenotype risk scores (PheRS) and polygenic scores (PGS) across 13 common diseases and three biobank-based studies indicates that PheRS and PGS may provide additive benefits for risk prediction.
N Nature Genetics · Aug 27, 2025 Chromatin dynamics of a large-sized genome provides insights into polyphenism and X0 dosage compensation of locusts Chromosome-level genome assemblies of migratory and desert locusts, coupled with epigenomic profiling of migratory locusts, reveal chromatin dynamics underlying polyphenism and X-linked dosage compensation following autosomal gene translocation.
N Nature Genetics · Aug 27, 2025 Inactivation ofβ-1,3-glucan synthase-like 5confers broad-spectrum resistance toPlasmodiophora brassicaepathotypes in cruciferous plants This study implicates GSL5 inactivation in high, broad-spectrum resistance to the clubroot pathogenPlasmodiophora brassicaeinArabidopsis thaliana,Brassica napus,Brassica oleraceaandBrassica rapa.
N Nature Genetics · Aug 26, 2025 ERG-driven prostate cancer initiation is cell-context dependent and requires KMT2A and DOT1L Despite the high prevalence of ERG transcription factor translocations in prostate cancer, the mechanism of tumorigenicity remains poorly understood. Using lineage tracing, we find the tumor-initiating activity of ERG resides in a subpopulation of murine basal cells that coexpress luminal genes (BasalLum) and not in the larger population of ERG+luminal cells. Upon ERG activation, BasalLumcells give rise to highly proliferative intermediate (IM) cells with stem-like features that coexpress basal, luminal, hillock and club marker genes, before transitioning to Krt8+luminal cells. Transcriptomic analysis of ERG+human prostate cancers confirms the presence of rare ERG+BasalLumcells, as well as IM cells whose presence is associated with a worse prognosis. Single-cell analysis revealed a chromatin state in ERG+IM cells enriched for STAT3 transcription factor binding sites and elevated expression of the KMT2A/MLL1 and DOT1L, all three of which are essential for ERG-driven tumorigenicity in vivo. In addition to providing translational opportunities, this work illustrates how single-cell approaches combined with lineage tracing can identify cancer vulnerabilities not evident from bulk analysis. Cancer stem cells Cancer therapy Epigenetics Gene regulation
N Nature Genetics · Aug 26, 2025 Mutational landscape of triple-negative breast cancer in African American women African American (AA) women have the highest incidence of triple-negative breast cancer (TNBC) among all racial groups, but are underrepresented in cancer genomic studies. In 462 AA women with TNBC, we characterized the tumor mutational landscape by whole-exome sequencing and RNA sequencing. We unveiled a high-resolution mutational portrait of TNBC in AA women reminiscent of that in Asian and non-Hispanic white women, with no evidence of associations of mutational features with African ancestry. We also made some distinctive discoveries, including an almost complete dominance ofTP53mutations, low frequency ofPIK3CAmutations and mutational signature-based subtypes with etiologic and prognostic significance. These findings do not support major racial differences in TNBC biology at the level of somatic mutations. Our study contributes considerably to diversifying the knowledge base of breast cancer genomics and provides insights into the disease etiology, disparities and therapeutic vulnerability of TNBC in AA women. Breast cancer
N Nature Genetics · Aug 22, 2025 Tamoxifen induces PI3K activation in uterine cancer Mutagenic processes and clonal selection contribute to the development of therapy-associated secondary neoplasms, a known complication of cancer treatment. The association between tamoxifen therapy and secondary uterine cancers is uncommon but well established; however, the genetic mechanisms underlying tamoxifen-driven tumorigenesis remain unclear. We find that oncogenicPIK3CAmutations, common in spontaneously arising estrogen-associated de novo uterine cancer, are significantly less frequent in tamoxifen-associated tumors. In vivo, tamoxifen-induced estrogen receptor stimulation activates phosphoinositide 3-kinase (PI3K) signaling in normal mouse uterine tissue, potentially eliminating the selective benefit of PI3K-activating mutations in tamoxifen-associated uterine cancer. Together, we present a unique pathway of therapy-associated carcinogenesis in which tamoxifen-induced activation of the PI3K pathway acts as a non-genetic driver event, contributing to the multistep model of uterine carcinogenesis. While this PI3K mechanism is specific to tamoxifen-associated uterine cancer, the concept of treatment-induced signaling events may have broader applicability to other routes of tumorigenesis. Breast cancer Endometrial cancer
N Nature Genetics · Aug 22, 2025 Precise modulation of BRG1 levels reveals features of mSWI/SNF dosage sensitivity Mammalian switch/sucrose nonfermentable (mSWI/SNF) complex regulates chromatin accessibility and frequently shows alterations due to mutation in cancer and neurological diseases. Inadequate expression of mSWI/SNF in heterozygous mice can lead to developmental defects, indicating dosage-sensitive effects of mSWI/SNF. However, how its dosage affects function has remained unclear. Using a targeted protein degradation system, we investigated its dosage-sensitive effects by precisely controlling protein levels of BRG1, the ATPase subunit of the mSWI/SNF complex. We found that binding of BRG1 to chromatin exhibited a linear response to the BRG1 protein level. Although chromatin accessibility at most promoters and insulators was largely unaffected by BRG1 depletion, 44% of enhancers, including 84% of defined superenhancers, showed reduced accessibility. Notably, half of the BRG1-regulated enhancers, particularly superenhancers, exhibited a buffered response to BRG1 loss. Consistently, transcription exhibited a predominantly buffered response to changes in BRG1 levels. Collectively, our findings demonstrate a genomic feature-specific response to BRG1 dosage, shedding light on the dosage-sensitive effects of mSWI/SNF complex defects in cancer and other diseases. Cancer Epigenetics Epigenomics Gene regulation Stem cells
N Nature Genetics · Aug 20, 2025 Temporal genomic dynamics shape clinical trajectory in multiple myeloma Multiple myeloma evolution is characterized by the accumulation of genomic drivers over time. To unravel this timeline and its impact on clinical outcomes, we analyzed 421 whole-genome sequences from 382 patients. Using clock-like mutational signatures, we estimated a time lag of two to four decades between the initiation of events and diagnosis. We demonstrate that odd-numbered chromosome trisomies in patients with hyperdiploidy can be acquired simultaneously with other chromosomal gains (for example, 1q gain). We show that hyperdiploidy is acquired after immunoglobulin heavy chain translocation when both events co-occur. Finally, patients with early 1q gain had adverse outcomes similar to those with 1q amplification (>1 extra copy), but fared worse than those with late 1q gain. This finding underscores that the 1q gain prognostic impact depends more on the timing of acquisition than on the number of copies gained. Overall, this study contributes to a better understanding of the life history of myeloma and may have prognostic implications. Data processing Myeloma Oncogenes
N Nature Genetics · Aug 20, 2025 Super-pangenome analyses across 35 accessions of 23Avenaspecies highlight their complex evolutionary history and extensive genomic diversity Common oat, belonging to the genusAvenawith 30 recognized species, is a nutritionally important cereal crop and high-quality forage worldwide. Here, we construct a genus-level super-pangenome ofAvenacomprising 35 high-quality genomes from 14 cultivated oat accessions and 21 wild species. The fully resolved phylogenomic analysis unveils the origin and evolutionary scenario ofAvenaspecies, and the super-pangenome analysis identifies 26.62% and 59.93% specific genes and haplotypes in wild species. We delineate the landscape of structural variations (SVs) and the transcriptome profile based 1,401 RNA-sequencing (RNA-seq) samples from diverse abiotic stress treatments in oat. We highlight the crucial role of SVs in modulating gene expression and shaping adaptation to diverse stresses. Further combining SV-based genome-wide association studies (GWASs), we characterize 13 candidate genes associated with drought resistance such asAsARF7, validated by transgenic oat lines. Our study provides unprecedented genomic resources to facilitate genomic, evolution and molecular breeding research in oat. Gene expression Genomics Plant breeding Transcriptomics
N Nature Genetics · Aug 20, 2025 Single-cell DNA methylome and 3D genome atlas of human subcutaneous adipose tissue The cell-type-level epigenomic landscape of human subcutaneous adipose tissue (SAT) is not well characterized. Here, we elucidate the epigenomic landscape across SAT cell types using snm3C-seq. We find that SAT CG methylation (mCG) displays pronounced hypermethylation in myeloid cells and hypomethylation in adipocytes and adipose stem and progenitor cells, driving nearly half of the 705,063 differentially methylated regions (DMRs). Moreover,TET1andDNMT3Aare identified as plausible regulators of the cell-type-level mCG profiles. Both global mCG profiles and chromosomal compartmentalization reflect SAT cell-type lineage. Notably, adipocytes display more short-range chromosomal interactions, forming complex local 3D genomic structures that regulate transcriptional functions, including adipogenesis. Furthermore, adipocyte DMRs and A compartments are enriched for abdominal obesity genome-wide association study (GWAS) variants and polygenic risk, while myeloid A compartments are enriched for inflammation. Together, we characterize the SAT single-cell-level epigenomic landscape and link GWAS variants and partitioned polygenic risk of abdominal obesity and inflammation to the SAT epigenome. Epigenetics Epigenomics Obesity Sequencing
N Nature Genetics · Aug 19, 2025 A comparison of 27Arabidopsis thalianagenomes and the path toward an unbiased characterization of genetic polymorphism Making sense of whole-genome polymorphism data is challenging, but it is essential for overcoming the biases in SNP data. Here we analyze 27 genomes ofArabidopsis thalianato illustrate these issues. Genome size variation is mostly due to tandem repeat regions that are difficult to assemble. However, while the rest of the genome varies little in length, it is full of structural variants, mostly due to transposon insertions. Because of this, the pangenome coordinate system grows rapidly with sample size and ultimately becomes 70% larger than the size of any single genome, even forn= 27. Finally, we show how short-read data are biased by read mapping. SNP calling is biased by the choice of reference genome, and both transcriptome and methylome profiling results are affected by mapping reads to a reference genome rather than to the genome of the assayed individual. Plant genetics Population genetics
N Nature Genetics · Aug 18, 2025 Tracing the evolution of single-cell 3D genomes in Kras-driven cancers Although three-dimensional (3D) genome structures are altered in cancer, it remains unclear how these changes evolve and diversify during cancer progression. Leveraging genome-wide chromatin tracing to visualize 3D genome folding directly in tissues, we generated 3D genome cancer atlases of oncogenic Kras-driven mouse lung adenocarcinoma (LUAD) and pancreatic ductal adenocarcinoma. Here we define nonmonotonic, stage-specific alterations in 3D genome compaction, heterogeneity and compartmentalization as cancers progress from normal to preinvasive and ultimately to invasive tumors, discovering a potential structural bottleneck in early tumor progression. Remarkably, 3D genome architectures distinguish morphologic cancer states in single cells, despite considerable cell-to-cell heterogeneity. Analyses of genome compartmentalization changes not only showed that compartment-associated genes are more homogeneously regulated but also elucidated prognostic and dependency genes in LUAD, as well as an unexpected role for Rnf2 in 3D genome regulation. Our results highlight the power of single-cell 3D genome mapping to identify diagnostic, prognostic and therapeutic biomarkers in cancer. Epigenomics Gene regulation Molecular imaging Non-small-cell lung cancer Tumour biomarkers