N Nature Methods · Dec 04, 2025 C-COMPASS: a user-friendly neural network tool profiles cell compartments at protein and lipid levels Systematic proteomic organelle profiling methods including protein correlation profiling and LOPIT have advanced our understanding of cellular compartmentalization. To manage the complexity of organelle profiling data, we introduce C-COMPASS, a user-friendly open-source software that employs a neural network-based regression model to predict the spatial cellular distribution of proteins. C-COMPASS handles complex multilocalization patterns and integrates protein abundance to model organelle composition changes across conditions. We apply C-COMPASS to mice with humanized livers to elucidate organelle remodeling during metabolic perturbations. Additionally, by training neural networks with co-generated marker protein profiles, C-COMPASS extends spatial profiling to lipids, overcoming the lack of organelle-specific lipid markers, allowing for determination of localization and tracking of lipid species across different compartments. This provides integrated snapshots of organelle lipid and protein compositions. Overall, C-COMPASS offers an accessible tool for multiomic studies of organelle dynamics without needing advanced computational skills, empowering researchers to explore new questions in lipidomics, proteomics and organelle biology. Computational platforms and environments Organelles Proteomics biology mouse experiments
N Nature Methods · Dec 04, 2025 Latent space-based network analysis for brain–behavior linking in neuroimaging We propose a latent space-based statistical network analysis (LatentSNA) method that implements network science in a generative Bayesian framework, preserves neurologically meaningful brain topology and improves statistical power for imaging biomarker detection. LatentSNA (1) addresses the lack of power and inflated type II errors in current analytic approaches when detecting imaging biomarkers, (2) allows unbiased estimation of the influence of biomarkers on behavioral variants, (3) quantifies uncertainty and evaluates the likelihood of estimated biomarker effects against chance and (4) improves brain–behavior prediction in new samples as well as the clinical utility of neuroimaging findings. LatentSNA is broadly applicable across multiple imaging modalities and outcome measures in developing, aging and transdiagnostic cohorts, totaling 8,003 to 11,861 participants. LatentSNA achieves substantial accuracy gains (averaging 110–150%) and replicability improvements (averaging 153%) over existing approaches in moderate to large datasets. As a result, LatentSNA elucidates how network topology is implicated in brain–behavior relationships. Network topology Statistical methods
N Nature Methods · Dec 04, 2025 Deep Imputation for Skeleton data (DISK) for behavioral science Pose estimation methods and motion capture systems have opened doors to quantitative measurements of animal kinematics. While animal behavior experiments are expensive and complex, tracking errors sometimes make large portions of the experimental data unusable. Here our deep learning method, Deep Imputation for Skeleton data (DISK), uncovers dependencies between keypoints and their dynamics to impute missing tracking data without the help of any manual annotations. We demonstrate the utility and performance of DISK on seven animal skeletons including multi-animal setups. The imputed recordings allow us to detect more episodes of motion, such as steps, and obtain more statistically robust results when comparing these episodes between experimental conditions. In addition, by learning to impute the missing content, DISK learns meaningful representations of the data capturing, for example, underlying actions. This stand-alone imputation package, available athttps://github.com/bozeklab/DISK.git/, is applicable to outputs of tracking methods (marker-based or markerless) and allows for varied types of downstream analysis. Computational neuroscience Machine learning
N Nature Methods · Dec 03, 2025 Atom-level enzyme active site scaffolding using RFdiffusion2 Designing new enzymes typically begins with idealized arrangements of catalytic functional groups around a reaction transition state, then attempts to generate protein structures that precisely position these groups. Current AI-based methods can create active enzymes but require predefined residue positions and rely on reverse-building residue backbones from side-chain placements, which limits design flexibility. Here we show that a new deep generative model, RoseTTAFold diffusion 2 (RFdiffusion2), overcomes these constraints by designing enzymes directly from functional group geometries without specifying residue order or performing inverse rotamer generation. RFdiffusion2 successfully generates scaffolds for all 41 active sites in a diverse benchmark, compared to 16 using previous methods. We further design enzymes for three distinct catalytic mechanisms and identify active candidates after experimentally testing fewer than 96 sequences in each case. These results highlight the potential of atomic-level generative modeling to create de novo enzymes directly from reaction mechanisms. Enzymes Protein design biology
N Nature Methods · Dec 02, 2025 Molecular-scale isotropic 3D super-resolution microscopy via interference localization Three-dimensional (3D) nanoscale imaging reveals the detailed morphology of subcellular structures; however, conventional single-molecule localization microscopy is constrained by limited axial resolution. Here we introduce ROSE-3D, an interferometric localization approach that enables isotropic 3D super-resolution imaging with uniform performance across the entire depth of field. Compared with conventional astigmatism-based methods, ROSE-3D improves lateral localization precision by 2–6 times and axial precision by 3.5–8 times over a depth of field of approximately 1.2 μm. Leveraging its multicolor and whole-cell imaging capabilities, ROSE-3D resolves, in situ, the nanoscale organization of nuclear lamins and the assemblies of mitochondrial fission-related protein DRP1. These results establish ROSE-3D as a powerful tool for interrogating nanoscale cellular architecture. Fluorescence imaging Super-resolution microscopy biology
N Nature Methods · Dec 02, 2025 CaBLAM: a high-contrast bioluminescent Ca2+indicator derived from an engineeredOplophorus gracilirostrisluciferase Monitoring intracellular calcium is central to understanding cell signaling across nearly all cell types and organisms. Fluorescent genetically encoded calcium indicators (GECIs) remain the standard tools for in vivo calcium imaging, but require intense excitation light, leading to photobleaching, background autofluorescence and phototoxicity. Bioluminescent GECIs, which generate light enzymatically, eliminate these artifacts but have been constrained by low dynamic range and suboptimal calcium affinities. Here we show that CaBLAM (‘calcium bioluminescence activity monitor’), an engineered bioluminescent calcium indicator, achieves an order-of-magnitude improvement in signal contrast and a tunable affinity matched to physiological cytosolic calcium. CaBLAM enables single-cell and subcellular activity imaging at video frame rates in cultured neurons and sustained imaging over hours in awake, behaving animals. These capabilities establish CaBLAM as a robust and general alternative to fluorescent GECIs, extending calcium imaging to regimes where excitation light is undesirable or infeasible. Bioluminescence imaging Cellular neuroscience biology
N Nature Methods · Dec 02, 2025 Parallel stopped-flow interrogation of diverse biological systems at the single-molecule scale Single-molecule imaging techniques have provided unprecedented insights into functional changes in composition and conformation across diverse biological systems. As with other biophysical methods, single-molecule fluorescence and Förster resonance energy transfer investigations are typically limited to examination of one sample at a time. Consequently, experimental throughput is restricted, and experimental variances are introduced that can obscure functional distinctions in closely related systems. Here, to address these limitations, we introduce parallel rapid exchange single-molecule fluorescence and single-molecule Förster resonance energy transfer to enable simultaneous steady-state and pre-steady-state interrogations of diverse systems. Using this approach, we elucidate the timing of distinct conformational events underpinning β-arrestin1 activation, unmask antibiotic-induced impacts on messenger RNA decoding fidelity and demonstrate that endogenously encoded ribosomal RNA sequence variation modulates antibiotic sensitivity. This generalizable and scalable method promises to broaden the scope and reproducibility of quantitative single-molecule interrogations of biomolecular function. Fluorescence imaging Fluorescence resonance energy transfer Proteins Single-molecule biophysics Translation biology
N Nature Methods · Nov 28, 2025 ExoSloNano: multimodal nanogold labels for identification of macromolecules in live cells and cryo-electron tomograms In situ cryo-electron microscopy (cryo-EM) enables the direct interrogation of structure–function relationships by resolving macromolecular structures in their native cellular environment. Recent progress in sample preparation, imaging and data processing has enabled the identification and determination of large biomolecular complexes. However, the majority of proteins are of a size that still eludes identification in cellular cryo-EM data, and most proteins exist in low copy numbers. Therefore, novel tools are needed for cryo-EM to identify macromolecules across multiple size scales (from microns to nanometers). Here we introduce nanogold probes for detecting specific proteins using correlative light and electron microscopy, cryo-electron tomography (cryo-ET) and resin-embedded electron microscopy. These nanogold probes can be introduced into live cells, in a manner that preserves intact molecular networks and cell viability. We use this ExoSloNano system to identify both cytoplasmic and nuclear proteins by room-temperature electron microscopy, and resolve associated structures by cryo-ET. By providing high-efficiency protein labeling in live cells and molecular specificity within cryo-ET tomograms, ExoSloNano expands the proteome available to electron microscopy. Cellular imaging Cryoelectron tomography Fluorescence imaging Sensors and probes biology
N Nature Methods · Nov 28, 2025 DeepCor: denoising fMRI data with contrastive autoencoders Functional magnetic resonance imaging (fMRI) allows noninvasive measurement of neural activity with high spatial resolution. However, fMRI data are affected by noise. Here we introduce and evaluate a denoising method (DeepCor) that utilizes deep generative models to disentangle and remove noise. The method is applicable to data from single participants. DeepCor outperforms other state-of-the-art denoising approaches on a variety of simulated datasets. In real fMRI data, DeepCor enhances BOLD signal responses to face stimuli, outperforming CompCor by 215%. Cognitive neuroscience Computational biology and bioinformatics
N Nature Methods · Nov 28, 2025 Assessment of computational methods in predicting TCR–epitope binding recognition T cell receptors (TCRs) play a vital role in immune recognition by binding specific epitopes. Accurate prediction of TCR–epitope interactions is fundamental for advancing immunology research. Although numerous computational methods have been developed, a comprehensive evaluation of their performance remains lacking. Here we assessed 50 state-of-the-art TCR–epitope prediction models using 21 datasets covering 762 epitopes and hundreds of thousands binding TCRs. Our analysis revealed that the source of negative TCRs substantially impacts model accuracy, with external negatives potentially introducing uncontrolled confounders. Model performance generally improved with more TCRs per epitope, highlighting the importance of large and diverse datasets. Models incorporating multiple features typically outperformed those using only complementarity-determining region 3β information, yet all struggle to generalize to unseen epitopes. The use of independent test sets proved crucial for unbiased assessment on both seen and unseen epitopes. These insights will guide the development of more accurate and generalizable TCR–epitope prediction models for real-world applications. Adaptive immunity Computational models biology
N Nature Methods · Nov 27, 2025 A comprehensive foundation model for cryo-EM image processing Cryogenic electron microscopy (cryo-EM) has become a premier technique for determining high-resolution structures of biological macromolecules. However, its broad application is constrained by the demand for specialized expertise. Here, to address this limitation, we introduce the Cryo-EM Image Evaluation Foundation (Cryo-IEF) model, a versatile tool pre-trained on ~65 million cryo-EM particle images through unsupervised learning. Cryo-IEF performs diverse cryo-EM processing tasks, including particle classification by structure, pose-based clustering and image quality assessment. Building on this foundation, we developed CryoWizard, a fully automated single-particle cryo-EM processing pipeline enabled by fine-tuned Cryo-IEF for efficient particle quality ranking. CryoWizard resolves high-resolution structures across samples of varied properties and effectively mitigates the prevalent challenge of preferred orientation in cryo-EM. Cryoelectron microscopy Machine learning Proteins biology
N Nature Methods · Nov 27, 2025 FX-Cell: a method for single-cell RNA sequencing on difficult-to-digest and cryopreserved plant samples Single-cell RNA sequencing in plants requires the isolation of high-quality protoplasts—cells devoid of cell walls. However, many plant tissues and organs are resistant to enzymatic digestion, posing a significant barrier to advancing single-cell multi-omics in plant research. Furthermore, for field-grown crops, the lack of immediate laboratory facilities presents another major challenge for timely protoplast preparation. Here, to address these limitations, we developed FX-Cell and its derivatives, FXcryo-Cell and cryoFX-Cell, to enable single-cell RNA sequencing with both difficult-to-digest and cryopreserved plant samples. By optimizing the fixation buffer and minimizing RNA degradation, our approach ensures efficient cell wall digestion at high temperatures while maintaining high-quality single cells, even after long-term storage at −80 °C, and circumvents use of nuclei, which are not representative of the pool of translatable messenger RNAs. We successfully constructed high-quality cell atlases for rice tiller nodes, rhizomes of wild rice and maize crown roots grown under field conditions. Moreover, these methods enable the accurate reconstruction of plant acute wounding responses at single-cell resolution. Collectively, these advancements expand the applicability of plant single-cell genomics across a wider range of species and tissues, paving the way for comprehensive Plant Cell Atlases for plant species. Plant development RNA sequencing biology
N Nature Methods · Nov 27, 2025 Nondestructive X-ray tomography of brain tissue ultrastructure Maps of biological tissues at subcellular detail are key for understanding how organs function. X-ray nanotomography is a promising alternative to volume electron microscopy: it has the potential to nondestructively image millimeter-sized samples at ultrastructural resolution within a few days. A fundamental barrier is that the intense X-rays required for imaging also deform and disintegrate the tissue samples. Here we show a combination of solutions that overcome this barrier: We used a cryogenic and stable sample stage, tailored nonrigid tomographic reconstruction algorithms and an epoxy resin developed for the nuclear and aerospace industry. Tissue samples were resistant to radiation doses exceeding 1.15 × 1010Gy, and sub-40 nm isotropic resolution allowed identifying axon bundles, dendrites and synapses in mouse brain tissue without physical sectioning. Using volume electron microscopy, we demonstrate that tissue ultrastructure remains intact after X-ray imaging. Together, this unlocks the potential of X-ray tomography for high-resolution tissue imaging. Imaging Neuroscience biology mouse experiments
N Nature Methods · Nov 26, 2025 A highly photostable monomeric red fluorescent protein for dual-color 3D STED and time-lapse 3D SIM imaging Highly photostable red fluorescent proteins (RFPs) are invaluable for dual-color fluorescence microscopy, including super-resolution microscopy. Here we present mScarlet3‑S2, an RFP that exhibits a 29-fold improvement in photostability over its predecessor, mScarlet3, and outperforms other existing RFPs. This high photostability enables prolonged 2D and 3D imaging using both structured illumination microscopy and stimulated emission depletion microscopy. Using mScarlet3‑S2, we achieved over 150Z-stacks in 3D STED imaging, revealing the architecture of the endoplasmic reticulum (ER) in detail. Key findings facilitated by mScarlet3‑S2 include nonplanar ER junctions, nuclear envelope (NE) invaginations, 3D maps of ER–NE contacts, diverse contact morphotypes (punctate, ribbon-like and branched) and polarized ER–NE junction distributions. These findings redefine our structural understanding of the ER–NE interface and demonstrate the value of mScarlet3‑S2 in revealing subcellular complexity. Fluorescence imaging Super-resolution microscopy biology
N Nature Methods · Nov 24, 2025 Helixer: ab initio prediction of primary eukaryotic gene models combining deep learning and a hidden Markov model The accurate identification of genes is vital for understanding biological function, yet this remains challenging across many newly sequenced or less-studied species. Here we present Helixer, an artificial intelligence-based tool for ab initio gene prediction that delivers highly accurate gene models across fungal, plant, vertebrate and invertebrate genomes. Unlike traditional methods, Helixer operates without requiring additional experimental data such as RNA sequencing, making it broadly applicable to diverse species. We show that Helixer’s pretrained models achieve accuracy on par with or exceeding current tools, producing gene annotations that closely match expert-curated references across multiple evaluation metrics. Its design enables immediate use on genomes without retraining, providing an efficient, accessible solution for genome annotation in both research and applied settings. The tool is available as an open-source software for local installation via GitHub. An online web interface is also available as well as through the Galaxy ToolShed. Computational biology and bioinformatics Genome informatics Machine learning biology
N Nature Methods · Nov 24, 2025 Scalable spatial single-cell transcriptomics and translatomics in 3D thick tissue blocks Characterizing the transcriptional and translational gene expression patterns at the single-cell level within their three-dimensional (3D) tissue context is essential for revealing how genes shape tissue structure and function in health and disease. However, most existing spatial profiling techniques are limited to 5–20 µm thin tissue sections. Here, we developed Deep-STARmap and Deep-RIBOmap, which enable 3D in situ quantification of thousands of gene transcripts and their corresponding translation activities, respectively, within 60–200-µm thick tissue blocks. This is achieved through scalable probe synthesis, hydrogel embedding with efficient probe anchoring and robust cDNA crosslinking. We first utilized Deep-STARmap in combination with multicolor fluorescent protein imaging for simultaneous molecular cell typing and 3D neuron morphology tracing in the mouse brain. We also demonstrate that 3D spatial profiling facilitates comprehensive and quantitative analysis of tumor–immune interactions in human skin cancer. Gene expression Molecular neuroscience RNA Transcriptomics Tumour heterogeneity biology mouse experiments
N Nature Methods · Nov 24, 2025 TIRTL-seq: deep, quantitative and affordable paired TCR repertoire sequencing The specificity of T cells is determined by T cell receptor (TCR) α and β chain sequences. While bulk TCR sequencing enables cost-effective repertoire profiling without chain pairing information, single-cell approaches provide paired data but are costly and limited in throughput. Here we present throughput-intensive rapid TCR library sequencing (TIRTL-seq), an experimental and computational methodology for paired TCR repertoire sequencing (TCR-seq). TIRTL-seq is based on the parallel generation of hundreds of TCR libraries in 384-well plates at less than US$200 per plate, allowing cohort-scale paired TCR-seq studies. We benchmarked TIRTL-seq against state-of-the-art bulk TCR-seq and 10x Genomics Chromium technologies on longitudinal samples and identified severe acute respiratory syndrome coronavirus 2- and Epstein–Barr virus-specific clonal expansions after infection with distinct dynamics. TIRTL-seq offers a universal protocol scalable from a single cell to millions of T cells per sample, simultaneously delivering both precise clonal frequency estimation and accurate TCR chain pairing, combining the strengths of bulk and single-cell TCR-seq. TIRTL-seq is a high-throughput method for paired T cell receptor sequencing at the cohort scale. Adaptive immunity Immunological techniques Sequencing Software Systems biology biology
N Nature Methods · Nov 20, 2025 4Pi-SIMFLUX: 4Pi single-molecule localization microscopy with structured illumination Single-molecule localization microscopy (SMLM) has transformed biological imaging by enabling nanoscale visualization of intricate subcellular structures. However, conventional three-dimensional SMLM techniques typically exhibit lower axial resolution than lateral resolution, hindering isotropic investigations. Interferometric approaches, such as 4Pi-SMLM, enhance axial resolution by approximately fivefold through dual-objective coherent fluorescence detection, surpassing lateral resolution. Here we present 4Pi-SIMFLUX, which integrates structured illumination into 4Pi-SMLM to double its lateral resolution, achieving near-isotropic three-dimensional localization precision of 2–3 nm. We demonstrate that 4Pi-SIMFLUX breaks the 10-nm resolution barrier in biological samples, resolving microtubule ultrastructure and nuclear pore complexes with exceptional detail and clarity, while accounting for label size and localization density. Furthermore, it enables simultaneous multicolor imaging for interrogating multiple cellular components and high-fidelity, whole-cell visualization that captures comprehensive spatial organization. 4Pi-SIMFLUX effectively bridges the axial–lateral resolution gap, establishing a robust tool for molecular-scale imaging in complex cellular environments. Organelles Super-resolution microscopy
N Nature Methods · Nov 18, 2025 Light-induced extracellular vesicle and particle adsorption The role of extracellular vesicles (EVs) and particles (EPs/EVPs) in human health and disease has garnered considerable attention over the past two decades. However, while several types of EVPs are known to interact dynamically with the extracellular matrix and there is great potential value in producing high-fidelity EVP micropatterns, there are currently no label-free, scalable and tunable platform technologies with this capability. We introduce light-induced extracellular vesicle and particle adsorption (LEVA) as a powerful solution to study surface-bound EVPs. The versatility of LEVA is demonstrated using GFP–EV standards, EVs from conventional and bioreactor cultures, DiFi exomeres and Escherichia coli EVs, with the resulting patterns used for single-EV fluorescence imaging, cell migration on migrasome-mimetic trails and bacterial EV-mediated neutrophil swarming. LEVA will rapidly advance our understanding of extracellular matrix protein- and surface-bound EVPs and should encourage researchers from many disciplines to create new biomimetic, immunoengineering and other assays. LEVA is a label-free immobilization method for studying surface-bound extracellular vesicles. Cell biology Fluorescence imaging Lab-on-a-chip Nanoparticles
N Nature Methods · Nov 18, 2025 Multiplexed ultrasound imaging of gene expression Acoustic reporter genes (ARGs) have enabled imaging of gene expression with ultrasound, which provides high resolution access to deep, optically opaque living tissues. However, unlike their fluorescent counterparts, ARGs have so far been limited to a single ‘sound’, preventing multiplexed imaging of cellular states or populations. Here we use rational protein design and directed evolution to develop two new ARGs that can be distinguished from each other based on their acoustic pressure-response profiles, enabling ‘two-tone’ ultrasound imaging of gene expression. We demonstrate the utility of multiplexed ARGs for delineating bacterial cell species and cell states in vitro, and then apply them towards imaging distinct subpopulations of probiotics in the mouse gastrointestinal tract and of tumor-colonizing bacterial agents in vivo. Just as the first wavelength-shifted derivatives of fluorescent proteins opened a vivid world for optical microscopy, our next-generation acoustic proteins set the stage for a rich symphony of ultrasound signals from living subjects. Molecular imaging Ultrasound
N Nature Methods · Nov 18, 2025 ImmunoMatch learns and predicts cognate pairing of heavy and light immunoglobulin chains The development of stable antibodies formed by compatible heavy (H) and light (L) chain pairs is crucial in both in vivo maturation of antibody-producing cells and ex vivo designs of therapeutic antibodies. We present ImmunoMatch, a machine-learning framework trained on paired H and L sequences from human B cells to identify molecular features underlying chain compatibility. ImmunoMatch distinguishes cognate from random H–L pairs and captures differences associated withκandλlight chains, reflecting B cell selection mechanisms in the bone marrow. We apply ImmunoMatch to reconstruct paired antibodies from spatial VDJ sequencing data and study the refinement of H–L pairing across B cell maturation stages in health and disease. We find further that ImmunoMatch is sensitive to sequence differences at the H–L interface. These insights provide a computational lens into the broader biological principles governing antibody assembly and stability. Adaptive immunity Lymphocytes Machine learning Software
N Nature Methods · Nov 17, 2025 A multi-omics molecular landscape of 30 tissues in aging female rhesus macaques A systematic investigation of aging patterns across virtually all major tissues in nonhuman primates, our evolutionarily closest relatives, can provide valuable insights into tissue aging in humans, which is still elusive largely due to the difficulty in sampling. Here, we generated and analyzed multi-omics data, including transcriptome, proteome and metabolome, from 30 tissues of 17 female rhesus macaques (Macacamulatta) aged 3–27 years. We found that certain molecular features, such as increased inflammation, are consistent across tissues and align with findings in mice and humans. We further revealed that tissue aging in macaques is asynchronous and can be classified into two distinct types, with one type exhibiting more pronounced aging degree, likely associated with decreased mRNA translation efficiency, and predominantly contributing to whole-body aging. This work provides a comprehensive molecular landscape of aging in nonhuman primate tissues and links translation efficiency to tissue-specific aging. Genetic models Proteomics Sequencing Transcriptomics Zoology
N Nature Methods · Nov 13, 2025 Confocal Airy beam oblique light-sheet tomography for brain-wide cell type distribution and morphology Advanced brain-wide mapping is critical for addressing complex questions in neuroscience. However, current imaging methods are limited by throughput, resolution and signal-to-noise ratio, constraining their broader applicability. Here, we present confocal Airy beam integrated with single-photon oblique light-sheet tomography (CAB-OLST): a system that integrates single-photon excitation with a scanned Airy beam light sheet, virtual slit detection and automated mechanical sectioning. CAB-OLST enables high-throughput, high-resolution and high-signal-to-noise ratio volumetric imaging, achieving an optical resolution of 0.77 μm × 0.49 μm × 2.61 μm. This allows for mouse brain-wide cell type distribution mapping at a voxel size of 0.37 μm × 0.37 μm × 1.77 μm in 10 h and single-neuron projectome imaging with a voxel size of 0.26 μm × 0.26 μm × 1.06 μm over 58 h. Compared to existing light-sheet and point-scanning systems, CAB-OLST provides a scalable and robust platform for comprehensive neuronal morphology reconstruction and high-precision cell atlas generation. Confocal Airy beam integrated with single-photon oblique light-sheet tomography (CAB-OLST) is a high-throughput imaging approach for brain-wide mapping of neurons, as demonstrated in cleared mouse brains. Fluorescence imaging Light-sheet microscopy Mouse Neuroscience
N Nature Methods · Nov 13, 2025 Jaxley: differentiable simulation enables large-scale training of detailed biophysical models of neural dynamics Biophysical neuron models provide insights into cellular mechanisms underlying neural computations. A central challenge has been to identify parameters of detailed biophysical models such that they match physiological measurements or perform computational tasks. Here we describe a framework for simulating biophysical models in neuroscience—Jaxley—which addresses this challenge. By making use of automatic differentiation and GPU acceleration,Jaxleyenables optimizing large-scale biophysical models with gradient descent.Jaxleycan learn biophysical neuron models to match voltage or two-photon calcium recordings, sometimes orders of magnitude more efficiently than previous methods.Jaxleyalso makes it possible to train biophysical neuron models to perform computational tasks. We train a recurrent neural network to perform working memory tasks, and a network of morphologically detailed neurons with 100,000 parameters to solve a computer vision task.Jaxleyimproves the ability to build large-scale data- or task-constrained biophysical models, creating opportunities for investigating the mechanisms underlying neural computations across multiple scales. Computational biophysics Computational neuroscience
N Nature Methods · Nov 13, 2025 Bin Chicken: targeted metagenomic coassembly for the efficient recovery of novel genomes The recovery of microbial genomes from metagenomic datasets has provided genomic representation for hundreds of thousands of species from diverse biomes. However, low-abundance microorganisms are often missed due to insufficient genomic coverage. Here we present Bin Chicken, an algorithm that substantially improves genome recovery through automated, targeted selection of metagenomes for coassembly based on shared marker gene sequences derived from raw reads. Marker gene sequences that are divergent from known reference genomes can be further prioritized, providing an efficient means of recovering highly novel genomes. Applying Bin Chicken to public metagenomes and coassembling 800 sample groups recovered 77,562 microbial genomes, including the first genomic representatives of 6 phyla, 41 classes and 24,028 species. These genomes expand the genomic tree of life and uncover a wealth of novel microbial lineages for further research. Data mining Genome informatics Metagenomics Microbial genetics Software
N Nature Methods · Nov 13, 2025 MISO: microfluidic protein isolation enables single-particle cryo-EM structure determination from a single cell colony Single-particle cryogenic electron microscopy (cryo-EM) enables reconstruction of atomic-resolution 3D maps of proteins by visualizing thousands to millions of purified protein particles embedded in vitreous ice. This corresponds to picograms of purified protein, which can potentially be isolated from a few thousand cells. Hence, cryo-EM holds the potential of a very sensitive analytical method for delivering high-resolution protein structure as a readout. In practice, millions of times more starting biological material is required to prepare cryo-EM grids. Here we show that using a micro isolation (MISO) method, which combines microfluidics-based protein purification with cryo-EM grid preparation, cryo-EM structures of soluble bacterial and eukaryotic membrane proteins can be solved starting from less than 1 µg of a target protein and progressing from cells to cryo-EM grids within a few hours. This scales down the amount of starting biological material hundreds to thousands of times, opening possibilities for the structural characterization of hitherto inaccessible proteins. Cryoelectron microscopy Membrane proteins Single-molecule biophysics
N Nature Methods · Nov 13, 2025 Stimulus-modulated approach to steady state (SASS): a flexible paradigm for event-related fMRI Functional magnetic resonance imaging (fMRI) studies discard the initial volumes acquired during the approach of the magnetization to its steady-state value. Here we leverage the higher temporal signal-to-noise ratio of these initial volumes to increase the sensitivity of event-related fMRI experiments. We introduce acquisition-free periods (AFPs) that permit the full recovery of the magnetization, followed by an acquisition block of fMRI volumes. An appropriately placed stimulus in the AFP produces a blood oxygenation level-dependent response that peaks during the initial high temporal signal-to-noise ratio phase of the acquisition block. Using humans and monkeys (Callithrix jacchus) at different field strengths, we demonstrate up to a ~50% reduction in the number of trials needed to achieve a given statistical threshold relative to conventional fMRI. The silent AFP can be exploited for the presentation of auditory stimuli or uncontaminated electrophysiological recording and its variable duration allows aperiodic stimulus or response-locked signal averaging as well as gating to physiology or motion. Neurophysiology Neuroscience
N Nature Methods · Nov 11, 2025 Universal consensus 3D segmentation of cells from 2D segmented stacks Cell segmentation is the foundation of a wide range of microscopy-based biological studies. Deep learning has revolutionized two-dimensional (2D) cell segmentation, enabling generalized solutions across cell types and imaging modalities. This has been driven by the ease of scaling up image acquisition, annotation and computation. However, three-dimensional (3D) cell segmentation, requiring dense annotation of 2D slices, still poses substantial challenges. Manual labeling of 3D cells to train broadly applicable segmentation models is prohibitive. Even in high-contrast images annotation is ambiguous and time-consuming. Here we develop a theory and toolbox, u-Segment3D, for 2D-to-3D segmentation, compatible with any 2D method generating pixel-based instance cell masks. u-Segment3D translates and enhances 2D instance segmentations to a 3D consensus instance segmentation without training data, as demonstrated on 11 real-life datasets, comprising >70,000 cells, spanning single cells, cell aggregates and tissue. Moreover, u-Segment3D is competitive with native 3D segmentation, even exceeding when cells are crowded and have complex morphologies. Cellular imaging Image processing Machine learning Software
N Nature Methods · Nov 07, 2025 nELISA: a high-throughput, high-plex platform enables quantitative profiling of the inflammatory secretome Existing high-plex protein measurement tools compromise on quantification, precision and cost efficiency. Here, to address this, we present nELISA, a platform that combines a DNA-mediated, bead-based sandwich immunoassay with advanced multicolor bead barcoding. Antibody pairs are preassembled on target-specific, barcoded beads, which ensures spatial separation between noncognate assays. Detection antibodies are tethered via flexible single-stranded DNA to enable efficient ternary sandwich formation. Detection is achieved through toehold-mediated strand displacement, where fluorescently labeled DNA oligos simultaneously untether and label detection antibodies. nELISA delivers sub-picogram-per-milliliter sensitivity across seven orders of magnitude. Using a 191-plex inflammation panel, we profiled cytokine responses in 7,392 peripheral blood mononuclear cell samples, generating ~1.4 million protein measurements and revealing over 440 robust cytokine responses, including previously unreported effects. nELISA thus provides a simple, scalable and cost-efficient solution for large-scale, high-fidelity phenotypic screening. Biosensors High-throughput screening Proteomic analysis Proteomics
N Nature Methods · Nov 07, 2025 Monod: model-based discovery and integration through fitting stochastic transcriptional dynamics to single-cell sequencing data Single-cell RNA sequencing analysis centers on illuminating cell diversity and understanding the transcriptional mechanisms underlying cellular function. These datasets are large, noisy and complex. Current analyses prioritize noise removal and dimensionality reduction to tackle these challenges and extract biological insight. We propose an alternative, physical approach to leverage the stochasticity, size and multimodal nature of these data to explicitly distinguish their biological and technical facets while revealing the underlying regulatory processes. With the Python package Monod, we demonstrate how nascent and mature RNA counts, present in most published datasets, can be meaningfully ‘integrated’ under biophysical models of transcription. By using variation in these modalities, we can identify transcriptional modulation not discernible through changes in average gene expression, quantitatively compare mechanistic hypotheses of gene regulation, analyze transcriptional data from different technologies within a common framework and minimize the use of opaque or distortive normalization and transformation techniques. Computational biophysics Computational models Software Transcriptomics
N Nature Methods · Nov 03, 2025 ESPRESSO: spatiotemporal omics based on organelle phenotyping Omics technologies such as genomics, transcriptomics, proteomics and metabolomics methods, have been instrumental in improving our understanding of complex biological systems by providing high-dimensional phenotypes of cell populations and single cells. Despite fast-paced advancements, these methods are limited in their ability to include a temporal dimension. Here, we introduce ESPRESSO (Environmental Sensor Phenotyping RElayed by Subcellular Structures and Organelles), a technique that provides single-cell, high-dimensional phenotyping resolved in space and time. ESPRESSO combines fluorescent labeling, advanced microscopy and image and data analysis methods to extract morphological and functional information from organelles at the single-cell level. We validate ESPRESSO’s methodology and its application across numerous cellular systems for the analysis of cell type, stress response, differentiation and immune cell polarization. We show that ESPRESSO can correlate phenotype changes with gene expression, and demonstrate its applicability to 3D cultures, offering a path to improved spatially and temporally resolved biological exploration of cellular states. Cellular imaging High-throughput screening Organelles
N Nature Methods · Nov 03, 2025 STORIES: learning cell fate landscapes from spatial transcriptomics using optimal transport In dynamic biological processes such as development, spatial transcriptomics is revolutionizing the study of the mechanisms underlying spatial organization within tissues. Inferring cell fate trajectories from spatial transcriptomics profiled at several time points has thus emerged as a critical goal, requiring novel computational methods. Wasserstein gradient flow learning is a promising framework for analyzing sequencing data across time, built around a neural network representing the differentiation potential. However, existing gradient flow learning methods face challenges in analyzing spatially resolved transcriptomic data. Here, we propose STORIES, a method that uses an extension of Optimal Transport to learn a spatially informed potential. We benchmark our approach using three large Stereo-seq spatiotemporal atlases and demonstrate superior spatial coherence compared to existing approaches. Finally, we provide an in-depth analysis of axolotl neural regeneration and mouse gliogenesis, recovering gene trends for known markers such asNptx1in neuron regeneration andAldh1l1in gliogenesis and additional putative drivers. Computational models Differentiation Software Transcriptomics
N Nature Methods · Nov 03, 2025 A portable poison exon for small-molecule control of mammalian gene expression The ability to precisely control gene expression using small-molecule drugs is a valuable tool in research and has important therapeutic potential. However, existing systems are often limited by the toxicity of the drugs and the need to alter gene sequences or endogenous regulatory elements. Here, we introduce Cyclone (acyclovir-controlled poison exon), an acyclovir-controlled poison exon cassette that can be used for small-molecule control of both transgene and endogenous gene expression. Cyclone is a portable ‘intron–poison exon–intron’ element that can be inserted into nearly any gene and is completely removed upon acyclovir treatment, leaving the native transcript intact. Cyclone offers tunable, reversible gene expression with nearly undetectable background and a ~295-fold activation. We also present Pac-Cyclone, a cassette that simplifies the generation of cell lines with acyclovir-controlled endogenous gene expression. Finally, we demonstrate the programmability of Cyclone, underscoring its potential for developing diverse genetic circuits controlled by various ligands. Genetic engineering Riboswitches RNA Synthetic biology Translation
N Nature Methods · Nov 03, 2025 Whole-brain reconstruction of fiber tracts based on cytoarchitectonic organization Mapping of axon trajectories is crucial for understanding brain organization. Using whole-brain high-throughput fluorescence imaging, we developed a cytoarchitecture-based link estimation (CABLE) method for accurate fiber tract mapping at cellular resolution. This method infers the fiber direction from the inherent anisotropy of the nucleus or soma shape and spatial arrangement of adjacent cells. The inferred fiber tracts were validated by tracing virally labeled individual axons in the monkey brain. This CABLE method could disentangle complex intersecting or bending fibers that were uncertain in diffusion magnetic resonance imaging tractography, allowing accurate brain-wide fiber tract reconstruction in marmoset and macaque brains. Finally, we applied CABLE for rapid mapping of axon fiber abnormalities in diseased neonatal human brain tissues, establishing a path for high-resolution brain mapping of fiber tracts in the human brain. Cellular neuroscience Image processing Light-sheet microscopy Magnetic resonance imaging
N Nature Methods · Nov 03, 2025 Squidiff: predicting cellular development and responses to perturbations using a diffusion model Single-cell sequencing has revolutionized our understanding of cellular heterogeneity and responses to environmental stimuli. However, mapping transcriptomic changes across diverse cell types in response to various stimuli and elucidating underlying disease mechanisms remains challenging. Here we present Squidiff, a diffusion model-based generative framework that predicts transcriptomic changes across diverse cell types in response to environmental changes. We demonstrate the robustness of Squidiff across cell differentiation, gene perturbation and drug response prediction. Through continuous denoising and semantic feature integration, Squidiff learns transient cell states and predicts high-resolution transcriptomic landscapes over time and conditions. Furthermore, we applied Squidiff to model blood vessel organoid development and cellular responses to neutron irradiation and growth factors. Our results demonstrate that Squidiff enables in silico screening of molecular landscapes and cellular state transitions, facilitating rapid hypothesis generation and providing valuable insights into the regulatory principles of cell fate decisions. Biotechnology Computational models Machine learning Stem-cell differentiation
N Nature Methods · Oct 30, 2025 High-resolution imaging mass cytometry to map subcellular structures Imaging mass cytometry (IMC) is a powerful multiplexed imaging technology used to investigate cell phenotypes and spatial organization of tissue in health and disease. The spatial resolution of IMC is presently at 1 µm, enabling the resolution of single cells and large subcellular compartments but not submicrometer intracellular structures. Here we report a method to improve the resolution of IMC so that it approaches that of light microscopy. High-resolution IMC (HR-IMC) uses an oversampling approach coupled with point-spread function-based deconvolution to achieve a resolution below 350 nm. We demonstrate the performance of HR-IMC in resolving subcellular structures, such as nuclear foci and mitochondrial networks previously undetectable with IMC, and applied it to visualize chemotherapy-induced perturbation of patient-derived ovarian cancer cells. HR-IMC extends highly multiplex IMC analyses into the subcellular regime, enabling analysis of cell biological features and characteristics of disease. Data acquisition Imaging
N Nature Methods · Oct 30, 2025 Nicheformer: a foundation model for single-cell and spatial omics Tissue makeup depends on the local cellular microenvironment. Spatial single-cell genomics enables scalable and unbiased interrogation of these interactions. Here we introduce Nicheformer, a transformer-based foundation model trained on both human and mouse dissociated single-cell and targeted spatial transcriptomics data. Pretrained on SpatialCorpus-110M, a curated collection of over 57 million dissociated and 53 million spatially resolved cells across 73 tissues on cellular reconstruction, Nicheformer learns cell representations that capture spatial context. It excels in linear-probing and fine-tuning scenarios for a newly designed set of downstream tasks, in particular spatial composition prediction and spatial label prediction. Critically, we show that models trained only on dissociated data fail to recover the complexity of spatial microenvironments, underscoring the need for multiscale integration. Nicheformer enables the prediction of the spatial context of dissociated cells, allowing the transfer of rich spatial information to scRNA-seq datasets. Overall, Nicheformer sets the stage for the next generation of machine-learning models in spatial single-cell analysis. Computational models Machine learning Software Transcriptomics
N Nature Methods · Oct 29, 2025 Annotating the genome at single-nucleotide resolution with DNA foundation models Genome annotation models that directly analyze DNA sequences are indispensable for modern biological research, enabling rapid and accurate identification of genes and other functional elements. Current annotation tools are typically developed for specific element classes and trained from scratch using supervised learning on datasets that are often limited in size. Here we frame the genome annotation problem as multilabel semantic segmentation and introduce a methodology for fine-tuning pretrained DNA foundation models to segment 14 different genic and regulatory elements at single-nucleotide resolution. We leverage the self-supervised pretrained model Nucleotide Transformer to develop a general segmentation model, SegmentNT, capable of processing DNA sequences up to 50-kb long and that achieves state-of-the-art performance on gene annotation, splice site and regulatory elements detection. We also integrated in our framework the foundation models Enformer and Borzoi, extending the sequence context up to 500 kb and enhancing performance on regulatory elements. Finally, we show that a SegmentNT model trained on human genomic elements generalizes to different species, and a multispecies SegmentNT model achieves strong generalization across unseen species. Our approach is readily extensible to additional models, genomic elements and species. Genomics Machine learning Software
N Nature Methods · Oct 27, 2025 Improved reconstruction of single-cell developmental potential with CytoTRACE 2 While single-cell RNA sequencing has advanced our understanding of cell fate, identifying molecular hallmarks of potency—a cell’s ability to differentiate into other cell types—remains a challenge. Here we introduce CytoTRACE 2, an interpretable deep learning framework for predicting absolute developmental potential from single-cell RNA sequencing data. Across diverse platforms and tissues, CytoTRACE 2 outperformed previous methods in predicting developmental hierarchies, enabling detailed mapping of single-cell differentiation landscapes and expanding insights into cell potency. Cancer genomics Machine learning Software Stem cells Transcriptomics
N Nature Methods · Oct 23, 2025 PHLOWER leverages single-cell multimodal data to infer complex, multi-branching cell differentiation trajectories Computational trajectory analysis is a key computational task for inferring differentiation trees from this single-cell data. An open challenge is the prediction of complex and multi-branching trees from multimodal data. To address these challenges, we present PHLOWER (decomposition of the Hodge Laplacian for inferring trajectories from flows of cell differentiation), which leverages the harmonic component of the Hodge decomposition on simplicial complexes to infer trajectory embeddings from single-cell multimodal data. These natural representations of cell differentiation facilitate the estimation of their underlying differentiation trees. We evaluate PHLOWER through benchmarking with multi-branching differentiation trees and using kidney organoid multimodal and spatial single-cell data. These demonstrate the power of PHLOWER in both the inference of complex trees and the identification of transcription factors regulating off-target cells in kidney organoids. Thus, PHLOWER enables inference of complex branching trajectories and prediction of transcriptional regulators by leveraging multimodal data. Experimental models of disease Gene regulatory networks
N Nature Methods · Oct 22, 2025 scooby: modeling multimodal genomic profiles from DNA sequence at single-cell resolution Understanding how regulatory sequences shape gene expression across individual cells is a fundamental challenge in genomics. Joint RNA sequencing and epigenomic profiling provides opportunities to build models capturing sequence determinants across steps of gene expression. However, current models, developed primarily for bulk omics data, fail to capture the cellular heterogeneity and dynamic processes revealed by single-cell multimodal technologies. Here, we introduce scooby, a framework to model genomic profiles of single-cell RNA-sequencing coverage and single-cell assay for transposase-accessible chromatin using sequencing insertions from sequence at single-cell resolution. For this, we leverage the pretrained multiomics profile predictor Borzoi and equip it with a cell-specific decoder. Scooby recapitulates cell-specific expression levels of held-out genes and identifies regulators and their putative target genes. Moreover, scooby allows resolving single-cell effects of bulk expression quantitative trait loci and delineating their impact on chromatin accessibility and gene expression. We anticipate scooby to aid unraveling the complexities of gene regulation at the resolution of individual cells. Computational models Machine learning Software Transcriptomics
N Nature Methods · Oct 20, 2025 CELLECT: contrastive embedding learning for large-scale efficient cell tracking Quantitative analysis of large-scale cellular behaviors plays an increasingly crucial role in understanding mechanisms of diverse physiopathological processes, but achieving cell tracking with both high performance and efficiency in practical applications remains a challenge. Here we introduce CELLECT, a contrastive embedding learning method for large-scale efficient cell tracking, and demonstrate it on theCaenorhabditis elegansdataset in the Cell Tracking Challenge. By contrastive learning of latent embeddings of diverse cellular structures, a CELLECT model pretrained on a single public dataset can be effectively applied across different imaging modalities and species with broad generalization. Using advanced two-photon imaging, CELLECT enables real-time 3D tracking of large-scale B cells with frequent divisions during germinal center formation in a mouse lymph node, quantitative identification of cell–bacterium interactions in the mouse spleen and high-fidelity extraction of neural signals during strong nonrigid motions. We believe that these results demonstrate broad applications of CELLECT in immunology, pathology and neuroscience. Fluorescence imaging Lymphocytes Software Systems biology
N Nature Methods · Oct 15, 2025 gReLU: a comprehensive framework for DNA sequence modeling and design Deep learning models trained on DNA sequences can predict cell-type-specific regulatory activity, reveal cis-regulatory grammar, prioritize genetic variants and design synthetic DNA. However, building and interpreting these models correctly remains difficult, and models and software built by different groups are often not interoperable. Here we present gReLU, a comprehensive software framework that enables advanced sequence modeling pipelines, including data preprocessing, modeling, evaluation, interpretation, variant effect prediction and regulatory element design. gReLU advances deep-learning-based modeling and analysis of DNA sequences with comprehensive toolsets and versatile applications. Genomics Machine learning Software
N Nature Methods · Oct 13, 2025 Multitask benchmarking of single-cell multimodal omics integration methods Single-cell multimodal omics technologies have empowered the profiling of complex biological systems at a resolution and scale that were previously unattainable. These biotechnologies have propelled the fast-paced innovation and development of data integration methods, leading to a critical need for their systematic categorization, evaluation and benchmarking. Navigating and selecting the most pertinent integration approach poses a considerable challenge, contingent upon the tasks relevant to the study goals and the combination of modalities and batches present in the data at hand. Understanding how well each method performs multiple tasks, including dimension reduction, batch correction, cell type classification and clustering, imputation, feature selection and spatial registration, and at which combinations will help guide this decision. Here we develop a much-needed guideline on choosing the most appropriate method for single-cell multimodal omics data analysis through a systematic categorization and comprehensive benchmarking of current methods. The stage 1 protocol for this Registered Report was accepted in principle on 30 July 2024. The protocol, as accepted by the journal, can be found athttps://springernature.figshare.com/articles/journal_contribution/Multi-task_benchmarking_of_single-cell_multimodal_omics_integration_methods/26789902. Computational models Data integration Software Transcriptomics
N Nature Methods · Oct 13, 2025 Deep generative modeling of sample-level heterogeneity in single-cell genomics Single-cell genomic studies were recently conducted on hundred of samples exhibiting complex designs. These data have tremendous potential for discovering how sample- or tissue-level phenotypes relate to cellular and molecular composition. However, current analyses are often based on simplified representations of these data by averaging information across cells. We present multi-resolution variational inference (MrVI), a deep generative model designed to realize the potential of cohort studies at the single-cell level. MrVI tackles two fundamental, intertwined problems: stratifying samples into groups and evaluating the cellular and molecular differences between groups, without requiring predefined cell states. Leveraging its single-cell perspective, MrVI detects clinically relevant stratifications of cohorts of people with COVID-19 or inflammatory bowel disease that are manifested in only certain cellular subsets, enabling new discoveries that would otherwise be overlooked. MrVI can de novo identify groups of small molecules with similar biochemical properties and evaluate their effects on cellular composition and gene expression in large-scale perturbation studies. MrVI is an open-source tool atscvi-tools.org. Machine learning Software Statistical methods Transcriptomics
N Nature Methods · Oct 08, 2025 Automated classification of cellular expression in multiplexed imaging data with Nimbus Multiplexed imaging offers a powerful approach to characterize the spatial topography of tissues in both health and disease. To analyze such data, the specific combination of markers that are present in each cell must be enumerated to enable accurate phenotyping, a process that often relies on unsupervised clustering. We constructed the Pan-Multiplex (Pan-M) dataset containing 197 million distinct annotations of marker expression across 15 different cell types. We used Pan-M to create Nimbus, a deep learning model to predict marker positivity from multiplexed image data. Nimbus is a pretrained model that uses the underlying images to classify marker expression of individual cells as positive or negative across distinct cell types, from different tissues, acquired using different microscope platforms, without requiring any retraining. We demonstrate that Nimbus predictions capture the underlying staining patterns of the full diversity of markers present in Pan-M, and that Nimbus matches or exceeds the accuracy of previous approaches that must be retrained on each dataset. We then show how Nimbus predictions can be integrated with downstream clustering algorithms to robustly identify cell subtypes in image data. We have open-sourced Nimbus and Pan-M to enable community use athttps://github.com/angelolab/Nimbus-Inference. Image processing Machine learning Software
N Nature Methods · Oct 08, 2025 Cell tracking with accurate error prediction Cell tracking is an indispensable tool for studying development by time-lapse imaging. However, existing cell trackers cannot assign confidence to predicted tracks, which prohibits fully automated analysis without manual curation. We present a fundamental advance: an algorithm that combines neural networks with statistical physics to determine cell tracks with error probabilities for each step in the track. From these, we can obtain error probabilities for any tracking feature, from cell cycles to lineage trees, that function likePvalues in data interpretation. Our method, OrganoidTracker 2.0, greatly speeds up tracking analysis by limiting manual curation to rare low-confidence tracking steps. Importantly, it also enables fully automated analysis by retaining only high-confidence track segments, which we demonstrate by analyzing cell cycles and differentiation events at scale for thousands of cells in multiple intestinal organoids. Our approach brings cell dynamics-based organoid screening within reach and enables transparent reporting of cell-tracking results and associated scientific claims. Confocal microscopy Differentiation Image processing Software Statistical methods
N Nature Methods · Oct 03, 2025 Spatiotemporal focusing enables all-optical in situ histology of heterogeneous tissue Living systems embody heterogeneous tissues with complex opto-mechanical properties. Achieving organ-scale, diffraction-limited volumetric imaging that faithfully captures in vivo architecture requires minimizing sample deformation and preserving vascular and neuronal continuity across delicate tissue interfaces. As a solution to this problem, we developed a robotic nonlinear optical system for iterative multiphoton microscopy and opto-micromachining. Adaptive control enabled days-long autonomous operation, while spatiotemporal line-focused ablation increased the machining efficiency by 100-fold over prior configurations. Using the intact murine craniocerebral system as a test bed, our approach demonstrates the potential for whole-body submicrometer resolution imaging and anatomical reconstruction. Mouse Multiphoton microscopy Neuroscience Optical imaging
N Nature Methods · Oct 03, 2025 All-at-once RNA folding with 3D motif prediction framed by evolutionary information Structural RNAs exhibit a vast array of recurrent short three-dimensional (3D) elements found in loop regions involving non-Watson–Crick interactions that help arrange canonical double helices into tertiary structures. Here we present CaCoFold-R3D, a probabilistic grammar that predicts these RNA 3D motifs (also termed modules) jointly with RNA secondary structure over a sequence or alignment. CaCoFold-R3D uses evolutionary information present in an RNA alignment to reliably identify canonical helices (including pseudoknots) by covariation. Here we further introduce the R3D grammars, which also exploit helix covariation that constrains the positioning of the mostly noncovarying RNA 3D motifs. Our method runs predictions over an almost-exhaustive list of over 50 known RNA motifs (‘everything’). Motifs can appear in any nonhelical loop region (including three-way, four-way and higher junctions) (‘everywhere’). All structural motifs as well as the canonical helices are arranged into one single structure predicted by one single joint probabilistic grammar (‘all-at-once’). Our results demonstrate that CaCoFold-R3D is a valid alternative for predicting the all-residue interactions present in a RNA 3D structure. CaCoFold-R3D is fast and easily customizable for novel motif discovery and shows promising value both as a strong input for deep learning approaches to all-atom structure prediction as well as toward guiding RNA design as drug targets for therapeutic small molecules. Computational models Machine learning Non-coding RNAs Riboswitches
N Nature Methods · Oct 02, 2025 Foundation model for efficient biological discovery in single-molecule time traces Single-molecule fluorescence microscopy (SMFM) can reveal important biological insights. However, uncovering rare but critical intermediates often demands manual inspection of time traces and iterative ad hoc approaches. To facilitate systematic and efficient discovery from SMFM time traces, we introduce META-SiM, a transformer-based foundation model pretrained on diverse SMFM analysis tasks. META-SiM rivals best-in-class algorithms on a broad range of tasks including trace classification, segmentation, idealization and stepwise photobleaching analysis. Additionally, the model produces embeddings that encapsulate detailed information about each trace, which the web-based META-SiM Projector (https://www.simol-projector.org) casts into lower-dimensional space for efficient whole-dataset visualization, labeling, comparison and sharing. Combining this Projector with the objective metric of local Shannon entropy enables rapid identification of condition-specific behaviors, even if rare or subtle. Applying META-SiM to an existing single-molecule Förster resonance energy transfer dataset, we discover a previously undetected intermediate state in pre-mRNA splicing. META-SiM removes bottlenecks, improves objectivity and both systematizes and accelerates biological discovery in single-molecule data. Machine learning Single-molecule biophysics
N Nature Methods · Oct 01, 2025 Giotto Suite: a multiscale and technology-agnostic spatial multiomics analysis ecosystem Emerging spatial multiomics technologies provide an increasingly large amount of information content at multiple scales. However, it remains challenging to efficiently represent and harmonize diverse spatial datasets. Here we present Giotto Suite, a suite of modular packages that provides scalable and extensible end-to-end solutions for multiscale and multiomic data analysis, integration and visualization. At its core, Giotto Suite is centered around an innovative data framework, allowing the representation and integration of spatial omics data in a technology-agnostic manner. Giotto Suite integrates molecular, morphology, spatial and annotated feature information to create a responsive and flexible workflow, as demonstrated by applications to several state-of-the-art spatial technologies. Furthermore, Giotto Suite builds upon interoperable interfaces and data structures that bridge the established fields of genomics and spatial data science in R, thereby enabling independent developers to create custom-engineered pipelines. As such, Giotto Suite creates an immersive and multiscale ecosystem for spatial multiomic data analysis. Computational platforms and environments Software Transcriptomics
N Nature Methods · Oct 01, 2025 Fourier-based three-dimensional multistage transformer for aberration correction in multicellular specimens High-resolution tissue imaging is often compromised by sample-induced optical aberrations that degrade resolution and contrast. Although wavefront sensor-based adaptive optics (AO) can measure these aberrations, such hardware solutions are typically complex, expensive to implement and slow when serially mapping spatially varying aberrations across large fields of view. Here we introduceAOViFT(adaptive optical vision Fourier transformer)—a machine learning-based aberration sensing framework built around a three-dimensional multistage vision transformer that operates on Fourier domain embeddings.AOViFTinfers aberrations and restores diffraction-limited performance in puncta-labeled specimens with substantially reduced computational cost, training time and memory footprint compared to conventional architectures or real-space networks. We validatedAOViFTon live gene-edited zebrafish embryos, demonstrating its ability to correct spatially varying aberrations using either a deformable mirror or postacquisition deconvolution. By eliminating the need for the guide star and wavefront sensing hardware and simplifying the experimental workflow,AOViFTlowers technical barriers for high-resolution volumetric microscopy across diverse biological samples. Computational models Machine learning
N Nature Methods · Oct 01, 2025 HippoMaps: multiscale cartography of human hippocampal organization The hippocampus has a specialized microarchitecture, is situated at the nexus of multiple macroscale functional networks, contributes to numerous cognitive as well as affective processes and is highly susceptible to brain pathology across common disorders. These features make the hippocampus a model to understand how brain structure covaries with function, in both health and disease. Here we introduce HippoMaps, an open access toolbox and online data warehouse for the mapping and contextualization of subregional hippocampal data in the human brain (http://hippomaps.readthedocs.io). HippoMaps capitalizes on a unified hippocampal unfolding approach as well as shape intrinsic registration capabilities to allow for cross-participant and cross-modal data aggregation. We initialize this repository with a combination of hippocampal data spanning three-dimensional ex vivo histology, ex vivo 9.4-Tesla magnetic resonance imaging (MRI), as well as in vivo structural MRI and resting-state functional MRI obtained at 3 Tesla and 7 Tesla, together with intracranial encephalography recordings in patients with epilepsy. All code, data and tools are openly available online, with the aim of fostering further community contributions. Computational neuroscience Data integration Image processing Learning and memory
N Nature Methods · Sep 29, 2025 Highly multiplexed 3D profiling of cell states and immune niches in human tumors Diseases such as cancer involve alterations in cell proportions, states and interactions, as well as complex changes in tissue morphology and architecture. Histopathological diagnosis of disease and most multiplexed spatial profiling relies on inspecting thin (4–5 µm) specimens. Here we describe a high-plex cyclic immunofluorescence method for three-dimensional tissue imaging and use it to show that few, if any, cells are intact in conventional thin tissue sections, reducing the accuracy of cell phenotyping and interaction analysis. However, three-dimensional cyclic immunofluorescence of sections eightfold to tenfold thicker enables accurate morphological assessment of diverse protein markers in intact tumor, immune and stromal cells. Moreover, the high resolution of this confocal approach generates images of cells in a preserved tissue environment at a level of detail previously limited to cell culture. Precise imaging of cell membranes also makes it possible to detect and map cell–cell contacts and juxtracrine signaling complexes in immune cell niches. Cancer Cancer microenvironment Cell signalling Cellular imaging Tumour heterogeneity
N Nature Methods · Sep 29, 2025 InterPLM: discovering interpretable features in protein language models via sparse autoencoders Despite their success in protein modeling and design, the internal mechanisms of protein language models (PLMs) are poorly understood. Here we present a systematic framework to extract and analyze interpretable features from PLMs using sparse autoencoders. Training sparse autoencoders on ESM-2 embeddings, we identify thousands of interpretable features highlighting biological concepts including binding sites, structural motifs and functional domains. Individual neurons show considerably less conceptual alignment, suggesting PLMs store concepts in superposition. This superposition persists across model scales and larger PLMs capture more interpretable concepts. Beyond known annotations, ESM-2 learns coherent patterns across evolutionarily distinct protein families. To systematically analyze these numerous features, we developed an automated interpretation approach using large language models for feature description and validation. As practical applications, these features can accurately identify missing database annotations and enable targeted steering of sequence generation. Our results show PLM representations can be decomposed into interpretable components, demonstrating the feasibility and utility of mechanistically interpreting these models. Protein analysis Software
N Nature Methods · Sep 29, 2025 Uncovering hidden protein modifications with native top-down mass spectrometry Protein modifications drive dynamic cellular processes by modulating biomolecular interactions, yet capturing these modifications within their native structural context remains a significant challenge. Native top-down mass spectrometry promises to preserve the critical link between modifications and interactions. However, current methods often fail to detect uncharacterized or low-abundance modifications, limiting insights into proteoform diversity. To address this gap, we introduce precise and accurate Identification Of Native proteoforms (precisION), an interactive end-to-end software package that leverages a robust, data-driven fragment-level open search to detect, localize and quantify ‘hidden’ modifications within intact protein complexes. Applying precisION to four therapeutically relevant targets—PDE6, ACE2, osteopontin (SPP1) and a GABA transporter (GAT1)—we discover undocumented phosphorylation, glycosylation and lipidation, and resolve previously uninterpretable density in an electron cryo-microscopy map of GAT1. As an open-source software package, precisION offers an intuitive means for interpreting complex protein fragmentation data. This tool will empower the community to unlock the potential of native top-down mass spectrometry, advancing integrative structural biology, molecular pathology and drug development. Protein analysis Proteins Proteomics
N Nature Methods · Sep 25, 2025 Merging conformational landscapes in a single consensus space with FlexConsensus algorithm Structural heterogeneity analysis in cryogenic electron microscopy is experiencing a breakthrough in estimating more accurate, richer and interpretable conformational landscapes derived from experimental data. The emergence of new methods designed to tackle the heterogeneity challenge reflects this new paradigm, enabling users to gain a better understanding of protein dynamics. However, the question of how intrinsically different heterogeneity algorithms compare remains unsolved, which is crucial for determining the reliability, stability and correctness of the estimated conformational landscapes. Here, to overcome the previous challenge, we introduce FlexConsenus: a multi-autoencoder neural network able to learn the commonalities and differences among several conformational landscapes, enabling them to be placed in a shared consensus space with enhanced reliability. The consensus space enables the measurement of reproducibility in heterogeneity estimations, allowing users to either focus their analysis on particles with a stable estimation of their structural variability or concentrate on specific particle subsets detected by only certain methods. Image processing Machine learning Software
N Nature Methods · Sep 25, 2025 Single-cell multi-omic detection of DNA methylation and histone modifications reconstructs the dynamics of epigenomic maintenance DNA methylation and histone modifications encode epigenetic information. Recently, major progress was made to measure either mark at a single-cell resolution; however, a method for simultaneous detection is lacking, preventing study of their interactions. Here, to bridge this gap, we developed scEpi2-seq. Our technique provides a readout of histone modifications and DNA methylation at the single-cell and single-molecule level. Application in a cell line with the FUCCI cell cycle reporter system reveals how DNA methylation maintenance is influenced by the local chromatin context. In addition, profiling of H3K27me3 and DNA methylation in the mouse intestine yields insights into epigenetic interactions during cell type specification. Differentially methylated regions also demonstrated independent cell-type regulation in addition to H3K27me3 regulation, which reinforces that CpG methylation acts as an additional layer of control in facultative heterochromatin. DNA sequencing Epigenetics
N Nature Methods · Sep 25, 2025 EpiAgent: foundation model for single-cell epigenomics Although single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) enables the exploration of the epigenomic landscape that governs transcription at the cellular level, the complicated characteristics of the sequencing data and the broad scope of downstream tasks mean that a sophisticated and versatile computational method is urgently needed. Here we introduce EpiAgent, a foundation model pretrained on our manually curated large-scale Human-scATAC-Corpus. EpiAgent encodes chromatin accessibility patterns of cells as concise ‘cell sentences’ and captures cellular heterogeneity behind regulatory networks via bidirectional attention. Comprehensive benchmarks show that EpiAgent excels in typical downstream tasks, including unsupervised feature extraction, supervised cell type annotation and data imputation. By incorporating external embeddings, EpiAgent enables effective cellular response prediction for both out-of-sample stimulated and unseen genetic perturbations, reference data integration and query data mapping. Through in silico knockout ofcis-regulatory elements, EpiAgent demonstrates the potential to model cell state changes. EpiAgent is further extended to directly annotate cell types in a zero-shot manner. Computational models Data integration Machine learning Software
N Nature Methods · Sep 23, 2025 Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods DNA-based biodiversity surveys result in massive-scale data, including up to millions of species—of which, most are rare. Making the most of such data for inference and prediction requires modeling approaches that can relate species occurrences to environmental and spatial predictors, while incorporating information about their taxonomic or phylogenetic placement. Even if the scalability of joint species distribution models to large communities has greatly advanced, incorporating hundreds of thousands of species has not been feasible to date, leading to compromised analyses. Here we present a ‘common to rare transfer learning’ (CORAL) approach, based on borrowing information from the common species to enable statistically and computationally efficient modeling of both common and rare species. We illustrate that CORAL leads to much improved prediction and inference in the context of DNA metabarcoding data from Madagascar, comprising 255,188 arthropod species detected in 2,874 samples. Ecology Statistical methods
N Nature Methods · Sep 23, 2025 Kilohertz volumetric imaging of in vivo dynamics using squeezed light field microscopy Volumetric functional imaging of transient cellular signaling and motion dynamics is often limited by hardware bandwidth and the scarcity of photons under short exposures. To overcome these challenges, we introduce squeezed light field microscopy (SLIM), a computational imaging approach that rapidly captures high-resolution three-dimensional light signals using only a single, low-format camera sensor. SLIM records over 1,000 volumes per second across a 550-µm diameter field of view and 300-µm depth, achieving 3.6-µm lateral and 6-µm axial resolution. Here we demonstrate its utility in blood cell velocimetry within the embryonic zebrafish brain and in freely moving tails undergoing high-frequency swings. Millisecond-scale temporal resolution further enables precise voltage imaging of neural membrane potentials in the leech ganglion and hippocampus of behaving mice. Together, these results establish SLIM as a versatile and robust tool for high-speed volumetric microscopy across diverse biological systems. Fluorescence imaging Microscopy Optical imaging
N Nature Methods · Sep 23, 2025 Dose-efficient cryo-electron microscopy for thick samples using tilt- corrected scanning transmission electron microscopy Cryogenic electron microscopy is a powerful tool in structural biology. In thick specimens, challenges arise as an exponentially larger fraction of the transmitted electrons lose energy from inelastic scattering and can no longer be properly focused as a result of chromatic aberrations in the post-specimen optics. Rather than filtering out the inelastic scattering at the price of reducing potential signal, as is done in energy-filtered transmission electron microscopy, we show how a dose-efficient and unfiltered image can be rapidly obtained using tilt-corrected bright-field scanning transmission electron microscopy data collected on a pixelated detector. Enhanced contrast and a 3–5× improvement in dose efficiency are observed for two-dimensional images of intact bacterial cells and large organelles using tilt-corrected bright-field scanning transmission electron microscopy compared to energy-filtered transmission electron microscopy for thicknesses beyond 500 nm. As a proof of concept for the technique’s performance in structural determination, we present a single-particle analysis map at sub-nanometer resolution for a highly symmetric virus-like particle determined from 789 particles. Cellular imaging Cryoelectron microscopy Scanning electron microscopy
N Nature Methods · Sep 22, 2025 RNA-stabilized coat proteins for sensitive and simultaneous imaging of distinct single mRNAs in live cells RNA localization and regulation are critical for cellular function, yet many live RNA imaging tools suffer from limited sensitivity due to background emissions from unbound probes. Here we introduce conditionally stable variants of MS2 and PP7 coat proteins (which we name dMCP and dPCP) designed to decrease background in live-cell RNA imaging. Using a protein engineering approach that combines circular permutation and degron masking, we generated dMCP and dPCP variants that rapidly degrade except when bound to cognate RNA ligands. These enhancements enabled the sensitive visualization of single mRNA molecules undergoing differential regulation within various subcompartments of live cells. We further demonstrate dual-color imaging with orthogonal MS2 and PP7 motifs, allowing simultaneous low-background visualization of distinct RNA species within the same cell. Overall, this work provides versatile, low-background probes for RNA imaging, which should have broad utility in the imaging and biotechnological utilization of MS2-containing and PP7-containing RNAs. Protein design RNA probes
N Nature Methods · Sep 18, 2025 GPU-accelerated homology search with MMseqs2 Rapidly growing protein databases demand faster sensitive search tools. Here the graphics processing unit (GPU)-accelerated MMseqs2 delivers 6× faster single-protein searches than CPU methods on 2 × 64 cores, speeds previously requiring large protein batches. For larger query batches, it is the most cost-effective solution, outperforming the fastest alternative method by 2.4-fold with eight GPUs. It accelerates protein structure prediction with ColabFold 31.8× over the standard AlphaFold2 pipeline and protein structure search with Foldseek by 4–27×. MMseqs2-GPU is available under an open-source license athttps://mmseqs.com/. Hardware and infrastructure Protein analysis Protein function predictions Protein structure predictions Software
N Nature Methods · Sep 15, 2025 Spatial gene expression at single-cell resolution from histology using deep learning with GHIST The increased use of spatially resolved transcriptomics provides new biological insights into disease mechanisms. However, the high cost and complexity of these methods are barriers to broader application. Consequently, methods have been created to predict spot-based gene expression from routinely collected histology images. Recent benchmarking showed that current methodologies have limited accuracy and spatial resolution, constraining translational capacity. Here, we introduce GHIST, a deep learning-based framework that predicts spatial gene expression at single-cell resolution by leveraging subcellular spatial transcriptomics and synergistic relationships between multiple layers of biological information. We validated GHIST using public datasets and The Cancer Genome Atlas data, demonstrating its flexibility across different spatial resolutions and superior performance. Our results underscore the utility of in silico generation of single-cell spatial gene expression measurements and the capacity to enrich existing datasets with a spatially resolved omics modality, paving the way for scalable multi-omics analysis and biomarker identification. Computational models Functional genomics Gene expression Image processing Machine learning
N Nature Methods · Sep 15, 2025 Cancer subclone detection based on DNA copy number in single-cell and spatial omic sequencing data Somatic mutations such as copy number alterations accumulate during cancer progression, driving intratumor heterogeneity that impacts therapy effectiveness. Understanding the characteristics and spatial distribution of genetically distinct subclones is essential for unraveling tumor evolution and improving cancer treatment. Here we present Clonalscope, a subclone detection method using copy number profiles, applicable to spatial transcriptomics and single-cell sequencing data. Clonalscope implements a nested Chinese Restaurant Process to identify de novo tumor subclones, which can incorporate prior information from matched bulk DNA sequencing data for improved subclone detection and malignant cell labeling. On single-cell RNA sequencing and single-cell assay for transposase-accessible chromatin using sequencing data from gastrointestinal tumors, Clonalscope successfully labeled malignant cells and identified genetically different subclones with thorough validations. On spatial transcriptomics data from various primary and metastasized tumors, Clonalscope labeled malignant spots, traced subclones and identified spatially segregated subclones with distinct differentiation levels and expression of genes associated with drug resistance and survival. Cancer genomics Genomics Software Statistical methods Tumour heterogeneity
N Nature Methods · Sep 15, 2025 De novo discovery of conserved gene clusters in microbial genomes with Spacedust Metagenomics has revolutionized environmental and human-associated microbiome studies. However, the limited fraction of proteins with known biological processes and molecular functions presents a major bottleneck. In prokaryotes and viruses, evolution favors keeping genes participating in the same biological processes colocalized as conserved gene clusters. Conversely, conservation of gene neighborhood indicates functional association. Here we present Spacedust, a tool for systematic, de novo discovery of conserved gene clusters. To find homologous protein matches, Spacedust uses fast and sensitive structure comparison with Foldseek. Partially conserved clusters are detected using novel clustering and order conservationPvalues. We demonstrate Spacedust’s sensitivity with an all-versus-all analysis of 1,308 bacterial genomes, identifying 72,843 conserved gene clusters containing 58% of the 4.2 million genes. It recovered 95% of antiviral defense system clusters annotated by the specialized tool PADLOC. Spacedust’s high sensitivity and speed will facilitate the annotation of large numbers of sequenced bacterial, archaeal and viral genomes. Genome informatics Metagenomics Software
N Nature Methods · Sep 15, 2025 Integrating diverse experimental information to assist protein complex structure prediction by GRASP Protein complex structure prediction is crucial for understanding of biological activities and advancing drug development. While various experimental methods can provide structural insights into protein complexes, the knowledge obtained is often sparse or approximate. A general tool is needed to integrate limited experimental information for high-throughput and accurate prediction. Here we introduce GRASP to efficiently and flexibly incorporate diverse forms of experimental information. GRASP outperforms existing tools in handling both simulated and real-world experimental restraints including those from crosslinking, covalent labeling, chemical shift perturbation and deep mutational scanning. For example, GRASP excels at predicting antigen–antibody complex structures, even surpassing AlphaFold3 when using experimental deep mutational scanning or covalent-labeling restraints. Beyond its accuracy and flexibility in restrained structure prediction, GRASP’s ability to integrate multiple forms of restraints enables integrative modeling. We also showcase its potential in modeling protein structural interactome under near-cellular conditions using previously reported large-scale in situ crosslinking data for mitochondria. Cryoelectron microscopy Machine learning Protein structure predictions Solution-state NMR
N Nature Methods · Sep 15, 2025 MSnLib: efficient generation of open multi-stage fragmentation mass spectral libraries Untargeted high-resolution mass spectrometry is a key tool in clinical metabolomics, natural product discovery and exposomics, with compound identification remaining the major bottleneck. Currently, the standard workflow applies spectral library matching against tandem mass spectrometry (MS2) fragmentation data. Multi-stage fragmentation (MSn) yields more profound insights into substructures, enabling validation of fragmentation pathways; however, the community lacks open MSnreference data of diverse natural products and other chemicals. Here we describe MSnLib, a machine learning-ready open resource of >2 million spectra in MSntrees of 30,008 unique small molecules, built with a high-throughput data acquisition and processing pipeline in the open-source software mzmine. Mass spectrometry Metabolomics
N Nature Methods · Sep 15, 2025 Scaling up spatial transcriptomics for large-sized tissues: uncovering cellular-level tissue architecture beyond conventional platforms with iSCALE Recent advances in spatial transcriptomics (ST) technologies have transformed our ability to profile gene expression while preserving crucial spatial context within tissues. However, existing ST platforms are constrained by high costs, long turnaround times, low resolution, limited gene coverage and inherently small tissue capture areas, which hinder their broad applications. Here we present iSCALE, a method that reconstructs large-scale, super-resolution gene expression landscapes and automatically annotates cellular-level tissue architecture in samples exceeding capture areas of current ST platforms. The performance of iSCALE was assessed by comprehensive evaluations involving benchmarking experiments, immunohistochemistry staining and manual annotations by pathologists. When applied to multiple sclerosis human brain samples, iSCALE uncovered lesion-associated cellular characteristics undetectable by conventional ST experiments. Our results demonstrate the utility of iSCALE in analyzing large tissues by enabling unbiased annotation, resolving cell type composition, mapping cellular microenvironments and revealing spatial features beyond the reach of standard ST analysis or routine histopathological assessment. Gene expression analysis Machine learning RNA sequencing Transcriptomics
N Nature Methods · Sep 11, 2025 Coupling CRISPR scanning with targeted chromatin accessibility profiling using a double-stranded DNA deaminase Genome editing enables sequence-function profiling of endogenouscis-regulatory elements, driving understanding of their mechanisms. However, these approaches lack direct, scalable readouts of chromatin accessibility across long single-molecule chromatin fibers. Here we leverage double-stranded DNA cytidine deaminases to profile chromatin accessibility at endogenous loci of interest through targeted PCR and long-read sequencing, a method we term targeted deaminase-accessible chromatin sequencing (TDAC-seq). With high sequence coverage at targeted loci, TDAC-seq can be integrated with CRISPR perturbations to link genetic edits and their effects on chromatin accessibility on the same single chromatin fiber at single-nucleotide resolution. We employed TDAC-seq to parse CRISPR edits that activate fetal hemoglobin in human CD34+hematopoietic stem and progenitor cells (HSPCs) during erythroid differentiation as well as in pooled CRISPR and base-editing screens tiling an enhancer controlling the globin locus. We further scaled the method to interrogate 947 variants in aGFI1B-linked enhancer associated with myeloproliferative neoplasm risk in a single pooled CRISPR experiment in CD34+HSPCs. Together, TDAC-seq enables high-resolution sequence-function mapping of single-molecule chromatin fibers by genome editing. Chromatin structure DNA sequencing Epigenomics
N Nature Methods · Sep 11, 2025 Biophysics-based protein language models for protein engineering Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose mutational effect transfer learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure and energetics. We fine-tune METL on experimental sequence–function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL’s ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering. Machine learning Protein design
N Nature Methods · Sep 10, 2025 CLEM-Reg: an automated point cloud-based registration algorithm for volume correlative light and electron microscopy Volume correlative light and electron microscopy (vCLEM) is a powerful imaging technique that enables the visualization of fluorescently labeled proteins within their ultrastructural context. Currently, vCLEM alignment relies on time-consuming and subjective manual methods. This paper presents CLEM-Reg, an algorithm that automates the three-dimensional alignment of vCLEM datasets by leveraging probabilistic point cloud registration techniques. Point clouds are derived from segmentations of common structures in each modality, created by state-of-the-art open-source methods. CLEM-Reg drastically reduces the registration time of vCLEM datasets to a few minutes and achieves correlation of fluorescent signal to submicron target structures in electron microscopy on three newly acquired vCLEM benchmark datasets. CLEM-Reg was then used to automatically obtain vCLEM overlays to unambiguously identify TGN46-positive transport carriers involved in protein trafficking between the trans-Golgi network and plasma membrane. Datasets are available on EMPIAR and BioStudies, and a napari plugin is provided to aid end-user adoption. CLEM-Reg automates the three-dimensional alignment of volume correlative light and electron microscopy datasets by leveraging probabilistic point cloud registration techniques for fast and accurate results across diverse datasets. Cellular imaging Image processing
N Nature Methods · Sep 08, 2025 Scvi-hub: an actionable repository for model-driven single-cell analysis The growing availability of single-cell omics datasets presents new opportunities for reuse, while challenges in data transfer, normalization and integration remain a barrier. Here we present scvi-hub: a platform for efficiently sharing and accessing single-cell omics datasets using pretrained probabilistic models. It enables immediate execution of fundamental tasks like visualization, imputation, annotation and deconvolution on new query datasets using state-of-the-art methods, with massively reduced storage and compute requirements. We show that pretrained models support efficient analysis of large references, including the CZI CELLxGENE Discover Census. Scvi-hub is built within the scvi-tools open-source environment and integrated into scverse. Scvi-hub offers a scalable and user-friendly framework for accessing and contributing to a growing ecosystem of ready-to-use models and datasets, thus putting the power of atlas-level analysis at the fingertips of a broad community of users. Machine learning Software Statistical methods Transcriptomics
N Nature Methods · Sep 08, 2025 WISDEM: a hybrid wireless integrated sensing detector for simultaneous EEG and MRI Concurrent recording of electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) signals reveals cross-scale neurovascular dynamics crucial for explaining fundamental linkages between function and behaviors. However, MRI scanners generate artifacts for EEG detection. Despite existing denoising methods, cabled connections to EEG receivers are susceptible to environmental fluctuations inside MRI scanners, creating baseline drifts that complicate EEG signal retrieval from the noisy background. Here we show that a wireless integrated sensing detector for simultaneous EEG and MRI can encode fMRI and EEG signals on distinct sidebands of the detector’s oscillation wave for detection by a standard MRI console over the entire duration of the fMRI sequence. Local field potential and fMRI maps are retrieved through low-pass and high-pass filtering of frequency-demodulated signals. From optogenetically stimulated somatosensory cortex in ChR2-transfected Sprague Dawley rats, positive correlation between evoked local field potential and fMRI signals validates strong neurovascular coupling, enabling cross-scale brain mapping with this two-in-one transducer. Electroencephalography – EEG Magnetic resonance imaging Neuroscience Rat
N Nature Methods · Sep 03, 2025 Reproducible single-cell annotation of programs underlying T cell subsets, activation states and functions T cells recognize antigens and induce specialized gene expression programs (GEPs), enabling functions like proliferation, cytotoxicity and cytokine production. Traditionally, different T cell classes are thought to exhibit mutually exclusive responses, including TH1, TH2 and TH17 programs. However, single-cell RNA sequencing has revealed a continuum of T cell states without clearly distinct subsets, necessitating new analytical frameworks. Here, we introduce T-CellAnnoTator (TCAT), a pipeline that improves T cell characterization by simultaneously quantifying predefined GEPs capturing activation states and cellular subsets. Analyzing 1,700,000 T cells from 700 individuals spanning 38 tissues and five disease contexts, we identify 46 reproducible GEPs reflecting core T cell functions including proliferation, cytotoxicity, exhaustion and effector states. We experimentally demonstrate new activation programs and apply TCAT to characterize activation GEPs that predict immune checkpoint inhibitor response across multiple tumor types. Our software package starCAT generalizes this framework, enabling reproducible annotation in other cell types and tissues. Adaptive immunity Computational biology and bioinformatics Sequencing Systems biology
N Nature Methods · Sep 03, 2025 Unified mass imaging maps the lipidome of vertebrate development Embryo development entails the formation of anatomical structures with distinct biochemical compositions. Compared with the wealth of knowledge on gene regulation, our understanding of metabolic programs operating during embryogenesis is limited. Mass spectrometry imaging (MSI) has the potential to map the distribution of metabolites across embryo development. Here we established uMAIA, an analytical framework for the joint analysis of large MSI datasets, which enables the construction of multidimensional metabolomic atlases. Employing this framework, we mapped the four-dimensional (4D) distribution of over a hundred lipids at micrometric resolution inDanio rerioembryos. We discovered metabolic trajectories that unfold in concert with morphogenesis and revealed spatially organized biochemical coordination overlooked by bulk measurements. Interestingly, lipid mapping revealed unexpected distributions of sphingolipid and triglyceride species, suggesting their involvement in pattern establishment and organ development. Our approach empowers a new generation of whole-organism metabolomic atlases and enables the discovery of spatially organized metabolic circuits. Data integration Embryogenesis Lipidomics Mass spectrometry Metabolomics
N Nature Methods · Aug 30, 2025 Machine learning approaches for protein structure prediction Accurate protein structure prediction remains one of the most challenging problems in computational biology. We present a new machine learning framework that combines deep neural networks with physics-based constraints to achieve unprecedented accuracy in protein folding predictions. Our model, trained on a comprehensive dataset of protein structures, demonstrates superior performance compared to existing methods and provides new insights into protein folding mechanisms. Structural Biology Machine Learning Protein Science
N Nature Methods · Aug 28, 2025 Laser flash melting cryo-EM samples to overcome preferred orientation Sample preparation remains a bottleneck for protein structure determination by cryo-electron microscopy. A frequently encountered issue is that proteins adsorb to the air–water interface of the sample in a limited number of orientations. This makes it challenging to obtain high-resolution reconstructions, or may even cause projects to fail altogether. We have previously observed that laser flash melting and revitrification of cryo-EM samples reduces preferred orientation for large, symmetric particles. Here we demonstrate that our method can in fact be used to scramble the orientation of proteins of a range of sizes and symmetries. The effect can be enhanced for some proteins by increasing the heating rate during flash melting or by depositing amorphous ice onto the sample prior to revitrification. This also allows us to shed light onto the underlying mechanism. Our experiments establish a set of tools for overcoming preferred orientation that can be easily integrated into existing workflows. Cryoelectron microscopy Proteins
N Nature Methods · Aug 27, 2025 Functional phenotyping of genomic variants using joint multiomic single-cell DNA–RNA sequencing This study introduces SDR-seq, a droplet-based single-cell DNA–RNA sequencing platform, enabling the study of gene expression profiles linked to both noncoding and coding variants.
N Nature Methods · Aug 26, 2025 Reconstruction of a connectome of single neurons in mouse brains by cross-validating multi-scale multi-modality data Brain networks, or connectomes, have inspired research at macro-, meso- and micro-scales. However, the rise of single-cell technologies necessitates inferring connectomes consisting of individual neurons projecting throughout the brain. Her, we present a scalable approach to map single-neuron connectivity at the whole-brain scale using two complementary methods. We first generated an arbor-net by probabilistically pairing dendritic and axonal arbors of 20,247 neurons registered to the Allen Brain Atlas. We also produced a bouton-net based on 2.57 million putative axonal boutons from 1,877 fully reconstructed neurons and probabilistic pairing of these full-morphology datasets. Cross-validation of both networks showed statistical consistency in spatially and anatomically modular distributions of neuronal connections, corresponding to functional modules in the mouse brain. We found that single-neuron connections correlated more strongly with gene coexpression than the full-brain mesoscale connectome. Our network analysis, comparing the connectomes with alternative brain architectures, identified nonrandom subnetwork patterns. Overall, our data indicate rich granularity and strong modular diversity in mouse brain networks. Computational biology and bioinformatics Neuroscience
N Nature Methods · Aug 26, 2025 A realistic phantom dataset for benchmarking cryo-ET data annotation Cryo-electron tomography (cryo-ET) is a powerful technique for imaging molecular complexes in their native cellular environments. However, identifying the vast majority of molecular species in cellular tomograms remains prohibitively difficult. Machine learning (ML) methods provide an opportunity to automate the annotation process, but algorithm development has been hindered by the lack of large, standardized datasets. Here we present an experimental phantom dataset with comprehensive ground-truth annotations for six molecular species to spur new algorithm development and benchmark existing tools. This annotated dataset is available on the CryoET Data Portal with infrastructure to streamline access for methods developers across fields. Cryoelectron tomography Data acquisition Machine learning Protein databases Proteins
N Nature Methods · Aug 26, 2025 DeepMVP: deep learning models trained on high-quality data accurately predict PTM sites and variant-induced alterations Post-translational modifications (PTMs) are critical regulators of protein function, and their disruption is a key mechanism by which missense variants contribute to disease. Accurate PTM site prediction using deep learning can help identify PTM-altering variants, but progress has been limited by the lack of large, high-quality training datasets. Here, we introduce PTMAtlas, a curated compendium of 397,524 PTM sites generated through systematic reprocessing of 241 public mass-spectrometry datasets, and DeepMVP, a deep learning framework trained on PTMAtlas to predict PTM sites for phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation. DeepMVP substantially outperforms existing tools across all six PTM types. Its application to predicting PTM-altering missense variants shows strong concordance with experimental results, validated using literature-curated variants and cancer proteogenomic datasets. Together, PTMAtlas and DeepMVP provide a robust platform for PTM research and a scalable framework for assessing the functional consequences of coding variants through the lens of PTMs. Genomics Machine learning Post-translational modifications Proteome informatics Proteomics
N Nature Methods · Aug 21, 2025 A versatile miniature two-photon microscope enabling multicolor deep-brain imaging Here we present FHIRM-TPM 3.0, a 2.6 g miniature two-photon microscope capable of multicolor deep-brain imaging in freely behaving mice. The system was integrated with a broadband anti-resonant hollow-core fiber featuring low transmission loss, minimal dispersion from 700 nm to 1,060 nm and high tolerance of laser power. By correcting chromatic and spherical aberrations and optimizing the fluorescence collection aperture, we achieved cortical neuronal imaging at depths exceeding 820 μm and, using a GRIN lens, hippocampal Ca2+imaging at single dendritic spine resolution. Moreover, we engineered three interchangeable parfocal objectives, allowing for a tenfold scalable field of view up to 1 × 0.8 mm², with lateral resolutions ranging from 0.68 μm to 1.46 μm. By multicolor imaging at excitation wavelengths of 780 nm, 920 nm and 1,030 nm, we investigated mitochondrial and cytosolic Ca2+activities relative to the deposition of amyloid plaques in the cortex of awake APP/PS1 transgenic mice. Thus, FHIRM-TPM 3.0 provides a versatile imaging system suitable for diverse brain imaging scenarios. Ca2+ imaging Mouse Multiphoton microscopy Neuroscience
N Nature Methods · Aug 18, 2025 High-throughput profiling of chemical-induced gene expression across 93,644 perturbations In this Resource, we present an extensive dataset of chemical-induced gene signatures (CIGS), encompassing expression patterns of 3,407 genes regulating key biological processes in 2 human cell lines exposed to 13,221 compounds across 93,664 perturbations. This dataset encompasses 319,045,108 gene expression events, generated through 2 high-throughput technologies: the previously documented high-throughput sequencing-based high-throughput screening (HTS2) and the newly developed highly multiplexed and parallel sequencing (HiMAP-seq). Our results show that HiMAP-seq is comparable to RNA sequencing, but can profile the expression of thousands of genes across thousands of samples in one single test by utilizing a pooled-sample strategy. We further illustrate CIGS’s utility in elucidating the mechanism of action of unannotated small molecules, like ligustroflavone and 2,4-dihydroxybenzaldehyde, and to identify perturbation-induced cell states, such as those resistant to ferroptosis. The full dataset is publicly accessible athttps://cigs.iomicscloud.com/. Chemical genetics Genetic techniques RNA sequencing Transcriptomics