science_health3853 wordsRead on Arc Codex

Diet–microbiome associations in 10,068 individuals from the Human Phenotype Project to guide personalized nutrition

Abstract Diet is a major environmental factor influencing the human gut microbiome. However, the effects of specific foods and dietary patterns on microbial composition, diversity and function is not fully understood, limiting progress toward personalized dietary strategies. Here, leveraging 10,068 participants from the Human Phenotype Project with app-based diet logs and shotgun metagenomics, we predicted diet–microbiome associations at species-level resolution. Diet significantly predicted microbial diversity (richness r = 0.26, Shannon Index r = 0.24), the relative abundance of 669 of 724 species tested (92.4%, false discovery rate <0.05), and 313 of 320 pathways (97.8%, false discovery rate <0.05). Feature attribution identified distinct food–microbe links, including coffee with Lawsonibacter asaccharolyticus (r = 0.43), yogurt with Streptococcus thermophilus (r = 0.42) and milk with Bifidobacterium species (r = 0.31–0.36). In parallel, broader dietary patterns, especially the degree of food processing, emerged as predictors of microbial diversity and composition. We also show that diet–microbiome associations persist over four years, with 82.5% of species exhibiting significant longitudinal tracking between predicted and observed abundances. Finally, we developed an exploratory analysis for simulating personalized dietary interventions with predicted microbiome shift effects that are associated with improvements in cardiometabolic health. Our findings demonstrate that diet is strongly associated with microbiome composition, diversity and function, and highlight its potential for guiding personalized interventions. This is a preview of subscription content, access via your institution Access options Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $32.99 / 30 days cancel any time Subscribe to this journal Receive 12 print issues and online access $259.00 per year only $21.58 per issue Buy this article - Purchase on SpringerLink - Instant access to the full article PDF. USD 39.95 Prices may be subject to local taxes which are calculated during checkout Similar content being viewed by others Data availability The raw individual-level gut metagenome data and basic phenotypes (age, sex and BMI) used in this study are available at the European Genome–Phenome Archive (https://ega-archive.org/) under accession EGAS00001007204. Individual-level genetics data used in this study are available at the European Genome–Phenome Archive (https://ega-archive.org/) under accession number EGAS00001008040. Owing to privacy and ethical considerations set forth by the IRB committee of the study, all other datasets that are part of the Human Phenotype Project (HPP)—such as blood test results and anthropometrics measurements—are accessible to researchers at universities and other research institutions at https://humanphenotypeproject.org/data-access. This data was made available in this manner in the Nature Medicine publication on the HPP18. Interested bona fide researchers should contact info@pheno.ai, who will then provide an agreement that allows access to the data for noncommercial purposes. Data access will then be provided through an account in a secure cloud-based environment. Code availability Code used in this study is available at the following via GitHub at https://github.com/TmrSegev/diet-microbiome. A permanent archive of the code version used for this analysis is available via Zenodo at https://doi.org/10.5281/zenodo.18959226 (ref. 58). References Sanz, Y. et al. The gut microbiome connects nutrition and human health. Nat. Rev. Gastroenterol. Hepatol. 22, 534–555 (2025). Kim, Y. S., Unno, T., Kim, B.-Y. & Park, M.-S. Sex differences in gut microbiota. World J. Mens Health 38, 48–60 (2020). Rothschild, D. et al. Environment dominates over host genetics in shaping human gut microbiota. Nature 555, 210–215 (2018). Gacesa, R. et al. Environmental factors shaping the gut microbiome in a Dutch population. Nature 604, 732–739 (2022). Carmody, R. N., Varady, K. & Turnbaugh, P. J. Digesting the complex metabolic effects of diet on the host and microbiome. Cell 187, 3857–3876 (2024). Hills, R. D. et al. Gut microbiome: profound implications for diet and disease. Nutrients 11, 1613 (2019). Arora, T. et al. Microbial fermentation of flaxseed fibers modulates the transcriptome of GPR41-expressing enteroendocrine cells and protects mice against diet-induced obesity. Am. J. Physiol. Endocrinol. Metab. 316, E453–E463 (2019). McDonald, D. American Gut: an open platform for citizen science microbiome research. mSystems 3, e00031–18 (2018). Simpson, H. L. & Campbell, B. J. Review article: dietary fibre–microbiota interactions. Aliment. Pharmacol. Ther. 42, 158–179 (2015). Dostal Webster, A. et al. Influence of short-term changes in dietary sulfur on the relative abundances of intestinal sulfate-reducing bacteria. Gut Microbes 10, 447–457 (2019). Malesza, I. J. et al. High-fat, western-style diet, systemic inflammation, and gut microbiota: a narrative review. Cells 10, 3164 (2021). Rinninella, E. et al. The role of diet in shaping human gut microbiota. Best Pract. Res. Clin. Gastroenterol. 62–63, 101828 (2023). Vangay, P. et al. US immigration westernizes the human gut microbiome. Cell 175, 962–972 (2018). Sonnenburg, J. L. & Bäckhed, F. Diet–microbiota interactions as moderators of human metabolism. Nature 535, 56–64 (2016). Fackelmann, G. et al. Gut microbiome signatures of vegan, vegetarian and omnivore diets and associated health outcomes across 21,561 individuals. Nat. Microbiol. 10, 41–52 (2025). Gundogdu, A. & Nalbantoglu, O. U. The role of the Mediterranean diet in modulating the gut microbiome: a review of current evidence. Nutrition 114, 112118 (2023). Ross, F. C. et al. The interplay between diet and the gut microbiome: implications for health and disease. Nat. Rev. Microbiol. 22, 671–686 (2024). Reicher, L. Deep phenotyping of health–disease continuum in the Human Phenotype Project. Nat. Med. 31, 3191–3203 (2025). Chiuve, S. E. et al. Alternative dietary indices both strongly predict risk of chronic disease. J. Nutr. 142, 1009–1018 (2012). Karanja, N. M. et al. Descriptive characteristics of the dietary patterns used in the Dietary Approaches to Stop Hypertension Trial. J. Am. Diet. Assoc. 99, S19–S27 (1999). Stubbendorff, A. et al. A systematic evaluation of seven different scores representing the EAT–Lancet reference diet and mortality, stroke, and greenhouse gas emissions in three cohorts. Lancet Planet. Health 8, e391–e401 (2024). Satija, A. & Hu, F. B. Plant-based diets and cardiovascular health. Trends Cardiovasc. Med. 28, 437–441 (2018). Abu-Saad, K. et al. Adaptation and predictive utility of a Mediterranean diet screener score. Clin. Nutr. 38, 2928–2935 (2019). Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. In Proc. 31st International Conference on Neural Information Processing Systems (eds. Guyon, I. et al.) 3149–3157 (Curran Associates, 2017). de la Cuesta-Zuluaga, J. et al. Age- and sex-dependent patterns of gut microbial diversity in human adults. mSystems 4, e00261–19 (2019). Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016). Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020). Belda, I. et al. A multi-omics approach for understanding the effects of moderate wine consumption on human intestinal health. Food Funct. 12, 4152–4164 (2021). Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995). Manghi, P. et al. Coffee consumption is associated with intestinal Lawsonibacter asaccharolyticus abundance and prevalence across multiple cohorts. Nat. Microbiol. 9, 3120–3134 (2024). Fijan, S. Microorganisms with claimed probiotic properties: an overview of recent literature. Int. J. Environ. Res. Public Health 11, 4745–4767 (2014). Aslam, H. et al. The effects of dairy and dairy derivatives on the gut microbiota: a systematic literature review. Gut Microbes 12, 1799533 (2020). JanssenDuijghuijsen, L. et al. Changes in gut microbiota and lactose intolerance symptoms before and after daily lactose supplementation in individuals with the lactase nonpersistent genotype. Am. J. Clin. Nutr. 119, 702–710 (2024). Cooper, D. N., Martin, R. J. & Keim, N. L. Does whole grain consumption alter gut microbiota and satiety?. Healthcare 3, 364–392 (2015). Mano, F. et al. The effect of white rice and white bread as staple foods on gut microbiota and host metabolism. Nutrients 10, 1323 (2018). Ben-Yacov, O. et al. Gut microbiome modulates the effects of a personalised postprandial-targeting (PPT) diet on cardiometabolic markers: a diet intervention in pre-diabetes. Gut 72, 1486–1496 (2023). Frioux, C. et al. Enterosignatures define common bacterial guilds in the human gut microbiome. Cell Host Microbe 31, 1111–1125. (2023). Carlino, N. et al. Unexplored microbial diversity from 2,500 food metagenomes and links with the human microbiome. Cell 187, 5775–5795 (2024). Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery3. Elife 10, e65088 (2021). Tan, J. et al. in Advances in Immunology Vol. 121 (ed Alt, F. W.) 91–119 (Academic, 2014). Hamer, H. M. et al. Review article: the role of butyrate on colonic function. Aliment. Pharmacol. Ther. 27, 104–119 (2008). Zhou, T. et al. High-resistant starch and low-glutelin content 1 rice benefits gut function in obese patients. Front. Sustain. Food Syst. 8, 1364403 (2024). Martinez Tuppia, C. et al. In vitro human gastrointestinal digestibility and colonic fermentation of wheat sourdough and yeast breads. Foods 13, 3014 (2024). Htet, T. D. et al. Rationale and design of a randomised controlled trial testing the effect of personalised diet in individuals with pre-diabetes or type 2 diabetes mellitus treated with metformin. BMJ Open 10, e037859 (2020). Ben-Yacov, O. et al. Personalized postprandial glucose response–targeting diet versus mediterranean diet for glycemic control in prediabetes. Diabetes Care 44, 1980–1991 (2021). Larsen, P. E. & Dai, Y. Modeling interaction networks between host, diet, and bacteria predicts obesogenesis in a mouse model. Front. Mol. Biosci. 9, 1059094 (2022). Bancil, A. S. et al. Food additive emulsifiers and their impact on gut microbiome, permeability, and inflammation: mechanistic insights in inflammatory bowel disease. J. Crohns Colitis 15, 1068–1079 (2021). Naimi, S., Viennois, E., Gewirtz, A. T. & Chassaing, B. Direct impact of commonly used dietary emulsifiers on human gut microbiota. Microbiome 9, 66 (2021). Leeuwendaal, N. K., Stanton, C., O’Toole, P. W. & Beresford, T. P. Fermented foods, health and the gut microbiome. Nutrients 14, 1527 (2022). Wastyk, H. C. et al. Gut-microbiota-targeted diets modulate human immune status. Cell 184, 4137–4153 (2021). Lichtenstein, A. H. et al. 2021 Dietary guidance to improve cardiovascular health: a scientific statement from the american heart association. Circulation 144, e472–e487 (2021). Luna-Castillo, K. P. et al. The effect of dietary interventions on hypertriglyceridemia: from public health to molecular nutrition evidence. Nutrients 14, 1104 (2022). Asnicar, F. et al. Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals. Nat. Med. 27, 321–332 (2021). Taylor, B. C. et al. Consumption of fermented foods is associated with systematic differences in the gut microbiome and metabolome. mSystems 5, e00901–e00919 (2020). Leviatan, S., Shoer, S., Rothschild, D., Gorodetski, M. & Segal, E. An expanded reference map of the human gut microbiome reveals hundreds of previously unknown species. Nat. Commun. 13, 3863 (2022). Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). Blanco-Míguez, A. et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat. Biotechnol. 41, 1633–1644 (2023). Segev, T. TmrSegev/diet-microbiome: v1.1.0. Zenodo https://doi.org/10.5281/zenodo.18959226 (2026). Acknowledgements We thank the members of the Segal group for fruitful discussions. E.S. is supported by the Crown Human Genome Center; the Larson Charitable Foundation New Scientist Fund; the Else Kröner Fresenius Foundation; the White Rose International Foundation; the Ben B. and Joyce E. Eisenberg Foundation; the Nissenbaum Family; Marcos Pinheiro de Andrade and Vanessa Buchheim; Lady Michelle Michels; Aliza Moussaieff; and grants funded by the Minerva Foundation, with funding from the Federal German Ministry for Education and Research and by the European Research Council and the Israel Science Foundation. Funded by the European Union Project 101095540 IMMEDIATE. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. Author information Authors and Affiliations Contributions T.S. conceived the project, designed and conducted all analyses, interpreted the results and wrote the paper. D.B. designed and conducted the simulated dietary intervention and microbiome-phenotype prediction analyses, and wrote the paper. L.Z. and H.R. guided the project and advised analyses. A.G. provided consultation and support for data processing. M.R. and D.K. advised diet adherence score development. D.S.-B. helped write the manuscript. A.W. developed protocols, and oversaw sample collection and processing. E.S. conceived and directed the project and analyses, designed the analyses, interpreted the results and wrote the paper. Corresponding author Ethics declarations Competing interests H.R. is an employee at Pheno.AI Ltd a biomedical data science company from Tel-Aviv, Israel. E.S. is a paid consultant of Pheno.AI Ltd. The other authors declare no competing interests. Peer review Peer review information Nature Medicine thanks Leo Lahti and Tao Zuo for their contribution to the peer review of this work. Primary Handling Editor: Ming Yang, in collaboration with the Nature Medicine team. Additional information Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Extended data Extended Data Fig. 1 Extended diversity and abundance predictions. a. Comparison of species abundance prediction performance between models trained with covariates only (x-axis) and with all features (y-axis). Each dot represents a species; the red dashed line indicates the identity line. b. Histogram of Pearson correlations (LGBM) between predicted and observed species abundances, in the test set. A red line indicates the 99th percentile of correlations obtained from permutation testing, highlighting significant correlations. c. Comparison of species abundance prediction performance between ridge regression (x-axis) and LightGBM (y-axis). Each dot represents a species; the red dashed line indicates the identity line. d. Diversity targets linear model versus LGBM model. Pearson correlation coefficients (r) for ridge linear regression models (blue) and LightGBM models (orange) trained to predict species richness and Shannon diversity. Grey dots represent permuted Pearson’s r with all features, and dashed line indicates 99th percentile of permutations. Extended Data Fig. 2 SHAP summary plots of highly predicted microbial species. SHAP summary plot for highly predicted species, shows feature contributions sorted by importance; SHAP direction was derived from spearman correlation directions. Extended Data Fig. 3 Presence of diet-predictive microbial species in foods from the curated Food Metagenomic Database (cFMD). a. A heatmap showing microbial species detected across 16 food items (columns) and 38 species with significant predictive associations in the cohort (rows). Colors indicate mean relative abundance (MetaPhlAn-derived), with warmer shades reflecting higher abundance. Asterisks mark species-food pairs where the species was detected (abundance > 0). Several key associations were confirmed: Streptococcus thermophilus was abundant in yogurt, consistent with its role in fermentation; milk contained Bifidobacterium longum and Bifidobacterium adolescentis; and UBA11774 sp003507655, primarily predicted by nut intake, was detected in dark chocolate. Conversely, some strong dietary predictors (for example, Lawsonibacter asaccharolyticus for coffee) were not found in food sequencing. Extended Data Fig. 4 Microbiome pathways are highly predicted by diet. a. Histogram of Pearson correlations (LGBM) between predicted and observed pathway abundances. A red line indicates the 99th percentile of correlations obtained from permutation testing, highlighting significant correlations. b. Pearson correlations of pathway predictions from baseline models (covariates) compared with models including all features. c. Distribution of feature importance across all modeled pathways (n = 320). Box plots display the mean absolute SHAP values for each predictor across all pathway models, sorted by average importance. Higher values indicate a stronger contribution to predicting pathways. Boxes denote midline as the median; box as interquartile range; whiskers as 1.5x interquartile range; points indicate outliers. d-f. Directional SHAP values for representative pathway groups. Grey dots denote individual pathways, bars show mean signed SHAP values. d, Biosynthetic pathways. e, Carbohydrate and plant polysaccharide degradation pathways. f, Butyrate producing pathways. Extended Data Fig. 5 Main predictors heatmap of SHAP values for the most food-predictable pathways. a. Main predictors heatmap of SHAP values for pathways with the highest predictive gain from baseline (Δr > 0.05), and the dietary features with the maximal importance. We clustered pathways in their SHAP value profiles across foods. Absolute SHAP values < 0.01 were masked. Extended Data Fig. 6 External validation of diet-microbiome models in independent cohorts. a,b. Validation in the Australian PREDICT cohort at baseline (n = 118) using models retrained on shared nutrient-level features. a. Pearson’s r for alpha diversity predictions. Bar heights represent performance for models based on age, sex and BMI, HPP internal cross-validation, HPP held-out test set, and external validation on the Australian cohort. Grey dots represent Pearson’s r from permuted predictions; the dashed horizontal line indicates the 99th percentile of permutations. b. Histogram of Pearson correlations for predicted species relative abundances in the Australian cohort compared to HPP test set predictions averaged across five subsamples of equal size. The red dotted line indicates the 99th percentile of correlations obtained from permutation testing. c-e. Validation in the Personalized Nutrition Project (PNP) dietary intervention cohort (n = 188) using models retrained on shared food-level features. c. Pearson correlations for alpha diversity metrics, comparing PNP baseline and intervention samples to HPP benchmarks. d,e. Histograms of Pearson correlations for species relative abundance predictions in PNP baseline (d) and intervention (e) samples compared to size-matched HPP test samples. Extended Data Fig. 7 Correlation of observed versus predicted longitudinal changes in microbial abundance. a, b. Association of predicted and observed microbial changes at the two-year (a) and four-year (b) follow-ups. Each point represents a microbial species, with mean predicted change plotted against mean observed change. Species predicted to increase tended to rise in abundance, whereas those predicted to decrease declined, supporting the validity of diet-microbiome links over time. Associations were evaluated using Pearson correlation coefficients (r) and two-sided P-values, adjusted for multiple comparisons using the Benjamini-Hochberg method (r > 0, FDR < 0.05). Extended Data Fig. 8 Predictive performance and variance partitioning of cardiometabolic phenotypes. a. Pearson correlations between predicted and observed values for clinically relevant health phenotypes (n = 10) and the composite Cardiometabolic Index (CMI). For each phenotype, circles show the predictive performance of a baseline model including age and sex (green) and a microbiome-informed model including TSS-normalized and log-transformed species abundances in addition to age and sex (orange). Lines connect the paired models for each phenotype, with numbers above the lines indicating the absolute improvement in Pearson correlation (Δr). Five-fold cross-validation was used to generate predictions. All phenotypes shown were significantly predicted by the microbiome-informed model (FDR < 0.05). b. Variance partitioning of the predictive signal attributable to diet versus the microbiome. For each phenotype (n = 11, including CMI), the total improvement in explained variance (ΔR² beyond the baseline model of age and sex) was decomposed into variance explained uniquely by diet, uniquely by the microbiome, and by shared diet–microbiome information (reflecting predictive signal that cannot be uniquely assigned because diet and microbiome features are correlated). Colored dots indicate individual phenotypes; box centers indicate medians; boxes indicate interquartile ranges (IQR); whiskers indicate the full range. The unique diet component contributed the largest improvement (median ΔR² = 0.152), significantly exceeding the shared component (median ΔR² = 0.085; P = 0.005, two-sided Wilcoxon signed-rank test). The unique microbiome component was negligible (median ΔR² = 0.006) and significantly lower than the unique diet component (P = 0.010, two-sided Wilcoxon signed-rank test). Extended Data Fig. 9 Personalized dietary simulated interventions predict microbiome shifts and triglyceride reductions. a. Personalized dietary recommendations for a representative individual (56-60 years old male). Arrows indicate foods predicted to increase or decrease relative to baseline intake, with percent changes shown in the legend. b. Predicted microbial shifts for the same individual, showing the ten species with the largest absolute fold changes. c. Most frequent food changes recommended across the cohort (n = 2,070 participants from the test set). Percentages show how many individuals who ate each food were advised to change it. Recommended changes are given as percent of baseline calories; each point is one individual. Points at -100% indicate complete removal of that food, and clustering at -100% reflects foods for which complete removal is commonly recommended. Boxes denote midline as the median; box as interquartile range; whiskers as 1.5x interquartile range. d. Microbial species most frequently predicted to change in response to the recommended diets (n = 2,070 participants from the test set). Percentages show how many individuals carrying each species were predicted to see a change. Boxes denote midline as the median; box as interquartile range; whiskers as 1.5x interquartile range. e. Number of recommended food changes per individual. Most people received only a few changes, with a median of two. f. Predicted changes in triglyceride levels across individuals. The dashed line marks no change, and the arrow highlights a predicted 3.2 mg/dL decrease in blood triglyceride concentration for the representative individual shown in panels a and b. Illustrations in a and b created in BioRender; Aharoni, S. https://biorender.com/qtbac40 (2026). Extended Data Fig. 10 Personalized dietary simulated interventions predict microbiome shifts and VAT reductions. a. Personalized dietary recommendations for a representative individual (61-65 years old female). Arrows indicate foods predicted to increase or decrease relative to baseline intake, with percent changes shown in the legend. b. Predicted microbial shifts for the same individual, showing the ten species with the largest absolute fold changes. c. Most frequent food changes recommended across the cohort (n = 2,070 participants from the test set). Percentages show how many individuals who ate each food were advised to change it. Recommended changes are given as percent of baseline calories; each point is one individual. Points at -100% indicate complete removal of that food, and clustering at -100% reflects foods for which complete removal is commonly recommended. Boxes denote midline as the median; box as interquartile range; whiskers as 1.5x interquartile range. d. Microbial species most frequently predicted to change in response to the recommended diets (n = 2,070 participants from the test set). Percentages show how many individuals carrying each species were predicted to see a change. Boxes denote midline as the median; box as interquartile range; whiskers as 1.5x interquartile range; points indicate outliers. e. Number of recommended food changes per individual. Most people received only a few changes, with a median of three. f. Predicted changes in visceral adipose tissue (VAT) mass across individuals. The dashed line marks no change, and the arrow highlights a predicted 22 gr decrease in VAT mass for the representative individual shown in panels a and b. Illustrations in a and b created in BioRender; Aharoni, S. https://biorender.com/qtbac40 (2026). Supplementary information Supplementary Information (download PDF ) Extended data figure legends and Supplementary Fig. 1. Supplementary Table 1 (download XLSX ) Diet adherence scores calculation. Supplementary Table 2 (download CSV ) Foods diet classification and NOVA score. Supplementary Table 3 (download CSV ) Linear regression models for diet pattern comparisons. Supplementary Table 4 (download CSV ) Longitudinal analysis with linear mixed-effect models. Supplementary Table 5 (download XLSX ) Correlation of observed versus predicted longitudinal changes in microbial abundance. Rights and permissions Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. About this article Cite this article Segev, T., Barak, D., Zahavi, L. et al. Diet–microbiome associations in 10,068 individuals from the Human Phenotype Project to guide personalized nutrition. Nat Med (2026). https://doi.org/10.1038/s41591-026-04312-x Received: Accepted: Published: Version of record: DOI: https://doi.org/10.1038/s41591-026-04312-x

How it works

Once you click Generate, Ollama reads this article and crafts 5 comprehension questions. Your answers are graded against the article content — general knowledge won't be enough. Score 70+ to count toward your certificate.

Questions are cached — you'll always get the same 5 for this article.