AI tool adjusts for ancestral bias in genetic data

AdminUncategorized1 month ago16 Views

  • Research Highlight
  • Published:

Genomics

Nature Biotechnology

volume 43page 501 (2025)Cite this article

Human ancestry has a considerable impact on gene expression, but genomic datasets for disease analysis severely underrepresent non-European populations, thereby limiting the advancement of precision medicine. In a paper in Nature Communications, Smith et al. introduce a machine learning tool to mitigate the effects of ancestral bias in transcriptomic data.

The tool, called PhyloFrame, creates ancestry-aware signatures of disease by integrating population genomics data with smaller, disease-relevant training datasets. PhyloFrame uses a logistic regression model with LASSO penalty to obtain an initial set of disease-relevant genes. It then uses population genomics data to help compensate for data distribution shifts caused by human ancestry differences. In short, PhyloFrame projects the initial disease signature onto a functional interaction network, extending the network to include the first and second neighbors of each signature gene. This new set is then filtered by a statistic defined as enhanced allele frequency (EAF) — which captures population-specific allelic enrichment in healthy tissue — to identify ancestrally diverse genes that interact with the original signature. From each ancestry, a selected subset of genes with high EAF and gene expression variability in the training data are added to the PhyloFrame signature. Retraining the model with the forced inclusion of these equitable genes results in a signature of disease that generalizes to all populations, even if not represented in the training data.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Author information

Authors and Affiliations

  1. Nature Biotechnology https://www.nature.com/nbt/

    Iris Marchal

Corresponding author

Correspondence to
Iris Marchal.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marchal, I. AI tool adjusts for ancestral bias in genetic data.
Nat Biotechnol 43, 501 (2025). https://doi.org/10.1038/s41587-025-02651-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-025-02651-7

Associated content

Read More

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Recent Comments

No comments to show.

Stay Informed With the Latest & Most Important News

I consent to receive newsletter via email. For further information, please review our Privacy Policy

Advertisement

Loading Next Post...
Follow
Sign In/Sign Up Sidebar Search Trending 0 Cart
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...

Cart
Cart updating

ShopYour cart is currently is empty. You could visit our shop and start shopping.