Oxford,  2025 — Researchers from the Leverhulme Centre for Demographic Science (LCDS) at the University of Oxford and Harvard University have unveiled a pioneering statistical method that addresses one of the most persistent and overlooked challenges in genetic research: participation bias.

Published Friday last week in the Proceedings of the National Academy of Sciences (PNAS), the study, “Participation bias in the estimation of heritability and genetic correlation”, was co-authored by LCDS scientists Stefania Benonisdottir and Augustine Kong, alongside lead author Shuang Song and co-author Jun S. Liu, both from Harvard University.

“Participation is often treated as a side issue, but it’s genetically informative in itself,” said Dr. Shuang Song, lead author of the study. “Our method provides a novel and robust way to correct for this bias — using information on the genetics of participation missed by others — and reveals how deeply participation affects the genetic estimates we rely on.”

A Statistical Breakthrough for Population Genetics

The team developed a novel statistical framework to model the genetic and non-genetic bases of participation bias. To adjust estimates of heritability and genetic correlation, they proposed a method that utilizes simultaneously the information on the genetic basis of participation bias that can be deduced from the participants alone, together with known distributional differences of traits between participants and the target population. Compared to other adjustment methods that utilize the latter but not the former, this method avoids having to make strong assumptions that are likely, and in some cases demonstrated, to be invalid. Their approach cleverly disentangles genetic and non-genetic correlations between participation and other traits.

By applying their method to data from the UK Biobank — one of the world’s largest genetic studies — the researchers found that participation bias causes consistent underestimation of heritability and genetic correlation across several key phenotypes, including:

  • Body mass index (BMI)
  • Educational attainment
  • Smoking status
  • Income and employment status

Without adjustment, estimates for these traits may significantly misrepresent the genetic architecture underpinning them.

“This method allows us to see more clearly into the genetic structure of human traits — even when working with imperfect, biased samples,” said Professor Augustine Kong. “It’s a step forward in ensuring that large genetic studies like UK Biobank produce accurate and equitable scientific insight.”

Implications for Future Research

Together with a previous study published by the two LCDS scientists here, this study demonstrates that participation itself is a complex behavioural trait with its own genetic signature. As a result, studies that ignore this bias risk drawing flawed conclusions about the genetics of other traits. This insight is particularly timely as global biobanks continue to scale up and seek to generalize findings across populations.

The authors argue that this framework can be readily applied to other datasets such as Our Future Health, and even extended to studies involving multiple ancestries, offering a scalable solution to a fundamental statistical challenge in human genetics and sample surveys.

Support and Acknowledgments

This research was supported by the Economic and Social Research Council (ES/W002116/1), Leverhulme Research Centres Grant (RC-2018-003), the Li Ka Shing Foundation, the Goodger and Schorstein Scholarship, Nuffield College, and the UK Biobank under Application No. 68672.

📄 Read the full open-access article in PNAS:
https://www.pnas.org/doi/full/10.1073/pnas.2425530122