Fine-mapping results for 29 blood cell phenotypes in 408,113 participants from UK Biobank
Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant health burden. Here we share Genome-Wide Association Study (GWAS) fine-mapping results from UK Biobank including 408,113 European ancestry participants, where we discovered 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering the full allele frequency spectrum of variation impacting hematopoiesis.
We used a Bayesian fine-mapping method (FINEMAP, http://www.christianbenner.com, see below) that accounts for multiple independent signals and we fine-mapped 3,100 (19% of total 16,643 autosomal) associations to a single putative causative variant (>95% posterior probability) and more than half of the associated signals (n=9,149; 55%) to fewer than 10 variants. Here you can browse these results by variant, genomic position or gene.
Example
In our manuscript we mention the variant rs72928038, which maps to intron 1 of lymphoid TF BACH2 (Richer et al., 2016), and colocalizes with a H3K27ac histone QTL in CD4+ T-cells (Kundu et al., 2020). Its minor allele A (Minor Allele Frequency = 0.18) is strongly associated with decreased lymphocyte count and fine-mapped with posterior probability of 0.78. For more information about this variant and its role, click here. Click here to view this example in the browser.
Methods
Statistical fine-mapping was performed in the UKBB cohort, using FINEMAP v1.3.1 (Benner et al. 2016) [http://www.christianbenner.com/]. Input windows were defined as +- 250 kb from a conditionally independent signal. In case of multiple sentinels generating overlapping windows these were merged together, resulting in window size ranging from 500kb to 1.38Mb. The number of conditionally independent signals in each window was used as prior knowledge for the maximum number of causative variants to be searched (--n-causal-snps option) and the prior standard deviation for effect sizes was set to 0.08 (--prior-std option). The LD structure was computed from the same samples included in the GWAS analysis. 95% credible sets were defined as minimal sets of variants jointly covering at least 95% of the posterior probability of including the true causative signals.
For more details about phenotypes, methods and results please refer to our pre-print: citation

Patrick K. Albers