Notes for XMAP

Key contribution: Propose a statistical method for cross-population fine-mapping common causal SNPs (XMAP) by leveraging genetic diversity (different LD structures) and accounting for confounding bias which addresses the challenges of:

  1. Strong linkage disequilibrium among variants can limit the statistical power and resolution of fine-mapping. [Solved by LDSC and SuSiE]
  2. Computationally expensive to simultaneously search for multiple causal variants. [Solved by SuSiE]
  3. The confounding bias hidden in GWAS summary statistics can produce spurious signals. [Adjusting by the inflation factor cc]

It also integrates the polygenic component ϕ\boldsymbol{\phi} to capture the genetic background effects.

It can be integrated with single cell data to identify trait-relevant cell populations at the single cell resolution.

Advantages over existing methods: Greater statistical power, better calibrate of false positive rate, and substantially higher computational efficiency for identifying multiple causal signals.

Algorithm: Variational expectation-maximization (VEM).

Model and Algorithm

XMAP Model for Individual-level Data

XMAP Model for Summary-level Data

Algorithm and Parameter Estimation

Denote the collection of unknown parameters θ={Σ,Ω,c1,c2}\boldsymbol{\theta}=\left\{\boldsymbol{\Sigma}, \boldsymbol{\Omega}, c_{1}, c_{2}\right\}, and the collections of latent variables ϕ={ϕ1,ϕ2},γ={γk}k=1,,K\boldsymbol{\phi}=\left\{\boldsymbol{\phi}_{1}, \boldsymbol{\phi}_{2}\right\}, \boldsymbol{\gamma}=\left\{\boldsymbol{\gamma}_{k}\right\}_{k=1, \ldots, K} and β={β1k,β2k}k=1,,K\boldsymbol{\beta}=\left\{\beta_{1 k}, \beta_{2 k}\right\}_{k=1, \ldots, K}. Obtain the parameter estimates θ\boldsymbol{\theta} and identify causal SNPs with the posterior:
Pr(γ,β,Φb^,s^,R;θ^)=Pr(b^,γ,β,ϕs^,R;θ^)Pr(b^s^,R;θ^).\begin{equation} \operatorname{Pr}(\boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\Phi} \mid \hat{\mathbf{b}}, \hat{\mathbf{s}}, \mathbf{R} ; \hat{\boldsymbol{\theta}})=\frac{\operatorname{Pr}(\hat{\mathbf{b}}, \boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\phi} \mid \hat{\mathbf{s}}, \mathbf{R} ; \hat{\boldsymbol{\theta}})}{\operatorname{Pr}(\hat{\mathbf{b}} \mid \hat{\mathbf{s}}, \mathbf{R} ; \hat{\boldsymbol{\theta}})}. \end{equation}

  1. First step: Apply LDSC to estimate the parameters c1,c2c_{1}, c_{2}, and Ω\mathbf{\Omega}.
    • For Ω\boldsymbol{\Omega}, the diagonal terms ω1\omega_{1} and ω2\omega_{2} are estimated with the per-SNP heritabilities of the corresponding populations using LDSC. The off-diagonal term ω12\omega_{12} is estimated by the per-SNP co-heritability obtained via bi-variate LDSC.
    • The inflation constants c1c_{1} and c2c_{2} are estimated by the intercepts of LDSC of the two populations.
  2. Second step: Variational expectation-maximization (VEM) algorithm to estimate Σ\boldsymbol{\Sigma}.
    • Derive a lower bound of the logarithm of the marginal likelihood:
      logPr(b^s^,R;Ω^,c^1,c^2,Σ)γq(γ,β,ϕ)logPr(b^,γ,β,ϕs^,Ω^,c^1,c^2,Σ)q(β,ϕ)dβdϕ=Eq[logPr(b^,γ,β,Φs^,R;Ω^,c^1,c^2,Σ)logq(γ,β,Φ)]Lq(Σ)\begin{equation} \begin{aligned} & \log \operatorname{Pr}\left(\hat{\mathbf{b}} \mid \hat{\mathbf{s}}, \mathbf{R} ; \hat{\boldsymbol{\Omega}}, \hat{c}_{1}, \hat{c}_{2}, \boldsymbol{\Sigma}\right) \geq \sum_{\boldsymbol{\gamma}} \iint q(\boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\phi}) \log \frac{\operatorname{Pr}\left(\hat{\mathbf{b}}, \boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\phi} \mid \hat{\mathbf{s}}, \hat{\boldsymbol{\Omega}}, \hat{c}_{1}, \hat{c}_{2}, \boldsymbol{\Sigma}\right)}{q(\boldsymbol{\beta}, \boldsymbol{\phi})} d \boldsymbol{\beta} d \boldsymbol{\phi} \\ & =\mathbb{E}_{q}\left[\log \operatorname{Pr}\left(\hat{\mathbf{b}}, \boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\Phi} \mid \hat{\mathbf{s}}, \mathbf{R} ; \hat{\boldsymbol{\Omega}}, \hat{c}_{1}, \hat{c}_{2}, \boldsymbol{\Sigma}\right)-\log q(\boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\Phi})\right] \\ & \equiv \mathcal{L}_{q}(\boldsymbol{\Sigma}) \end{aligned}\end{equation}
    • Factorizable formulation of the mean field variational approximation:
      q(γ,β,ϕ)=k=1Kq(b1k,b2k)q(ϕ)=k=1Kq(γk)q(β1k,β2kγk)q(ϕ),\begin{equation} q(\boldsymbol{\gamma}, \boldsymbol{\beta}, \boldsymbol{\phi})=\prod_{k=1}^{K} q\left(\mathbf{b}_{1 k}, \mathbf{b}_{2 k}\right) q(\boldsymbol{\phi})=\prod_{k=1}^{K} q\left(\boldsymbol{\gamma}_{k}\right) q\left(\beta_{1 k}, \beta_{2 k} \mid \boldsymbol{\gamma}_{k}\right) q(\boldsymbol{\phi}), \end{equation}
      where q(b1k,b2k)=q(γk)q(β1k,β2kγk)q\left(\mathbf{b}_{1 k}, \mathbf{b}_{2 k}\right)=q\left(\boldsymbol{\gamma}_{k}\right) q\left(\beta_{1 k}, \beta_{2 k} \mid \boldsymbol{\gamma}_{k}\right) and q(ϕ)q(\boldsymbol{\phi}) are the distributions of {b1k,b2k}\left\{\mathbf{b}_{1 k}, \mathbf{b}_{2 k}\right\} and ϕ\boldsymbol{\phi} under the variational approximation, respectively.
    • E-step: Variational distributions at the tt-th iteration are given as:
      q(γkΣ(t))=Mult(1,π~k),q([β1kβ2k]γkj=1,Σ(t))=N(μ~kj,Σ~kj),q([Φ1Φ2]Σ(t))=N(v~,Λ~),\begin{equation} \begin{align*} & q\left(\boldsymbol{\gamma}_{k} \mid \boldsymbol{\Sigma}^{(t)}\right)=\operatorname{Mult}\left(1, \tilde{\boldsymbol{\pi}}_{k}\right), \\ & q\left(\left.\left[\begin{array}{l} \beta_{1 k} \\ \beta_{2 k} \end{array}\right] \right\rvert\, \gamma_{k j}=1, \boldsymbol{\Sigma}^{(t)}\right)=\mathcal{N}\left(\tilde{\boldsymbol{\mu}}_{k j}, \tilde{\boldsymbol{\Sigma}}_{k j}\right), \\ & q\left(\left.\left[\begin{array}{l} \boldsymbol{\Phi}_{1} \\ \boldsymbol{\Phi}_{2} \end{array}\right] \right\rvert\, \boldsymbol{\Sigma}^{(t)}\right)=\mathcal{N}(\tilde{\boldsymbol{v}}, \tilde{\boldsymbol{\Lambda}}), \end{align*}\end{equation}
      where π~=[π~k1,,π~kp]T[0,1]p,Σ~kjR2×2,μ~kjR2,Λ~R2p×2p\tilde{\boldsymbol{\pi}}=\left[\tilde{\boldsymbol{\pi}}_{k 1}, \ldots, \tilde{\boldsymbol{\pi}}_{k p}\right]^{T} \in[0,1]^{p}, \tilde{\boldsymbol{\Sigma}}_{k j} \in \mathbb{R}^{2 \times 2}, \tilde{\boldsymbol{\mu}}_{k j} \in \mathbb{R}^{2}, \tilde{\boldsymbol{\Lambda}} \in \mathbb{R}^{2 p \times 2 p}, and v~R2p\tilde{\boldsymbol{v}} \in \mathbb{R}^{2 p} are variational parameters. The variational parameters are given as
      π~kj=softmax(log(p)+12logΣ~kj+12μ~kjTΣ~kj1μ~kj),Σ~kj=[σ~kj,12σ~kj,122σ~kj,22σ~kj,22]=([r1jc^1s^1j200r2jc^2s^2j2]+(Σk(t))1)1,μ~kj=[μ~kj,1μ~kj,2]=Σ~kj([b^1jc^1s^1j2b^2jc^2s^2j2][R1jTc^1s^1j200R2jTc^2s^2j2](k1Kμ~kjπ~k+v~)),Λ~=([s^11R1s^11c^100s^21R2s^21c^2]+Ω^1Ip)1,v~=Λ~[s^12b^1c^1s^22b^2c^2][s^11R1s^11c^100s^21R2s^21c^2](k=1Kμ~kjπ~k),\begin{equation} \begin{align*} \tilde{\pi}_{k j} & =\operatorname{softmax}\left(-\log (p)+\frac{1}{2} \log \left|\tilde{\boldsymbol{\Sigma}}_{k j}\right|+\frac{1}{2} \tilde{\boldsymbol{\mu}}_{k j}^{T} \tilde{\boldsymbol{\Sigma}}_{k j}^{-1} \tilde{\boldsymbol{\mu}}_{k j}\right), \\ \tilde{\boldsymbol{\Sigma}}_{k j} & =\left[\begin{array}{ll} \tilde{\sigma}_{k j, 1}^{2} & \tilde{\sigma}_{k j, 12}^{2} \\ \tilde{\sigma}_{k j, 2}^{2} & \tilde{\sigma}_{k j, 2}^{2} \end{array}\right]=\left(\left[\begin{array}{cc} \frac{r_{1 j}}{\hat{c}_{1} \hat{s}_{1 j}^{2}} & \mathbf{0} \\ \mathbf{0} & \frac{r_{2 j}}{\hat{c}_{2} \hat{s}_{2 j}^{2}} \end{array}\right]+\left(\boldsymbol{\Sigma}_{k}^{(t)}\right)^{-1}\right)^{-1}, \\ \tilde{\boldsymbol{\mu}}_{k j} & =\left[\begin{array}{l} \tilde{\mu}_{k j, 1} \\ \tilde{\mu}_{k j, 2} \end{array}\right]=\tilde{\boldsymbol{\Sigma}}_{k j}\left(\left[\begin{array}{l} \frac{\hat{\mathbf{b}}_{1 j}}{\hat{c}_{1} \hat{s}_{1 j}^{2}} \\ \frac{\hat{\mathbf{b}}_{2 j}}{\hat{c}_{2} \hat{s}_{2 j}^{2}} \end{array}\right]-\left[\begin{array}{cc} \frac{\mathbf{R}_{1 j}^{T}}{\hat{c}_{1} \hat{s}_{1 j}^{2}} & \mathbf{0} \\ \mathbf{0} & \frac{\mathbf{R}_{2 j}^{T}}{\hat{c}_{2} \hat{s}_{2 j}^{2}} \end{array}\right]\left(\sum_{k^{\prime} \neq 1}^{K} \tilde{\boldsymbol{\mu}}_{k^{\prime} j} \otimes \tilde{\boldsymbol{\pi}}_{k^{\prime}}+\tilde{\boldsymbol{v}}\right)\right), \\ \tilde{\boldsymbol{\Lambda}} & =\left(\left[\begin{array}{cc} \frac{\hat{\mathbf{s}}_{1}^{-1} \mathbf{R}_{1} \hat{\mathbf{s}}_{1}^{-1}}{\hat{c}_{1}} & \mathbf{0} \\ \mathbf{0} & \frac{\hat{\mathbf{s}}_{2}^{-1} \mathbf{R}_{2} \hat{\mathbf{s}}_{2}^{-1}}{\hat{c}_{2}} \end{array}\right]+\hat{\boldsymbol{\Omega}}^{-1} \otimes \mathbf{I}_{p}\right)^{-1}, \\ \tilde{\boldsymbol{v}} & =\tilde{\boldsymbol{\Lambda}}\left[\begin{array}{cc} \frac{\hat{\mathbf{s}}_{1}^{-2} \hat{\mathbf{b}}_{1}}{\hat{c}_{1}} \\ \frac{\hat{\mathbf{s}}_{2}^{-2} \hat{\mathbf{b}}_{2}}{\hat{c}_{2}} \end{array}\right]-\left[\begin{array}{cc} \frac{\hat{\mathbf{s}}_{1}^{-1} \mathbf{R}_{1} \hat{\mathbf{s}}_{1}^{-1}}{\hat{c}_{1}} & \mathbf{0} \\ \mathbf{0} & \frac{\hat{\mathbf{s}}_{2}^{-1} \mathbf{R}_{2} \hat{\mathbf{s}}_{2}^{-1}}{\hat{c}_{2}} \end{array}\right]\left(\sum_{k=1}^{K} \tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{\pi}}_{k}\right), \end{align*}\end{equation}
      where softmax denotes the softmax function to make sure j=1pπ~kj=1\sum_{j=1}^{p} \tilde{\pi}_{k j}=1 and \otimes is the Kronecker product. The lower bound (11) can be analytically evaluated as
      Lq(ΣΣ(t))=(kKμ~kjπ~k+v~)T[s^12b^1c^1s^22b^2c^2]12(kKμ~kjπ~k+v~)T×[s^11R1s^11c^100s^21R2s^21c^2](kKμ~kjπ~k+v~)jp12c^1s^1j2r1jjkKπ~kj(μ~kj,12+σ~kj,12)jp12c^2s^b,2j2r2jjkKπ~kj(μ~kj,22+σ~kj,22)+12kK((μ~kjm~k)T[s^11R1s^11c^100s^21R2s^21c^2](μ~kjπ~k))12pkjTr(Σk1(Σ~kj+μ~kjμ~kjT))p2log2πΩ^12v~T(Ω^1Ip)v~12Tr(([1c^1S^11R1S^11001c^2S^21R2S^21]+Ω^1Ip)Λ~)+jpkKπ~kjlog1pjpkKπ~kjlogπ~kj+12jpkKπ~kj(logΣ~kjlogΣk)+12logΛ~+ constant \begin{aligned} \mathcal{L}_{q}\left(\boldsymbol{\Sigma} \mid \boldsymbol{\Sigma}^{(t)}\right) & =\left(\sum_{k}^{K} \tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{\pi}}_{k}+\tilde{\boldsymbol{v}}\right)^{T}\left[\begin{array}{c} \frac{\hat{\mathbf{s}}_{1}^{-2} \hat{\mathbf{b}}_{1}}{\hat{c}_{1}} \\ \frac{\hat{\mathbf{s}}_{2}^{-2} \hat{\mathbf{b}}_{2}}{\hat{c}_{2}} \end{array}\right]-\frac{1}{2}\left(\sum_{k}^{K} \tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{\pi}}_{k}+\tilde{\boldsymbol{v}}\right)^{T} \\ & \times\left[\begin{array}{cc} \frac{\hat{\mathbf{s}}_{1}^{-1} \mathbf{R}_{\mathbf{1}} \hat{\mathbf{s}}_{1}^{-1}}{\hat{c}_{1}} & \mathbf{0} \\ \mathbf{0} & \frac{\hat{\mathbf{s}}_{2}^{-1} \mathbf{R}_{2} \hat{\mathbf{s}}_{2}^{-1}}{\hat{c}_{2}} \end{array}\right]\left(\sum_{k}^{K} \tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{\pi}}_{k}+\tilde{\boldsymbol{v}}\right)-\sum_{j}^{p} \frac{1}{2 \hat{c}_{1} \hat{s}_{1 j}^{2}} r_{1 j j} \sum_{k}^{K} \tilde{\boldsymbol{\pi}}_{k j}\left(\tilde{\mu}_{k j, 1}^{2}+\tilde{\sigma}_{k j, 1}^{2}\right) \\ & -\sum_{j}^{p} \frac{1}{2 \hat{c}_{2} \hat{s}_{b, 2 j}^{2}} r_{2 j j} \sum_{k}^{K} \tilde{\pi}_{k j}\left(\tilde{\mu}_{k j, 2}^{2}+\tilde{\sigma}_{k j, 2}^{2}\right) \\ & +\frac{1}{2} \sum_{k}^{K}\left(\left(\tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{m}}_{k}\right)^{T}\left[\begin{array}{cc} \frac{\hat{\mathbf{s}}_{1}^{-1} \mathbf{R}_{1} \hat{\mathbf{s}}_{1}^{-1}}{\hat{c}_{1}} & \mathbf{0} \\ \mathbf{0} & \frac{\hat{\mathbf{s}}_{2}^{-1} \mathbf{R}_{2} \hat{\mathbf{s}}_{2}^{-1}}{\hat{c}_{2}} \end{array}\right]\left(\tilde{\boldsymbol{\mu}}_{k j} \otimes \tilde{\boldsymbol{\pi}}_{k}\right)\right) \\ & -\frac{1}{2 p} \sum_{k} \sum_{j} \operatorname{Tr}\left(\boldsymbol{\Sigma}_{k}^{-1}\left(\tilde{\boldsymbol{\Sigma}}_{k j}+\tilde{\boldsymbol{\mu}}_{k j} \tilde{\boldsymbol{\mu}}_{k j}^{T}\right)\right)-\frac{p}{2} \log |2 \pi \hat{\boldsymbol{\Omega}}|-\frac{1}{2} \tilde{\boldsymbol{v}}^{T}\left(\hat{\boldsymbol{\Omega}}^{-1} \otimes \mathbf{I}_{p}\right) \tilde{\boldsymbol{v}} \\ & -\frac{1}{2} \operatorname{Tr}\left(\left(\left[\begin{array}{cc} \frac{1}{\hat{c}_{1}} \hat{\mathbf{S}}_{1}^{-1} \mathbf{R}_{1} \hat{\mathbf{S}}_{1}^{-1} & \mathbf{0} \\ \mathbf{0} & \frac{1}{\hat{c}_{2}} \hat{\mathbf{S}}_{2}^{-1} \mathbf{R}_{2} \hat{\mathbf{S}}_{2}^{-1} \end{array}\right]+\hat{\boldsymbol{\Omega}}^{-1} \otimes \mathbf{I}_{p}\right) \tilde{\boldsymbol{\Lambda}}\right) \\ & +\sum_{j}^{p} \sum_{k}^{K} \tilde{\pi}_{k j} \log \frac{1}{p}-\sum_{j}^{p} \sum_{k}^{K} \tilde{\pi}_{k j} \log \tilde{\pi}_{k j}+\frac{1}{2} \sum_{j}^{p} \sum_{k}^{K} \tilde{\pi}_{k j}\left(\log \left|\tilde{\boldsymbol{\Sigma}}_{k j}\right|-\log \left|\boldsymbol{\Sigma}_{k}\right|\right) \\ & +\frac{1}{2} \log |\tilde{\Lambda}|+\text { constant } \end{aligned}
      where Tr(B)\operatorname{Tr}(\mathbf{B}) denotes the trace of the square matrix B\mathbf{B}, and the constant term does not involve Σ\boldsymbol{\Sigma}.
    • M-step: Solve LqΣk=0\frac{\partial \mathcal{L}_{q}}{\partial \Sigma_{k}}=0 to obtain the update equation of Σk\Sigma_{k}:
      Σk(t+1)=jpπ~kj(μ~kjμ~kjT+Σ~kj)\begin{equation} \boldsymbol{\Sigma}_{k}^{(t+1)}=\sum_{j}^{p} \tilde{\pi}_{k j}\left(\tilde{\boldsymbol{\mu}}_{k j} \tilde{\boldsymbol{\mu}}_{k j}^{T}+\tilde{\boldsymbol{\Sigma}}_{k j}\right) \end{equation}

Identification of Causal Variant and Construction of Credible Set (Output)

Conclusion

Comparison with other methods (including DAP-G, FINEMAP, SuSiE, SuSiE-inf, PAINTOR, MsCAVIAR, and SuSiEx in simulation study; SuSiE, SuSiE-inf and SuSiEx in real data analysis), shows that XMAP has three features:

Simulation Study

Fig. 2: a Manhattan plots. b Heat maps showing the absolute correlations between the three causal SNPs and their nearby SNPs in two populations. c Comparisons of FDR control. d,e CPU timings. f Comparisons of statistical power.
Fig. 3: a Comparison of FDR control. b Estimated LDSC intercepts. c Comparisons of ROC curves. d An illustrative example with a single causal signal.

Real Data Analysis

  1. LDL GWASs:
    • Data: GWASs of AFR and EAS (by GLGC) and EUR (by UKBB and GLGC).

    For AFR, we estimated the LD matrices by using 3,072 African individuals from UKBB as reference samples.

    • Confounding bias: The LDSC intercepts estimated from all LDL GWASs were not substantially different from one, suggesting an ignorable confounding bias here.
    • Credibility: Evaluated the replication rate using an independent LDL GWAS from the EUR population.
    • Improvement of the power: Use rs900776 as an example to show the improvement of fine-mapping power and resolution by XMAP is owing to leverage the genetic diversity. [should compare the results of XMAP in 3 populations separately and altogether?]
Fig. 4: a # causal signals identified by XMAP and SuSiE with different PIP thresholds. b The LD score distribution of putative causal SNPs identified by XMAP. c-f Fine-mapping of locus 21.4 Mbp–22.4 Mbp in chromosome 8. g Absolute correlation in EUR and AFR among the SNPs within the level-99% credible set. The SNP rs900776 is highlighted in the heat map.
  1. Height GWASs: which were well known to be affected by population structure.
    • Aim: Investigate the ability of XMAP in correcting confounding bias and reducing false positive signals.
    • Process: First applied fine-mapping methods to discovery GWAS datasets, and then evaluated the credibility in replication datasets from different population backgrounds.
    • Data: Strong confounding bias: EUR GWAS from UKBB and a Chinese GWAS.
    • Replication data: Ignorable confounding bias: Within-sibship GWAS from European population, which was known to be less confounded by population structure. The GWAS from BBJ cohort from EAS background.
    • Result: Rs2053005 could be a false positive and XMAP was able to exclude this signal by correcting the confounding bias.
Fig. 5: a-d Overview of replication analyses of high-PIP fine-mapped SNPs across populations: bar charts showing the fraction and number of fine-mapped SNPs with p-value < 5e−8 in the replication cohorts of EUR Sibship GWAS and BBJ cohorts and bar charts showing the distribution of PIP for fine-mapped SNPs computed by SuSiE in the replication cohorts of EUR Sibship GWAS and BBJ. e-i Fine-mapping of locus 66.55 Mbp–66.85 Mbp in chromosome 15.
  1. Multiple causal signals: XMAP was able to identify multiple causal signals within a locus.
    • Data: Same with 2.
    • Results: XMAP, MsCaviar, and SuSiEx robustly identified multiple causal signals when the sample size decreased. XMAP robust to the choice of KK which is max # of causal signals.
Fig. 6: a Distributions of the number of putative causal SNPs identified by XMAP under different PIP thresholds. b,c The p-value / PIP distributions in the Sibship GWAS replication cohort, threshold set as 0.9. d-h A demonstrative example using the locus 130.2 Mbp–130.5 Mbp in chromosome 6. Rs1415701 and rs6569648 had highly probability to be casual.
  1. Single-cell data integration: XMAP results can be effectively integrated with single-cell datasets to identify disease/trait-relevant cells.
    • Data: Blood traits from scATAC-seq dataset that encompasses multiple hematopoietic lineages.
    • Process: ...
    • Result: Better interpretation of risk variants in their relevant cellular context, gaining biological insights into causal mechanisms at single-cell resolution.

Limitations

  1. Assumptions: XMAP assumes that the causal variants are shared across populations, which may not be true for some signals (Same as PAINTOR and MsCAVIAR).
  2. Disproportionate distribution of causal variants: Causal variants are reported to be distributed disproportionately in the genome, depending on the functional context of the genomic regions.
  3. Gene-level effects: Gene-level effects can be more stably shared across populations, as compared to SNP-level effects. Leveraging the genetic diversity at the gene-level for fine-mapping can be an interesting direction.

Reference

Cai M, Wang Z, Xiao J, et al. XMAP: Cross-population fine-mapping by leveraging genetic diversity and accounting for confounding bias[J]. Nature Communications, 2023, 14(1): 6870.

TO-DO

  1. In real data's 1.