Skip to contents

Finimom employs a beta-binomial prior for model dimension dd:

(d=k|a,b)=(pk)B(a+k,pk+b)B(a,b),a,b>0,d=1,,K, \mathbb{P}(d = k | a, b) = \binom{p}{k}\frac{B(a + k, p - k + b)}{B(a, b)}, \quad a, b > 0, \quad d = 1, \dots, K,

where pp is the number of variants and KK the maximum model size. The priors of this form with a=1a = 1 and b=pub = p^u with u>1u > 1, are discussed in Castillo and van der Vaart (2012) and in Castillo et al. (2015). The parameter uu here controls the amount of prior density for smaller models, with larger values of uu giving more prior mass to smaller models.

Using a linkage disequilibrium (LD) matrix from an external dataset tend to increase the false positive rate. In our formulation, parameter uu provides a flexible way to adjust for this. The default values are u=2u = 2 when using in-sample LD matrix, and u=2.25u = 2.25 when using out-of-sample LD matrix.

We demonstrate the prior for model dimension using the example dataset:

library(finimom)

(p <- length(exampledata$betahat))
#> [1] 363

maxsize <- 10

a <- 1
u <- 1.5

val <- exp(sapply(seq_len(maxsize), dbb, p = p, a = a, b = p^u))
(val <- val/sum(val))
#>  [1] 9.502616e-01 4.727099e-02 2.345333e-03 1.160564e-04 5.727772e-06
#>  [6] 2.819359e-07 1.384076e-08 6.776590e-10 3.309028e-11 1.611478e-12

plot(val, type = "b", ylim = c(0, 1))

And for different values of uu:


us <- c(1.05, 1.5, 2, 2.25)

vals <- lapply(us, function(u){
  b <- p^u
  out <- exp(sapply(1:10, dbb, a = a, p = p, b = b))
  out <- out/sum(out)
})

plot(vals[[1]], type = "b", ylim = c(0, 1))
invisible(lapply(2:4, function(i) lines(vals[[i]], type = "b", lty = i)))

The same on a log scale:


plot(vals[[1]], type = "b", log = "y", ylim = range(unlist(vals)))
invisible(lapply(2:4, function(i) lines(vals[[i]], type = "b", lty = i)))

References

Castillo and van der Vaart (2012). Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics.

Castillo et al. (2015). Bayesian linear regression with sparse priors. The Annals of Statistics.

Session information


sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] finimom_0.2.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.37     desc_1.4.3        R6_2.5.1          fastmap_1.2.0    
#>  [5] xfun_0.49         cachem_1.1.0      knitr_1.48        htmltools_0.5.8.1
#>  [9] rmarkdown_2.29    lifecycle_1.0.4   cli_3.6.3         sass_0.4.9       
#> [13] pkgdown_2.1.1     textshaping_0.4.0 jquerylib_0.1.4   systemfonts_1.1.0
#> [17] compiler_4.4.2    highr_0.11        tools_4.4.2       ragg_1.3.3       
#> [21] evaluate_1.0.1    bslib_0.8.0       Rcpp_1.0.13-1     yaml_2.3.10      
#> [25] jsonlite_1.8.9    rlang_1.1.4       fs_1.6.5