--- title: GWASapi User Guide output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{GWASapi User Guide} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8](inputenc) --- ```{r knitr_options, include=FALSE} library(knitr) opts_chunk$set(eval=FALSE) options(width=95) ``` The [GWASapi](https://github.com/rqtl/GWASapi) package provides access to the [NHGRI-EBI catalog of GWAS summary statistics](https://www.ebi.ac.uk/gwas). For details on the API, see [its documentation](https://www.ebi.ac.uk/gwas/summary-statistics/docs/), as well as [Pjotr Prins](https://thebird.nl/)'s [documentation at github](https://github.com/pjotrp/racket-summary-stats). ## Installation You can install GWASapi from [GitHub](https://github.com/rqtl/GWASapi). You first need to install the [remotes](https://remotes.r-lib.org). ```{r install_remotes} install.packages("remotes") ``` Then use `remotes::install_github()` to install GWASapi. ```{r install_with_remotes} library(remotes) install_github("rqtl/GWASapi") ``` Load the package with `library()`. ```{r load_package} library(GWASapi) ``` ## Lists of things The purpose of the GWASapi package is to provide access to summary statistics for human GWAS. First, you can get lists of studies and traits that are available. To get lists of studies, use `list_studies()`. The default is to return just 20 studies. You can control that limit with the argument `size`. You can also use `start` to step through the full set. ```{r list_studies} list_studies(size=5) ``` To retrieve _all_ studies, set a higher limit ```{r list_all_studies} all_studies <- list_studies(size=2000) ``` To get a list of traits, use `list_traits()`. Again the default is to return just 20 values. To get all traits, use the `size` argument. ```{r list_all_traits} all_traits <- list_traits(size=2000) ``` The traits are returned as identifiers like `"EFO_0000249"`. To get a description of a trait, you can use the ontology lookup service, for example Chromosomes are stored as integers 1-24. ```{r list_chr} list_chr() ``` ## Get associations To get associations for a specific variant by its rs-number, use `get_variant()`. If you know the chromosome it is on, you'll get faster results by providing the chromosome. And again, the default is to return just 20 values, so use the `size` and `start` arguments if you want a comprehensive list. The result is a data frame with columns such as `base_pair_location`, `p_value`, `study_accession`, and `trait`. ```{r get_variant} result <- get_variant("rs2228603", 19, size=5) ``` Use the arguments `p_lower` and `p_upper` to focus on associations with p-value in a specified range. For example, to get all of the associations with p-value < 10^-10^, you would do: ```{r get_variant_pupper} result <- get_variant("rs2228603", 19, p_upper=1e-10) ``` To get associations for a specific region, use `get_asso()`. For example, to get the region from 19.2 Mbp to 19.3 Mbp on chr 19: ```{r get_asso} result <- get_asso(chr=19, bp_lower=19200000, bp_upper=19300000) ``` You can restrict those results to a particular study. ```{r get_asso_by_study} result <- get_asso(chr=19, bp_lower=19200000, bp_upper=19300000, study="GCST000392") ``` To get associations for a given trait, use `get_trait_asso()`. You can't restrict this to a given chromosome region. ```{r get_trait_asso} result <- get_trait_asso("EFO_0001360", p_upper=1e-100, size=5) ```