Title: | Microbiome Mixture Analysis |
---|---|
Description: | Evaluate whether a microbiome sample is a mixture of two samples, by fitting a model for the number of read counts as a function of single nucleotide polymorphism (SNP) allele and the genotypes of two potential source samples. Lobo et al. (2021) <doi:10.1093/g3journal/jkab308>. |
Authors: | Karl W Broman [aut, cre] |
Maintainer: | Karl W Broman <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4 |
Built: | 2024-11-22 03:21:10 UTC |
Source: | https://github.com/kbroman/mbmixture |
Perform a parametric bootstrap to assess whether there is significant evidence that a sample is a mixture.
bootstrapNull( tab, n_rep = 1000, interval = c(0, 1), tol = 0.000001, check_boundary = TRUE, cores = 1, return_raw = TRUE )
bootstrapNull( tab, n_rep = 1000, interval = c(0, 1), tol = 0.000001, check_boundary = TRUE, cores = 1, return_raw = TRUE )
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
n_rep |
Number of bootstrap replicates |
interval |
Interval to which each parameter should be constrained |
tol |
Tolerance for convergence |
check_boundary |
If TRUE, explicitly check the boundaries of |
cores |
Number of CPU cores to use, for parallel calculations.
(If |
return_raw |
If TRUE, return the raw results. If FALSE, just return the p-value.
Unlink |
If return_raw=FALSE
, a single numeric value (the p-value).If
return_raw=TRUE
, a vector of length n_rep
with the LRT statistics from each
bootstrap replicate.
data(mbmixdata) # just 100 bootstrap replicates, as an illustration bootstrapNull(mbmixdata, n_rep=100)
data(mbmixdata) # just 100 bootstrap replicates, as an illustration bootstrapNull(mbmixdata, n_rep=100)
Perform a parametric bootstrap to get estimated standard errors.
bootstrapSE( tab, n_rep = 1000, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE, cores = 1, return_raw = FALSE )
bootstrapSE( tab, n_rep = 1000, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE, cores = 1, return_raw = FALSE )
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
n_rep |
Number of bootstrap replicates |
interval |
Interval to which each parameter should be constrained |
tol |
Tolerance for convergence |
check_boundary |
If TRUE, explicitly check the boundaries of |
cores |
Number of CPU cores to use, for parallel calculations.
(If |
return_raw |
If TRUE, return the raw results. If FALSE, just return the estimated standard errors. |
If return_raw=FALSE
, a vector of two standard errors. If
return_raw=TRUE
, an matrix of size n_rep
x 2 with the detailed
bootstrap results.
data(mbmixdata) # just 100 bootstrap replicates, as an illustration bootstrapSE(mbmixdata, n_rep=100)
data(mbmixdata) # just 100 bootstrap replicates, as an illustration bootstrapSE(mbmixdata, n_rep=100)
Calculate log likelihood function for microbiome sample mixture model at particular values of p
and e
.
mbmix_loglik(tab, p, e = 0)
mbmix_loglik(tab, p, e = 0)
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
p |
Contaminant probability (proportion of mixture coming from the second sample). |
e |
Sequencing error rate. |
The log likelihood evaluated at p
and e
.
data(mbmixdata) mbmix_loglik(mbmixdata, p=0.74, e=0.002)
data(mbmixdata) mbmix_loglik(mbmixdata, p=0.74, e=0.002)
Example dataset for mbmixture package.
data(mbmixdata)
data(mbmixdata)
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read.
data(mbmixdata) mle_pe(mbmixdata)
data(mbmixdata) mle_pe(mbmixdata)
Calculate the MLE of the sequencing error rate e for a fixed value of the contaminant probability p.
mle_e( tab, p = 0.05, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE )
mle_e( tab, p = 0.05, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE )
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
p |
Assumed value for the contaminant probability |
interval |
Interval to which each parameter should be constrained |
tol |
Tolerance for convergence |
check_boundary |
If TRUE, explicitly check the boundaries of |
A single numeric value, the MLE of e
, with the log likelihood as an attribute.
data(mbmixdata) mle_e(mbmixdata, p=0.74)
data(mbmixdata) mle_e(mbmixdata, p=0.74)
Calculate the MLE of the contaminant probability p for a fixed value of the sequencing error rate e.
mle_p( tab, e = 0.002, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE )
mle_p( tab, e = 0.002, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE )
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
e |
Assumed value for the sequencing error rate |
interval |
Interval to which each parameter should be constrained |
tol |
Tolerance for convergence |
check_boundary |
If TRUE, explicitly check the boundaries of |
A single numeric value, the MLE of p
, with the log likelihood as an attribute.
data(mbmixdata) mle_p(mbmixdata, e=0.002)
data(mbmixdata) mle_p(mbmixdata, e=0.002)
Find joint MLEs of p and e for microbiome mixture model
mle_pe( tab, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE, SE = FALSE )
mle_pe( tab, interval = c(0, 1), tol = 0.000001, check_boundary = FALSE, SE = FALSE )
tab |
Dataset of read counts as 3d array of size 3x3x2, genotype in first sample x genotype in second sample x allele in read. |
interval |
Interval to which each parameter should be constrained |
tol |
Tolerance for convergence |
check_boundary |
If TRUE, explicitly check the boundaries of |
SE |
If TRUE, get estimated standard errors. |
A vector containing the estimates of p
and e
along with the evaluated log likelihood and likelihood ratio test statistics for the hypotheses p=0 and p=1.
data(mbmixdata) mle_pe(mbmixdata)
data(mbmixdata) mle_pe(mbmixdata)