| Title: | Integrated Interface of Bayesian Single Index Models using 'nimble' |
|---|---|
| Description: | Provides tools for fitting Bayesian single index models with flexible choices of priors for both the index and the link function. The package implements model estimation and posterior inference using efficient MCMC algorithms built on the 'nimble' framework, allowing users to specify, extend, and simulate models in a unified and reproducible manner. The following methods are implemented in the package: Antoniadis et al. (2004) <https://www.jstor.org/stable/24307224>, Wang (2009) <doi:10.1016/j.csda.2008.12.010>, Choi et al. (2011) <doi:10.1080/10485251003768019>, Dhara et al. (2019) <doi:10.1214/19-BA1170>, McGee et al. (2023) <doi:10.1111/biom.13569>. |
| Authors: | Seowoo Jung [aut, cre], Eun-kyung Lee [aut] |
| Maintainer: | Seowoo Jung <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.0.4 |
| Built: | 2026-05-28 08:17:56 UTC |
| Source: | https://github.com/seowoo-jung/bayessim |
Create a fitted bsim object by combining a BayesSIM
setup object with MCMC samples returned by runMCMC().
as_bsim(object, mcmc.out, ...) ## S3 method for class 'bsimSetup' as_bsim(object, mcmc.out, ...)as_bsim(object, mcmc.out, ...) ## S3 method for class 'bsimSetup' as_bsim(object, mcmc.out, ...)
object |
A |
mcmc.out |
MCMC output corresponding to the result of a call to |
... |
Additional arguments passed to other methods. |
This function is mainly intended for workflows where the model structure
and the MCMC sampling are performed separately. It collects the MCMC draws across chains, and returns an object of class "bsim"
that can be used with generic functions such as summary(), plot(), and predict().
An object of class "bsim" containing posterior samples,
point estimates, fitted values, and related model information.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Extracts fitted result that contains posterior mean/median prediction of response variable, and standard error, credible interval of each data point in data.frame.
## S3 method for class 'bsimPred' as.data.frame(x, ...)## S3 method for class 'bsimPred' as.data.frame(x, ...)
x |
An object of class "bsimPred" which is the result of |
... |
Additional arguments passed to other methods. |
data.frame of prediction information.
data("DATA1") simdata2 <- data.frame(DATA1$X, y = DATA1$y) fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) pred <- predict(fit_one, se.fit = TRUE, interval = "credible",level = 0.8) results <- as.data.frame(pred) summary(results)data("DATA1") simdata2 <- data.frame(DATA1$X, y = DATA1$y) fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) pred <- predict(fit_one, se.fit = TRUE, interval = "credible",level = 0.8) results <- as.data.frame(pred) summary(results)
Fitting a single–index model in single integrated function.
BayesSIM( formula, data, indexprior = "fisher", link = "bspline", prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) BayesSIM_setup( formula, data, indexprior = "fisher", link = "bspline", prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) ## S3 method for class 'bsim' print(x, digits = 3, ...) ## S3 method for class 'bsimSetup' print(x, digits = 3, ...)BayesSIM( formula, data, indexprior = "fisher", link = "bspline", prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) BayesSIM_setup( formula, data, indexprior = "fisher", link = "bspline", prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) ## S3 method for class 'bsim' print(x, digits = 3, ...) ## S3 method for class 'bsimSetup' print(x, digits = 3, ...)
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
indexprior |
Index vector prior among |
link |
Link function among |
prior |
Optional named list of prior settings. Further descriptions are in every specific model's Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in every specific model's Details section. |
method |
Character, |
lowerB |
This parameter is only for |
upperB |
This parameter is only for |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
x |
A fitted |
digits |
Number of digits to display. |
... |
Additional arguments. |
Integrated function for Bayesian single-index model. Default model is von-Mises Fisher distribution for index vector with B-spline link function.
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
bsFisher(), bsSphere(), bsPolar(), bsSpike(),
gpFisher(), gpSphere(), gpPolar(), gpPolarHigh(), gpSpike()
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version - bsFisher fit1 <- BayesSIM(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version- bsFisher models <- BayesSIM_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version - bsFisher fit1 <- BayesSIM(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version- bsFisher models <- BayesSIM_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Fits a single-index model
where the link is represented by B-spline and the
index vector has von Mises–Fisher prior.
bsFisher( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsFisher_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )bsFisher( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsFisher_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model uses a -order polynomial spline with interior knots as follows:
on with
and . are spline coefficients and , are boundary knots where ,
and .
Priors
von Mises–Fisher prior on the index with direction and concentration.
Conditioned on and , the link coefficients follow
normal distribution with estimated mean vector and
covariance , where is the B-spline basis design matrix.
Inverse gamma prior on with shape parameter and rate parameter .
Sampling
Random walk metropolis algorithm is used for index vector . Given , and are sampled from posterior distribution. Further sampling method is described in Antoniadis et al(2004).
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: von Mises–Fisher prior for the projection vector (index).
index_direction is a unit direction vector of the von Mises–Fisher distribution.
By default, the estimated vector from projection pursuit regression is assigned.
index_dispersion is the positive concentration parameter. By default, 150 is assigned.
Link function: B-spline basis and coefficient of B-spline setup.
basis: For the basis of B-spline, link_basis_df is the number of basis functions (default 21),
link_basis_degree is the spline degree (default 2) and
link_basis_delta is a small jitter for boundary knots spacing control (default 0.001).
beta: For the coefficient of B-spline, multivariate normal prior is assigned with mean link_beta_mu, and covariance link_beta_cov.
By default, is assigned.
Error variance (sigma2): An Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 0.001, sigma2_rate = 100)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector: Initial unit index vector . By default, the vector is randomly sampled from the von Mises–Fisher prior.
Link function: Initial spline coefficients (link_beta). By default,
is computed,
where is the B-spline basis design matrix.
Error variance (sigma2): Initial scalar error variance (default 0.01).
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Antoniadis, A., Grégoire, G., & McKeague, I. W. (2004). Bayesian estimation in single-index models. Statistica Sinica, 1147-1164.
Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsFisher(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsFisher_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsFisher(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsFisher_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Fits a single-index model
where the link is represented by B-spline link function and the
index vector is parameterized on the unit sphere via a one-to-one polar transformation.
bsPolar( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsPolar_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )bsPolar( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsPolar_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model uses a -order polynomial spline with interior knots as follows:
on with
and . are spline coefficient and , are boundary knots where ,
and . lies on the unit sphere ()
to ensure identifiability and is parameterized via a one-to-one polar transformation with angle , where is the dimension of predictor.
Sampling is performed on the angular parameters defining
the index vector.
The mapping is
The vector is then scaled to unit length.
Priors
is dimension of polar angle of index vector and re-scaled Beta distribution on is assigned.
Conditioned on and , the link coefficients follow
normal distribution with estimated mean vector and
covariance , where is the B-spline basis design matrix.
Inverse gamma prior on with shape parameter and rate parameter .
Sampling Samplers are automatically assigned by nimble.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector:
Only shape parameter index_psi_alpha of dimension vector is needed since rate parameters is computed to satisfy .
By default, the shape parameter for each element of the polar vector is set to 5000.
Link function: B-spline basis and coefficient of B-spline setup.
basis: For the basis of B-spline, link_basis_df is the number of basis functions (default 21), link_basis_degree is the spline degree (default 2) and
link_basis_delta is a small jitter for boundary-knot spacing control (default 0.001).
beta: For the coefficient of B-spline, multivariate normal prior is assigned with mean link_beta_mu, and covariance link_beta_cov.
By default, is assigned.
Error variance (sigma2): An Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 0.001, sigma2_rate = 100)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector: Initial vector of polar angle index_psi ( dimension). Each element of angle is between 0 and .
By default, it is randomly draw from uniform distribution.
Link function: Initial spline coefficients(link_beta). By default,
is computed,
where is the B-spline basis design matrix.
Error variance (sigma2): Initial scalar error variance (default 0.01).
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Antoniadis, A., Grégoire, G., & McKeague, I. W. (2004). Bayesian estimation in single-index models. Statistica Sinica, 1147-1164.
Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.
Dhara, K., Lipsitz, S., Pati, D., & Sinha, D. (2019). A new Bayesian single index model with or without covariates missing at random. Bayesian analysis, 15(3), 759.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsPolar(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsPolar_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsPolar(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsPolar_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Fits a single-index model
where the link is represented by B-spline link and the
index vector is on half-unit sphere.
bsSphere( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsSphere_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )bsSphere( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsSphere_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model uses a -order polynomial spline with interior knots and intercept as follows:
on with
and . are spline coefficients and , are boundary knots where ,
and . Variable selection is encoded by a binary vector , equivalently
setting components of to zero when .
Priors
The variable selection indicator has Beta–Bernoulli hierarchy prior. Beta hyper-prior on the Bernoulli–inclusion probability ,
inducing where .
are shape and rate parameter of beta distribution.
Free‑knot prior: the number of knots with mean . The knot locations have a Dirichlet prior on the scaled interval .
Index vector prior is uniform on the half‑sphere of dimension with first nonzero positive.
Conjugate normal–inverse-gamma on enables analytic integration for RJMCMC with covariance .
Sampling Posterior exploration follows a hybrid RJMCMC with six move types:
add/remove predictor , update , add/delete/relocate a knot. The update is a random‑walk
Metropolis via local rotations in a two‑coordinate subspace. Knot changes use simple proposals with tractable acceptance ratios.
Further sampling method is described in Wang (2009).
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: index_nu_r1, index_nu_r2 gives the shape and rate parameter of beta-binomial prior, respectively. (default index_nu_r1 = 1, index_nu_r2 = 1).
Link function: B-spline knots, basis and coefficient setup.
knots: Free-knot prior for the spline. link_knots_lambda_k is the Poisson mean for the number of
interior knot (default 5). link_knots_maxknots is the maximum number of interior knots.
If link_knots_maxknots is NULL, the number of interior knots is randomly drawn from a Poisson distribution.
basis: For the basis of B-spline, link_basis_degree is the spline
degree (default 2).
beta: For the coefficient of B-spline,
By default, link_beta_mu is a zero vector, link_beta_tau is set to the sample size,
and link_beta_Sigma0 is the identity matrix of dimension ,
where is the number of interior knots and is the spline order.
Error variance (sigma2): Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
Small values for shape and rate parameters yield a weakly-informative prior on . (default sigma2_shape = 0.001, sigma2_rate = 0.001)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector: index_nu is binary vector indicating active predictors for the index.
index is initial unit-norm index vector (automatically normalized if necessary, with the first nonzero element made positive for identifiability).
The vector length must match the number of predictor.
Ideally, positions where index_nu has a value of 1 should correspond to nonzero elements in ; elements corresponding to index_nu = 0 will be set to zero.
Link function: link_k is initial number of interior knots. link_knots is initial vector of interior knot positions in , automatically scaled to the true boundary.
Length of this vector should be equal to the initial value of k.
link_beta is initial vector of spline coefficients. Length should be equal to the initial number of basis functions with intercept ().
Error variance (sigma2): Initial scalar error variance. (default 0.01)
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Wang, H.-B. (2009). Bayesian estimation and variable selection for single index models. Computational Statistics & Data Analysis, 53, 2617–2627.
Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsSphere(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsSphere_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsSphere(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsSphere_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Fits a single-index model
where the link is represented by B-spline link function and the
index vector has spike-and-slab prior.
bsSpike( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsSpike_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )bsSpike( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) bsSpike_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model uses a -order polynomial spline with interior knots as follows:
on with
and . are spline coefficient and and are boundary knots where ,
and . is a p-dimensional index vector subject to a spike-and-slab
prior for variable selection with binary indicator variable .
Priors
The variable selection indicator has Beta–Bernoulli hierarchy prior. Beta hyper-prior on the Bernoulli–inclusion probability ,
inducing where .
are shape and rate parameter of beta distribution.
Slab coefficients have normal distribution with zero mean and variance.
Conditioned on and , the link coefficients follow
normal distribution with estimated mean vector and
covariance , where is the B-spline basis design matrix.
Inverse gamma prior on with shape parameter and rate parameter .
Sampling Samplers are automatically assigned by nimble.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: index_nu_r1, index_nu_r2 gives the shape and rate parameter of beta-binomial prior, respectively.
For slab prior, normal distribution with zero mean is assigned for selected variables . index_sigma_theta is for variance of , and it is assigned 0.25 by default.
Link function: B-spline basis and coefficient of B-spline setup.
basis: For the basis of B-spline, link_basis_df is the number of basis functions (default 21), link_basis_degree is the spline degree (default 2) and link_basis_delta is a small jitter for boundary-knot spacing control (default 0.01).
beta: For the coefficient of B-spline, multivariate normal prior is assigned with mean link_beta_mu, and covariance link_beta_cov. By default,
Error variance (sigma2): Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 0.001, sigma2_rate = 100)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector:
index_pi Initial selecting variable probability. (default: 0.5)
index_nu Initial vector of inclusion indicators . By default, each nu is randomly drawn by
index Initial vector of index. By default, each element of index vector, which is chosen by , is proposed by normal distribution.
Link function: Initial spline coefficients (link_beta). By default,
is computed,
where is the B-spline basis design matrix.
Error variance (sigma2): Initial scalar error variance (default 0.01).
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Antoniadis, A., Grégoire, G., & McKeague, I. W. (2004). Bayesian estimation in single-index models. Statistica Sinica, 1147-1164.
Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.
McGee, G., Wilson, A., Webster, T. F., & Coull, B. A. (2023). Bayesian multiple index models for environmental mixtures. Biometrics, 79(1), 462-474.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsSpike(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsSpike_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- bsSpike(y ~ ., data = simdata, niter = 5000, nburnin = 1000, nchain = 1) # Split version models <- bsSpike_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
BayesSIM
Computes posterior summaries of the single-index model index vector
from a fitted BayesSIM. Users may choose either the
posterior mean or median as the point estimate.
## S3 method for class 'bsim' coef(object, method = c("mean", "median"), se = FALSE, ...)## S3 method for class 'bsim' coef(object, method = c("mean", "median"), se = FALSE, ...)
object |
A fitted object of |
method |
Character string indicating the summary statistic to compute.
Options are |
se |
Logical value whether computing standard error for index estimates.
If |
... |
Additional arguments passed to other methods. |
A numeric vector or data.frame of estimated coefficient and standard error of the index vector.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Compiles a nimble model object and a corresponding (uncompiled) MCMC algorithm and returns the compiled pair.
compileModelAndMCMC(object)compileModelAndMCMC(object)
object |
Class |
The function first compiles nimble model object, then compiles nimble sampler.
Both compiled model and compiled MCMC samplers are returned as a list.
A list with two elements:
modelthe compiled NIMBLE model (external pointer object).
mcmcthe compiled MCMC function/algorithm bound to the model.
nimbleModel,
configureMCMC,
buildMCMC,
compileNimble,
runMCMC
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Concrete compressive strength dataset from the UCI Machine Learning Repository. No missing variables and there are 8 quantitative inputs and 1 quantitative output.
data(concrete)data(concrete)
Numeric data.frame of size 1030 8.
Numeric. Cement content (kg/m).
Numeric. Blast furnace slag (kg/m).
Numeric. Fly ash (kg/m).
Numeric. Mixing water (kg/m).
Numeric. Superplasticizer (kg/m).
Numeric. Coarse aggregate (kg/m).
Numeric. Fine aggregate (kg/m).
Numeric. Curing age (days; typically 1–365).
Numeric. Concrete compressive strength (MPa).
Source Concrete Compressive Strength in UCI Machine Learning Repository. This data is integrated by experimental data from 17 different sources to check the realiability of the strength. This dataset compiles experimental concrete mixes from multiple studies and is used to predict compressive strength and quantify how mixture ingredients and curing age affect that strength.
Variables.
Cement, Blast Furnace Slag, Fly Ash, Water, Superplasticizer, Coarse Aggregate, Fine Aggregate: quantities in kg per m of mixture.
Age: curing time in days (1–365).
Target(strength): compressive strength in MPa.
Yeh, I. (1998). Concrete Compressive Strength [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5PK67.
Yeh, I. (1998). Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research, 28(12), 1797-1808.
data(concrete) str(concrete) plot(density(concrete$strength), main = "Concrete compressive strength (MPa)")data(concrete) str(concrete) plot(density(concrete$strength), main = "Concrete compressive strength (MPa)")
Synthetic data from a single–index model
with and .
The index vector is and
.
data(DATA1)data(DATA1)
Numeric matrix of size 200 4, entries i.i.d.
.
Numeric vector of length 200.
data(DATA1) str(DATA1)data(DATA1) str(DATA1)
BayesSIM
Computes fitted values from a BayesSIM, using either the
posterior mean or median of the estimated link function with index values.
Fitted values can be returned on the latent scale or on the linear
predictor scale.
## S3 method for class 'bsim' fitted( object, type = c("latent", "linpred"), method = c("mean", "median"), ... )## S3 method for class 'bsim' fitted( object, type = c("latent", "linpred"), method = c("mean", "median"), ... )
object |
A fitted object of |
type |
Character string indicating the scale on which fitted values
are returned. Default is
|
method |
Character string specifying the summary statistic used to
compute the fitted values. Options are |
... |
Additional arguments passed to other methods. |
A numeric vector of fitted values.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
compileModelAndMCMC
Return compiled nimble model object and a corresponding MCMC samplers.
getModel(object) getSampler(object)getModel(object) getSampler(object)
object |
The result of |
getModel returns compiled Nimble model object.
getSampler returns compiled Nimble sampler object, directly using in runMCMC function.
nimbleModel,
configureMCMC,
buildMCMC,
compileNimble,
runMCMC
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Functions for getting list of initial values of the nimble model.
getInit(object)getInit(object)
object |
A fitted object of |
The list of initial values are returned. This can be helpful to use when you use BayesSIM_setup.
You should be aware of that if you want to get more than 1 chain of MCMC samples, you should change nchain argument in BayesSIM_setup.
The output of initial values would be different, depending on the number of chain.
You can apply BayesSIM object when you want to check the list of initial values.
BUGS code of the model definition.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Functions for identifying definition of the nimble model.
getModelDef(object)getModelDef(object)
object |
A fitted object of |
The function that contain Bayes SIM model structure in nimble. This function is for advanced users. There are several functions used in the model definition.
transX_fisher, transX_sp: Making B-spline basis.
dvMFnim: Distribution of von Mises-Fisher.
dKnotsSimple: Distribution of the free knots for bsSphere.
dunitSphere: Distribution of unit sphere.
alphaTheta: One-to-one polar transformation. Making index vector from individual angular vector psi.
expcov_gpSphere, expcov_gpPolar, expcov_gpSpike: Covariance kernel of each model.
expcov_gpSphere uses squared-exponential kernel, expcov_gpPolar uses OU process kernel, and expcov_gpSpike uses squared-exponential including its own parameter, .
Xlinear: Making linear combination with index vector.
BUGS code of the model definition.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- bsFisher_setup(y ~ ., data = simdata2) # Get model definition getModelDef(models)simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- bsFisher_setup(y ~ ., data = simdata2) # Get model definition getModelDef(models)
Functions for retrieving the variables that can be monitored.
getVarMonitor(object, type = c("name", "list"))getVarMonitor(object, type = c("name", "list"))
object |
A fitted object of |
type |
Options for variables. By default, |
The function returns a list of variables that can be used in monitors2 in the bayesSIM function.
You can also refer to getModelDef to understand the model structure and designate necessary variables.
Stochastic nodes of the model are recommended to be monitored.
A vector of variables that can be monitored in the model.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) # Get monitorable variables getVarMonitor(models) # Get the list of variables with dimension getVarMonitor(models, type = "list")simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) # Get monitorable variables getVarMonitor(models) # Get the list of variables with dimension getVarMonitor(models, type = "list")
BayesSIM
Generic function applied to BayesSIM.
It extracts goodness of fit of the BayesSIM.
GOF(object) ## S3 method for class 'bsim' GOF(object, ...)GOF(object) ## S3 method for class 'bsim' GOF(object, ...)
object |
A fitted object of |
... |
Additional arguments passed to other methods. |
Mean squared error of model with mean of MCMC sample is applied.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Fits a single–index model , where
the index lies on the unit sphere with von Mises-Fisher prior, and the link is represented
with Gaussian process.
gpFisher( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpFisher_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )gpFisher( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpFisher_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single-index model uses Gaussian process with zero mean and and covariance kernel as a link function, where is index value.
Index vector should be length 1.
Priors
von Mises–Fisher prior on the index with direction and concentration.
Covariance kernel: Amplitude, , follows log normal distribution with mean and variance .
Length-scale parameter follows gamma distribution with shape parameter and rate parameter .
Inverse gamma prior on with shape parameter and rate parameter .
Sampling All sampling parameters' samplers are assigned by nimble.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: von Mises–Fisher prior for the projection vector (index).
index_direction is a unit direction vector of the von Mises–Fisher distribution.
By default, the estimated vector from projection pursuit regression is assigned.
index_dispersion is the positive concentration parameter. By default, 150 is assigned.
Link function:
Length-scale:Gamma distribution is assigned for length-scale parameter, .
link_lengthscale_shape is shape parameter (default 1/8) and link_lengthscale_rate is rate parameter of lengthscale. (default 1/8)
Amplitude: Log-normal distribution is assigned for amplitude parameter, .
link_amp_a is mean (default -1), and link_amp_b is variance. (default 1)
Error variance (sigma2): An inverse-gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 1, sigma2_rate = 1)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector (index): Initial unit index vector . By default, the vector is sampled from the von Mises–Fisher prior.
Link function: link_lengthscale is initial scalar length-scale parameter. (default: 0.1)
link_amp is initial scalar amplitude parameter. (default: 1)
Error variance (sigma2): Initial scalar error variance. (default: 1)
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Antoniadis, A., Grégoire, G., & McKeague, I. W. (2004). Bayesian estimation in single-index models. Statistica Sinica, 1147-1164.
Choi, T., Shi, J. Q., & Wang, B. (2011). A Gaussian process regression approach to a single-index model. Journal of Nonparametric Statistics, 23(1), 21-36.
Hornik, K., & Grün, B. (2014). movMF: an R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58, 1-31.
set.seed(123) N <- 60; p <- 2 x1 <- runif(N, -3, 5) x2 <- runif(N, -3, 5) beta1 <- 0.45; beta2 <- sqrt(1-beta1^2) sigma <- sqrt(0.0036) xlin <- x1*beta1 + x2*beta2 eta <- 0.1*xlin + sin(0.5*xlin)^2 y <- rnorm(N, eta, sigma) x <- matrix(c(x1, x2), ncol = 2) simdata <- data.frame(x = x, y = y) colnames(simdata) <- c("X1", "X2", "y") # One tool version fit1 <- gpFisher(y ~ ., data = simdata, nchain = 1, niter = 1000, nburnin = 100) # Split version models <- gpFisher_setup(y ~ ., data = simdata, nchain = 1) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 1000, nburnin = 100, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) N <- 60; p <- 2 x1 <- runif(N, -3, 5) x2 <- runif(N, -3, 5) beta1 <- 0.45; beta2 <- sqrt(1-beta1^2) sigma <- sqrt(0.0036) xlin <- x1*beta1 + x2*beta2 eta <- 0.1*xlin + sin(0.5*xlin)^2 y <- rnorm(N, eta, sigma) x <- matrix(c(x1, x2), ncol = 2) simdata <- data.frame(x = x, y = y) colnames(simdata) <- c("X1", "X2", "y") # One tool version fit1 <- gpFisher(y ~ ., data = simdata, nchain = 1, niter = 1000, nburnin = 100) # Split version models <- gpFisher_setup(y ~ ., data = simdata, nchain = 1) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 1000, nburnin = 100, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
Fits a single–index model , where
the index is specified and computed via a one-to-one polar
transformation, and the link is represented with a Gaussian process.
gpPolar( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolar_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolarHigh( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolarHigh_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )gpPolar( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolar_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolarHigh( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpPolarHigh_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model is specified as ,
where the index vector lies on the unit sphere with () with non-zero first component
to ensure identifiability and is parameterized via a one-to-one polar transformation with angle .
The mapping is
The vector is then scaled to unit length.
Sampling is performed on the angular parameters defining
the index vector. The link function is modeled by a Gaussian process
prior with zero mean and an Ornstein–Uhlenbeck (OU) covariance kernel
, where is a bandwidth (smoothness)
parameter and is index value ().
Priors
is dimension of polar angle of index vector and re-scaled Beta distribution on is assigned for prior.
Prior for (bandwidth parameter) is discrete uniform of equally spaced grid points in ].
Inverse gamma prior on with shape parameter and rate parameter .
Sampling For gpPolar, is sampled by Metropolis-Hastings and updated with ,
is chosen by grid search method that maximizes likelihood,
are sampled from their full conditional
distributions using Gibbs sampling.
Since is sampled by grid search, more than 5 dimension of variables gpPolarHigh is recommended.
For gpPolarHigh, all sampling parameters' samplers are assigned by nimble.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: Only shape parameter index_psi_alpha of dimension vector is needed since rate parameters is computed to satisfy .
By default, the shape parameter for each element of the polar vector is set to 5000.
Link function:
link_kappa_min is minimum value of kappa (default 0.5), link_kappa_max is maximum value of kappa (default 4),
and link_kappa_grid_width is space (default 0.1).
Error variance (sigma2): An Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 2, sigma2_rate = 0.01)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector: Initial vector of polar angle index_psi with dimension. Each element of angle is between 0 and .
Link function: Initial scalar scale parameter of covariance kernel link_kappa. (default: 2)
Error variance (sigma2): Initial scalar error variance. (default: 0.01)
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Dhara, K., Lipsitz, S., Pati, D., & Sinha, D. (2019). A new Bayesian single index model with or without covariates missing at random. Bayesian analysis, 15(3), 759.
library(MASS) N <- 100 # Sample Size p <- 3 mu <- c(0,0,0) rho <- 0 Cx <- rbind(c(1,rho,rho), c(rho,1,rho), c(rho, rho,1)) X <- mvrnorm(n = N, mu=mu, Sigma=Cx, tol=1e-8) alpha <- c(1,1,1) alpha <- alpha/sqrt(sum(alpha^2)) z <- matrix(0,N) z <- X %*% alpha z <- z[,1] sigma <- 0.3 f <- exp(z) y <- f + rnorm(N, 0, sd=sigma) # adding noise y <- y-mean(y) f_all <- f x_all <- X y_all <- y simdata <- cbind(x_all, y, f) simdata <- as.data.frame(simdata) colnames(simdata) = c('x1', 'x2', 'x3', 'y','f') prior_gpPolar <- prior_param(indexprior = "polar", link = "gp", link_kappa_max = 2, link_kappa_grid_width = 0.05) # One tool version fit1 <- gpPolar(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar, niter = 3000, nburnin = 2000, nchain = 1) fit2 <- gpPolarHigh(y ~ x1 + x2 + x3, data = simdata, niter = 3000, nburnin = 2000, nchain = 1) # Split version models1 <- gpPolar_setup(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar) models2 <- gpPolarHigh_setup(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar) Ccompile1 <- compileModelAndMCMC(models1) Ccompile2 <- compileModelAndMCMC(models2) sampler1 <- getSampler(Ccompile1) sampler2 <- getSampler(Ccompile2) initList1 <- getInit(models1) initList2 <- getInit(models2) mcmc.out1 <- runMCMC(sampler1, niter = 3000, nburnin = 2000, thin = 1, nchains = 1, setSeed = TRUE, init = initList1, summary = TRUE, samplesAsCodaMCMC = TRUE) mcmc.out2 <- runMCMC(sampler2, niter = 3000, nburnin = 2000, thin = 1, nchains = 1, setSeed = TRUE, init = initList2, summary = TRUE, samplesAsCodaMCMC = TRUE) fit1_split <- as_bsim(models1, mcmc.out1) fit2_split <- as_bsim(models2, mcmc.out2)library(MASS) N <- 100 # Sample Size p <- 3 mu <- c(0,0,0) rho <- 0 Cx <- rbind(c(1,rho,rho), c(rho,1,rho), c(rho, rho,1)) X <- mvrnorm(n = N, mu=mu, Sigma=Cx, tol=1e-8) alpha <- c(1,1,1) alpha <- alpha/sqrt(sum(alpha^2)) z <- matrix(0,N) z <- X %*% alpha z <- z[,1] sigma <- 0.3 f <- exp(z) y <- f + rnorm(N, 0, sd=sigma) # adding noise y <- y-mean(y) f_all <- f x_all <- X y_all <- y simdata <- cbind(x_all, y, f) simdata <- as.data.frame(simdata) colnames(simdata) = c('x1', 'x2', 'x3', 'y','f') prior_gpPolar <- prior_param(indexprior = "polar", link = "gp", link_kappa_max = 2, link_kappa_grid_width = 0.05) # One tool version fit1 <- gpPolar(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar, niter = 3000, nburnin = 2000, nchain = 1) fit2 <- gpPolarHigh(y ~ x1 + x2 + x3, data = simdata, niter = 3000, nburnin = 2000, nchain = 1) # Split version models1 <- gpPolar_setup(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar) models2 <- gpPolarHigh_setup(y ~ x1 + x2 + x3, data = simdata, prior = prior_gpPolar) Ccompile1 <- compileModelAndMCMC(models1) Ccompile2 <- compileModelAndMCMC(models2) sampler1 <- getSampler(Ccompile1) sampler2 <- getSampler(Ccompile2) initList1 <- getInit(models1) initList2 <- getInit(models2) mcmc.out1 <- runMCMC(sampler1, niter = 3000, nburnin = 2000, thin = 1, nchains = 1, setSeed = TRUE, init = initList1, summary = TRUE, samplesAsCodaMCMC = TRUE) mcmc.out2 <- runMCMC(sampler2, niter = 3000, nburnin = 2000, thin = 1, nchains = 1, setSeed = TRUE, init = initList2, summary = TRUE, samplesAsCodaMCMC = TRUE) fit1_split <- as_bsim(models1, mcmc.out1) fit2_split <- as_bsim(models2, mcmc.out2)
Fits a single–index model , where
the index lies on the unit sphere, and the link is represented
with Gaussian process.
gpSphere( formula, data, prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpSphere_setup( formula, data, prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )gpSphere( formula, data, prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpSphere_setup( formula, data, prior = NULL, init = NULL, method = "FB", lowerB = NULL, upperB = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
method |
Character, |
lowerB |
This parameter is only for |
upperB |
This parameter is only for |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single-index model uses Gaussian process with zero mean and and covariance kernel as a link function, where is index value.
Index vector should be length 1.
Priors
von Mises–Fisher prior on the index with direction and concentration.
Covariance kernel: Amplitude, , follows log normal distribution with mean and variance .
Length-scale parameter follows gamma distribution with shape parameter and rate parameter .
Inverse-Gamma prior on with shape parameter and rate parameter .
Sampling In the fully Bayesian approach, , , and
are updated via the Metropolis–Hastings algorithm, while and
are sampled using Gibbs sampling.
In the empirical Bayes approach, , , ,
and are estimated by maximum a posteriori (MAP), and
is sampled from its full conditional posterior distribution.
In the empirical Gibbs sampler, , , and
are estimated by MAP, whereas and are sampled
via Gibbs sampling.
For estimation via MAP, effective sample size or potential scale reduction factor is meaningless.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: Nothing to assign.
Link function:
Length-scale:Gamma distribution is assigned for length-scale parameter, .
link_lengthscale_shape is shape parameter (default 1/8) and link_lengthscale_rate is rate parameter of lengthscale. (default 1/8)
Amplitude: Log-normal distribution is assigned for amplitude parameter, .
link_amp_a is mean (default -1), and link_amp_b is variance. (default 1)
Error variance (sigma2): inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 1, sigma2_rate = 1)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param.
Index vector (index): Initial unit index vector. By default, vector is randomly drawn from normal distribution and standardized.
Link function: link_lengthscale is initial scalar length-scale parameter. (default: 0.1)
link_amp is initial scalar amplitude parameter. (default: 1)
Error variance (sigma2): Initial scalar error variance. (default: 1)
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
Choi, T., Shi, J. Q., & Wang, B. (2011). A Gaussian process regression approach to a single-index model. Journal of Nonparametric Statistics, 23(1), 21-36.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- gpSphere(y ~ ., method = "EB", data = simdata, niter = 1000, nburnin = 100) # Split version models <- gpSphere_setup(y ~ ., method = "EB", data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 1000, nburnin = 100, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) # The estimates computed by MAP - standard error of the esitmate is meaningless. summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- gpSphere(y ~ ., method = "EB", data = simdata, niter = 1000, nburnin = 100) # Split version models <- gpSphere_setup(y ~ ., method = "EB", data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 1000, nburnin = 100, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) # The estimates computed by MAP - standard error of the esitmate is meaningless. summary(fit2)
Fits a single-index model ,
where index vector has a spike and slab prior and
the link is represented by Gaussian process and the
gpSpike( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpSpike_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )gpSpike( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE ) gpSpike_setup( formula, data, prior = NULL, init = NULL, monitors = NULL, niter = 10000, nburnin = 1000, thin = 1, nchain = 1, setSeed = FALSE )
formula |
an object of class formula. Interaction term is not allowed for single-index model. |
data |
an data frame. |
prior |
Optional named list of prior settings. Further descriptions are in Details section. |
init |
Optional named list of initial values. If the values are not assigned, they are randomly sampled from prior or designated value. Further descriptions are in Details section. |
monitors |
Optional character vector of additional monitor nodes. To check the names of the nodes, fit the |
niter |
Integer. Total MCMC iterations (default |
nburnin |
Integer. Burn-in iterations (default |
thin |
Integer. Thinning for monitors (default |
nchain |
Integer. Number of MCMC chains (default |
setSeed |
Logical or numeric argument. Further details are provided in runMCMC |
Model The single–index model is specified as ,
where is a p-dimensional index vector subject to a spike-and-slab
prior for variable selection. The link function is modeled
using a Gaussian process prior with zero mean and squared exponential covariance
kernel ,
where determines the smoothness of .
The covariance kernel is re-parameterized to where
and
.
Therefore, is sampled in MCMC.
Priors
The variable selection indicator has Beta–Bernoulli hierarchy prior. Beta hyper-prior on the Bernoulli–inclusion probability ,
inducing where .
are shape and rate parameter of beta distribution.
Slab coefficients have normal distribution with zero mean and variance.
GP precision follows gamma distribution with shape parameter , and rate parameter .
Error precision follows gamma distribution with shape parameter , and rate parameter .
Sampling A random walk Metropolis algorithm is used to sample
and a Metropolis-Hastings algorithm is used for the main parameters .
The variance is directly sampled from posterior distribution.
is not directly sampled by MCMC.
Prior hyper-parameters These are the prior hyper-parameters set in the function. You can define new values for each parameter in prior_param.
Index vector: index_r1, index_r2 gives the shape and rate parameter of beta-binomial prior, respectively.
For slab prior, normal distribution with zero mean is assigned for selected variables . index_sigma_theta is for variance of , and it is assigned 0.25 by default.
Link function: Inverse gamma prior is assigned for hyper-parameters
link_inv_lambda_shape is shape parameter and link_inv_lambda_rate is rate parameter of inverse gamma distribution.
(default link_inv_lambda_shape = 1, link_inv_lambda_rate = 0.1)
Error variance (sigma2): An Inverse gamma prior is assigned to
where sigma2_shape is shape parameter and sigma2_rate is rate parameter of inverse gamma distribution.
(default sigma2_shape = 0.001, sigma2_rate = 100)
Initial values These are the initial values set in the function. You can define new values for each initial value in init_param
Index vector:
index_pi: Initial selecting variable probability. (default: 0.5)
index_nu: Initial vector of inclusion indicators . By default, each index_nu is randomly drawn by
index: Initial vector of index. By default, each element of index vector, which is chosen by indicator, is proposed by normal distribution.
Link function: Initial scalar of lambda (link_inv_lambda) for covariance kernel of Gaussian process.
Error variance (sigma2): Initial scalar error variance. (default: 0.01)
A list typically containing:
coefficientsMean estimates of index vector. Return list of model_setup does not include it.
sesMean standard error of index vector. Return list of model_setup does not include it.
residualsResiduals with mean estimates of fitted values. Return list of model_setup does not include it.
fitted.valuesMean estimates of fitted values. Return list of model_setup does not include it.
linear.predictorsMean estimates of single-index values. Return list of model_setup does not include it.
gofGoodness-of-fit. Return list of model_setup function does not include it.
samplesPosterior draws of variables for computing fitted values of the model, including , .
Return list of model_setup does not include it.
inputList of data used in modeling, formula, input values for prior and initial values, and computation time without compiling.
defModelNimble model object.
defSamplerNimble sampler object.
modelNameName of the model.
McGee, G., Wilson, A., Webster, T. F., & Coull, B. A. (2023). Bayesian multiple index models for environmental mixtures. Biometrics, 79(1), 462-474.
set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- gpSpike(y ~ ., data = simdata, niter = 5000, nburnin = 1000) # Split version models <- gpSpike_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)set.seed(123) n <- 200; d <- 4 theta <- c(2, 1, 1, 1); theta <- theta / sqrt(sum(theta^2)) f <- function(u) u^2 * exp(u) sigma <- 0.5 X <- matrix(runif(n * d, -1, 1), nrow = n) index_vals <- as.vector(X %*% theta) y <- f(index_vals) + rnorm(n, 0, sigma) simdata <- data.frame(x = X, y = y) colnames(simdata) <- c(paste0("X", 1:4), "y") # One tool version fit1 <- gpSpike(y ~ ., data = simdata, niter = 5000, nburnin = 1000) # Split version models <- gpSpike_setup(y ~ ., data = simdata) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) fit2 <- as_bsim(models, mcmc.out) summary(fit2)
init_param is a convenience helper that constructs a nested initial value list
for a given combination of index vector and link function.
It starts from the model-specific default prior, and then overwrites only those components for
which the user supplies non-NULL arguments.
This allows users to modify selected hyper-parameters without having to know or manually reconstruct the full nested prior list structure.
init_param( indexprior, link, index = NULL, index_nu = NULL, index_psi = NULL, index_pi = NULL, link_beta = NULL, link_k = NULL, link_knots = NULL, link_lengthscale = NULL, link_amp = NULL, link_kappa = NULL, link_inv_lambda = NULL, sigma2 = NULL )init_param( indexprior, link, index = NULL, index_nu = NULL, index_psi = NULL, index_pi = NULL, link_beta = NULL, link_k = NULL, link_knots = NULL, link_lengthscale = NULL, link_amp = NULL, link_kappa = NULL, link_inv_lambda = NULL, sigma2 = NULL )
indexprior |
Character scalar indicating the prior for the index.
Typically one of |
link |
Character scalar indicating the link function family.
Typically |
index, index_nu, index_psi, index_pi
|
Optional initial values for index and related parameter values. |
link_beta, link_k, link_knots, link_lengthscale, link_amp, link_kappa, link_inv_lambda
|
Optional initial values for components under link functions. |
sigma2 |
Optional numeric scalar giving the initial value of |
init_param(indexprior, link) can be used to obtain the random initial values
list for the requested combination of index prior and link function.
For any argument that is not NULL, the corresponding field in the nested prior list is overwritten.
The detailed meaning and recommended choices for each initial values depend on the specific model, index vector and link function. For those details, please refer to the documentation of the corresponding model-fitting functions.
A nested list with components index, link, and
sigma2.
bsFisher(), bsSphere(), bsPolar(), bsSpike(),
gpFisher(), gpSphere(), gpPolar(), gpPolarHigh(), gpSpike()
## Default initial values for Fisher index + B-spline link: i0 <- init_param("fisher", "bspline") ## Modify only a few initial values: i1 <- init_param( indexprior = "fisher", link = "bspline", index = c(1, 0, 0), # initial direction of the index link_beta = rep(0, 21), # initial values for spline coefficients sigma2 = 0.1 # initial value for sigma^2 ) ## Example with GP link: i2 <- init_param( indexprior = "sphere", link = "gp", link_lengthscale = 0.2, # initial GP length-scale link_amp = 1.5, # initial GP amplitude sigma2 = 1 # initial variance )## Default initial values for Fisher index + B-spline link: i0 <- init_param("fisher", "bspline") ## Modify only a few initial values: i1 <- init_param( indexprior = "fisher", link = "bspline", index = c(1, 0, 0), # initial direction of the index link_beta = rep(0, 21), # initial values for spline coefficients sigma2 = 0.1 # initial value for sigma^2 ) ## Example with GP link: i2 <- init_param( indexprior = "sphere", link = "gp", link_lengthscale = 0.2, # initial GP length-scale link_amp = 1.5, # initial GP amplitude sigma2 = 1 # initial variance )
BayesSIM
Provides traceplot for objects of BayesSIM.
nimTraceplot(x, ...)nimTraceplot(x, ...)
x |
A fitted object of |
... |
Further arguments passed to |
Traceplots for MCMC samples are displayed. By default, the index vector and error variance are only included in the summary. Extra monitor variables are included, if specified.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
BayesSIM
Produce diagnostic plots for a fitted Bayesian single-index model.
## S3 method for class 'bsim' plot(x, method = c("mean", "median"), interval = TRUE, alpha = 0.95, ...) ## S3 method for class 'bsimPred' plot(x, ...)## S3 method for class 'bsim' plot(x, method = c("mean", "median"), interval = TRUE, alpha = 0.95, ...) ## S3 method for class 'bsimPred' plot(x, ...)
x |
A fitted object of |
method |
Character string specifying the summary used for the
posterior fitted values. Options are |
interval |
A logical value indicating whether a credible interval is included in the fitted plot. Default is |
alpha |
Numeric value between 0 and 1 specifying the credible level.
By default, |
... |
Additional arguments passed to underlying plotting functions. |
The function internally calls predict() on the fitted model object
to obtain posterior summaries of . Predicted value of is .
If interval = TRUE, the function requests posterior credible intervals
and overlays them on the fitted curve.
If interval = FALSE, only the posterior mean or median curve is drawn.
The output consists of two plots:
Observed vs Predicted plot: a diagnostic scatter plot comparing actual outcomes with posterior fitted values to visually assess model fit.
Fitted curve plot: posterior as a function of the
estimated single index, optionally with % credible intervals.
predict.bsim(), summary.bsim()
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
BayesSIM_setup
Produce diagnostic plots for a raw data.
## S3 method for class 'bsimSetup' plot(x, select = NULL, ...)## S3 method for class 'bsimSetup' plot(x, select = NULL, ...)
x |
A fitted object of |
select |
A vector of index or name of variables in data. |
... |
Additional arguments passed to underlying plotting functions. |
The function displays scatter plots of the response variable against each predictor variable. Each panel includes a smoothed trend line to help assess the marginal relationship before fitting the model.
A ggplot object is returned invisibly, allowing further modification if needed.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) plot(models, select = c("X1", "X2"))simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) plot(models, select = c("X1", "X2"))
BayesSIM
Generate predictions from a fitted Bayesian single-index model.
## S3 method for class 'bsim' predict( object, newdata = NULL, se.fit = FALSE, type = c("response", "latent", "index"), method = c("mean", "median"), interval = c("none", "credible"), level = 0.95, ... ) ## S3 method for class 'bsimPred' print(x, ...)## S3 method for class 'bsim' predict( object, newdata = NULL, se.fit = FALSE, type = c("response", "latent", "index"), method = c("mean", "median"), interval = c("none", "credible"), level = 0.95, ... ) ## S3 method for class 'bsimPred' print(x, ...)
object |
A fitted object of |
newdata |
Optional data frame for which predictions should be made.
If |
se.fit |
A logical value indicating whether standard errors are required.
Default is |
type |
Character string specifying the type of prediction.
|
method |
Character string determining the posterior summary used for
point predictions. Options are |
interval |
Character string indicating whether a credible interval should be returned. Default is
|
level |
Numeric value between 0 and 1 specifying the credible level.
|
... |
Additional arguments. |
x |
An object of class "bsimPred" which is the result of |
This method extracts MCMC posterior samples stored in a BayesSIM object and computes posterior summaries of:
the posterior predictive response (type "response"),
the latent link function evaluation (type "latent"), or
the single index (type "index").
The key distinction is that "response" incorporates the posterior
variability of the error term , whereas "latent" represents
the noiseless conditional expectation computed directly
from the link function and the posterior draws of .
When interval = "credible", the returned object includes lower and upper
credible bounds computed via posterior quantiles for the chosen prediction
scale.
If newdata is supplied, predictions are evaluated at the new covariate
values by computing the corresponding posterior index
and applying the link function.
A list containing at least the following components:
fittedPosterior mean or median predictions. If se.fit = TRUE or
interval = "credible", standard error and lower, upper bounds of the credible interval is also included.
trueyTrue response value of test data. When true response value is not available, NULL is saved.
idxValueLinear index value is saved where is estimated by method.
levelCredible level.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
prior_param is a convenience helper that constructs a nested prior list
for a given combination of index prior and link function.
It starts from the model-specific default prior, and then overwrites only those components for
which the user supplies non-NULL arguments.
This allows users to modify selected hyper-parameters without having to know or manually reconstruct the full nested prior list structure.
prior_param( indexprior, link, index_direction = NULL, index_dispersion = NULL, index_nu_r1 = NULL, index_nu_r2 = NULL, index_psi_alpha = NULL, index_sigma_theta = NULL, index_r1 = NULL, index_r2 = NULL, link_basis_df = NULL, link_basis_degree = NULL, link_basis_delta = NULL, link_knots_lambda_k = NULL, link_knots_maxknots = NULL, link_beta_mu = NULL, link_beta_cov = NULL, link_beta_tau = NULL, link_beta_Sigma0 = NULL, link_lengthscale_shape = NULL, link_lengthscale_rate = NULL, link_amp_a = NULL, link_amp_b = NULL, link_kappa_min = NULL, link_kappa_max = NULL, link_kappa_grid_width = NULL, link_inv_lambda_shape = NULL, link_inv_lambda_rate = NULL, sigma2_shape = NULL, sigma2_rate = NULL )prior_param( indexprior, link, index_direction = NULL, index_dispersion = NULL, index_nu_r1 = NULL, index_nu_r2 = NULL, index_psi_alpha = NULL, index_sigma_theta = NULL, index_r1 = NULL, index_r2 = NULL, link_basis_df = NULL, link_basis_degree = NULL, link_basis_delta = NULL, link_knots_lambda_k = NULL, link_knots_maxknots = NULL, link_beta_mu = NULL, link_beta_cov = NULL, link_beta_tau = NULL, link_beta_Sigma0 = NULL, link_lengthscale_shape = NULL, link_lengthscale_rate = NULL, link_amp_a = NULL, link_amp_b = NULL, link_kappa_min = NULL, link_kappa_max = NULL, link_kappa_grid_width = NULL, link_inv_lambda_shape = NULL, link_inv_lambda_rate = NULL, sigma2_shape = NULL, sigma2_rate = NULL )
indexprior |
Character scalar indicating the prior for the index.
Typically one of |
link |
Character scalar indicating the link function family.
Typically |
index_direction, index_dispersion, index_nu_r1, index_nu_r2, index_psi_alpha, index_sigma_theta, index_r1, index_r2
|
Optional overrides for hyper-parameters related to the index prior. |
link_basis_df, link_basis_degree, link_basis_delta
|
Optional overrides for the B-spline basis hyper-parameters, such as the effective degrees of freedom, spline degree, and penalty parameter. |
link_knots_lambda_k, link_knots_maxknots
|
Optional overrides for the B-spline knot-selection hyper-parameters in, used for models with adaptive knot placement. |
link_beta_mu, link_beta_cov, link_beta_tau, link_beta_Sigma0
|
Optional overrides for the prior on spline coefficients. The detailed interpretation of these hyper-parameters depends on the specific model and is described in the documentation of each model-fitting function. |
link_lengthscale_shape, link_lengthscale_rate
|
Optional overrides for the hyper-parameters of the GP length-scale prior. |
link_amp_a, link_amp_b
|
Optional overrides for the hyper-parameters of the GP amplitude prior. |
link_kappa_min, link_kappa_max, link_kappa_grid_width
|
Optional overrides for the hyper-parameters in used in models with polar index and GP-type link,
to control the grid or support for the concentration parameter |
link_inv_lambda_shape, link_inv_lambda_rate
|
Optional overrides for spike-and-slab–type GP link priors. |
sigma2_shape, sigma2_rate
|
Optional overrides for the inverse-gamma prior on the observation
variance |
prior_param(indexprior, link) can be used to obtain the default prior
list for the requested combination of index prior and link function.
For any argument that is not NULL, the corresponding field in the nested prior list is overwritten.
The detailed meaning and recommended choices for each hyper-parameter depend on the specific model, prior of index vector and link function. For those details, please refer to the documentation of the corresponding model-fitting functions.
A nested list with top-level elements index, link, and
sigma2, suitable for passing to the prior argument of the
various BayesSIM model fitting functions.
bsFisher(), bsSphere(), bsPolar(), bsSpike(),
gpFisher(), gpSphere(), gpPolar(), gpPolarHigh(), gpSpike()
## Default prior for Fisher index + B-spline link: p0 <- prior_param("fisher", "bspline") ## Modify only a few hyper-parameters: p1 <- prior_param( indexprior = "fisher", link = "bspline", sigma2_shape = 0.5, link_basis_df = 30, index_direction = c(1, 0, 0) )## Default prior for Fisher index + B-spline link: p0 <- prior_param("fisher", "bspline") ## Modify only a few hyper-parameters: p1 <- prior_param( indexprior = "fisher", link = "bspline", sigma2_shape = 0.5, link_basis_df = 30, index_direction = c(1, 0, 0) )
BayesSIM
Returns the model residuals based on the posterior fitted values of a
BayesSIM. Residuals can be computed using either the posterior mean or median fitted values.
## S3 method for class 'bsim' residuals(object, method = c("mean", "median"), ...)## S3 method for class 'bsim' residuals(object, method = c("mean", "median"), ...)
object |
A fitted object of |
method |
Character string specifying the summary statistic used to
compute the fitted values. Options are |
... |
Additional arguments passed to other methods. |
A numeric vector of residuals. ()
can be mean or median of MCMC samples.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
BayesSIM
Provides a summary for BayesSIM.
## S3 method for class 'bsim' summary(object, ...) ## S3 method for class 'summary.bsim' print(x, digits = 3, ...)## S3 method for class 'bsim' summary(object, ...) ## S3 method for class 'summary.bsim' print(x, digits = 3, ...)
object |
A fitted object of |
... |
Further arguments passed. |
x |
A summary output of |
digits |
The minimum number of significant digits to be printed. |
A list of summary statistics for MCMC samples, including data.frame table for the results.
Each row corresponds to a model parameter, and columns report the statistics.
The function summarizes posterior MCMC samples by reporting key statistics, including:
Posterior mean and median
Empirical standard deviation
95% credible interval (lower and upper quantiles)
Potential scale reduction factor (Rhat) for multiple chains
Effective sample size (ESS)
By default, the index vector and error variance are only included in the summary.
If variable selection methods are used, such as uniform sphere and spike-and-slab prior,
the indicator vector (nu) is also included.
Note that the potential scale reduction factor for nu can be reported as
NaN or Inf, since the indicator rarely changes during the MCMC run.
If the model is fitted with single chain, both all.chain and chain have identical information.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) # 1. One tool version fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Check median index vector estimates with standard errors coef(fit_one, method = "median", se = TRUE) # Fitted index values of median prediction fitted(fit_one, type = "linpred", method = "median") # Residuals of median prediction residuals(fit_one, method = "median") # Summary of the model summary(fit_one) # Convergence diagnostics nimTraceplot(fit_one) # Goodness of fit GOF(fit_one) # Fitted plot plot(fit_one) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) plot(pred) # 2. Split version models <- BayesSIM_setup(y ~ ., data = simdata2) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes exactly the same as the class of "fit_one" object and apply generic functions. fit_split <- as_bsim(models, mcmc.out)
Provides a summary for BayesSIM prediction.
## S3 method for class 'bsimPred' summary(object, ...) ## S3 method for class 'summary.bsimPred' print(x, digits = 3, ...)## S3 method for class 'bsimPred' summary(object, ...) ## S3 method for class 'summary.bsimPred' print(x, digits = 3, ...)
object |
An object of class "bsimPred" which is the result of |
... |
Further arguments passed. |
x |
A summary output of class "bsimPred". |
digits |
The minimum number of significant digits to be printed. |
Performance of prediction with point predicted values.
Metrics are including RMSE, MAE, Mean bias, and coverage. In terms of coverage, it is only available
when the "bsimPred" includes interval, corresponding to interval = "credible" options for predict function.
It indicates mean coverage rate of data points whether confidence intervals include true response values.
In this case, all chains are combined for the computation.
simdata2 <- data.frame(DATA1$X, y = DATA1$y) fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) summary(pred) plot(pred)simdata2 <- data.frame(DATA1$X, y = DATA1$y) fit_one <- BayesSIM(y ~ ., data = simdata2, niter = 5000, nburnin = 1000, nchain = 1) # Prediction with 95% credible interval at new data newx <- data.frame(X1 = rnorm(10), X2 = rnorm(10), X3 = rnorm(10), X4 = rnorm(10)) pred <- predict(fit_one, newdata = newx, interval = "credible", level = 0.95) summary(pred) plot(pred)
BayesSIM_setup
Provides a summary for BayesSIM_setup.
## S3 method for class 'bsimSetup' summary(object, ...) ## S3 method for class 'summary.bsimSetup' print(x, digits = 1, ...)## S3 method for class 'bsimSetup' summary(object, ...) ## S3 method for class 'summary.bsimSetup' print(x, digits = 1, ...)
object |
A fitted object of |
... |
Further arguments passed. |
x |
A summary output of |
digits |
The minimum number of significant digits to be printed. |
Summarize the model setup information of an object created by
a setup function (e.g., BayesSIM_setup),
including the model name, data dimensions, prior settings, and sampling options.
An object of class summary.bsimSetup, which is a list containing:
modelName |
The name of the fitted model. |
n |
The number of observations. |
p |
The number of predictors. |
prior |
A list of prior settings used in the model. |
samplingOptions |
A list of MCMC sampling options, including |
model |
The compiled nimble model object. |
simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) summary(models) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes the class of "bsim". fit_split <- as_bsim(models, mcmc.out)simdata2 <- data.frame(DATA1$X, y = DATA1$y) models <- BayesSIM_setup(y ~ ., data = simdata2) summary(models) Ccompile <- compileModelAndMCMC(models) nimSampler <- getSampler(Ccompile) initList <- getInit(models) mcmc.out <- runMCMC(nimSampler, niter = 5000, nburnin = 1000, thin = 1, nchains = 1, setSeed = TRUE, inits = initList, summary = TRUE, samplesAsCodaMCMC = TRUE) # "fit_split" becomes the class of "bsim". fit_split <- as_bsim(models, mcmc.out)