The parametric bootstrap is a powerful, general tool to obtain confidence intervals for estimates from parametric models.
Parametric bootstrap versus alternatives
The parametric bootstrap is used less often than the nonparametric bootstrap, because it relies heavily on the assumed parametric model. I focus on the parametric bootstrap here, because it builds excellent intuition for “what we’re up to” when we are estimating standard errors.
Algorithm: Parametric Bootstrap Estimator
Input: Observed data \(x_1, x_2, \dots, x_n\), a parametric model \(f(\theta)\), an estimator \(\hat{\theta}\), and a quantity of interest \(\hat{\tau} = \tau(\hat{\theta})\).1
Generate a bootstrap sample \(x_1^{*(b)}, x_2^{*(b)}, \dots, x_n^{*(b)} \sim f(\hat{\theta})\). That is, simulate a new data set from the fitted parametric model.3
Fit the model to the simulated data set \(x_1^{*(b)}, x_2^{*(b)}, \dots, x_n^{*(b)}\) and compute \(\hat{\theta}^{*(b)}\).
Compute \(\hat{\tau}^{*(b)} = \tau \left( \hat{\theta}^{*(b)}\right)\), which might be \(\theta^{*(b)}\) itself if no transformation is needed.
Summarize the empirical distribution of \(\hat{\tau}^{*(1)}, \hat{\tau}^{*(2)}, \dots, \hat{\tau}^{*(B)}\) using the SD to estimate the SE or 5th/95th percentiles to obtain a 90% confidence interval.
1 This works for any statistic that is a function of \(x_1, \dots, x_n\), but we’ll mainly use ML estimators and direct transformations of those.
2 We typically use \(B = 2,000\) as a good starting point.
3 For example, if we use an exponential model and estimate the rate as \(\hat{\lambda}\), then we would simulate each data set from an exponential distribution with parameter \(\hat{\lambda}\). The first data set is \(x_1^{*(1)}, \dots, x_n^{*(1)}\), the second data set is \(x_1^{*(2)}, \dots, x_n^{*(2)}\), and so on. In R, that would be rexp(N, rate = lambda_hat)
Importantly, we lean heavily on the assumption that we have a good model of the distribution of the data. (The predictive distribution allows us to assess this.) There’s also a nonparametric bootstrap, which is much more popular. We consider that later in the semester.
5.1 Example: Toothpaste Cap Problem
The code below implements the parametric bootstrap for the toothpaste cap problem. For 2,000 iterations, it draws 150 observations from a Bernoulli distribution with \(\hat{\pi} = \frac{8}{150}\). For each iteration, it computes the ML estimate of \(\pi\) for the bootstrapped data set. Then it uses the SD of the bootstrap estimates to estimate the SE and computes the percentiles of the bootstrap estimates to obtain the confidence interval.
n_bs <-2000bs_est <-numeric(n_bs) # a container for the estimatesfor (i in1:n_bs) { bs_y <-rbinom(150, size =1, prob =8/150) bs_est[i] <-mean(bs_y)}print(sd(bs_est), digits =2) # se estimate
[1] 0.018
print(quantile(bs_est, probs =c(0.05, 0.95)), digits =2) # 90% ci
5% 95%
0.027 0.087
We leave an evaluation of this confidence interval (i.e., Does it capture \(\theta\) 90% of the time?) to later.
5.2 Example: Beta Distribution
Now let’s apply the parametric bootstrap to a two-parameter model: the beta distribution.
Now let’s find the ML estimates of the two shape parameters.
# obtain ml estimateslog_lik_fn <-function(par =c(2, 2), y) { a <- par[1] # pulling these out makes the code a bit easier to follow b <- par[2] log_lik_i <-dbeta(y, shape1 = a, shape2 = b, log =TRUE) log_lik <-sum(log_lik_i)return(log_lik)}opt <-optim(par =c(3, 3), fn = log_lik_fn, y = y,control =list(fnscale =-1))ml_est <- opt$parprint(ml_est, digits =3)
[1] 5.46 1.91
Now let’s use those ML estimates to perform a parametric bootstrap and find 95% CIs for the shape parameters.
# obtain parametric bootstrap 95% ci for alpha and betan_bs <-2000bs_est <-matrix(NA, nrow = n_bs, ncol =2) # a container for the estimatesfor (i in1:n_bs) { bs_y <-rbeta(n, shape1 = ml_est[1], shape2 = ml_est[2]) bs_opt <-optim(par =c(3, 3), fn = log_lik_fn, y = bs_y,control =list(fnscale =-1)) bs_est[i, ] <- bs_opt$par}ci <-apply(bs_est, MARGIN =2, quantile, probs =c(0.025, 0.975))print(ci, digits =3) # 95% ci
[,1] [,2]
2.5% 4.25 1.52
97.5% 7.52 2.58
If instead we cared about the mean of the beta distribution (which is \(\frac{\alpha}{\alpha + \beta}\)), we can use the parametric bootstrap to obtain a confidence interval for that quantity as well. Since we drew the fake data set from a beta distribution with \(\alpha = 5\) and \(\beta = 2\), the true mean is \(\frac{\alpha}{\alpha + \beta} = \frac{5}{7} \approx 0.71\).
# obtain parametric bootstrap 95% ci for meann_bs <-2000bs_est <-numeric(n_bs) # a container for the estimatesfor (i in1:n_bs) { bs_y <-rbeta(n, shape1 = ml_est[1], shape2 = ml_est[2]) bs_opt <-optim(par =c(3, 3), fn = log_lik_fn, y = bs_y,control =list(fnscale =-1)) bs_alpha <- bs_opt$par[1] bs_beta <- bs_opt$par[2] bs_est[i] <- bs_alpha/(bs_alpha + bs_beta)}print(quantile(bs_est, probs =c(0.05, 0.95)), digits =2) # 95% ci
5% 95%
0.72 0.77
print(sd(bs_est), digits =2) # estimate of SE
[1] 0.015
# true mean print(alpha/(alpha + beta), digits =2)