Multiple imputation MI is a popular approach to handling missing data. In the final part of MI, inferences for parameter estimates are made based on simple rules developed by Rubin. These rules rely on the analyst having a calculable standard error for their parameter estimate for each imputed dataset. This is fine for standard analyses, e. However, for many analyses analytic standard errors are not available, or are prohibitive to find by analytical methods. For such methods, if there were no missing data, an attractive approach for finding standard errors and confidence intervals is the method of bootstrapping.

They consider a number of possible ways of combining bootstrapping and MI. The two main approaches are either to first impute missing data, and then use bootstrapping to obtain an estimate of the within-imputation SE for each imputed dataset, or, to bootstrap the original data, and apply MI separately to each bootstrapped dataset.

The MI point estimate found from each bootstrap is then used to construct bootstrap SEs and confidence intervals in the usual way. In contrast, the approach which uses bootstrapping to find a SE for each imputed dataset results in SEs which are much too large and confidence intervals which over cover.

I think this is a very valuable investigation and paper, and recommend those who need to combine bootstrapping with MI to read it. One aspect I cannot quite understand is their explanation for the reason that MI followed by bootstrapping MI boot doesn't work.

On page 11 they state correctly I think that applying the bootstrap to the m'th imputed dataset estimatesthe variance of the m'th imputed point estimate of theta, conditional on the observed data and the imputed values in the m'th imputation - i. They then write. These estimates are not identical to the variance which we need to apply 3. Combining the M estimates is not meaningful because the quantity we use in our calculation is not but rather M different quantities which are all not unconditional on the missing and imputed data.

I don't really understand this - is the total posterior variance ofwhich is what we want to estimate with Rubin's rules. But the within-imputation variance is supposed to be the average across the predictive distribution of the missing data posterior variance of conditional on the imputed and observed data. And I can't see why this wouldn't be the case.

If anyone can explain this to me, please add a comment! I haven't read it yet, but I believe they may resolved my query above.

Any suggestions about how this approach i. But in principle it ought not to be tricky: 1 manually generate m bootstrap resampled datasets. Hope that helps. Hi Jonathan - I was of your colleagues at the London School.Bootstrapping is any test or metric that uses random sampling with replacementand falls under the broader class of resampling methods.

Bootstrapping assigns measures of accuracy bias, variance, confidence intervalsprediction error, etc. Bootstrapping estimates the properties of an estimator such as its variance by measuring those properties when sampling from an approximating distribution.

One standard choice for an approximating distribution is the empirical distribution function of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples with replacement, of the observed data set and of equal size to the observed data set.

It may also be used for constructing hypothesis tests. It is often used as an alternative to statistical inference based on the assumption of a parametric model when that assumption is in doubt, or where parametric inference is impossible or requires complicated formulas for the calculation of standard errors.

The bootstrap was published by Bradley Efron in "Bootstrap methods: another look at the jackknife"[5] [6] [7] inspired by earlier work on the jackknife. As the population is unknown, the true error in a sample statistic against its population value is unknown.

## Combining bootstrapping with multiple imputation

As an example, assume we are interested in the average or mean height of people worldwide. We cannot measure all the people in the global population, so instead we sample only a tiny part of it, and measure that.

Assume the sample is of size N ; that is, we measure the heights of N individuals. From that single sample, only one estimate of the mean can be obtained. In order to reason about the population, we need some sense of the variability of the mean that we have computed. The bootstrap sample is taken from the original by using sampling with replacement e.

This process is repeated a large number of times typically 1, or 10, timesand for each of these bootstrap samples we compute its mean each of these are called bootstrap estimates.

We now can create a histogram of bootstrap means. This histogram provides an estimate of the shape of the distribution of the sample mean from which we can answer questions about how much the mean varies across samples. The method here, described for the mean, can be applied to almost any other statistic or estimator. A great advantage of bootstrap is its simplicity.

It is a straightforward way to derive estimates of standard errors and confidence intervals for complex estimators of the distribution, such as percentile points, proportions, odds ratio, and correlation coefficients. Bootstrap is also an appropriate way to control and check the stability of the results. Although for most problems it is impossible to know the true confidence interval, bootstrap is asymptotically more accurate than the standard intervals obtained using sample variance and assumptions of normality.

Although bootstrapping is under some conditions asymptotically consistentit does not provide general finite-sample guarantees. The result may depend on the representative sample. The apparent simplicity may conceal the fact that important assumptions are being made when undertaking the bootstrap analysis e.

## Example of Bootstrapping

Also, the bootstrapping can be time-consuming. The number of bootstrap samples recommended in literature has increased as available computing power has increased.

If the results may have substantial real-world consequences, then one should use as many samples as is reasonable, given available computing power and time.

Increasing the number of samples cannot increase the amount of information in the original data; it can only reduce the effects of random sampling errors which can arise from a bootstrap procedure itself. Moreover, there is evidence that numbers of samples greater than lead to negligible improvements in the estimation of standard errors. However, Athreya has shown [20] that if one performs a naive bootstrap on the sample mean when the underlying population lacks a finite variance for example, a power law distributionthen the bootstrap distribution will not converge to the same limit as the sample mean.Campus health and safety are our top priorities.

Get help with Zoom and more. I would like to perform a path analysis, a confirmatory factor analysis, or a structural equation model. As of this writing, SPSS for Windows does not currently support modules to perform the analyses you describe. If your models of interest are small, the free student version may be sufficient to meet your needs. For larger models, you will need to purchase your own copy of AMOS.

If you have further questions, email stat. Every time I run a model, I get a box that pops up on my screen that mentions an error Cannot paste to clipboard. What should I do about this error? We are currently working with Smallwaters Corporation's technical support group to correct the error. The error does not appear to influence model results or model fitting activities in any way beyond the annoyance of having to dismiss the extra dialog box at the end of the model fitting run. Once you dismiss the error dialog box, you may view your model fitting results as usual.

I have data from two different groups of research participants. One group of participants took my survey in the fall semester while the second group took my survey in the spring semester. I've come up with what I think is a good confirmatory factor analysis model based on the fall data and I want to see if that model holds in the spring data. I believe this is called a "multiple group" analysis. If you do not, see our online AMOS tutorial.

This FAQ also assumes that you know how to use AMOS to perform nested model comparisons and that you understand the assumptions and principles underlying nested model comparisons. Multiple group analysis in structural equation modeling is very useful because it allows you to compare multiple samples across the same measurement instrument or multiple population groups e.

AMOS allows you to test whether your groups meet the assumption that they are equal by examining whether different sets of path coefficients are invariant. In other words, you will be testing whether path coefficients in your model are equal for your groups.

You can test the equalities of variables' variances, means, and intercepts, as well as the covariances between variables, and the equalities of path coefficients across two or more groups. Before you begin testing invariance across groups, you should assess carefully your overall sample size and the equality of sample sizes across groups. Since the multiple group analysis estimates more parameters than a single group analysis, you will need proportionally more cases for a multiple group analysis to ensure stable parameter estimates and replicable results.

For instance, if you had cases for a single group analysis, you would want at least cases for an analysis that used two groups. Furthermore, your analysis should ideally have equal numbers of cases in each group. Little is known about the impact of sharply unequal group sizes on results obtained from a multiple group SEM analysis, except that larger groups will exert more influence on the results than smaller groups.

This property of multiple group analysis is not especially problematic if the group sizes mirror the proportion of individuals' group membership in the population from which the sample was drawn. On the other hand, if the sample sizes are not proportional to population sizes, errors of inference may be more likely to occur. Different assumptions of group equality can be tested and they are often tested in a particular order Bollen, For illustrative purposes, this example will consider one assumption of group equality.

Consider example r from the AMOS program example set.Mplus Home. Search Help. Dear Dr. I know that both can provide standard errors that are robust to nonnormality, and bootstrapping can even provide assymetric confidence intervals. What are the pro's and con's of each?

Are there situations in which it is better to use one compared to the other? I am not confident that bootstrapping can solve the power problems accompanying a small sample size though. Thank you for your view on this! Bengt O. As opposed to MLR, bootstrapping offers non-symmetric confidence intervals which can be important with parameter estimates that have non-normal sampling distributions, such as for variances and indirect effects, particularly for small samples.

I don't recall papers making direct comparisons between MLR and bootstrap. Anyone else? There is also the possibility to do Bayesian estimation which has the same advantage as bootstrap.

You may want to try all 3 approaches to get a feeling for the range of results. One of the paths was found to be significant when MLR was used but non-significant with bootstrapping. Hence, considering the discrepancy in the results I am not sure which of the two methods to use. I would appreciate any input on this. Thank you! Bootstrapping may be more accurate, but perhaps conservative. It may fall somewhere in between.

When I use more bootstrap samples e. So, I will probably use bootstrap with and also try the Bayes estimator. Thank you for your help! I am also looking at diagnostics statistics - using the outliers option to examine outliers and influential cases and the Tech10 option to request standardized residuals; in addition to plots of standardized residuals against standardized predicted values to test for homoscedasticity.

Requests for SAVE will be ignored. Request for TECH10 is ignored. Thanks for your help! Thanks a lot Professor Muthen. I'd like to kindly ask two follow-up questions. Many thanks. I use bootstrapping to evaluate the confidence intervals of the indirect effects present in the model. However, I also noticed other differences between the two methods. Would this be an indication that the bootstrap method is too conservative in my case?

Bayes estimator yields more similar p-values to MLR. Send the relevant outputs to Support along with your license number. When using bias-corrected bootstraps to estimate indirect effects, could one also say that the direct effects in the same mediation model were estimated using bias corrected bootstraps?

Thank you. No, the effects are still ML point estimates.Forums New posts Search forums. What's new New posts New profile posts.

### AMOS - interpretation of bootstrap indirect effects for one-tailed hypothesis

Members Current visitors New profile posts Search profile posts. Log in Register. Search titles only. Search Advanced searchâ€¦.

New posts. Search forums. Log in. For a better experience, please enable JavaScript in your browser before proceeding.

AMOS - interpretation of bootstrap indirect effects for one-tailed hypothesis. Thread starter adakwa Start date Nov 8, Tags bootstrapping confidence interval indirect effects one-tail test p-value. This method reports results for two-tailed significance, but my hypotheses are directional. With bootstrapping, significance is assessed based on confidence intervals - though a p-value is also provided.

For regression, I would normally divide the p-value by two to get the one-tailed result. Would this be appropriate with tests of indirect effects too? In my case the two-tailed p value is. Also, since, for these types of tests, CI rather than p-value is reported, is there an equivalent adjustment for the confidence intervals?

Thanks in advance for any help you can give. CB Super Moderator Nov 8, Hi there, welcome to the forum! Sorry for the delay in releasing your post - it was caught in our spam filter for some reason.

Stay pure. Stay poor. Nov 9, I haven't used AMOS before, but can you post your output so we can see what you are writing about. Thanks Cowboybear As mentioned, my hypotheses are directional - that is one variable is predicted to increase the other rather than just be related to it, so my understanding is that a one-tailed test is appropriate.

I have generally been reporting one-tailed test results for my regressions. However, with AMOS, it is not possible to specify a one-tailed test, and with bootstrapping for indirect effects, usually the CI and not the p-value is reported.

If you know otherwise, I'd be interested in your experience. Sure and thanks for your time. Please see attachments. Your output was a little cryptic to me. CB Super Moderator Nov 9, You must log in or register to reply here.This post intends to introduce the basics of mediation analysis and does not explain statistical details. For details, please refer to the articles at the end of this post. This research example is made up for illustration purposes. I think, however, grades are not the real reason that happiness increases.

This is a typical case of mediation analysis.

Self-esteem is a mediator that explains the underlying mechanism of the relationship between grades IV and happiness DV. Before we start, please keep in mind that, as any other regression analysis, mediation analysis does not imply causal relationships unless it is based on experimental design. To analyze mediation: 1. Use either the Sobel test or bootstrapping for significance testing.

This post will show examples using R, but you can use any statistical software. They are just three regression analyses! Step 1. We want X to affect Y.

If there is no relationship between X and Y, there is nothing to mediate. Although this is what Baron and Kenny originally suggested, this step is controversial. Step 2. We want X to affect M. If X and M have no relationship, M is just a third variable that may or may not be associated with Y. A mediation makes sense only if X affects M. Step 3. If a mediation effect exists, the effect of X on Y will disappear or at least weaken when M is included in the regression.

The effect of X on Y goes through M. If the effect of X on Y still exists, but in a smaller magnitude, M partially mediates between X and Y partial mediation. The example shows a full mediation, yet a full mediation rarely happens in practice. Once we find these relationships, we want to see if this mediation effect is statistically significant different from zero or not.

Note that the Total Effect in the summary 0. The direct effect ADE, 0. However, the suggested steps help you understand how it works! Mediation analysis is not limited to linear regression; we can use logistic regression or polynomial regression and more. Also, we can add more variables and relationships, for example, moderated mediation or mediated moderation. However, if your model is very complex and cannot be expressed as a small set of regressions, you might want to consider structural equation modeling instead.

The Status Dashboard provides quick information about access to materials, how to get help, and status of Library spaces. JavaScript must be enabled in order for you to use our website. However, it seems JavaScript is either disabled or not supported by your browser. Home U.Bootstrapping is a powerful statistical technique. It is especially useful when the sample size that we are working with is small.

Under usual circumstances, sample sizes of less than 40 cannot be dealt with by assuming a normal distribution or a t distribution. Bootstrap techniques work quite well with samples that have less than 40 elements.

**Confirmatory factor analysis in AMOS (Feb 20, 2019)**

The reason for this is that bootstrapping involves resampling. These kinds of techniques assume nothing about the distribution of our data. Bootstrapping has become more popular as computing resources have become more readily available.

This is because in order for bootstrapping to be practical a computer must be used. We will see how this works in the following example of bootstrapping. We begin with a statistical sample from a population that we know nothing about. Although other statistical techniques used to determine confidence intervals assume that we know the mean or standard deviation of our population, bootstrapping does not require anything other than the sample.

For purposes of our example, we will assume that the sample is 1, 2, 4, 4, We now resample with replacement from our sample to form what are known as bootstrap samples. Each bootstrap sample will have a size of five, just like our original sample.

Since we are randomly selecting and then are replacing each value, the bootstrap samples may be different from the original sample and from each other. For examples that we would run into in the real world, we would do this resampling hundreds if not thousands of times.

In what follows below, we will see an example of 20 bootstrap samples:. Since we are using bootstrapping to calculate a confidence interval for the population mean, we now calculate the means of each of our bootstrap samples. These means, arranged in ascending order are: 2, 2. We now obtain from our list of bootstrap sample means a confidence interval.

For our example above we have a confidence interval of 2. Share Flipboard Email. Courtney Taylor. Professor of Mathematics. Courtney K.

Taylor, Ph. Updated January 06,

## Comments