During the last decade, the missing data treatments reported in JEP have shown much improvement in terms of decreased use of ad hoc methods (e.g., LD and PD) and increased use of principled methods (e.g., FIML, EM, and MI). Yet several research practices still persisted including, not explicitly acknowledging the presence of missing data, not describing the approach used in dealing with missing data, not testing assumptions assumed. In this paper, we promote three principled missing data methods (i.e., MI, FIML, and EM) by discussing their theoretical framework, implementation, assumptions, and computing issues. All three methods were illustrated with an empirical *Adolescent* data set using SAS 9.3. Their performances were evaluated under three conditions. These three conditions were created from three missing rates (20%, 40%, and 60%). Each incomplete data set was subsequently analyzed by a regression model to predict adolescents’ behavioral risk score using one of the three principled methods or LD. The performance of the four missing data methods was contrasted with that of the complete data set in terms of bias and *SE*.

Results showed that the three principled methods yielded similar estimates at both missing data rates. In comparison, LD consistently resulted in larger *SE* s for regression coefficients estimates. These findings are consistent with those reported in the literature and thus confirm the recommendations of the three principled methods (Allison 2003;Horton and Lipsitz 2001;Kenward and Carpenter 2007;Peng et al. 2006;Peugh and Enders 2004;Schafer and Graham 2002). Under the three missing data conditions, MI, FIML, and EM yielded similar estimates and *SE* s. These results are consistent with missing data theory that argues that MI and ML-based methods (e.g., FIML and EM) are equivalent (Collins et al. 2001;Graham et al. 2007;Schafer and Graham 2002). In terms of *SE*, ML-based methods outperformed MI by providing slightly smaller *SE* s. This finding is to be expected because ML-based methods do not involve any randomness whereas MI does. Below we elaborate on features shared by MI and ML-based methods, choice between these two types of methods, and extension of these methods to multilevel research contexts.

### Features shared by MI and ML-based methods

First of all, these methods are based on the likelihood function of *P*(*Y*_{
obs
}, *θ*) = ∫ *P*(*Y*_{
complete
}, *θ*)*dY*_{
mis
}. Because this equation is valid under MAR (Rubin 1976), all three principled methods are valid under the MAR assumption. The two ML-based methods work directly with the likelihood function, whereas MI takes the Bayesian approach by imposing a prior distribution on the likelihood function. As the sample size increases, the impact of the specific prior distribution diminishes. It has been shown that,

If the user of the ML procedure and the imputer use the same set of input data (same set of variables and observational units), if their models apply equivalent distributional assumptions to the variables and the relationships among them, if the sample size is large, and if the number of imputations, *M*, is sufficiently large, then the results from the ML and MI procedures will be essentially identical. (Collins et al. 2001p. 336)

In fact, the computational details of EM and MCMC (i.e., data augmentation) are very similar (Schafer 1997).

Second, both the MI and the ML-based methods allow the estimation/imputation model to be different from the analysis model—the model of substantive interest. Although it is widely known that the imputation model can be different from the analysis model for MI, the fact that ML-based methods can incorporate auxiliary variables (such as, EMORISK) is rarely mentioned in the literature, except by Graham (2003). As previously discussed, Graham (2003) suggested using the saturated correlates model to incorporate auxiliary variables into SEM. However, this approach results in a rapidly expanding model with each additional auxiliary variable; consequently, the ML-based methods may not converge. In this case, MI is the preferred method, especially when one needs to incorporate a large number of auxiliary variables into the model of substantive interest.

Finally, most statistical packages that offer the EM, FIML and/or MI methods assume multivariate normality. Theory and experiments suggest that MI is more robust to violation of this distributional assumption than ML-based methods (Schafer 1997). As discussed previously, violation of the multivariate normality assumption may cause convergence problems for ML-based methods. Yet MI can still provide satisfactory results in the presence of non-normality (refer to the section titled **MI Related Issues**). This is so because the posterior distribution in MI is approximated by a finite mixture of the normal distributions. MI therefore is able to capture non-normal features, such as, skewness or multiple modes (Schafer 1999). At the present, the literature does not offer systematic comparisons of these two methods in terms of their sensitivity to the violation of the multivariate normality assumption.

### Choice between MI and ML-based methods

The choice between MI and ML-based methods is not easy. On the one hand, ML-based methods offer the advantage of likelihood ratio tests so that nested models can be compared. Even though Schafer (1997) provided a way to combine likelihood ratio test statistics in MI, no empirical studies have evaluated the performance of this pooled likelihood ratio test under various data conditions (e.g., missing mechanism, missing rate, number of imputations, model complexity). And this test has not been incorporated into popular statistical packages, such as, SAS, SPSS. ML-based methods, in general, produce slightly smaller *SE* s than MI (Collins et al. 2001;Schafer and Graham 2002). Finally, ML-based methods have greater power than MI (Graham et al. 2007), unless imputations were sufficiently large, such as 100 or more.

On the other hand, MI has a clear advantage over ML-based methods when dealing with categorical variables (Peng and Zhu 2008). Another advantage of MI over ML-based methods is its computational simplicity (Sinharay et al. 2001). Once missing data have been imputed, fitting multiple models to a single data set does not require the repeated application of MI. Yet it requires multiple applications of ML-based methods to fit different models to the same data. As stated earlier, it is easier to include auxiliary variable in MI than in ML-based methods. In this sense, MI is the preferred method, if one wants to employ an inclusive strategy to selecting auxiliary variables.

The choice also depends on the goal of the study. If the aim is exploratory, or if the data are prepared for a number of users who may analyze the data differently, MI is certainly better than a ML-based method. For these purposes, a data analyst needs to make sure that the imputation model is general enough to capture meaningful relationships in the data set. If, however, a researcher is clear about the parameters to be estimated, FIML or EM is a better choice because they do not introduce randomness due to imputation into the data, and are more efficient than MI.

An even better way to deal with missing data is to apply MI and EM jointly. In fact, the application of MI can be facilitated by utilizing EM estimates as starting values for the data augmentation algorithm (Enders 2010). Furthermore, the number of EM iterations needed for convergence is a conservative estimate for the number of burn-ins needed in data augmentation of MI, because EM converges slower than MI.

### Extension of MI and ML-based methods to multilevel research contexts

Many problems in education and psychology are multilevel in nature, such as students nested within classroom, teachers nested within school districts, etc. To adequately address these problems, multilevel model have been recommended by methodologists. For an imputation method to yield valid results, the imputation model must contain the same structure as the data. In other words, the imputation model should be multilevel in order to impute for missing data in a multilevel context (Carpenter and Goldstein 2004). There are several ways to extend MI to deal with missing data when there are two levels. If missing data occur only at level 1 and the number of level 2 units is low, standard MI can be used with minor adjustments. For example, for a random-intercept model, one can dummy-code the cluster membership variable and include the dummy variables into the imputation model. In the case of a random slope and random intercepts model, one needs to perform multiple imputation separately within each cluster (Graham 2009). When the number of level 2 units is high, the procedure just described is cumbersome. In this instance, one may turn to specialized MI programs, such as, the PAN library in the S-Plus program (Schafer 2001), the REALCON-IMPUTE software (Carpenter et al. 2011), and the R package mlmmm (Yucel 2007). Unfortunately, ML-based methods have been extended to multilevel models only when there are missing data on the dependent variable, but not on the covariates at any level, such as student’s age at level 1 or school’s SES at level 2 (Enders 2010).

In this paper, we discuss and demonstrate three principled missing data methods that are applicable for a variety of research contexts in educational psychology. Before applying any of the principled methods, one should make every effort to prevent missing data from occurring. Toward this end, the missing data rate should be kept at minimum by designing and implementing data collection carefully. When missing data are inevitable, one needs to closely examine the missing data mechanism, missing rate, missing pattern, and the data distribution before deciding on a suitable missing data method. When implementing a missing data method, a researcher should be mindful of issues related to its proper implementation, such as, statistical assumptions, the specification of the imputation/estimation model, a suitable number of imputations, and criteria of convergence.

Quality of research will be enhanced if (a) researchers explicitly acknowledge missing data problems and the conditions under which they occurred, (b) principled methods are employed to handle missing data, and (c) the appropriate treatment of missing data is incorporated into review standards of manuscripts submitted for publication.