Multiple Imputation When Variables Exceed Observations: An Overview of Challenges and Solutions

Abstract

Missing data are a prevalent problem in psychological research that can reduce statistical power and bias parameter estimates. These problems can be mostly resolved with multiple imputation, a modern missing data treatment that is increasingly used. Imputation, however, requires the number of variables to be smaller than the number of observations (i.e., non-missing values), and this number is often exceeded due to, e.g., large assessments, high missing data rates, the inclusion of variables predictive of missing values, and the inclusion of non-linear transformations. Even when the ratio of variables to observations meets the minimum requirement, convergence failure can occur in large, complex models. Specialized techniques have been developed to overcome the challenges related to having too many variables in an imputation model, but they are still relatively unknown by researchers in psychology. Accordingly, this paper presents an overview of four imputation techniques that can be used to reduce the number of predictors in an imputation model: item aggregation with scales and parcels, passive imputation, principal component analysis (PcAux) and two-fold fully conditional specification. The purpose, advantages, limitations, and applications of each method are discussed, along with recommendations and illustrative examples, with the aims of (1) understanding different imputation methods and (2) identifying methods that could be useful for one’s imputation problem.

Description

© 2024 University of California Press. All rights reserved. cc-by

Keywords

auxiliary variable, broad imputation, inclusive imputation, joint modeling, MICE, missingness

Citation

Chaput-Langlois, S., Stickley, Z.L., Little, T.D., & Rioux, C.. 2024. Multiple Imputation When Variables Exceed Observations: An Overview of Challenges and Solutions. Collabra: Psychology, 10(1). https://doi.org/10.1525/collabra.92993

Collections