Best Model Selection Method for the Best Latent Variables Determination When Solving Multicollinearity with Partial Least Squares
DOI:
https://doi.org/10.62054/ijdm/0204.17Abstract
Violating the assumption of independence among explanatory variables in the linear regression model leads to multicollinearity. In the presence of multicollinearity, the Ordinary Least Squares (OLS) estimator yields inefficient parameter estimates, whereas Partial Least Squares (PLS) estimates are more robust. Moreover, in PLS, weights must be assigned to each explanatory variable before the latent variables are extracted. Two significant challenges associated with the PLS method are the choice of the weight scheme and the selection of latent variables (LVs) to obtain an efficient estimate of the model parameters. Two methods of weight allocation are considered in this study: equal weight allocation and the variance of the regressors, while the two commonly known methods of model selection are the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). AIC and BIC were used to select the best model for determining the optimal latent variables. Consequently, the study compared the performance of PLS results when the two weight attachment schemes and the two commonly used methods of model selection were used to determine the best latent variables. Efficient validation of PLS was performed using Total Mean Squared Error (TMSE) results for all model parameters obtained by the PLS estimator across different scenarios: varying sample sizes, multicollinearity levels, variability values, weight assignments, and model selection methods. Hence, the study concluded that the BIC method of model selection is the best for determining the optimal latent variables to use when employing PLS methods of estimation to handle multicollinearity in a Linear Regression Model.
References
Ayinde, K. (2007b). Equations to generate normal variates with desired inter-correlation matrix. International Journal of Statistics and System, 2(2), 99–111.
Bamidele, T. T., & Alabi, O. O. (2024). A robust estimator for causal inference: Integrating two-stage least squares with principal component. International Journal of Recent Research in Mathematics, Computer Science, and Information Technology, 11(1), 27–32. https://doi.org/10.5281/zenodo.12671069
Bastien, P., Vinzi, V. E., & Tenenhaus, M. (2005). PLS generalized linear regression. Computational Statistics & Data Analysis, 48(1), 17–46. https://doi.org/10.1016/j.csda.2004.02.005
Höskuldsson, A. (2015). PLS regression methods. Journal of Chemometrics, 29(10), 569-582.
Kondylis, A. (2006). PLS methods in regression model assessment and inference (Unpublished thesis). Université de Neuchâtel.
Naes, T., & Martens, H. (1989). Multivariate calibration. John Wiley & Sons.
Tenenhaus, M. (1998). La régression PLS: Théorie et pratique. Technip, Paris.
Westerhuis, J. A., van Velzen, E. J. J., Hoefsloot, H. C. J., & Smilde, A. K. (2016). Multivariate data analysis of complex datasets: Applications in metabolic fingerprinting. Journal of Chemometrics, 30(7), 421-430.
Wold, H. (1975). Soft modelling by latent variables: The nonlinear iterative partial least squares (NIPALS) approach. In J. Gani (Ed.), Perspectives in Probability and Statistics: Papers in Honour of M. S. Bartlett (pp. 520–540). Academic Press.
Wold, S., Ruhe, A., Wold, H., & Dunn, W. J. (2016). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. Journal of Econometrics, 67(1), 121–139. https://doi.org/10.1137/0905052
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Olusegun O. Alabi, Gbemisola W. Ogunmefun, Rasaki Y. Akinbo, Toba T. Bamidele, Olusesan T. Akintola (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors are solely responsible for obtaining permission to reproduce any copyrighted material contained in the manuscript as submitted. Any instance of possible prior publication in any form must be disclosed at the time the manuscript is submitted and a
copy or link to the publication must be provided.
The Journal articles are open access and are distributed under the terms of the Creative
Commons Attribution-NonCommercial-NoDerivs 4.0 IGO License, which permits use,
distribution, and reproduction in any medium, provided the original work is properly cited.
No modifications or commercial use of the articles are permitted.




