Full Length Research Paper
Abstract
The use of modern statistical methodology to overcome the known pitfalls of classical regression models in the analysis of large numbers of highly correlated data, has increased considerably in recent years. Statisticians in the field of chemometrics and OMICS research have developed a new method called Orthogonal projections to latent structures (OPLS). In comparison with the regular partial least squares (PLS) regression, OPLS provides a simpler method with the additional advantage that the orthogonal variation can be analyzed separately. Use of the OPLS model has spread to fields other than its origin but it is not yet applied to the field of epidemiology, which is a wide field of research. In public health and clinical research, there are situations in which large numbers of correlated variables need to be modeled. The authors successfully applied OPLS-DA to model large numbers of variables in a case-control study and compared it with discriminant analysis done by partial least squares regression. Prior to fitting the models, the dataset was split into two parts: a training set and a prediction set. Models fitted on the training dataset were later tested for validity in the prediction dataset. The OPLS-DA was compared with PLS-DA for model fitness, diagnostics and model interpretability. Both models suited the data but OPLS-DA was preferable. The authors encourage the use of these methods to increase study power and statistical validity in epidemiology and similar settings in which large numbers of correlated variables need to be modeled.
Key words: Partial least squares regression, orthogonal projections to latent structures, logistic regression, multicollinearity, injury epidemiology, burns.
Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0