Liew, Bernard XW and Kovacs, Francisco M and Rügamer, David and Royuela, Ana (2023) Automatic Variable Selection Algorithms in Prognostic Factor Research in Neck Pain. Journal of Clinical Medicine, 12 (19). p. 6232. DOI https://doi.org/10.3390/jcm12196232
Liew, Bernard XW and Kovacs, Francisco M and Rügamer, David and Royuela, Ana (2023) Automatic Variable Selection Algorithms in Prognostic Factor Research in Neck Pain. Journal of Clinical Medicine, 12 (19). p. 6232. DOI https://doi.org/10.3390/jcm12196232
Liew, Bernard XW and Kovacs, Francisco M and Rügamer, David and Royuela, Ana (2023) Automatic Variable Selection Algorithms in Prognostic Factor Research in Neck Pain. Journal of Clinical Medicine, 12 (19). p. 6232. DOI https://doi.org/10.3390/jcm12196232
Abstract
This study aims to compare the variable selection strategies of different machine learning (ML) and statistical algorithms in the prognosis of neck pain (NP) recovery. A total of 3001 participants with NP were included. Three dichotomous outcomes of an improvement in NP, arm pain (AP), and disability at 3 months follow-up were used. Twenty-five variables (twenty-eight parameters) were included as predictors. There were more parameters than variables, as some categorical variables had >2 levels. Eight modelling techniques were compared: stepwise regression based on unadjusted <i>p</i> values (stepP), on adjusted <i>p</i> values (stepPAdj), on Akaike information criterion (stepAIC), best subset regression (BestSubset) least absolute shrinkage and selection operator [LASSO], Minimax concave penalty (MCP), model-based boosting (mboost), and multivariate adaptive regression splines (MuARS). The algorithm that selected the fewest predictors was stepPAdj (number of predictors, <i>p</i> = 4 to 8). MuARS was the algorithm with the second fewest predictors selected (<i>p</i> = 9 to 14). The predictor selected by all algorithms with the largest coefficient magnitude was "having undergone a neuroreflexotherapy intervention" for NP (β = from 1.987 to 2.296) and AP (β = from 2.639 to 3.554), and "Imaging findings: spinal stenosis" (β = from -1.331 to -1.763) for disability. Stepwise regression based on adjusted <i>p</i>-values resulted in the sparsest models, which enhanced clinical interpretability. MuARS appears to provide the optimal balance between model sparsity whilst retaining high predictive performance across outcomes. Different algorithms produced similar performances but resulted in a different number of variables selected. Rather than relying on any single algorithm, confidence in the variable selection may be increased by using multiple algorithms.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | neck pain; statistics; prognosis; machine learning; variable selection |
Subjects: | Z Bibliography. Library Science. Information Resources > ZZ OA Fund (articles) |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Sport, Rehabilitation and Exercise Sciences, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 18 Mar 2024 16:38 |
Last Modified: | 30 Oct 2024 16:38 |
URI: | http://repository.essex.ac.uk/id/eprint/37993 |
Available files
Filename: jcm-12-06232.pdf
Licence: Creative Commons: Attribution 4.0