Development of a Blood HbA1c Level Detection Model Based on Support Vector Regression (SVR) Using Microtest Data
Abstract
Glycated hemoglobin (HbA1c) is a key biomarker for long-term glycemic control and diagnosis of diabetes, yet standard assays remain relatively costly and infrastructure-dependent. This pilot study aimed to develop and rigorously evaluate a Support Vector Regression (SVR) model to predict HbA1c using small-scale microtest data as a potential basis for low-cost screening. Primary data were collected from 10 adults (≥40 years) who underwent three repeated oral glucose tolerance tests over 22 days, yielding 214 clinical, hemodynamic, and lifestyle-related variables per subject. Numerical features were cleaned and normalized, then modeled with SVR using a radial basis function kernel. Within a 5-fold cross-validation scheme, Spearman correlation was applied in each fold to select the 10 most informative features from the training set alone, followed by grid search over C, epsilon, and gamma to avoid overfitting and data leakage. Model performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and coefficient of determination (R²), and compared with a simple mean baseline model. The final SVR configuration achieved MAE 0.705, RMSE 1.285, and R² −0.162, indicating predictive performance close to the baseline, achieved under strict, transparent procedures. Features related to previous HbA1c measurements, postprandial glucose dynamics, heart rate, and impaired glucose tolerance (IGT) status were frequently selected, highlighting their potential relevance for early dysglycemia screening. Overall, the results demonstrate that SVR is technically applicable to highly constrained microtest datasets. Still, improved performance will likely require larger and more diverse samples, careful variable reduction, and better exploitation of temporal patterns before the model can be translated into a robust, field-ready decision-support tool.











