A Comparative Study of Machine Learning Models for Predictive Business Analytics in Digital Commerce (2018–2026)

Risha Alam

doi:10.63125/2hj6p796

Authors

Risha Alam Business Analyst, Newgen It Solutions LLC, Remote, USA Author

DOI:

https://doi.org/10.63125/2hj6p796

Keywords:

Machine Learning, Predictive Business Analytics, Digital Commerce, Ensemble Learning, Deep Learning

Abstract

This study examined the comparative performance of machine learning models for predictive business analytics in digital commerce using a quantitative experimental design across datasets spanning from 2018 to 2026. The analysis was based on a large-scale aggregated dataset comprising approximately 5.8 million observations, including structured transactional data and unstructured behavioral and textual data. The study evaluated traditional models such as logistic regression and decision trees, ensemble models including random forest, gradient boosting, and XGBoost, and deep learning models such as convolutional neural networks and long short-term memory networks. Performance was assessed using multiple evaluation metrics, including accuracy, precision, recall, F1-score, and area under the curve. The findings indicated that advanced models significantly outperformed traditional approaches. XGBoost achieved the highest overall accuracy of 92.8% and AUC of 95.1%, while LSTM demonstrated the best performance in unstructured data environments with an accuracy of 93.2% and F1-score of 92.1%. Random forest exhibited strong stability with an accuracy of 89.6% across datasets. In contrast, logistic regression showed lower performance, with accuracy averaging 78.4%. Ensemble models reduced prediction error in demand forecasting to as low as 8.6%, compared to 18.5% for traditional models. In fraud detection, advanced models improved recall from 72.4% to 88.9% after addressing class imbalance. Statistical analysis confirmed significant differences among models, with p-values below 0.05 and large effect sizes exceeding 1.0 in several comparisons. The study also revealed that dataset size and feature engineering significantly influenced model performance, with large datasets achieving accuracy levels above 93% and reduced variance. Overall, the results demonstrated that ensemble and deep learning models provide substantial improvements in predictive accuracy, robustness, and scalability, highlighting their effectiveness for complex and high-dimensional digital commerce applications.