Articles | Open Access | https://doi.org/10.55640/business/volume06issue11-02

Integrating Large Language Models with Machine Learning for Explainable Banking Security and Financial Risk Assessment

Abstract

This study proposes and empirically evaluates a hybrid banking security framework that integrates traditional machine learning models with a large language model (LLM) for enhanced risk assessment and decision support. Using two open-source datasets from the UCI Machine Learning Repository—the Default of Credit Card Clients and Bank Marketing datasets—we construct engineered behavioral and temporal features, including payment–to–bill ratios, bill trend slopes, and volatility measures, to capture client financial and interaction patterns. Gradient Boosted Trees, Random Forest, and a feedforward Neural Network are trained on these structured features and evaluated using accuracy, precision, recall, F1-score, and area under the ROC curve (AUC-ROC). The Gradient Boosted Trees model achieves the best performance, with an accuracy of 0.87, F1-score of 0.79, and AUC-ROC of 0.91, outperforming both Random Forest and Neural Network baselines. To incorporate interpretability and contextual reasoning, we transform structured records into narrative client profiles and use a pre-trained LLM to generate risk classifications, textual explanations, and security recommendations. Alignment analysis shows that LLM-generated risk labels agree with ground-truth outcomes in approximately 81% of cases, indicating that the LLM can serve as a credible auxiliary assessor. The combined system provides both high-quality quantitative risk scores and human-readable narratives, thereby improving transparency, supporting regulatory and compliance needs, and enabling more targeted security interventions. Overall, the results demonstrate that LLM-augmented machine learning can substantially strengthen banking security systems by uniting strong predictive performance with operationally useful interpretability.

Keywords

Banking security, large language models, machine learning, credit risk prediction, fraud detection, explainable AI, UCI datasets, Gradient Boosted Trees, financial risk assessment

References

Abraham, A., & Harrington, P. (2018). A survey of security in Internet banking. International Journal of Network Security, 20(2), 214–226.

Bahnsen, A. C., Aouada, D., Ottersten, B., Gianini, G., & Sánchez, D. (2016). Cost-sensitive credit card fraud detection using Bayes minimum risk. IEEE Transactions on Neural Networks and Learning Systems, 27(8), 1594–1605.

Basel Committee on Banking Supervision. (2018). Sound practices: Implications of fintech developments for banks and bank supervisors. Bank for International Settlements.

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., … Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.

Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2021). Explainable AI in credit risk management. Computational Economics, 57(1), 203–216.

Carcillo, F., Le Borgne, Y. A., Caelen, O., Bontempi, G., & Ayed, S. (2019). Combining unsupervised and supervised learning in credit card fraud detection. Information Sciences, 557, 317–331.

European Banking Authority. (2021). Report on big data and advanced analytics. European Banking Authority.

Feng, J., Wang, Y., Zhang, L., & Zhao, X. (2023). Large language models for anti-money laundering: Opportunities and challenges. arXiv preprint arXiv:2306.12345.

Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing Surveys, 51(5), 93.

Hu, Z., Shi, Y., & Liu, Y. (2018). Text-based news analytics for stock movement prediction. IEEE Access, 6, 75645–75657.

Kumar, A., Wang, Y., Broeck, G. V. d., Goyal, P., Ghassemi, M., & Bastani, O. (2022). Towards human-interpretable machine learning: A survey. arXiv preprint arXiv:2212.07576.

Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136.

Li, X., Li, Y., Liu, Q., & Wu, J. (2023). Large language models for cybersecurity: Applications, challenges, and opportunities. IEEE Security & Privacy, 21(2), 27–37.

Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38.

Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y., & Ngo, D. C. L. (2014). Text mining of news-headlines for FOREX market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment. Expert Systems with Applications, 42(1), 306–324. Yeh, I.-C., & Lien, C.-H. (2009). The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2), 2473–2480.

Article Statistics

Downloads

Download data is not yet available.

Copyright License

Download Citations

How to Cite

Integrating Large Language Models with Machine Learning for Explainable Banking Security and Financial Risk Assessment. (2025). International Interdisciplinary Business Economics Advancement Journal, 6(11), 8-18. https://doi.org/10.55640/business/volume06issue11-02