data set | # of features | Class $ C_1 $ | Class $ C_2 $ | ||
name | # of points | name | # of points | ||
German | 20 | Creditworthy | 700 | Non-creditworthy | 300 |
Australian | 14 | Non-default | 383 | Default | 307 |
Chinese | 7 | Good credit | 58 | Bad credit | 48 |
Nowadays, the effective credit scoring becomes a very crucial factor for gaining competitive advantages in credit market for both customers and corporations. In this paper, we propose a credit scoring method which combines the non-kernel fuzzy 2-norm quadratic surface SVM model, T-test feature weighting strategy and fuzzy within-class scatter together. It is worth pointing out that this new method not only saves computational time by avoiding choosing a kernel and corresponding parameters in the classical SVM models, but also addresses the "curse of dimensionality" issue and improves the robustness. Besides, we develop an efficient way to calculate the fuzzy membership of each training point by solving a linear programming problem. Finally, we conduct several numerical tests on two benchmark data sets of personal credit and one real-world data set of corporation credit. The numerical results strongly demonstrate that the proposed method outperforms eight state-of-the-art and commonly-used credit scoring methods in terms of accuracy and robustness.
Citation: |
Table 1. Credit Data Sets
data set | # of features | Class $ C_1 $ | Class $ C_2 $ | ||
name | # of points | name | # of points | ||
German | 20 | Creditworthy | 700 | Non-creditworthy | 300 |
Australian | 14 | Non-default | 383 | Default | 307 |
Chinese | 7 | Good credit | 58 | Bad credit | 48 |
Table 2. German Credit Data Test
model | misclassification rate (%) | CPU time (s) | |
mean | std | ||
LOG_REG | 23.04 | 0.35 | 0.14 |
FFBP_NN | 24.30 | 0.57 | 3.83 |
SVM_GausKer | 24.31 | 0.71 | 3.30 |
W2NSVM_GausKer | 23.85 | 0.56 | 5.72 |
W2NSVM_QuadKer | 23.92 | 0.81 | 5.36 |
FSVMWCS_GausKer | 23.42 | 1.84 | 6.87 |
Clu_SVM | 24.49 | 0.71 | 0.25 |
Dagher's QSVM | 24.26 | 0.62 | 4.63 |
SQSSVM | 23.86 | 0.59 | 2.82 |
FNKSVM-FWS | 21.36 | 0.51 | 4.23 |
Table 3. Australian Credit Data Test
model | misclassification rate (%) | CPU time (s) | |
mean | std | ||
LOG_REG | 13.56 | 0.27 | 0.12 |
FFBP_NN | 14.42 | 1.16 | 2.72 |
SVM_GausKer | 15.00 | 1.06 | 1.30 |
W2NSVM_GausKer | 14.87 | 0.53 | 2.73 |
W2NSVM_QuadKer | 14.59 | 0.46 | 3.01 |
FSVMWCS_GausKer | 14.63 | 3.68 | 3.75 |
Clu_SVM | 14.34 | 0.53 | 0.16 |
Dagher's QSVM | 26.42 | 1.23 | 1.63 |
SQSSVM | 14.57 | 0.57 | 0.80 |
FNKSVM-FWS | 11.96 | 0.43 | 1.56 |
Table 4. Chinese Credit Data Test
model | misclassification rate (%) | CPU time (s) | |
mean | std | ||
LOG_REG | 7.56 | 0.57 | 0.235 |
FFBP_NN | 24.01 | 2.25 | 4.412 |
SVM_GausKer | 13.75 | 0.90 | 0.034 |
W2NSVM_GausKer | 12.13 | 1.89 | 0.053 |
W2NSVM_QuadKer | 12.07 | 2.01 | 0.062 |
FSVMWCS_GausKer | 21.18 | 2.88 | 0.063 |
Clu_SVM | 10.96 | 0.55 | 0.048 |
Dagher's QSVM | 11.24 | 2.33 | 0.087 |
SQSSVM | 10.87 | 1.96 | 0.056 |
FNKSVM-FWS | 8.50 | 0.51 | 0.083 |
Table 5. Robustness of Models on Australian Credit Data
model | mean of misclassification rates (%) | |
without outliers | with outliers | |
LOG_REG | 13.56 | 17.87 |
FFBP_NN | 14.42 | 15.94 |
SVM_GausKer | 15.00 | 15.80 |
W2NSVM_GausKer | 14.87 | 15.65 |
W2NSVM_QuadKer | 14.59 | 15.36 |
FSVMWCS_GausKer | 14.63 | 18.43 |
Clu_SVM | 14.34 | 17.84 |
Dagher's QSVM | 26.42 | 53.21 |
SQSSVM | 14.57 | 15.58 |
FNKSVM-FWS | 11.96 | 12.61 |
[1] |
W. An and M. Liang, Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises, Neurocomputing, 110 (2013), 101-110.
doi: 10.1016/j.neucom.2012.11.023.![]() ![]() |
[2] |
K. Bache and M. Lichman, Uci machine learning repository, http://archive.ics.uci.edu/ml, 2013.
![]() |
[3] |
Y. Bai, X. Han, T. Chen and H. Yu, Quadratic kernel-free least squares support vector machine for target diseases classification, Journal of Combinatorial Optimization, 30 (2015), 850-870.
doi: 10.1007/s10878-015-9848-z.![]() ![]() ![]() |
[4] |
G. Baudat and F. Anouar, Generalized discriminant analysis using a kernel approach, Neural Computation, 12 (2000), 2385-2404.
doi: 10.1162/089976600300014980.![]() ![]() |
[5] |
S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, New York, 2004.
doi: 10.1017/CBO9780511804441.![]() ![]() ![]() |
[6] |
I. Dagher, Quadratic kernel-free non-linear support vector machine, Journal of Global Optimization, 41 (2008), 15-30.
doi: 10.1007/s10898-007-9162-0.![]() ![]() ![]() |
[7] |
R. Fisher, The use of multiple measurements in taxonomic problems, Annals of Human Genetics, 7 (1936), 179-188.
doi: 10.1111/j.1469-1809.1936.tb02137.x.![]() ![]() |
[8] |
T. Gestel, B. Baesens and J. Garcia, A support vector machine approach to credit scoring, Journal of Bank and Finance, 2 (2003), 73-82.
![]() |
[9] |
J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann, San Francisco, CA, 2006.
![]() |
[10] |
L. Han and H. Zhao, Orthogonal support vector machine for credit scoring, Engineering Applications of Artificial Intelligence, 26 (2013), 848-862.
doi: 10.1016/j.engappai.2012.10.005.![]() ![]() |
[11] |
T. Harris, Credit scoring using the clustered support vector machine, Expert Systems with Applications, 42 (2015), 741-750.
doi: 10.1016/j.eswa.2014.08.029.![]() ![]() |
[12] |
C. Huang, M. Chen and C. Wang, Credit scoring with a data mining approach based on support vector machines, Expert Systems with Applications, 33 (2007), 847-856.
doi: 10.1016/j.eswa.2006.07.007.![]() ![]() |
[13] |
X. Jiang, Y. Zhang and J. Lv, Fuzzy svm with a new fuzzy membership function, Neural Computing and Applications, 15 (2006), 268-276.
doi: 10.1007/s00521-006-0028-z.![]() ![]() |
[14] |
C. Lin and S. Wang, Fuzzy support vector machines, IEEE Transactions on Neural Networks, 13 (2002), 464-471.
![]() |
[15] |
F. Liu and X. Xue, Subgradient-based neural network for nonconvex optimization problems in support vector machines with indefinite kernels, Journal of Industrial and Management Optimization, 12 (2016), 285-301.
doi: 10.3934/jimo.2016.12.285.![]() ![]() ![]() |
[16] |
J. Luo, S.-C. Fang, Y. Bai and Z. Deng, Fuzzy quadratic surface support vector machine based on fisher discriminant analysis, Journal of Industrial and Management Optimization, 12 (2016), 357-373.
doi: 10.3934/jimo.2016.12.357.![]() ![]() ![]() |
[17] |
J. Luo, S.-C. Fang, Z. Deng and X. Guo, Quadratic surface support vector machine for binary classification, Asia-Pacific Journal Of Operational Research, 33 (2016), 1650046.
doi: 10.1142/S0217595916500469.![]() ![]() ![]() |
[18] |
A. Marques, V. Garcia and J. Sanchez, On the suitability of resampling techniques for the class imbalance problem in credit scoring, Journal of the Operational Research Society, 64 (2013), 1060-1070.
doi: 10.1057/jors.2012.120.![]() ![]() |
[19] |
D. Martin, Early warning of bank failure: a logistic regression approach, Journal of Banking and Finance, 1 (1977), 249-276.
![]() |
[20] |
B. Schölkopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge, MA, 2002.
doi: 10.1016/B978-044451378-6/50001-6.![]() ![]() ![]() |
[21] |
Y. Tian, M. Sun, Z. Deng, J. Luo and Y. Li, A new fuzzy set and non-kernel svm approach for mislabeled binary classification with applications, IEEE Transactions on Fuzzy Systems, 25 (2017), 1536-1545.
![]() |
[22] |
W. Tunga, C. Queka and P. Cheng, Genso-ews: A novel neural-fuzzy based early warning system for predicting bank failures, Neural Networks, 17 (2004), 567-587.
doi: 10.1016/j.neunet.2003.11.006.![]() ![]() |
[23] |
J. Wiginton, A note on the comparison of logic and discriminant models of customer credit behavior, Journal of Financial and Quantitative Analysis, 15 (1980), 757-770.
![]() |
[24] |
X. Yan, Y. Bai, S.-C. Fang and J. Luo, A kernel-free quadratic surface support vector machine for semi-supervised learning, Journal of the Operational Research Society, 67 (2016), 1001-1011.
doi: 10.1007/s10957-015-0843-4.![]() ![]() ![]() |
[25] |
X. Zhang, X. Xiao and G. Xu, Fuzzy support vector machine based on affinity among samples, Journal of Software, 17 (2006), 951-958.
doi: 10.1360/jos170951.![]() ![]() ![]() |
[26] |
H. Zhong, C. Miao, Z. Shen and Y. Feng, Comparing the learning effectiveness of BP, ELM, I-ELM, and SVM for corporate credit ratings, Neurocomputing, 128 (2014), 285-295.
doi: 10.1016/j.neucom.2013.02.054.![]() ![]() |
[27] |
L. Zhou, K. Lai and J. Yen, Credit scoring models with auc maximization based on weighted svm, International Journal of Information Technology and Decision Making, 4 (2009), 677-696.
doi: 10.1142/S0219622009003582.![]() ![]() |