April  2017, 13(2): 609-622. doi: 10.3934/jimo.2016035

A new semi-supervised classifier based on maximum vector-angular margin

College of Science, China Agricultural University, No.17 Tsing Hua East Road, Hai Dian District, Beijing 100083, China

* Corresponding author: Liming Yang

Received  December 2014 Revised  December 2015 Published  May 2016

Semi-supervised learning is an attractive method in classification problems when insufficient training information is available. In this investigation, a new semi-supervised classifier is proposed based on the concept of maximum vector-angular margin, (called S$^3$MAMC), the main goal of which is to find an optimal vector $c$ as close as possible to the center of the dataset consisting of both labeled samples and unlabeled samples. This makes S$^3$MAMC better generalization with smaller VC (Vapnik-Chervonenkis) dimension. However, S$^3$MAMC formulation is a non-convex model and therefore it is difficult to solve. Following that we present two optimization algorithms, mixed integer quadratic program (MIQP) and DC (difference of convex functions) program algorithms, to solve the S$^3$MAMC. Compared with the supervised learning methods, numerical experiments on real and synthetic databases demonstrate that the S$^3$MAMC can improve generalization when the labelled samples are relatively few. In addition, the S$^3$MAMC has competitive experiment results in generalization compared to the traditional semi-supervised classification methods.

Citation: Liming Yang, Yannan Chao. A new semi-supervised classifier based on maximum vector-angular margin. Journal of Industrial & Management Optimization, 2017, 13 (2) : 609-622. doi: 10.3934/jimo.2016035
References:
[1]

L. T. H. An and P. D. Tao, The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems, Annals of Operations Research, 133 (2005), 23-46. doi: 10.1007/s10479-004-5022-1. Google Scholar

[2]

L. T. H. AnH. M. LeV. V. Nguyen and P. D. Tao, A DC programming approach for feature selection in support vector machines learning, Advances in Data Analysis and Classification, 2 (2008), 259-278. doi: 10.1007/s11634-008-0030-7. Google Scholar

[3]

A. Asuncion and D. J. Newman, UCI machine learning repository, School of Information and Computer Sciences, University of California Irvine, 2007, http://www.ics.uci.edu/~mlearn/MLRepository.html.Google Scholar

[4]

K. Bennett and A. Demiriz, Semi-supervised support vector machines, In Advances in Neural Information Processing Systems, MIT Press, Cambridge, 12 (1998), 368–374.Google Scholar

[5]

W. ChangzhiL. Chaojie and L. Qiang, A DC programming approach for sensor network localization with uncertainties in anchor positions, Journal of Industrial and Management Optimization, 10 (2014), 817-826. doi: 10.3934/jimo.2014.10.817. Google Scholar

[6]

O. ChapelleV. Sindhwani and S. Keerthi, Optimization Techniques for Semi-Supervised Support Vector Machines, Journal of Machine Learning Research, 9 (2008), 203-233. Google Scholar

[7]

T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, 27 (2006), 861-874. Google Scholar

[8]

G. Fung and O. Mangasarian, Semi-Supervised Support Vector Machines for Unlabeled Data Classification, Optimization methods & software, 15 (2001), 29-44. Google Scholar

[9]

W. Guan and A. Gray, Sparse high-dimensional fractional-norm support vector machine via DC programming, Computational Statistics and Data Analysis, 67 (2013), 136-148. doi: 10.1016/j.csda.2013.01.020. Google Scholar

[10]

W. J. HuF. L. Chung and L. SH. Wang, The Maximum Vector-Angular Margin Classifier and its fast training on large datasets using a core vector machine, Neural Networks, 27 (2012), 60-73. Google Scholar

[11]

P. D. Tao and L. T. T. An, Convex analysis approaches to DC programming: Theory, algorithms and applications, Acta Mathematica, 22 (1997), 287-367. Google Scholar

[12]

B. ScholkopfA. J. SmolaR. C. Williamson and P. L. Bartlett, New support vector algorithms, Neural Computation, 12 (2000), 1207-1245. Google Scholar

[13]

X. XiaoJ. GuL. Zhang and S. Zhang, A sequential convex program method to DC program with joint chance constraints, Journal of Industrial and Management Optimization, 8 (2012), 733-747. doi: 10.3934/jimo.2012.8.733. Google Scholar

[14]

L. M. Yang and L. SH. Wang, A class of smooth semi-supervised SVM by difference of convex functions programming and algorithm, Knowledge-Based Systems, 41 (2013), 1-7. Google Scholar

[15]

YALMIP Toolbox. http://control.ee.ethz.ch/~joloef/wiki/pmwiki.php.Google Scholar

[16]

Y. B. Yuan, Canonical duality solution for alternating support vector machine, Journal of Industrial and Management Optimization, 8 (2012), 611-621. doi: 10.3934/jimo.2012.8.611. Google Scholar

[17]

V. N. Vapnik, Statistical Learning Theory, New York: Wiley. 1998. Google Scholar

show all references

References:
[1]

L. T. H. An and P. D. Tao, The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems, Annals of Operations Research, 133 (2005), 23-46. doi: 10.1007/s10479-004-5022-1. Google Scholar

[2]

L. T. H. AnH. M. LeV. V. Nguyen and P. D. Tao, A DC programming approach for feature selection in support vector machines learning, Advances in Data Analysis and Classification, 2 (2008), 259-278. doi: 10.1007/s11634-008-0030-7. Google Scholar

[3]

A. Asuncion and D. J. Newman, UCI machine learning repository, School of Information and Computer Sciences, University of California Irvine, 2007, http://www.ics.uci.edu/~mlearn/MLRepository.html.Google Scholar

[4]

K. Bennett and A. Demiriz, Semi-supervised support vector machines, In Advances in Neural Information Processing Systems, MIT Press, Cambridge, 12 (1998), 368–374.Google Scholar

[5]

W. ChangzhiL. Chaojie and L. Qiang, A DC programming approach for sensor network localization with uncertainties in anchor positions, Journal of Industrial and Management Optimization, 10 (2014), 817-826. doi: 10.3934/jimo.2014.10.817. Google Scholar

[6]

O. ChapelleV. Sindhwani and S. Keerthi, Optimization Techniques for Semi-Supervised Support Vector Machines, Journal of Machine Learning Research, 9 (2008), 203-233. Google Scholar

[7]

T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, 27 (2006), 861-874. Google Scholar

[8]

G. Fung and O. Mangasarian, Semi-Supervised Support Vector Machines for Unlabeled Data Classification, Optimization methods & software, 15 (2001), 29-44. Google Scholar

[9]

W. Guan and A. Gray, Sparse high-dimensional fractional-norm support vector machine via DC programming, Computational Statistics and Data Analysis, 67 (2013), 136-148. doi: 10.1016/j.csda.2013.01.020. Google Scholar

[10]

W. J. HuF. L. Chung and L. SH. Wang, The Maximum Vector-Angular Margin Classifier and its fast training on large datasets using a core vector machine, Neural Networks, 27 (2012), 60-73. Google Scholar

[11]

P. D. Tao and L. T. T. An, Convex analysis approaches to DC programming: Theory, algorithms and applications, Acta Mathematica, 22 (1997), 287-367. Google Scholar

[12]

B. ScholkopfA. J. SmolaR. C. Williamson and P. L. Bartlett, New support vector algorithms, Neural Computation, 12 (2000), 1207-1245. Google Scholar

[13]

X. XiaoJ. GuL. Zhang and S. Zhang, A sequential convex program method to DC program with joint chance constraints, Journal of Industrial and Management Optimization, 8 (2012), 733-747. doi: 10.3934/jimo.2012.8.733. Google Scholar

[14]

L. M. Yang and L. SH. Wang, A class of smooth semi-supervised SVM by difference of convex functions programming and algorithm, Knowledge-Based Systems, 41 (2013), 1-7. Google Scholar

[15]

YALMIP Toolbox. http://control.ee.ethz.ch/~joloef/wiki/pmwiki.php.Google Scholar

[16]

Y. B. Yuan, Canonical duality solution for alternating support vector machine, Journal of Industrial and Management Optimization, 8 (2012), 611-621. doi: 10.3934/jimo.2012.8.611. Google Scholar

[17]

V. N. Vapnik, Statistical Learning Theory, New York: Wiley. 1998. Google Scholar

Figure 1.  ACC versus µ=1 for ν=1 and 10 on Thyroid data
Figure 2.  Training samples of the synthetic data
Figure 3.  Comparison of DCA-S3MAMC and MIQP-S3MAMC in terms of CPU-time
Figure 4.  Comparison of DCA-S3MAMC and MIQP-S3MAMC in terms of ACC
Table 1.  Comparison of S$^3$MAMC, MAMC and $\nu$-SVC with the ratio of labelled to unlabelled samples being 2:8 in terms of generalization
data Classification models G-ACC (%) ACC (%) MCC (%) $F_1$-measure (%)
DCA-S$^3$MAMC 100 100 100 100
MIQP-S$^3$MAMC 100 100 100 100
Wine MAMC 94.31 94.39 89.06 94.16
$(107 \times 13)$ $\nu$-SVM 99.30 99.30 98.62 99.31
DCA-S$^3$MAMC 99.81 99.81 99.63 99.81
MIQP-S$^3$MAMC 95.55 95.65 91.64 95.45
Tryroid MAMC 94.87 95.00 90.45 95.24
$(65 \times 5)$ $\nu$-SVM 87.94 88.34 77.76 89.23
DCA-S$^3$MAMC 92.69 93.76 86.73 93.49
MIQP-S$^3$MAMC 96.35 96.35 91.20 96.16
Cancer MAMC 92.26 93.36 86.03 93.02
$(569 \times 30)$ $\nu$-SVM 91.27 91.46 85.30 90.93
DCA-S$^3$MAMC 62.40 62.40 24.81 62.65
MIQP-S$^3$MAMC 62.42 63.02 26.44 65.98
Sonar MAMC 60.11 60.41 20.97 62.67
$(208 \times 60)$ $\nu$-SVM 60.11 60.19 20.40 58.99
DCA-S$^3$MAMC 86.65 87.61 75.00 87.98
MIQP-S$^3$MAMC 85.34 86.69 75.91 88.20
Ionosphere MAMC 80.94 81.25 63.14 82.49
$(350 \times 34)$ $\nu$-SVM 86.34 86.74 74.51 87.76
DCA-S$^3$MAMC 72.82 73.66 48.51 76.28
MIQP-S$^3$MAMC 78.26 78.65 58.00 80.19
Hepatitis MAMC 70.60 71.76 45.04 74.98
$(155 \times 19)$ $\nu$-SVM 70.00 71.21 43.93 74.53
DCA-S$^3$MAMC 86.39 86.40 72.80 86.54
MIQP-S$^3$MAMC 84.72 84.79 69.76 85.31
Heart MAMC 81.82 81.86 63.83 82.35
$(155 \times 19)$ $\nu$-SVM 84.70 84.72 69.46 84.95
DCA-S$^3$MAMC 94.78 94.81 89.69 94.91
MIQP-S$^3$MAMC 95.61 95.62 91.29 95.69
Vote MAMC 90.11 90.29 81.07 90.80
$(432 \times 16)$ $\nu$-SVM 94.51 94.54 89.09 94.60
DCA-S$^3$MAMC 92.50 92.50 85.00 92.54
MIQP-S$^3$MAMC 93.25 93.25 86.50 93.23
Synthesis MAMC 86.45 86.49 73.07 86.82
$(200 \times 2)$ $\nu$-SVM 82.50 82.50 65.00 82.41
data Classification models G-ACC (%) ACC (%) MCC (%) $F_1$-measure (%)
DCA-S$^3$MAMC 100 100 100 100
MIQP-S$^3$MAMC 100 100 100 100
Wine MAMC 94.31 94.39 89.06 94.16
$(107 \times 13)$ $\nu$-SVM 99.30 99.30 98.62 99.31
DCA-S$^3$MAMC 99.81 99.81 99.63 99.81
MIQP-S$^3$MAMC 95.55 95.65 91.64 95.45
Tryroid MAMC 94.87 95.00 90.45 95.24
$(65 \times 5)$ $\nu$-SVM 87.94 88.34 77.76 89.23
DCA-S$^3$MAMC 92.69 93.76 86.73 93.49
MIQP-S$^3$MAMC 96.35 96.35 91.20 96.16
Cancer MAMC 92.26 93.36 86.03 93.02
$(569 \times 30)$ $\nu$-SVM 91.27 91.46 85.30 90.93
DCA-S$^3$MAMC 62.40 62.40 24.81 62.65
MIQP-S$^3$MAMC 62.42 63.02 26.44 65.98
Sonar MAMC 60.11 60.41 20.97 62.67
$(208 \times 60)$ $\nu$-SVM 60.11 60.19 20.40 58.99
DCA-S$^3$MAMC 86.65 87.61 75.00 87.98
MIQP-S$^3$MAMC 85.34 86.69 75.91 88.20
Ionosphere MAMC 80.94 81.25 63.14 82.49
$(350 \times 34)$ $\nu$-SVM 86.34 86.74 74.51 87.76
DCA-S$^3$MAMC 72.82 73.66 48.51 76.28
MIQP-S$^3$MAMC 78.26 78.65 58.00 80.19
Hepatitis MAMC 70.60 71.76 45.04 74.98
$(155 \times 19)$ $\nu$-SVM 70.00 71.21 43.93 74.53
DCA-S$^3$MAMC 86.39 86.40 72.80 86.54
MIQP-S$^3$MAMC 84.72 84.79 69.76 85.31
Heart MAMC 81.82 81.86 63.83 82.35
$(155 \times 19)$ $\nu$-SVM 84.70 84.72 69.46 84.95
DCA-S$^3$MAMC 94.78 94.81 89.69 94.91
MIQP-S$^3$MAMC 95.61 95.62 91.29 95.69
Vote MAMC 90.11 90.29 81.07 90.80
$(432 \times 16)$ $\nu$-SVM 94.51 94.54 89.09 94.60
DCA-S$^3$MAMC 92.50 92.50 85.00 92.54
MIQP-S$^3$MAMC 93.25 93.25 86.50 93.23
Synthesis MAMC 86.45 86.49 73.07 86.82
$(200 \times 2)$ $\nu$-SVM 82.50 82.50 65.00 82.41
Table 2.  Comparison of S$^3$MAMC, MAMC and $\nu$-SVM with the ratio of labelled to unlabelled samples being 1:9 in terms of accuracy (ACC)
models DCA-S$^3$MAMC $(\%)$ MAMC $(\%)$ $\nu$-SVM $(\%)$
Tryroid 92.59 85.19 86.29
Ionosphere 83.37 71.43 71.74
Sonar 60.45 55.56 53.89
Cancer 93.15 89.88 84.52
Heart 85.19 74.31 75.49
Hepatitis 73.33 64.44 71.11
Vote 93.65 86.95 89.68
Synthesis 91.56 70.39 63.66
models DCA-S$^3$MAMC $(\%)$ MAMC $(\%)$ $\nu$-SVM $(\%)$
Tryroid 92.59 85.19 86.29
Ionosphere 83.37 71.43 71.74
Sonar 60.45 55.56 53.89
Cancer 93.15 89.88 84.52
Heart 85.19 74.31 75.49
Hepatitis 73.33 64.44 71.11
Vote 93.65 86.95 89.68
Synthesis 91.56 70.39 63.66
Table 3.  Comparisons of the S$^3$MAMC with other semi-supervised learning methods by accuracy (ACC)
models MIQP-S$^3$MAMC $(\%)$ DCA-S$^3$MAMC $(\%)$ MILP-S$^3$VM $(\%)$ VS$^3$VM $(\%)$
Ionosphere 86.69 87.61 89.40 87.36
Sonar 63.02 62.40 78.10 66.12
Cancer 96.35 93.76 96.60 97.46
Heart 84.79 86.40 84.00 84.70
Hepatitis 78.65 73.66 70.36 65.13
Synthesis 93.25 92.50 81.11 85.67
models MIQP-S$^3$MAMC $(\%)$ DCA-S$^3$MAMC $(\%)$ MILP-S$^3$VM $(\%)$ VS$^3$VM $(\%)$
Ionosphere 86.69 87.61 89.40 87.36
Sonar 63.02 62.40 78.10 66.12
Cancer 96.35 93.76 96.60 97.46
Heart 84.79 86.40 84.00 84.70
Hepatitis 78.65 73.66 70.36 65.13
Synthesis 93.25 92.50 81.11 85.67
[1]

Yubo Yuan, Weiguo Fan, Dongmei Pu. Spline function smooth support vector machine for classification. Journal of Industrial & Management Optimization, 2007, 3 (3) : 529-542. doi: 10.3934/jimo.2007.3.529

[2]

Yubo Yuan. Canonical duality solution for alternating support vector machine. Journal of Industrial & Management Optimization, 2012, 8 (3) : 611-621. doi: 10.3934/jimo.2012.8.611

[3]

Jian Luo, Shu-Cherng Fang, Yanqin Bai, Zhibin Deng. Fuzzy quadratic surface support vector machine based on fisher discriminant analysis. Journal of Industrial & Management Optimization, 2016, 12 (1) : 357-373. doi: 10.3934/jimo.2016.12.357

[4]

Xin Li, Ziguan Cui, Linhui Sun, Guanming Lu, Debnath Narayan. Research on iterative repair algorithm of Hyperchaotic image based on support vector machine. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1199-1218. doi: 10.3934/dcdss.2019083

[5]

Ye Tian, Cheng Lu. Nonconvex quadratic reformulations and solvable conditions for mixed integer quadratic programming problems. Journal of Industrial & Management Optimization, 2011, 7 (4) : 1027-1039. doi: 10.3934/jimo.2011.7.1027

[6]

Zhiguo Feng, Ka-Fai Cedric Yiu. Manifold relaxations for integer programming. Journal of Industrial & Management Optimization, 2014, 10 (2) : 557-566. doi: 10.3934/jimo.2014.10.557

[7]

Radu Ioan Boţ, Anca Grad, Gert Wanka. Sequential characterization of solutions in convex composite programming and applications to vector optimization. Journal of Industrial & Management Optimization, 2008, 4 (4) : 767-782. doi: 10.3934/jimo.2008.4.767

[8]

Ning Lu, Ying Liu. Application of support vector machine model in wind power prediction based on particle swarm optimization. Discrete & Continuous Dynamical Systems - S, 2015, 8 (6) : 1267-1276. doi: 10.3934/dcdss.2015.8.1267

[9]

René Henrion, Christian Küchler, Werner Römisch. Discrepancy distances and scenario reduction in two-stage stochastic mixed-integer programming. Journal of Industrial & Management Optimization, 2008, 4 (2) : 363-384. doi: 10.3934/jimo.2008.4.363

[10]

Louis Caccetta, Syarifah Z. Nordin. Mixed integer programming model for scheduling in unrelated parallel processor system with priority consideration. Numerical Algebra, Control & Optimization, 2014, 4 (2) : 115-132. doi: 10.3934/naco.2014.4.115

[11]

Elham Mardaneh, Ryan Loxton, Qun Lin, Phil Schmidli. A mixed-integer linear programming model for optimal vessel scheduling in offshore oil and gas operations. Journal of Industrial & Management Optimization, 2017, 13 (4) : 1601-1623. doi: 10.3934/jimo.2017009

[12]

Edward S. Canepa, Alexandre M. Bayen, Christian G. Claudel. Spoofing cyber attack detection in probe-based traffic monitoring systems using mixed integer linear programming. Networks & Heterogeneous Media, 2013, 8 (3) : 783-802. doi: 10.3934/nhm.2013.8.783

[13]

Yongjian Yang, Zhiyou Wu, Fusheng Bai. A filled function method for constrained nonlinear integer programming. Journal of Industrial & Management Optimization, 2008, 4 (2) : 353-362. doi: 10.3934/jimo.2008.4.353

[14]

Wan Nor Ashikin Wan Ahmad Fatthi, Adibah Shuib, Rosma Mohd Dom. A mixed integer programming model for solving real-time truck-to-door assignment and scheduling problem at cross docking warehouse. Journal of Industrial & Management Optimization, 2016, 12 (2) : 431-447. doi: 10.3934/jimo.2016.12.431

[15]

Fanwen Meng, Kiok Liang Teow, Kelvin Wee Sheng Teo, Chee Kheong Ooi, Seow Yian Tay. Predicting 72-hour reattendance in emergency departments using discriminant analysis via mixed integer programming with electronic medical records. Journal of Industrial & Management Optimization, 2019, 15 (2) : 947-962. doi: 10.3934/jimo.2018079

[16]

K. Schittkowski. Optimal parameter selection in support vector machines. Journal of Industrial & Management Optimization, 2005, 1 (4) : 465-476. doi: 10.3934/jimo.2005.1.465

[17]

Pooja Louhan, S. K. Suneja. On fractional vector optimization over cones with support functions. Journal of Industrial & Management Optimization, 2017, 13 (2) : 549-572. doi: 10.3934/jimo.2016031

[18]

Hong-Gunn Chew, Cheng-Chew Lim. On regularisation parameter transformation of support vector machines. Journal of Industrial & Management Optimization, 2009, 5 (2) : 403-415. doi: 10.3934/jimo.2009.5.403

[19]

Changzhi Wu, Chaojie Li, Qiang Long. A DC programming approach for sensor network localization with uncertainties in anchor positions. Journal of Industrial & Management Optimization, 2014, 10 (3) : 817-826. doi: 10.3934/jimo.2014.10.817

[20]

Jian Gu, Xiantao Xiao, Liwei Zhang. A subgradient-based convex approximations method for DC programming and its applications. Journal of Industrial & Management Optimization, 2016, 12 (4) : 1349-1366. doi: 10.3934/jimo.2016.12.1349

2018 Impact Factor: 1.025

Metrics

  • PDF downloads (16)
  • HTML views (286)
  • Cited by (0)

Other articles
by authors

[Back to Top]