February  2019, 2(1): 11-28. doi: 10.3934/mfc.2019002

Comparisons of different methods for balanced data classification under the discrete non-local total variational framework

1. 

College of Computer Science and Technology, Qingdao University, Qingdao 266071, China

2. 

College of Business, Qingdao University, Qingdao 266071, China

* Corresponding author: Zhenkuan Pan

Published  March 2019

Because balanced constraints can overcome the problems of trivial solutions of data classification via minimum cut method, many techniques with different balanced strategies have been proposed to improve data classification accuracy. However, their performances have not been compared comprehensively so far. In this paper, we investigate seven balanced classification methods under the discrete non-local total variational framework and compare their accuracy performances on graph. The two-class classification problem with equality constraints, inequality constraints and Ratio Cut, Normalized Cut, Cheeger Cut models are investigated. For cases of equality constraint, we firstly compare the Penalty Function Method (PFM) and the Augmented Lagrangian Method (ALM), which can transform the constrained problems into unconstrained ones, to show the advantages of ALM. The other cases are all solved using the ALM also. In order to make the comparison fairly, we solve all models using ALM method and using the same proportion of fidelity points and the same neighborhood size on graph. Experimental results demonstrate ALM with the equality balanced constraint has the best classification accuracy compared with other six constraints. 200 words.

Citation: Shixiu Zheng, Zhilei Xu, Huan Yang, Jintao Song, Zhenkuan Pan. Comparisons of different methods for balanced data classification under the discrete non-local total variational framework. Mathematical Foundations of Computing, 2019, 2 (1) : 11-28. doi: 10.3934/mfc.2019002
References:
[1]

E. Bae and E. Merkurjev, Convex variational methods on graphs for multiclass segmentation of high-dimensional data and point clouds, Journal of Mathematical Imaging & Vision, 58 (2017), 468-493.  doi: 10.1007/s10851-017-0713-9.

[2]

A. Ben-DorG. Lancia and J. Perone et al., Banishing bias from consensus sequences, Symposium on Combinatorial Pattern Matching. Springer-Verlag, 1264 (1997), 247-261.  doi: 10.1007/3-540-63220-4_63.

[3]

X. BressonX.-C. TaiT. F. Chan and A. Szlam, Multi-class transductive learning based on L1 relaxations of Cheeger cut and Mumford-Shah-Potts model, Journal of Mathematical Imaging and Visionx, 49 (2014), 191-201.  doi: 10.1007/s10851-013-0452-5.

[4]

T. Chan and L. Vese, An active contour model without edges, International Conference on Scale-Space Theories in Computer Vision. Springer-Verlag, (1999), 141-151. doi: 10.1007/3-540-48236-9_13.

[5]

J. Cheeger, A lower bound for the smallest eigenvalue of the laplacian, Problems in Analysis, (1970), 195-199.

[6]

D. L. Donoho, De-noising by soft-thresholding, IEEE Transactions on Information Theory, 41 (1995), 613-627.  doi: 10.1109/18.382009.

[7]

A. ElmoatazO. Lezoray and S. Bougleux, Nonlocal discrete regularization on weighted graphs: A framework for image and manifold processing, IEEE Trans. Image Processing, 17 (2008), 1047-1060.  doi: 10.1109/TIP.2008.924284.

[8]

D. R. Fulkerson, Flows in networks, Recent Advances in Mathematical Programming, (1963), 319-331.

[9]

G. Gilboa and S. Osher., Nonlocal operators with applications to image processing, Multiscale Modeling & Simulation, 7 (2008), 1005-1028.  doi: 10.1137/070698592.

[10]

G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised segmentation, Multiscale Modeling & Simulation, 6 (2007), 595-630.  doi: 10.1137/060669358.

[11]

R. Glowinski, T. W. Pan and X. C. Tai, Some facts about operator splitting and alternating direction methods, Splitting Methods in Communication, Imaging, Science, and Engineering, 19-94, Sci. Comput., Springer, Cham, 2016.

[12]

L. Hagen and A. B. Kahng, New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 11 (1992), 1074-1085.  doi: 10.1109/43.159993.

[13]

M. Hein and S. Setzer, Beyond 'spectral clustering -tight relaxations of balanced graph cuts, In Advacnces in Neural Information Processing Systems (NIPS), (2011).

[14]

M. Hein and S. Setzer, Beyond spectral clustering- tight relaxations of balanced graph cuts, In J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F.C.N. Pereira and K.Q.Weinberger, Editors, Advances in Neural Information Processing Systems, 24 (2011), 2366-2374.

[15]

U. V. Luxburg, A tutorial on spectral clustering, Statistics and Computing, 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z.

[16]

M. Meila & J. Shi. Learning segmentation by random walks. Advances in Neural Information Processing Systems, 13 (2000), 873- 879.

[17]

E. Merkurjev, A. L. Bertozzi, X. Yan and K. Lerman, Modified cheeger and ratio cut methods using the ginzburg-landau functional for classification of high-dimensional data, Inverse Problems, 33 (2017), 074003, 24pp. doi: 10.1088/1361-6420/33/7/074003.

[18]

E. MerkurjevE. BaeA. L. Bertozzi and X. C. Tai, Global binary optimization on graphs for classification of high-dimensional data, Journal of Mathematical Imaging and Vision, 52 (2015), 414-435.  doi: 10.1007/s10851-015-0567-y.

[19]

L. I. RudinS. Osher and W. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D Nonlinear Phenomena, 60 (1992), 259-268.  doi: 10.1016/0167-2789(92)90242-F.

[20]

J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905. 

[21]

A. Szlam and X. Bresson, A total variation-based graph clustering algorithm for Cheeger ratio cuts, Conference on Machine Learning, (2010), 1039-1046.

[22]

A. N. Tikhonov, Regularization of incorrectly posed problems, Sov Math, 4 (1963), 1624-1627. 

[23]

D. Wagner and F. Wagner, Between min cut and graph bisection, Mathematical Foundations of Computer Science 1993 (Gdask, 1993), 711 (1993), 744-750.  doi: 10.1007/3-540-57182-5_65.

[24]

Y. Z. Zhang, Y. Jiang and Z. F. Pang, Cheeger cut model for the balanced data classification problem, Advanced Materials Research, (2013), 765-767, 730-734.

[25]

D. Zhou and B. Schlkopf, Regularization on discrete spaces, Joint Pattern Recognition Symposium, 3663 (2005), 361-368.  doi: 10.1007/11550518_45.

[26]

X. Zhu and A. B. Goldberg, Introduction to semi-supervised learning, Morgan & Claypool Publishers, (2009), 130pp. doi: 10.2200/S00196ED1V01Y200906AIM006.

show all references

References:
[1]

E. Bae and E. Merkurjev, Convex variational methods on graphs for multiclass segmentation of high-dimensional data and point clouds, Journal of Mathematical Imaging & Vision, 58 (2017), 468-493.  doi: 10.1007/s10851-017-0713-9.

[2]

A. Ben-DorG. Lancia and J. Perone et al., Banishing bias from consensus sequences, Symposium on Combinatorial Pattern Matching. Springer-Verlag, 1264 (1997), 247-261.  doi: 10.1007/3-540-63220-4_63.

[3]

X. BressonX.-C. TaiT. F. Chan and A. Szlam, Multi-class transductive learning based on L1 relaxations of Cheeger cut and Mumford-Shah-Potts model, Journal of Mathematical Imaging and Visionx, 49 (2014), 191-201.  doi: 10.1007/s10851-013-0452-5.

[4]

T. Chan and L. Vese, An active contour model without edges, International Conference on Scale-Space Theories in Computer Vision. Springer-Verlag, (1999), 141-151. doi: 10.1007/3-540-48236-9_13.

[5]

J. Cheeger, A lower bound for the smallest eigenvalue of the laplacian, Problems in Analysis, (1970), 195-199.

[6]

D. L. Donoho, De-noising by soft-thresholding, IEEE Transactions on Information Theory, 41 (1995), 613-627.  doi: 10.1109/18.382009.

[7]

A. ElmoatazO. Lezoray and S. Bougleux, Nonlocal discrete regularization on weighted graphs: A framework for image and manifold processing, IEEE Trans. Image Processing, 17 (2008), 1047-1060.  doi: 10.1109/TIP.2008.924284.

[8]

D. R. Fulkerson, Flows in networks, Recent Advances in Mathematical Programming, (1963), 319-331.

[9]

G. Gilboa and S. Osher., Nonlocal operators with applications to image processing, Multiscale Modeling & Simulation, 7 (2008), 1005-1028.  doi: 10.1137/070698592.

[10]

G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised segmentation, Multiscale Modeling & Simulation, 6 (2007), 595-630.  doi: 10.1137/060669358.

[11]

R. Glowinski, T. W. Pan and X. C. Tai, Some facts about operator splitting and alternating direction methods, Splitting Methods in Communication, Imaging, Science, and Engineering, 19-94, Sci. Comput., Springer, Cham, 2016.

[12]

L. Hagen and A. B. Kahng, New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 11 (1992), 1074-1085.  doi: 10.1109/43.159993.

[13]

M. Hein and S. Setzer, Beyond 'spectral clustering -tight relaxations of balanced graph cuts, In Advacnces in Neural Information Processing Systems (NIPS), (2011).

[14]

M. Hein and S. Setzer, Beyond spectral clustering- tight relaxations of balanced graph cuts, In J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F.C.N. Pereira and K.Q.Weinberger, Editors, Advances in Neural Information Processing Systems, 24 (2011), 2366-2374.

[15]

U. V. Luxburg, A tutorial on spectral clustering, Statistics and Computing, 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z.

[16]

M. Meila & J. Shi. Learning segmentation by random walks. Advances in Neural Information Processing Systems, 13 (2000), 873- 879.

[17]

E. Merkurjev, A. L. Bertozzi, X. Yan and K. Lerman, Modified cheeger and ratio cut methods using the ginzburg-landau functional for classification of high-dimensional data, Inverse Problems, 33 (2017), 074003, 24pp. doi: 10.1088/1361-6420/33/7/074003.

[18]

E. MerkurjevE. BaeA. L. Bertozzi and X. C. Tai, Global binary optimization on graphs for classification of high-dimensional data, Journal of Mathematical Imaging and Vision, 52 (2015), 414-435.  doi: 10.1007/s10851-015-0567-y.

[19]

L. I. RudinS. Osher and W. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D Nonlinear Phenomena, 60 (1992), 259-268.  doi: 10.1016/0167-2789(92)90242-F.

[20]

J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905. 

[21]

A. Szlam and X. Bresson, A total variation-based graph clustering algorithm for Cheeger ratio cuts, Conference on Machine Learning, (2010), 1039-1046.

[22]

A. N. Tikhonov, Regularization of incorrectly posed problems, Sov Math, 4 (1963), 1624-1627. 

[23]

D. Wagner and F. Wagner, Between min cut and graph bisection, Mathematical Foundations of Computer Science 1993 (Gdask, 1993), 711 (1993), 744-750.  doi: 10.1007/3-540-57182-5_65.

[24]

Y. Z. Zhang, Y. Jiang and Z. F. Pang, Cheeger cut model for the balanced data classification problem, Advanced Materials Research, (2013), 765-767, 730-734.

[25]

D. Zhou and B. Schlkopf, Regularization on discrete spaces, Joint Pattern Recognition Symposium, 3663 (2005), 361-368.  doi: 10.1007/11550518_45.

[26]

X. Zhu and A. B. Goldberg, Introduction to semi-supervised learning, Morgan & Claypool Publishers, (2009), 130pp. doi: 10.2200/S00196ED1V01Y200906AIM006.

Figure 1.  Reference basis of three data sets
Figure 2.  Reference basis of three data sets
Figure 3.  Different results for two-moon dataset classification
Figure 4.  Different results for handwritten digits classification of 3 & 8 dataset
Figure 5.  Different results for handwritten digit classification of 4 & 9 dataset
Table 0.  Different algorithm's parameters
EC-PFMEC-ALMSDICDDICRCCCNC
$ \mu_0 $0.010.005$ \frac{3k}{n} $$ \frac{3k}{n}\cdot n $0.01
$ \mu_1 $15252015151215
EC-PFMEC-ALMSDICDDICRCCCNC
$ \mu_0 $0.010.005$ \frac{3k}{n} $$ \frac{3k}{n}\cdot n $0.01
$ \mu_1 $15252015151215
Table 1.  Accuracy comparisons ofdifferent algorithms for two-moon dataset classification
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution10001000
EC-PFM10149861.7$ \% $5
EC-ALM10059951.25$ \% $1
SDIC10169841.50$ \% $2
DDIC10159851.55$ \% $3
RC10309701.70$ \% $5
CC10219791.65$ \% $4
NC98610141.90$ \% $7
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution10001000
EC-PFM10149861.7$ \% $5
EC-ALM10059951.25$ \% $1
SDIC10169841.50$ \% $2
DDIC10159851.55$ \% $3
RC10309701.70$ \% $5
CC10219791.65$ \% $4
NC98610141.90$ \% $7
Table 2.  Accuracy comparisons of different algorithms for handwritten digit classification of 3 & 8 dataset
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution71416825
EC-PFM716568011.2745$ \% $6
EC-ALM713968271.1385$ \% $1
SDIC712868381.1671$ \% $3
DDIC716168051.2244$ \% $4
RC717367931.2387$ \% $5
CC713468321.1600$ \% $2
NC716767991.5538$ \% $7
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution71416825
EC-PFM716568011.2745$ \% $6
EC-ALM713968271.1385$ \% $1
SDIC712868381.1671$ \% $3
DDIC716168051.2244$ \% $4
RC717367931.2387$ \% $5
CC713468321.1600$ \% $2
NC716767991.5538$ \% $7
Table 3.  Accuracy comparisons of different algorithms for handwritten digit classification of 4 & 9 dataset
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution68246958
EC-PFM681469681.2988$ \% $4
EC-ALM681869641.2335$ \% $1
SDIC679969831.3133$ \% $5
DDIC681169711.3206$ \% $6
RC683669461.2698$ \% $3
CC678769951.2407$ \% $2
NC680769751.4729$ \% $7
Method $ V_1 $.class $ V_2 $.classErrorRanking
Solution68246958
EC-PFM681469681.2988$ \% $4
EC-ALM681869641.2335$ \% $1
SDIC679969831.3133$ \% $5
DDIC681169711.3206$ \% $6
RC683669461.2698$ \% $3
CC678769951.2407$ \% $2
NC680769751.4729$ \% $7
Table 4.  The dependences of error rate on fidelity set size
Two-moons3&8
Fidelity set size(per class)Error rateFidelity set size(per class)Error rate
1001.20$ \% $3501.1287$ \% $
501.25$ \% $3001.1385 $ \% $
401.65$ \% $2501.2888$ \% $
301.80$ \% $2001.3604$ \% $
261.85$ \% $1501.6397$ \% $
202.45$ \% $1003.8665$ \% $
162.55$ \% $904.1386$ \% $
103.25$ \% $709.8597$ \% $
64.20$ \% $5011.2201$ \% $
Two-moons3&8
Fidelity set size(per class)Error rateFidelity set size(per class)Error rate
1001.20$ \% $3501.1287$ \% $
501.25$ \% $3001.1385 $ \% $
401.65$ \% $2501.2888$ \% $
301.80$ \% $2001.3604$ \% $
261.85$ \% $1501.6397$ \% $
202.45$ \% $1003.8665$ \% $
162.55$ \% $904.1386$ \% $
103.25$ \% $709.8597$ \% $
64.20$ \% $5011.2201$ \% $
Table 5.  Dependence of error rate on DDIC range parameter
$ \varepsilon $value $ V_1 $.class $ V_2 $.classError rate
5715368131.2316$ \% $
10716068061.2387$ \% $
15712168451.2459$ \% $
20715968071.2602$ \% $
30716068061.2576$ \% $
50716168051.2724$ \% $
$ \varepsilon $value $ V_1 $.class $ V_2 $.classError rate
5715368131.2316$ \% $
10716068061.2387$ \% $
15712168451.2459$ \% $
20715968071.2602$ \% $
30716068061.2576$ \% $
50716168051.2724$ \% $
[1]

G. A. Swarup. On the cut point conjecture. Electronic Research Announcements, 1996, 2: 98-100.

[2]

Robert M. Strain. Optimal time decay of the non cut-off Boltzmann equation in the whole space. Kinetic and Related Models, 2012, 5 (3) : 583-613. doi: 10.3934/krm.2012.5.583

[3]

Blaine Keetch, Yves van Gennip. A Max-Cut approximation using a graph based MBO scheme. Discrete and Continuous Dynamical Systems - B, 2019, 24 (11) : 6091-6139. doi: 10.3934/dcdsb.2019132

[4]

Takeshi Fukao. Variational inequality for the Stokes equations with constraint. Conference Publications, 2011, 2011 (Special) : 437-446. doi: 10.3934/proc.2011.2011.437

[5]

Wei-Xi Li, Lvqiao Liu. Gelfand-Shilov smoothing effect for the spatially inhomogeneous Boltzmann equations without cut-off. Kinetic and Related Models, 2020, 13 (5) : 1029-1046. doi: 10.3934/krm.2020036

[6]

Lvqiao Liu, Hao Wang. Global existence and decay of solutions for hard potentials to the fokker-planck-boltzmann equation without cut-off. Communications on Pure and Applied Analysis, 2020, 19 (6) : 3113-3136. doi: 10.3934/cpaa.2020135

[7]

Cédric Villani. Regularity of optimal transport and cut locus: From nonsmooth analysis to geometry to smooth analysis. Discrete and Continuous Dynamical Systems, 2011, 30 (2) : 559-571. doi: 10.3934/dcds.2011.30.559

[8]

Walter Allegretto, Yanping Lin, Shuqing Ma. On the box method for a non-local parabolic variational inequality. Discrete and Continuous Dynamical Systems - B, 2001, 1 (1) : 71-88. doi: 10.3934/dcdsb.2001.1.71

[9]

Anouar Bahrouni. Trudinger-Moser type inequality and existence of solution for perturbed non-local elliptic operators with exponential nonlinearity. Communications on Pure and Applied Analysis, 2017, 16 (1) : 243-252. doi: 10.3934/cpaa.2017011

[10]

Yunho Kim, Paul M. Thompson, Luminita A. Vese. HARDI data denoising using vectorial total variation and logarithmic barrier. Inverse Problems and Imaging, 2010, 4 (2) : 273-310. doi: 10.3934/ipi.2010.4.273

[11]

Juan Carlos De los Reyes, Estefanía Loayza-Romero. Total generalized variation regularization in data assimilation for Burgers' equation. Inverse Problems and Imaging, 2019, 13 (4) : 755-786. doi: 10.3934/ipi.2019035

[12]

Xin Sun, Dachuan Xu, Dongmei Zhang, Yang Zhou. An adaptive algorithm for maximization of non-submodular function with a matroid constraint. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022031

[13]

Liping Pang, Fanyun Meng, Jinhe Wang. Asymptotic convergence of stationary points of stochastic multiobjective programs with parametric variational inequality constraint via SAA approach. Journal of Industrial and Management Optimization, 2019, 15 (4) : 1653-1675. doi: 10.3934/jimo.2018116

[14]

Takeshi Fukao, Nobuyuki Kenmochi. Quasi-variational inequality approach to heat convection problems with temperature dependent velocity constraint. Discrete and Continuous Dynamical Systems, 2015, 35 (6) : 2523-2538. doi: 10.3934/dcds.2015.35.2523

[15]

Chunlin Wu, Juyong Zhang, Xue-Cheng Tai. Augmented Lagrangian method for total variation restoration with non-quadratic fidelity. Inverse Problems and Imaging, 2011, 5 (1) : 237-261. doi: 10.3934/ipi.2011.5.237

[16]

Qiyu Jin, Ion Grama, Quansheng Liu. Convergence theorems for the Non-Local Means filter. Inverse Problems and Imaging, 2018, 12 (4) : 853-881. doi: 10.3934/ipi.2018036

[17]

Gabriel Peyré, Sébastien Bougleux, Laurent Cohen. Non-local regularization of inverse problems. Inverse Problems and Imaging, 2011, 5 (2) : 511-530. doi: 10.3934/ipi.2011.5.511

[18]

Olivier Bonnefon, Jérôme Coville, Guillaume Legendre. Concentration phenomenon in some non-local equation. Discrete and Continuous Dynamical Systems - B, 2017, 22 (3) : 763-781. doi: 10.3934/dcdsb.2017037

[19]

Bao Wang, Alex Lin, Penghang Yin, Wei Zhu, Andrea L. Bertozzi, Stanley J. Osher. Adversarial defense via the data-dependent activation, total variation minimization, and adversarial training. Inverse Problems and Imaging, 2021, 15 (1) : 129-145. doi: 10.3934/ipi.2020046

[20]

Henri Berestycki, Nancy Rodríguez. A non-local bistable reaction-diffusion equation with a gap. Discrete and Continuous Dynamical Systems, 2017, 37 (2) : 685-723. doi: 10.3934/dcds.2017029

 Impact Factor: 

Metrics

  • PDF downloads (256)
  • HTML views (765)
  • Cited by (0)

[Back to Top]