American Institute of Mathematical Sciences

• Previous Article
Optimal control of viral infection model with saturated infection rate
• NACO Home
• This Issue
• Next Article
A robust optimization model for sustainable and resilient closed-loop supply chain network design considering conditional value at risk
doi: 10.3934/naco.2020024

Discriminant analysis of regularized multidimensional scaling

 1 Department of Mathematics, University of Dhaka, Bangladesh 2 School of Mathematics, University of Southampton, UK

Received  October 2019 Revised  January 2020 Published  March 2020

Fund Project: This research was supported by the Commonwealth Scholarship BDCS-2012-44

Regularized Multidimensional Scaling with Radial basis function (RMDS) is a nonlinear variant of classical Multi-Dimensional Scaling (cMDS). A key issue that has been addressed in RMDS is the effective selection of centers of the radial basis functions that plays a very important role in reducing the dimension preserving the structure of the data in higher dimensional space. RMDS uses data in unsupervised settings that means RMDS does not use any prior information of the dataset. This article is concerned on the supervised setting. Here we have incorporated the class information of some members of data to the RMDS model. The class separability term improved the method RMDS significantly and also outperforms other discriminant analysis methods such as Linear discriminant analysis (LDA) which is documented through numerical experiments.

Citation: Sohana Jahan. Discriminant analysis of regularized multidimensional scaling. Numerical Algebra, Control & Optimization, doi: 10.3934/naco.2020024
References:
 [1] A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007. Google Scholar [2] A. Argyriou, T. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272.   Google Scholar [3] J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193.   Google Scholar [4] I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, 2$^{nd}$ edition, Springer Series in Statistics, Springer, 2005.  Google Scholar [5] F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308.  doi: 10.1007/BF02294026.  Google Scholar [6] H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415.  doi: 10.3934/jimo.2009.5.403.  Google Scholar [7] L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321.   Google Scholar [8] T. F. Cox and M. A. Cox, Multidimensional Scaling, 2$^{nd}$ edition, Chapman and Hall/CRC, 2002.  Google Scholar [9] T. F. Cox and G. Ferry, Discriminant analysis using nonmetric multidimensional scaling, Pattern Recognition, 26 (1993), 145-153.   Google Scholar [10] J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133–145. Google Scholar [11] J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin, (1994), 308–325. Google Scholar [12] W. Glunt, T. L. Hayden, S. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600.  doi: 10.1137/0611042.  Google Scholar [13] W. Glunt, T. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120.   Google Scholar [14] J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328.  doi: 10.1093/biomet/53.3-4.325.  Google Scholar [15] Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748.  doi: 10.3934/jimo.2007.3.739.  Google Scholar [16] W. L. G. Koontz and K. Fukunaga, A nonlinear feature extraction algorithm using distance information, IEEE Transactions on Computers, 21 (1972), 56-63.   Google Scholar [17] J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27.  doi: 10.1007/BF02289565.  Google Scholar [18] T. Li, S. Zhu and M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl Inf Syst., 10 (2006), 453-472.  doi: 10.1007/s10115-006-0013-y.  Google Scholar [19] D. Lowe, Novel topographic nonlinear feature extraction using radial basis functions for concentration coding in the artificial nose, IEEE International Conference on Artificial Neural Networks, (1993), 95–99. Google Scholar [20] K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, $10{^th}$ printing, Academic Press, 1995.   Google Scholar [21] A. M. Martinez and A. C. Kak, PCA versus LDA, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (2001), 228-233.  doi: 10.1109/34.908974.  Google Scholar [22] S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15.   Google Scholar [23] E. Pȩkalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific, 2005. Google Scholar [24] H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93.  doi: 10.1137/110849523.  Google Scholar [25] H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336.  doi: 10.1080/00949655.2011.579970.  Google Scholar [26] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826.  doi: 10.1109/TSP.2013.2264814.  Google Scholar [27] H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, Ser. A, 147 (2014), 351-389.  doi: 10.1007/s10107-013-0726-0.  Google Scholar [28] K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476.  doi: 10.3934/jimo.2005.1.465.  Google Scholar [29] I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet, Ann. Math., 36 (1935), 724-732.  doi: 10.2307/1968654.  Google Scholar [30] S Jahanand and H. D. Qi, Regularized multidimensional scaling with radial basis functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563.  doi: 10.3934/jimo.2016.12.543.  Google Scholar [31] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009. Google Scholar [32] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB Approach, Elsevier Inc., 2010. Google Scholar [33] W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958.  Google Scholar [34] A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759.   Google Scholar [35] A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635–639. Google Scholar [36] A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168.   Google Scholar [37] G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22.  doi: 10.1007/BF02288560.  Google Scholar [38] Y. Yuan, W. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542.  doi: 10.3934/jimo.2007.3.529.  Google Scholar

show all references

References:
 [1] A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007. Google Scholar [2] A. Argyriou, T. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272.   Google Scholar [3] J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193.   Google Scholar [4] I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, 2$^{nd}$ edition, Springer Series in Statistics, Springer, 2005.  Google Scholar [5] F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308.  doi: 10.1007/BF02294026.  Google Scholar [6] H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415.  doi: 10.3934/jimo.2009.5.403.  Google Scholar [7] L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321.   Google Scholar [8] T. F. Cox and M. A. Cox, Multidimensional Scaling, 2$^{nd}$ edition, Chapman and Hall/CRC, 2002.  Google Scholar [9] T. F. Cox and G. Ferry, Discriminant analysis using nonmetric multidimensional scaling, Pattern Recognition, 26 (1993), 145-153.   Google Scholar [10] J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133–145. Google Scholar [11] J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin, (1994), 308–325. Google Scholar [12] W. Glunt, T. L. Hayden, S. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600.  doi: 10.1137/0611042.  Google Scholar [13] W. Glunt, T. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120.   Google Scholar [14] J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328.  doi: 10.1093/biomet/53.3-4.325.  Google Scholar [15] Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748.  doi: 10.3934/jimo.2007.3.739.  Google Scholar [16] W. L. G. Koontz and K. Fukunaga, A nonlinear feature extraction algorithm using distance information, IEEE Transactions on Computers, 21 (1972), 56-63.   Google Scholar [17] J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27.  doi: 10.1007/BF02289565.  Google Scholar [18] T. Li, S. Zhu and M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl Inf Syst., 10 (2006), 453-472.  doi: 10.1007/s10115-006-0013-y.  Google Scholar [19] D. Lowe, Novel topographic nonlinear feature extraction using radial basis functions for concentration coding in the artificial nose, IEEE International Conference on Artificial Neural Networks, (1993), 95–99. Google Scholar [20] K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, $10{^th}$ printing, Academic Press, 1995.   Google Scholar [21] A. M. Martinez and A. C. Kak, PCA versus LDA, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (2001), 228-233.  doi: 10.1109/34.908974.  Google Scholar [22] S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15.   Google Scholar [23] E. Pȩkalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific, 2005. Google Scholar [24] H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93.  doi: 10.1137/110849523.  Google Scholar [25] H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336.  doi: 10.1080/00949655.2011.579970.  Google Scholar [26] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826.  doi: 10.1109/TSP.2013.2264814.  Google Scholar [27] H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, Ser. A, 147 (2014), 351-389.  doi: 10.1007/s10107-013-0726-0.  Google Scholar [28] K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476.  doi: 10.3934/jimo.2005.1.465.  Google Scholar [29] I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet, Ann. Math., 36 (1935), 724-732.  doi: 10.2307/1968654.  Google Scholar [30] S Jahanand and H. D. Qi, Regularized multidimensional scaling with radial basis functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563.  doi: 10.3934/jimo.2016.12.543.  Google Scholar [31] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009. Google Scholar [32] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB Approach, Elsevier Inc., 2010. Google Scholar [33] W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958.  Google Scholar [34] A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759.   Google Scholar [35] A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635–639. Google Scholar [36] A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168.   Google Scholar [37] G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22.  doi: 10.1007/BF02288560.  Google Scholar [38] Y. Yuan, W. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542.  doi: 10.3934/jimo.2007.3.529.  Google Scholar
Fig. 1(a) shows the separation of the nonseparable two classes of iris data by support vector machine (SVM)algorithm. One class represented "+" and the other one is represented by "$\diamond$". Over $100$ runs, SRMDS model yielded at an average $1$ misclassified points, while the corresponding number for RMDS was Webb's model is $3$ to $5$. Fig. 1(b) shows SVM applied on 2-dimensional projection of Cancer data. Over $100$ runs, SRMDS model yielded $1$ to $3$ misclassified points. Here support vectors in each image is bounded by "O" and misclassified points (bounded by $\square$)
SVM on Seeds data projected in 2 dimensional space by $\text{RMDS}$ is shown in these figures. Where the separation of the classes are shown using multiclass classifier
Projected 2-dimensional Iris data, consisting of $3$ classes. One class represented by 'red +' is completely separated from the other two. Training points of nonseparable two classes are represented by 'green +' and 'blue +', whereas the testing points projected by SRMDS are represented by 'pink o' and 'red o'. Fig. 3(a) $\lambda = 0.1$. Fig. 3(b) $\lambda = 0.5$. Fig. 3(c)$\lambda = 0.9$, over $100$ random runs. The incresed value of $\lambda$ puts more weight on preservation of the structure of data
Projected 2-dimensional Cancer data, consisting of $2$ classes. Training points of the classes are represented by 'red +' and 'blue +', whereas the testing points of respective classes projected by SRMDS are represented by 'green o' and 'pink o' Fig. 4(a) $\lambda = 0.1$. Fig. 4(b) $\lambda = 0.5$. Fig. 4(c) $\lambda = 0.9$, over $100$ random runs. The incresed value of $\lambda$ puts more weight on preserving the structure of data
Average stress of dataset for different values of $\lambda$ over $100$ random runs. The incresed value of $\lambda$ reduces the stress
 Dataset Dim Class no. of ins. Source Iris 4 3 150 UCI Repository Cancer 9 2 683 UCI Repository Seeds 7 3 210 UCI Repository
 Dataset Dim Class no. of ins. Source Iris 4 3 150 UCI Repository Cancer 9 2 683 UCI Repository Seeds 7 3 210 UCI Repository
Numerical results obtained by applying SVM on three datasets projected using discriminant analysis
 Dataset MDS RMDS SRMDS Improvment over RMDS Iris Support vector 18 13 5 Missclassified points 6 3 0 66 % Cancer Support vector 64 54 47 Missclassified points 9 5 1 80 % Seeds C1 Support vector 42 35 23 Missclassified points 12 10 5 50 % Seeds C2 Support vector 20 16 10 Missclassified points 5 3 2 33 % Seeds C3 Support vector 24 20 9 Missclassified points 8 5 2 60 %
 Dataset MDS RMDS SRMDS Improvment over RMDS Iris Support vector 18 13 5 Missclassified points 6 3 0 66 % Cancer Support vector 64 54 47 Missclassified points 9 5 1 80 % Seeds C1 Support vector 42 35 23 Missclassified points 12 10 5 50 % Seeds C2 Support vector 20 16 10 Missclassified points 5 3 2 33 % Seeds C3 Support vector 24 20 9 Missclassified points 8 5 2 60 %
Misclassified points obtained by k-nn (3-nn) classifier on three datasets projected by SRMDS and LDA
 Dataset LDA SRMDS Iris 0 0 Cancer 4 2 Seeds 20 9
 Dataset LDA SRMDS Iris 0 0 Cancer 4 2 Seeds 20 9
 [1] Ying Lin, Qi Ye. Support vector machine classifiers by non-Euclidean margins. Mathematical Foundations of Computing, 2020, 3 (4) : 279-300. doi: 10.3934/mfc.2020018 [2] Hirokazu Ninomiya. Entire solutions of the Allen–Cahn–Nagumo equation in a multi-dimensional space. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 395-412. doi: 10.3934/dcds.2020364 [3] Divine Wanduku. Finite- and multi-dimensional state representations and some fundamental asymptotic properties of a family of nonlinear multi-population models for HIV/AIDS with ART treatment and distributed delays. Discrete & Continuous Dynamical Systems - S, 2021  doi: 10.3934/dcdss.2021005 [4] Liping Tang, Ying Gao. Some properties of nonconvex oriented distance function and applications to vector optimization problems. Journal of Industrial & Management Optimization, 2021, 17 (1) : 485-500. doi: 10.3934/jimo.2020117 [5] Caterina Balzotti, Simone Göttlich. A two-dimensional multi-class traffic flow model. Networks & Heterogeneous Media, 2020  doi: 10.3934/nhm.2020034 [6] Hongyu Cheng, Shimin Wang. Response solutions to harmonic oscillators beyond multi–dimensional brjuno frequency. Communications on Pure & Applied Analysis, 2021, 20 (2) : 467-494. doi: 10.3934/cpaa.2020222 [7] Gernot Holler, Karl Kunisch. Learning nonlocal regularization operators. Mathematical Control & Related Fields, 2021  doi: 10.3934/mcrf.2021003 [8] Jianhua Huang, Yanbin Tang, Ming Wang. Singular support of the global attractor for a damped BBM equation. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020345 [9] Manuel Friedrich, Martin Kružík, Jan Valdman. Numerical approximation of von Kármán viscoelastic plates. Discrete & Continuous Dynamical Systems - S, 2021, 14 (1) : 299-319. doi: 10.3934/dcdss.2020322 [10] Illés Horváth, Kristóf Attila Horváth, Péter Kovács, Miklós Telek. Mean-field analysis of a scaling MAC radio protocol. Journal of Industrial & Management Optimization, 2021, 17 (1) : 279-297. doi: 10.3934/jimo.2019111 [11] Min Ji, Xinna Ye, Fangyao Qian, T.C.E. Cheng, Yiwei Jiang. Parallel-machine scheduling in shared manufacturing. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020174 [12] Ole Løseth Elvetun, Bjørn Fredrik Nielsen. A regularization operator for source identification for elliptic PDEs. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2021006 [13] Manxue You, Shengjie Li. Perturbation of Image and conjugate duality for vector optimization. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020176 [14] Yifan Chen, Thomas Y. Hou. Function approximation via the subsampled Poincaré inequality. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 169-199. doi: 10.3934/dcds.2020296 [15] Kha Van Huynh, Barbara Kaltenbacher. Some application examples of minimization based formulations of inverse problems and their regularization. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020074 [16] Joel Kübler, Tobias Weth. Spectral asymptotics of radial solutions and nonradial bifurcation for the Hénon equation. Discrete & Continuous Dynamical Systems - A, 2020, 40 (6) : 3629-3656. doi: 10.3934/dcds.2020032 [17] Wen Li, Wei-Hui Liu, Seak Weng Vong. Perron vector analysis for irreducible nonnegative tensors and its applications. Journal of Industrial & Management Optimization, 2021, 17 (1) : 29-50. doi: 10.3934/jimo.2019097 [18] Kengo Nakai, Yoshitaka Saiki. Machine-learning construction of a model for a macroscopic fluid variable using the delay-coordinate of a scalar observable. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1079-1092. doi: 10.3934/dcdss.2020352 [19] Lin Jiang, Song Wang. Robust multi-period and multi-objective portfolio selection. Journal of Industrial & Management Optimization, 2021, 17 (2) : 695-709. doi: 10.3934/jimo.2019130 [20] Bilel Elbetch, Tounsia Benzekri, Daniel Massart, Tewfik Sari. The multi-patch logistic equation. Discrete & Continuous Dynamical Systems - B, 2021  doi: 10.3934/dcdsb.2021025

Impact Factor:

Tools

Article outline

Figures and Tables