Dataset | Dim | Class | no. of ins. | Source |
Iris | 4 | 3 | 150 | UCI Repository |
Cancer | 9 | 2 | 683 | UCI Repository |
Seeds | 7 | 3 | 210 | UCI Repository |
Regularized Multidimensional Scaling with Radial basis function (RMDS) is a nonlinear variant of classical Multi-Dimensional Scaling (cMDS). A key issue that has been addressed in RMDS is the effective selection of centers of the radial basis functions that plays a very important role in reducing the dimension preserving the structure of the data in higher dimensional space. RMDS uses data in unsupervised settings that means RMDS does not use any prior information of the dataset. This article is concerned on the supervised setting. Here we have incorporated the class information of some members of data to the RMDS model. The class separability term improved the method RMDS significantly and also outperforms other discriminant analysis methods such as Linear discriminant analysis (LDA) which is documented through numerical experiments.
Citation: |
Figure 1.
Fig. 1(a) shows the separation of the nonseparable two classes of iris data by support vector machine (SVM)algorithm. One class represented "+" and the other one is represented by "
Figure 3.
Projected 2-dimensional Iris data, consisting of
Figure 4.
Projected 2-dimensional Cancer data, consisting of
Table 1. List of datasets used in this article and their sources :
Dataset | Dim | Class | no. of ins. | Source |
Iris | 4 | 3 | 150 | UCI Repository |
Cancer | 9 | 2 | 683 | UCI Repository |
Seeds | 7 | 3 | 210 | UCI Repository |
Table 2. Numerical results obtained by applying SVM on three datasets projected using discriminant analysis
Dataset | MDS | RMDS | SRMDS | Improvment over RMDS | |
Iris | Support vector | 18 | 13 | 5 | |
Missclassified points | 6 | 3 | 0 | 66 % | |
Cancer | Support vector | 64 | 54 | 47 | |
Missclassified points | 9 | 5 | 1 | 80 % | |
Seeds C1 | Support vector | 42 | 35 | 23 | |
Missclassified points | 12 | 10 | 5 | 50 % | |
Seeds C2 | Support vector | 20 | 16 | 10 | |
Missclassified points | 5 | 3 | 2 | 33 % | |
Seeds C3 | Support vector | 24 | 20 | 9 | |
Missclassified points | 8 | 5 | 2 | 60 % |
Table 3. Misclassified points obtained by k-nn (3-nn) classifier on three datasets projected by SRMDS and LDA
Dataset | LDA | SRMDS |
Iris | 0 | 0 |
Cancer | 4 | 2 |
Seeds | 20 | 9 |
[1] | A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007. |
[2] | A. Argyriou, T. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272. |
[3] | J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193. |
[4] | I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, 2$^{nd}$ edition, Springer Series in Statistics, Springer, 2005. |
[5] | F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308. doi: 10.1007/BF02294026. |
[6] | H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415. doi: 10.3934/jimo.2009.5.403. |
[7] | L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321. |
[8] | T. F. Cox and M. A. Cox, Multidimensional Scaling, 2$^{nd}$ edition, Chapman and Hall/CRC, 2002. |
[9] | T. F. Cox and G. Ferry, Discriminant analysis using nonmetric multidimensional scaling, Pattern Recognition, 26 (1993), 145-153. |
[10] | J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133–145. |
[11] | J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin, (1994), 308–325. |
[12] | W. Glunt, T. L. Hayden, S. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600. doi: 10.1137/0611042. |
[13] | W. Glunt, T. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120. |
[14] | J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328. doi: 10.1093/biomet/53.3-4.325. |
[15] | Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748. doi: 10.3934/jimo.2007.3.739. |
[16] | W. L. G. Koontz and K. Fukunaga, A nonlinear feature extraction algorithm using distance information, IEEE Transactions on Computers, 21 (1972), 56-63. |
[17] | J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27. doi: 10.1007/BF02289565. |
[18] | T. Li, S. Zhu and M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl Inf Syst., 10 (2006), 453-472. doi: 10.1007/s10115-006-0013-y. |
[19] | D. Lowe, Novel topographic nonlinear feature extraction using radial basis functions for concentration coding in the artificial nose, IEEE International Conference on Artificial Neural Networks, (1993), 95–99. |
[20] | K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, $10{^th}$ printing, Academic Press, 1995. |
[21] | A. M. Martinez and A. C. Kak, PCA versus LDA, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (2001), 228-233. doi: 10.1109/34.908974. |
[22] | S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15. |
[23] | E. Pȩkalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific, 2005. |
[24] | H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93. doi: 10.1137/110849523. |
[25] | H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336. doi: 10.1080/00949655.2011.579970. |
[26] | H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826. doi: 10.1109/TSP.2013.2264814. |
[27] | H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, Ser. A, 147 (2014), 351-389. doi: 10.1007/s10107-013-0726-0. |
[28] | K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476. doi: 10.3934/jimo.2005.1.465. |
[29] | I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet, Ann. Math., 36 (1935), 724-732. doi: 10.2307/1968654. |
[30] | S Jahanand and H. D. Qi, Regularized multidimensional scaling with radial basis functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563. doi: 10.3934/jimo.2016.12.543. |
[31] | S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009. |
[32] | S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB Approach, Elsevier Inc., 2010. |
[33] | W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958. |
[34] | A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759. |
[35] | A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635–639. |
[36] | A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168. |
[37] | G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22. doi: 10.1007/BF02288560. |
[38] | Y. Yuan, W. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542. doi: 10.3934/jimo.2007.3.529. |