\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Discriminant analysis of regularized multidimensional scaling

This research was supported by the Commonwealth Scholarship BDCS-2012-44
Abstract / Introduction Full Text(HTML) Figure(5) / Table(3) Related Papers Cited by
  • Regularized Multidimensional Scaling with Radial basis function (RMDS) is a nonlinear variant of classical Multi-Dimensional Scaling (cMDS). A key issue that has been addressed in RMDS is the effective selection of centers of the radial basis functions that plays a very important role in reducing the dimension preserving the structure of the data in higher dimensional space. RMDS uses data in unsupervised settings that means RMDS does not use any prior information of the dataset. This article is concerned on the supervised setting. Here we have incorporated the class information of some members of data to the RMDS model. The class separability term improved the method RMDS significantly and also outperforms other discriminant analysis methods such as Linear discriminant analysis (LDA) which is documented through numerical experiments.

    Mathematics Subject Classification: Primary: 62H30, 68T10; Secondary: 90C25.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Fig. 1(a) shows the separation of the nonseparable two classes of iris data by support vector machine (SVM)algorithm. One class represented "+" and the other one is represented by "$ \diamond $". Over $ 100 $ runs, SRMDS model yielded at an average $ 1 $ misclassified points, while the corresponding number for RMDS was Webb's model is $ 3 $ to $ 5 $. Fig. 1(b) shows SVM applied on 2-dimensional projection of Cancer data. Over $ 100 $ runs, SRMDS model yielded $ 1 $ to $ 3 $ misclassified points. Here support vectors in each image is bounded by "O" and misclassified points (bounded by $ \square $)

    Figure 2.  SVM on Seeds data projected in 2 dimensional space by $\text{RMDS}$ is shown in these figures. Where the separation of the classes are shown using multiclass classifier

    Figure 3.  Projected 2-dimensional Iris data, consisting of $ 3 $ classes. One class represented by 'red +' is completely separated from the other two. Training points of nonseparable two classes are represented by 'green +' and 'blue +', whereas the testing points projected by SRMDS are represented by 'pink o' and 'red o'. Fig. 3(a) $ \lambda = 0.1 $. Fig. 3(b) $ \lambda = 0.5 $. Fig. 3(c)$ \lambda = 0.9 $, over $ 100 $ random runs. The incresed value of $ \lambda $ puts more weight on preservation of the structure of data

    Figure 4.  Projected 2-dimensional Cancer data, consisting of $ 2 $ classes. Training points of the classes are represented by 'red +' and 'blue +', whereas the testing points of respective classes projected by SRMDS are represented by 'green o' and 'pink o' Fig. 4(a) $ \lambda = 0.1 $. Fig. 4(b) $ \lambda = 0.5 $. Fig. 4(c) $ \lambda = 0.9 $, over $ 100 $ random runs. The incresed value of $ \lambda $ puts more weight on preserving the structure of data

    Figure 5.  Average stress of dataset for different values of $ \lambda $ over $ 100 $ random runs. The incresed value of $ \lambda $ reduces the stress

    Table 1.  List of datasets used in this article and their sources :

    Dataset Dim Class no. of ins. Source
    Iris 4 3 150 UCI Repository
    Cancer 9 2 683 UCI Repository
    Seeds 7 3 210 UCI Repository
     | Show Table
    DownLoad: CSV

    Table 2.  Numerical results obtained by applying SVM on three datasets projected using discriminant analysis

    Dataset MDS RMDS SRMDS Improvment over RMDS
    Iris Support vector 18 13 5
    Missclassified points 6 3 0 66 %
    Cancer Support vector 64 54 47
    Missclassified points 9 5 1 80 %
    Seeds C1 Support vector 42 35 23
    Missclassified points 12 10 5 50 %
    Seeds C2 Support vector 20 16 10
    Missclassified points 5 3 2 33 %
    Seeds C3 Support vector 24 20 9
    Missclassified points 8 5 2 60 %
     | Show Table
    DownLoad: CSV

    Table 3.  Misclassified points obtained by k-nn (3-nn) classifier on three datasets projected by SRMDS and LDA

    Dataset LDA SRMDS
    Iris 0 0
    Cancer 4 2
    Seeds 20 9
     | Show Table
    DownLoad: CSV
  • [1] A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007.
    [2] A. ArgyriouT. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272. 
    [3] J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193. 
    [4] I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, 2$^{nd}$ edition, Springer Series in Statistics, Springer, 2005.
    [5] F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308.  doi: 10.1007/BF02294026.
    [6] H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415.  doi: 10.3934/jimo.2009.5.403.
    [7] L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321. 
    [8] T. F. Cox and M. A. Cox, Multidimensional Scaling, 2$^{nd}$ edition, Chapman and Hall/CRC, 2002.
    [9] T. F. Cox and G. Ferry, Discriminant analysis using nonmetric multidimensional scaling, Pattern Recognition, 26 (1993), 145-153. 
    [10] J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133–145.
    [11] J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin, (1994), 308–325.
    [12] W. GluntT. L. HaydenS. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600.  doi: 10.1137/0611042.
    [13] W. GluntT. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120. 
    [14] J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328.  doi: 10.1093/biomet/53.3-4.325.
    [15] Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748.  doi: 10.3934/jimo.2007.3.739.
    [16] W. L. G. Koontz and K. Fukunaga, A nonlinear feature extraction algorithm using distance information, IEEE Transactions on Computers, 21 (1972), 56-63. 
    [17] J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27.  doi: 10.1007/BF02289565.
    [18] T. LiS. Zhu and M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl Inf Syst., 10 (2006), 453-472.  doi: 10.1007/s10115-006-0013-y.
    [19] D. Lowe, Novel topographic nonlinear feature extraction using radial basis functions for concentration coding in the artificial nose, IEEE International Conference on Artificial Neural Networks, (1993), 95–99.
    [20] K. V. MardiaJ. T. Kent and  J. M. BibbyMultivariate Analysis, $10{^th}$ printing, Academic Press, 1995. 
    [21] A. M. Martinez and A. C. Kak, PCA versus LDA, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (2001), 228-233.  doi: 10.1109/34.908974.
    [22] S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15. 
    [23] E. Pȩkalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific, 2005.
    [24] H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93.  doi: 10.1137/110849523.
    [25] H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336.  doi: 10.1080/00949655.2011.579970.
    [26] H.-D. QiN. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826.  doi: 10.1109/TSP.2013.2264814.
    [27] H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, Ser. A, 147 (2014), 351-389.  doi: 10.1007/s10107-013-0726-0.
    [28] K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476.  doi: 10.3934/jimo.2005.1.465.
    [29] I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet, Ann. Math., 36 (1935), 724-732.  doi: 10.2307/1968654.
    [30] S Jahanand and H. D. Qi, Regularized multidimensional scaling with radial basis functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563.  doi: 10.3934/jimo.2016.12.543.
    [31] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009.
    [32] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB Approach, Elsevier Inc., 2010.
    [33] W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958.
    [34] A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759. 
    [35] A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635–639.
    [36] A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168. 
    [37] G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22.  doi: 10.1007/BF02288560.
    [38] Y. YuanW. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542.  doi: 10.3934/jimo.2007.3.529.
  • 加载中

Figures(5)

Tables(3)

SHARE

Article Metrics

HTML views(3471) PDF downloads(320) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return