January 2017, 2(1): 69-75. doi: 10.3934/bdia.2017009

Multiple-instance learning for text categorization based on semantic representation

National Key Laboratory for Novel Software Technology, Nanjing University, China

 

Published  September 2017

Text categorization is the fundamental bricks of other related researches in NLP. Up to now, researchers have proposed many effective text categorization methods and gained well performance. However, these methods are generally based on the raw features or low level features, e.g., tf or tfidf, while neglecting the semantic structures between words. Complex semantic information can influence the precision of text categorization. In this paper, we propose a new method to handle the semantic correlations between different words and text features from the representations and the learning schemes. We represent the document as multiple instances based on word2vec. Experiments validate the effectiveness of proposed method compared with those state-of-the-art text categorization methods.

Citation: Jian-Bing Zhang, Yi-Xin Sun, De-Chuan Zhan. Multiple-instance learning for text categorization based on semantic representation. Big Data & Information Analytics, 2017, 2 (1) : 69-75. doi: 10.3934/bdia.2017009
References:
[1]

J. Amores, Multiple instance classification: Review, taxonomy and comparative study, Artificial Intelligence, 201 (2013), 81-105. doi: 10.1016/j.artint.2013.06.003.

[2]

S. AndrewsI. Tsochantaridis and T. Hofmann, Support vector machines for multiple-instance learning, Advances in Neural Information Processing Systems, 15 (2002), 561-568.

[3]

W.B. Cavnar and J.M. Trenkle, N-gram-based text categorization, Ann Arbor MI, 48113 (1994), 161-175.

[4]

Y. Chevaleyre and J. D. Zucker, Solving multiple-instance and multiple-part learning problems with decision trees and rule sets. application to the mutagenesis problem, In Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence, (2001), 204–214. doi: 10.1007/3-540-45153-6_20.

[5]

T.G. DietterichR.H. Lathrop and T. Lozano-Pérez, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, 89 (1997), 31-71. doi: 10.1016/S0004-3702(96)00034-3.

[6]

S. Dumais, Using svms for text categorization, IEEE Expert, 13 (1998), 21-23.

[7]

N. Ishii, T. Murai, T. Yamada and Y. Bao, Text classification by combining grouping, lsa and knn, In Ieee/acis International Conference on Computer and Information Science and Ieee/acis International Workshop on Component-Based Software Engineering, software Architecture and Reuse, (2006), 148–154. doi: 10.1109/ICIS-COMSAR.2006.81.

[8]

Q. Kuang and X. Xu, Improvement and application of tfidf method based on text classification, International Conference on Internet Technology and Applications, (2010), 1-4.

[9]

S. LaiL. XuK. Liu and J. Zhao, Recurrent convolutional neural networks for text classification, AAAI, (2015), 2267-2273.

[10]

O. Maron and T. Lozano-Pérez, A framework for multiple-instance learning, Advances in Neural Information Processing Systems, 200 (1998), 570-576.

[11]

A. Mccallum and K. Nigam, A comparison of event models for naive bayes text classification, In AAAI-98 Workshop On Learning For Text Categorization, 62 (2009), 41-48.

[12]

T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, Computer Science, 2013.

[13]

T. MikolovI. SutskeverK. ChenG. Corrado and J. Dean, Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems, 26 (2013), 3111-3119.

[14]

J. Wang and J.D. Zucker, Solving multiple-instance problem: A lazy learning approach, Proc.international Conf.on Machine Learning, (2000), 1119-1126.

[15]

M.L. Zhang and Z.H. Zhou, Improve multi-instance neural networks through feature selection, Neural Processing Letters, 19 (2004), 1-10. doi: 10.1023/B:NEPL.0000016836.03614.9f.

[16]

Z. H. Zhou and M. L. Zhang, Neural networks for multi-instance learning, In International Conference on Intelligent Information Technology 2002.

show all references

References:
[1]

J. Amores, Multiple instance classification: Review, taxonomy and comparative study, Artificial Intelligence, 201 (2013), 81-105. doi: 10.1016/j.artint.2013.06.003.

[2]

S. AndrewsI. Tsochantaridis and T. Hofmann, Support vector machines for multiple-instance learning, Advances in Neural Information Processing Systems, 15 (2002), 561-568.

[3]

W.B. Cavnar and J.M. Trenkle, N-gram-based text categorization, Ann Arbor MI, 48113 (1994), 161-175.

[4]

Y. Chevaleyre and J. D. Zucker, Solving multiple-instance and multiple-part learning problems with decision trees and rule sets. application to the mutagenesis problem, In Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence, (2001), 204–214. doi: 10.1007/3-540-45153-6_20.

[5]

T.G. DietterichR.H. Lathrop and T. Lozano-Pérez, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, 89 (1997), 31-71. doi: 10.1016/S0004-3702(96)00034-3.

[6]

S. Dumais, Using svms for text categorization, IEEE Expert, 13 (1998), 21-23.

[7]

N. Ishii, T. Murai, T. Yamada and Y. Bao, Text classification by combining grouping, lsa and knn, In Ieee/acis International Conference on Computer and Information Science and Ieee/acis International Workshop on Component-Based Software Engineering, software Architecture and Reuse, (2006), 148–154. doi: 10.1109/ICIS-COMSAR.2006.81.

[8]

Q. Kuang and X. Xu, Improvement and application of tfidf method based on text classification, International Conference on Internet Technology and Applications, (2010), 1-4.

[9]

S. LaiL. XuK. Liu and J. Zhao, Recurrent convolutional neural networks for text classification, AAAI, (2015), 2267-2273.

[10]

O. Maron and T. Lozano-Pérez, A framework for multiple-instance learning, Advances in Neural Information Processing Systems, 200 (1998), 570-576.

[11]

A. Mccallum and K. Nigam, A comparison of event models for naive bayes text classification, In AAAI-98 Workshop On Learning For Text Categorization, 62 (2009), 41-48.

[12]

T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, Computer Science, 2013.

[13]

T. MikolovI. SutskeverK. ChenG. Corrado and J. Dean, Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems, 26 (2013), 3111-3119.

[14]

J. Wang and J.D. Zucker, Solving multiple-instance problem: A lazy learning approach, Proc.international Conf.on Machine Learning, (2000), 1119-1126.

[15]

M.L. Zhang and Z.H. Zhou, Improve multi-instance neural networks through feature selection, Neural Processing Letters, 19 (2004), 1-10. doi: 10.1023/B:NEPL.0000016836.03614.9f.

[16]

Z. H. Zhou and M. L. Zhang, Neural networks for multi-instance learning, In International Conference on Intelligent Information Technology 2002.

Figure 1.  The structure of Bag-of-Words and Skip-Gram
Figure 2.  Pseudo-code for mi-SVM
Table 1.  Results of experiments on sougouC
Model car finance IT health sport
SVM + TF-IDF 0.8473 0.8420 0.8363 0.8326 0.8737
SVM + Word2vec 0.9303 0.8571 0.8755 0.9163 0.9828
mi-SVM + Word2vec 0.9599 0.8904 0.8943 0.9325 0.9842
Model car finance IT health sport
SVM + TF-IDF 0.8473 0.8420 0.8363 0.8326 0.8737
SVM + Word2vec 0.9303 0.8571 0.8755 0.9163 0.9828
mi-SVM + Word2vec 0.9599 0.8904 0.8943 0.9325 0.9842
Table 2.  Results of experiments on 20newsgroup
Model SVM+tf-idf SVM+Word2vec mi-SVM+Word2vec
Average 0.8508 0.8421 0.8619
Model SVM+tf-idf SVM+Word2vec mi-SVM+Word2vec
Average 0.8508 0.8421 0.8619
[1]

Cheng Zheng. Sparse equidistribution of unipotent orbits in finite-volume quotients of $\text{PSL}(2,\mathbb R)$. Journal of Modern Dynamics, 2016, 10: 1-21. doi: 10.3934/jmd.2016.10.1

[2]

J. Kent Poots, Nick Cercone. First steps in the investigation of automated text annotation with pictures. Big Data & Information Analytics, 2017, 2 (2) : 97-106. doi: 10.3934/bdia.2017001

[3]

Luigi C. Berselli, Placido Longo. Classical solutions for the system $\bf {\text{curl}\, v = g}$, with vanishing Dirichlet boundary conditions. Discrete & Continuous Dynamical Systems - S, 2019, 12 (2) : 215-229. doi: 10.3934/dcdss.2019015

[4]

Wei Xue, Wensheng Zhang, Gaohang Yu. Least absolute deviations learning of multiple tasks. Journal of Industrial & Management Optimization, 2018, 14 (2) : 719-729. doi: 10.3934/jimo.2017071

[5]

Carlos Castillo-Garsow. The role of multiple modeling perspectives in students' learning of exponential growth. Mathematical Biosciences & Engineering, 2013, 10 (5&6) : 1437-1453. doi: 10.3934/mbe.2013.10.1437

[6]

Stefan Erickson, Michael J. Jacobson, Jr., Andreas Stein. Explicit formulas for real hyperelliptic curves of genus 2 in affine representation. Advances in Mathematics of Communications, 2011, 5 (4) : 623-666. doi: 10.3934/amc.2011.5.623

[7]

Nikolaos S. Papageorgiou, Calogero Vetro, Francesca Vetro. Multiple solutions for (p, 2)-equations at resonance. Discrete & Continuous Dynamical Systems - S, 2019, 12 (2) : 347-374. doi: 10.3934/dcdss.2019024

[8]

Chuandong Li, Fali Ma, Tingwen Huang. 2-D analysis based iterative learning control for linear discrete-time systems with time delay. Journal of Industrial & Management Optimization, 2011, 7 (1) : 175-181. doi: 10.3934/jimo.2011.7.175

[9]

Yunhai Xiao, Soon-Yi Wu, Bing-Sheng He. A proximal alternating direction method for $\ell_{2,1}$-norm least squares problem in multi-task feature learning. Journal of Industrial & Management Optimization, 2012, 8 (4) : 1057-1069. doi: 10.3934/jimo.2012.8.1057

[10]

Xiaoming Yan, Ping Cao, Minghui Zhang, Ke Liu. The optimal production and sales policy for a new product with negative word-of-mouth. Journal of Industrial & Management Optimization, 2011, 7 (1) : 117-137. doi: 10.3934/jimo.2011.7.117

[11]

José Gómez-Torrecillas, F. J. Lobillo, Gabriel Navarro. Convolutional codes with a matrix-algebra word-ambient. Advances in Mathematics of Communications, 2016, 10 (1) : 29-43. doi: 10.3934/amc.2016.10.29

[12]

Augusto VisintiN. On the variational representation of monotone operators. Discrete & Continuous Dynamical Systems - S, 2017, 10 (4) : 909-918. doi: 10.3934/dcdss.2017046

[13]

Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117

[14]

Yangyang Xu, Wotao Yin, Stanley Osher. Learning circulant sensing kernels. Inverse Problems & Imaging, 2014, 8 (3) : 901-923. doi: 10.3934/ipi.2014.8.901

[15]

Nicolás M. Crisosto, Christopher M. Kribs-Zaleta, Carlos Castillo-Chávez, Stephen Wirkus. Community resilience in collaborative learning. Discrete & Continuous Dynamical Systems - B, 2010, 14 (1) : 17-40. doi: 10.3934/dcdsb.2010.14.17

[16]

Michał Kowalczyk, Yong Liu, Frank Pacard. Towards classification of multiple-end solutions to the Allen-Cahn equation in $\mathbb{R}^2$. Networks & Heterogeneous Media, 2012, 7 (4) : 837-855. doi: 10.3934/nhm.2012.7.837

[17]

Weigao Ge, Li Zhang. Multiple periodic solutions of delay differential systems with $2k-1$ lags via variational approach. Discrete & Continuous Dynamical Systems - A, 2016, 36 (9) : 4925-4943. doi: 10.3934/dcds.2016013

[18]

Guoshan Zhang, Shiwei Wang, Yiming Wang, Wanquan Liu. LS-SVM approximate solution for affine nonlinear systems with partially unknown functions. Journal of Industrial & Management Optimization, 2014, 10 (2) : 621-636. doi: 10.3934/jimo.2014.10.621

[19]

Minlong Lin, Ke Tang. Selective further learning of hybrid ensemble for class imbalanced increment learning. Big Data & Information Analytics, 2017, 2 (1) : 1-21. doi: 10.3934/bdia.2017005

[20]

James Tanis. Exponential multiple mixing for some partially hyperbolic flows on products of $ {\rm{PSL}}(2, \mathbb{R})$. Discrete & Continuous Dynamical Systems - A, 2018, 38 (3) : 989-1006. doi: 10.3934/dcds.2018042

 Impact Factor: 

Metrics

  • PDF downloads (11)
  • HTML views (56)
  • Cited by (0)

Other articles
by authors

[Back to Top]