November  2022, 5(4): 351-362. doi: 10.3934/mfc.2022021

CNN models for readability of Chinese texts

1. 

Department of Mathematics, City University of Hong Kong, Hong Kong

2. 

The Hong Kong University of Science and Technology (Guangzhou), Nansha, Guangzhou, 511400, Guangdong, China

3. 

Shaoxing University, Shaoxing 312000, Zhejiang, China

4. 

School of Data Science, Department of Mathematics, Liu Bie Ju Centre for Mathematical Sciences, City University of Hong Kong, Hong Kong

*Corresponding author: Le-Yin Wei

Received  June 2022 Published  November 2022 Early access  July 2022

Readability of Chinese texts considered in this paper is a multi-class classification problem with $ 12 $ grade classes corresponding to $ 6 $ grades in primary schools, $ 3 $ grades in middle schools, and $ 3 $ grades in high schools. A special property of this problem is the strong ambiguity in determining the grades. To overcome the difficulty, a measurement of readability assessment methods used empirically in practice is adjacent accuracy in addition to exact accuracy. In this paper we give mathematical definitions of these concepts in a learning theory framework and compare these two quantities in terms of the ambiguity level of texts. A deep learning algorithm is proposed for readability of Chinese texts, based on convolutional neural networks and a pre-trained BERT model for vector representations of Chinese characters. The proposed CNN model can extract sentence and text features by convolutions of sentence representations with filters and is efficient for readability assessment, which is demonstrated with some numerical experiments.

Citation: Han Feng, Sizai Hou, Le-Yin Wei, Ding-Xuan Zhou. CNN models for readability of Chinese texts. Mathematical Foundations of Computing, 2022, 5 (4) : 351-362. doi: 10.3934/mfc.2022021
References:
[1]

D. Chen and D. C. Manning, A fast and accurate dependency parser using neural networks, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 740–750. doi: 10.3115/v1/D14-1082.

[2]

T. Cohn, Y. He and Y. Liu, Glove: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 1532–1543.

[3]

E. Dale and J. S. Chall, The concept of readability, Elementary English, 26 (1949), 19-26. 

[4]

J. Devlin, M. W. Chang, K. Lee and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2019), 4171–4186.

[5]

Z. FangH. FengS. Huang and D. X. Zhou, Theory of deep convolutional neural networks II: Spherical analysis, Neural Networks, 131 (2020), 154-162. 

[6]

H. Feng, S. Huang and D. X. Zhou, Generalization analysis of CNNs for classification on spheres, IEEE Transactions on Neural Networks and Learning Systems, in press.

[7]

J. R. Firth, A synopsis of linguistic theory 1930-55, Studies in Linguistic Analysis (Special Volume of the Philological Society), The Philological Society, (1957), 1–32.

[8]

Y. Kim, Convolutional neural networks for sentence classification, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 1746–1751.

[9]

A. KrizhevskyI. Sutskever and E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 25 (2012), 1097-1105. 

[10]

Y. LeCunL. BottouY. Bengio and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), 2278-2324.  doi: 10.1109/5.726791.

[11]

T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, ICLR, 2013.

[12]

J. Zeng, Y. Xie, J. Lee and D. X. Zhou, CR-BERT: Chinese text readability method based on BERT and multi-level attention, preprint, (2022).

[13]

D. X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), 787-794.  doi: 10.1016/j.acha.2019.06.004.

[14]

D. X. Zhou, Deep distributed convolutional neural networks: Universality, Analysis and Applications, 16 (2018), 895-919.  doi: 10.1142/S0219530518500124.

[15]

D. X. Zhou, Theory of deep convolutional neural networks: Downsampling, Neural Networks, 124 (2020), 319-327. 

show all references

References:
[1]

D. Chen and D. C. Manning, A fast and accurate dependency parser using neural networks, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 740–750. doi: 10.3115/v1/D14-1082.

[2]

T. Cohn, Y. He and Y. Liu, Glove: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 1532–1543.

[3]

E. Dale and J. S. Chall, The concept of readability, Elementary English, 26 (1949), 19-26. 

[4]

J. Devlin, M. W. Chang, K. Lee and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2019), 4171–4186.

[5]

Z. FangH. FengS. Huang and D. X. Zhou, Theory of deep convolutional neural networks II: Spherical analysis, Neural Networks, 131 (2020), 154-162. 

[6]

H. Feng, S. Huang and D. X. Zhou, Generalization analysis of CNNs for classification on spheres, IEEE Transactions on Neural Networks and Learning Systems, in press.

[7]

J. R. Firth, A synopsis of linguistic theory 1930-55, Studies in Linguistic Analysis (Special Volume of the Philological Society), The Philological Society, (1957), 1–32.

[8]

Y. Kim, Convolutional neural networks for sentence classification, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 1746–1751.

[9]

A. KrizhevskyI. Sutskever and E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 25 (2012), 1097-1105. 

[10]

Y. LeCunL. BottouY. Bengio and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), 2278-2324.  doi: 10.1109/5.726791.

[11]

T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, ICLR, 2013.

[12]

J. Zeng, Y. Xie, J. Lee and D. X. Zhou, CR-BERT: Chinese text readability method based on BERT and multi-level attention, preprint, (2022).

[13]

D. X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), 787-794.  doi: 10.1016/j.acha.2019.06.004.

[14]

D. X. Zhou, Deep distributed convolutional neural networks: Universality, Analysis and Applications, 16 (2018), 895-919.  doi: 10.1142/S0219530518500124.

[15]

D. X. Zhou, Theory of deep convolutional neural networks: Downsampling, Neural Networks, 124 (2020), 319-327. 

Figure 1.  One Filter Instance
Figure 2.  Two Inputs & Two Filters
Figure 3.  Accuracy Curve by Epoch Number
Figure 4.  Confusion Matrix
Figure 5.  Scatter Plot
Table 1.  Number of Texts in Each Grade
Grade 1 2 3 4 5 6 7 8 9 10 11 12 Total
Texts 235 320 386 321 281 252 145 58 134 86 26 109 2353
Grade 1 2 3 4 5 6 7 8 9 10 11 12 Total
Texts 235 320 386 321 281 252 145 58 134 86 26 109 2353
Table 2.  Empirical accuracies of various models
Model Vec2Read[12] Tseng et al.[12] Basic Multi-Channel Top-$ k $ Fused
$ \hat{{\mathcal A}} $ 29.18 29.00 43.9 44.8 45.6 48.6
$ \hat{{\mathcal A}}_{\mathcal C} $ 69.70 67.05 83.7 84.3 83.7 88.0
Model Vec2Read[12] Tseng et al.[12] Basic Multi-Channel Top-$ k $ Fused
$ \hat{{\mathcal A}} $ 29.18 29.00 43.9 44.8 45.6 48.6
$ \hat{{\mathcal A}}_{\mathcal C} $ 69.70 67.05 83.7 84.3 83.7 88.0
[1]

Ye Sun, Daniel B. Work. Error bounds for Kalman filters on traffic networks. Networks and Heterogeneous Media, 2018, 13 (2) : 261-295. doi: 10.3934/nhm.2018012

[2]

Ying Sue Huang. Resynchronization of delayed neural networks. Discrete and Continuous Dynamical Systems, 2001, 7 (2) : 397-401. doi: 10.3934/dcds.2001.7.397

[3]

Tatyana S. Turova. Structural phase transitions in neural networks. Mathematical Biosciences & Engineering, 2014, 11 (1) : 139-148. doi: 10.3934/mbe.2014.11.139

[4]

Benedict Leimkuhler, Charles Matthews, Tiffany Vlaar. Partitioned integrators for thermodynamic parameterization of neural networks. Foundations of Data Science, 2019, 1 (4) : 457-489. doi: 10.3934/fods.2019019

[5]

Ricai Luo, Honglei Xu, Wu-Sheng Wang, Jie Sun, Wei Xu. A weak condition for global stability of delayed neural networks. Journal of Industrial and Management Optimization, 2016, 12 (2) : 505-514. doi: 10.3934/jimo.2016.12.505

[6]

Benedetta Lisena. Average criteria for periodic neural networks with delay. Discrete and Continuous Dynamical Systems - B, 2014, 19 (3) : 761-773. doi: 10.3934/dcdsb.2014.19.761

[7]

Larry Turyn. Cellular neural networks: asymmetric templates and spatial chaos. Conference Publications, 2003, 2003 (Special) : 864-871. doi: 10.3934/proc.2003.2003.864

[8]

Lars Grüne. Computing Lyapunov functions using deep neural networks. Journal of Computational Dynamics, 2021, 8 (2) : 131-152. doi: 10.3934/jcd.2021006

[9]

Leslaw Skrzypek, Yuncheng You. Feedback synchronization of FHN cellular neural networks. Discrete and Continuous Dynamical Systems - B, 2021, 26 (12) : 6047-6056. doi: 10.3934/dcdsb.2021001

[10]

Karim El Laithy, Martin Bogdan. Synaptic energy drives the information processing mechanisms in spiking neural networks. Mathematical Biosciences & Engineering, 2014, 11 (2) : 233-256. doi: 10.3934/mbe.2014.11.233

[11]

Yong Zhao, Qishao Lu. Periodic oscillations in a class of fuzzy neural networks under impulsive control. Conference Publications, 2011, 2011 (Special) : 1457-1466. doi: 10.3934/proc.2011.2011.1457

[12]

Zbigniew Gomolka, Boguslaw Twarog, Jacek Bartman. Improvement of image processing by using homogeneous neural networks with fractional derivatives theorem. Conference Publications, 2011, 2011 (Special) : 505-514. doi: 10.3934/proc.2011.2011.505

[13]

Leong-Kwan Li, Sally Shao. Convergence analysis of the weighted state space search algorithm for recurrent neural networks. Numerical Algebra, Control and Optimization, 2014, 4 (3) : 193-207. doi: 10.3934/naco.2014.4.193

[14]

Ivanka Stamova, Gani Stamov. On the stability of sets for reaction–diffusion Cohen–Grossberg delayed neural networks. Discrete and Continuous Dynamical Systems - S, 2021, 14 (4) : 1429-1446. doi: 10.3934/dcdss.2020370

[15]

Jui-Pin Tseng. Global asymptotic dynamics of a class of nonlinearly coupled neural networks with delays. Discrete and Continuous Dynamical Systems, 2013, 33 (10) : 4693-4729. doi: 10.3934/dcds.2013.33.4693

[16]

Cheng-Hsiung Hsu, Suh-Yuh Yang. Structure of a class of traveling waves in delayed cellular neural networks. Discrete and Continuous Dynamical Systems, 2005, 13 (2) : 339-359. doi: 10.3934/dcds.2005.13.339

[17]

Benoît Perthame, Delphine Salort. On a voltage-conductance kinetic system for integrate & fire neural networks. Kinetic and Related Models, 2013, 6 (4) : 841-864. doi: 10.3934/krm.2013.6.841

[18]

Meiyu Sui, Yejuan Wang, Peter E. Kloeden. Pullback attractors for stochastic recurrent neural networks with discrete and distributed delays. Electronic Research Archive, 2021, 29 (2) : 2187-2221. doi: 10.3934/era.2020112

[19]

Torsten Trimborn, Stephan Gerster, Giuseppe Visconti. Spectral methods to study the robustness of residual neural networks with infinite layers. Foundations of Data Science, 2020, 2 (3) : 257-278. doi: 10.3934/fods.2020012

[20]

Ruhua Wang, Senjian An, Wanquan Liu, Ling Li. Fixed-point algorithms for inverse of residual rectifier neural networks. Mathematical Foundations of Computing, 2021, 4 (1) : 31-44. doi: 10.3934/mfc.2020024

 Impact Factor: 

Metrics

  • PDF downloads (83)
  • HTML views (39)
  • Cited by (0)

Other articles
by authors

[Back to Top]