# American Institute of Mathematical Sciences

February  2020, 3(1): 51-64. doi: 10.3934/mfc.2020005

## An improved deep convolutional neural network model with kernel loss function in image classification

 1 Key Laboratory of Education Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming 650500, China 2 School of Information Science and Technology, Yunnan Normal University, Kunming 650500, China

* Corresponding author: Tianwei Xu

Received  December 2019 Revised  December 2019 Published  February 2020

Fund Project: This work is supported by National Natural Science Foundation of China (No. 61862068)

To further enhance the performance of the current convolutional neural network, an improved deep convolutional neural network model is shown in this paper. Different from the traditional network structure, in our proposed method the pooling layer is replaced by two continuous convolutional layers with $3 \times 3$ convolution kernel between which a dropout layer is added to reduce overfitting, and cross entropy kernel is used as loss function. Experimental results on Mnist and Cifar-10 data sets for image classification show that, compared to several classical neural networks such as Alexnet, VGGNet and GoogleNet, the improved network achieve better performance in learning efficiency and recognition accuracy at relatively shallow network depths.

Citation: Yuantian Xia, Juxiang Zhou, Tianwei Xu, Wei Gao. An improved deep convolutional neural network model with kernel loss function in image classification. Mathematical Foundations of Computing, 2020, 3 (1) : 51-64. doi: 10.3934/mfc.2020005
##### References:
 [1] M. Abadi, A. Agarwal and P. Barham, et al., TensorFlow: Large-scale machine learning on heterogeneous distributed systems[J], arXiv preprint, arXiv: 1603.04467, 2016. Google Scholar [2] Z. An, S. Li and J. Wang, Generalization of deep neural network for bearing fault diagnosis under different working conditions using multiple kernel method,, Neurocomputing, 352 (2019), 42-53.  doi: 10.1016/j.neucom.2019.04.010.  Google Scholar [3] V. Badrinarayanan, A. Kendall and R. Cipolla, SegNet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 2481-2495.  doi: 10.1109/TPAMI.2016.2644615.  Google Scholar [4] C. K. Chui, S. B. Lin and D. X. Zhou, Deep neural networks for rotation-invariance approximation and learning, Anal. Appl., 17 (2019), 737-772.  doi: 10.1142/S0219530519400074.  Google Scholar [5] M. Courbariaux, I. Hubara and D. Soudry, et al., Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or -1[J], arXiv preprint, arXiv: 1602.02830, 2016. Google Scholar [6] K. He, X. Zhang and S. Ren, et al., Delving deep into rectifiers: Surpassing human-level performance on imaget classification, Proceeding of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, Publisher: IEEE, 2015, 1026–1034. doi: 10.1109/ICCV.2015.123.  Google Scholar [7] K. He, X. Zhang, S. Ren, et al., Deep residual learning for image recognition, Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, Publisher: IEEE, 2016,770–778. doi: 10.1109/CVPR.2016.90.  Google Scholar [8] M. Heydari, E. Shivanian and B. Azarnavid, An iterative multistep kernel based method for nonlinear Volterra integral and integro-differential equations of fractional order, J. Comput. Appl. Math., 361 (2019), 97-112.  doi: 10.1016/j.cam.2019.04.017.  Google Scholar [9] G. Huang, Z. Liu and L. V. D. Maaten, et al., Densely connected convolutional networks, IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, Publisher: IEEE, 2017, 2261–2269. doi: 10.1109/CVPR.2017.243.  Google Scholar [10] Y. Jiang, Z. Liang and H. Gao, An improved constraint-based Bayesian network learning method using Gaussian kernel probability density estimator, Expert Syst. Appl., 113 (2018), 544-554.  doi: 10.1016/j.eswa.2018.06.058.  Google Scholar [11] A. Krizhevsky, I. Sutskever and G. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84-90.  doi: 10.1145/3065386.  Google Scholar [12] Y. Lecun, L. Bottou and Y. Bengio, Gradient-based learning applied to document recognition,, Proceedings of the IEEE, 86 (1998), 2278-2324.  doi: 10.1109/5.726791.  Google Scholar [13] Y. Lecun, Y. Bengio and G. Hinton, Deep learning, Nature, 521 (2015), 436-444.   Google Scholar [14] Y. Lei and D. X. Zhou, Convergence of online mirror descent, Appl. Comput. Harmon. Anal., 48 (2020), 343-373.  doi: 10.1016/j.acha.2018.05.005.  Google Scholar [15] S. B. Lin and D. X. Zhou, Optimal learning rates for kernel partial least squares, J. Fourier Anal. Appl., 24 (2018), 908-933.  doi: 10.1007/s00041-017-9544-8.  Google Scholar [16] J. Niu, L. X. Sun and M. Q. Xu, et al., A reproducing kernel method for solving heat conduction equations with delay,, Appl. Math. Lett., 100 (2020), 106036, 7 pp. doi: 10.1016/j.aml.2019.106036.  Google Scholar [17] Y. Qu, L. Lin and P. Shen, Joint hierarchical category structure learning and large-scale image classification, IEEE Trans. Image Process., 26 (2017), 4331-4346.  doi: 10.1109/TIP.2016.2615423.  Google Scholar [18] A. Radford, L. Metz and S. Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J], arXiv preprint, arXiv: 1511.06434, 2015. Google Scholar [19] K. Roushangar and S. Shahnazi, Bed load prediction in gravel-bed rivers using wavelet kernel extreme learning machine and meta-heuristic methods, Int. J. Environ. Sci. Te., 16 (2019), 8197-8208.  doi: 10.1007/s13762-019-02287-6.  Google Scholar [20] K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition[J]., arXiv preprint, arXiv: 1409.1556, 2015. Google Scholar [21] N. Srivastava, G. Hinton and A. Krizhevsky, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15 (2014), 1929-1958.   Google Scholar [22] C. Szegedy, W. Liu and Y. Jia, et al., Going deeper with convolutions, Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, Publisher: IEEE, 2015. doi: 10.1109/CVPR.2015.7298594.  Google Scholar [23] D. X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), 787-794.  doi: 10.1016/j.acha.2019.06.004.  Google Scholar [24] D. X. Zhou, Theory of deep convolutional neural networks: Downsampling, Neural Networks, 124 (2020), 319-327.  doi: 10.1016/j.neunet.2020.01.018.  Google Scholar

show all references

##### References:
 [1] M. Abadi, A. Agarwal and P. Barham, et al., TensorFlow: Large-scale machine learning on heterogeneous distributed systems[J], arXiv preprint, arXiv: 1603.04467, 2016. Google Scholar [2] Z. An, S. Li and J. Wang, Generalization of deep neural network for bearing fault diagnosis under different working conditions using multiple kernel method,, Neurocomputing, 352 (2019), 42-53.  doi: 10.1016/j.neucom.2019.04.010.  Google Scholar [3] V. Badrinarayanan, A. Kendall and R. Cipolla, SegNet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 2481-2495.  doi: 10.1109/TPAMI.2016.2644615.  Google Scholar [4] C. K. Chui, S. B. Lin and D. X. Zhou, Deep neural networks for rotation-invariance approximation and learning, Anal. Appl., 17 (2019), 737-772.  doi: 10.1142/S0219530519400074.  Google Scholar [5] M. Courbariaux, I. Hubara and D. Soudry, et al., Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or -1[J], arXiv preprint, arXiv: 1602.02830, 2016. Google Scholar [6] K. He, X. Zhang and S. Ren, et al., Delving deep into rectifiers: Surpassing human-level performance on imaget classification, Proceeding of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, Publisher: IEEE, 2015, 1026–1034. doi: 10.1109/ICCV.2015.123.  Google Scholar [7] K. He, X. Zhang, S. Ren, et al., Deep residual learning for image recognition, Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, Publisher: IEEE, 2016,770–778. doi: 10.1109/CVPR.2016.90.  Google Scholar [8] M. Heydari, E. Shivanian and B. Azarnavid, An iterative multistep kernel based method for nonlinear Volterra integral and integro-differential equations of fractional order, J. Comput. Appl. Math., 361 (2019), 97-112.  doi: 10.1016/j.cam.2019.04.017.  Google Scholar [9] G. Huang, Z. Liu and L. V. D. Maaten, et al., Densely connected convolutional networks, IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, Publisher: IEEE, 2017, 2261–2269. doi: 10.1109/CVPR.2017.243.  Google Scholar [10] Y. Jiang, Z. Liang and H. Gao, An improved constraint-based Bayesian network learning method using Gaussian kernel probability density estimator, Expert Syst. Appl., 113 (2018), 544-554.  doi: 10.1016/j.eswa.2018.06.058.  Google Scholar [11] A. Krizhevsky, I. Sutskever and G. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84-90.  doi: 10.1145/3065386.  Google Scholar [12] Y. Lecun, L. Bottou and Y. Bengio, Gradient-based learning applied to document recognition,, Proceedings of the IEEE, 86 (1998), 2278-2324.  doi: 10.1109/5.726791.  Google Scholar [13] Y. Lecun, Y. Bengio and G. Hinton, Deep learning, Nature, 521 (2015), 436-444.   Google Scholar [14] Y. Lei and D. X. Zhou, Convergence of online mirror descent, Appl. Comput. Harmon. Anal., 48 (2020), 343-373.  doi: 10.1016/j.acha.2018.05.005.  Google Scholar [15] S. B. Lin and D. X. Zhou, Optimal learning rates for kernel partial least squares, J. Fourier Anal. Appl., 24 (2018), 908-933.  doi: 10.1007/s00041-017-9544-8.  Google Scholar [16] J. Niu, L. X. Sun and M. Q. Xu, et al., A reproducing kernel method for solving heat conduction equations with delay,, Appl. Math. Lett., 100 (2020), 106036, 7 pp. doi: 10.1016/j.aml.2019.106036.  Google Scholar [17] Y. Qu, L. Lin and P. Shen, Joint hierarchical category structure learning and large-scale image classification, IEEE Trans. Image Process., 26 (2017), 4331-4346.  doi: 10.1109/TIP.2016.2615423.  Google Scholar [18] A. Radford, L. Metz and S. Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J], arXiv preprint, arXiv: 1511.06434, 2015. Google Scholar [19] K. Roushangar and S. Shahnazi, Bed load prediction in gravel-bed rivers using wavelet kernel extreme learning machine and meta-heuristic methods, Int. J. Environ. Sci. Te., 16 (2019), 8197-8208.  doi: 10.1007/s13762-019-02287-6.  Google Scholar [20] K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition[J]., arXiv preprint, arXiv: 1409.1556, 2015. Google Scholar [21] N. Srivastava, G. Hinton and A. Krizhevsky, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15 (2014), 1929-1958.   Google Scholar [22] C. Szegedy, W. Liu and Y. Jia, et al., Going deeper with convolutions, Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, Publisher: IEEE, 2015. doi: 10.1109/CVPR.2015.7298594.  Google Scholar [23] D. X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), 787-794.  doi: 10.1016/j.acha.2019.06.004.  Google Scholar [24] D. X. Zhou, Theory of deep convolutional neural networks: Downsampling, Neural Networks, 124 (2020), 319-327.  doi: 10.1016/j.neunet.2020.01.018.  Google Scholar
Mini-network replacing the $3 \times 3$ convolutions
kernel size: $3 \times 3$, stride: 2
Max pooling operation, kernel size: $4\times 4$, stride: 2
Dropout workflow
The improved network structure
The curve of recognition accuracy of Alexnet network and improved network with the training times on cifar-10
The curve of recognition accuracy of VGGNet and improved network with the training times on cifar-10
The curve of recognition accuracy of Google network and improved network with the training times on cifar-10
The curve of recognition accuracy of Alexnet network and improved network with the training times on Minist
The curve of recognition accuracy of VGGNet and improved network with the training times on Minist
The curve of recognition accuracy of GoogleNet network and improved network with the training times on Mnist
 Parameters Value CPU: Intel core i9-9900k GPU: NVIDIA GeForce RTX 2080ti RAM: 16.0 GB OS: WIN10 64-bit Develop software: Python3.7 + TensorFlow framework (GPU mode)
 Parameters Value CPU: Intel core i9-9900k GPU: NVIDIA GeForce RTX 2080ti RAM: 16.0 GB OS: WIN10 64-bit Develop software: Python3.7 + TensorFlow framework (GPU mode)
 Network Cifar Mnist Alexnet: train acc:0.95, test acc:0.78 train acc:0.98, test acc:0.97 VGGNet: train acc:0.98, test acc:0.83 train acc:0.99, test acc:0.98 Google network: train acc:1.0, test acc:0.90 train acc:1.0, test acc:1.0 Improve network: train acc:1.0, test acc:0.94 train acc:1.0, test acc:1.0
 Network Cifar Mnist Alexnet: train acc:0.95, test acc:0.78 train acc:0.98, test acc:0.97 VGGNet: train acc:0.98, test acc:0.83 train acc:0.99, test acc:0.98 Google network: train acc:1.0, test acc:0.90 train acc:1.0, test acc:1.0 Improve network: train acc:1.0, test acc:0.94 train acc:1.0, test acc:1.0
 [1] Zhuwei Qin, Fuxun Yu, Chenchen Liu, Xiang Chen. How convolutional neural networks see the world --- A survey of convolutional neural network visualization methods. Mathematical Foundations of Computing, 2018, 1 (2) : 149-180. doi: 10.3934/mfc.2018008 [2] Editorial Office. Retraction: Honggang Yu, An efficient face recognition algorithm using the improved convolutional neural network. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 901-901. doi: 10.3934/dcdss.2019060 [3] Jianfeng Feng, Mariya Shcherbina, Brunello Tirozzi. Stability of the dynamics of an asymmetric neural network. Communications on Pure & Applied Analysis, 2009, 8 (2) : 655-671. doi: 10.3934/cpaa.2009.8.655 [4] Wenzhong Zhu, Huanlong Jiang, Erli Wang, Yani Hou, Lidong Xian, Joyati Debnath. X-ray image global enhancement algorithm in medical image classification. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1297-1309. doi: 10.3934/dcdss.2019089 [5] Ying Sue Huang, Chai Wah Wu. Stability of cellular neural network with small delays. Conference Publications, 2005, 2005 (Special) : 420-426. doi: 10.3934/proc.2005.2005.420 [6] King Hann Lim, Hong Hui Tan, Hendra G. Harno. Approximate greatest descent in neural network optimization. Numerical Algebra, Control & Optimization, 2018, 8 (3) : 327-336. doi: 10.3934/naco.2018021 [7] Shyan-Shiou Chen, Chih-Wen Shih. Asymptotic behaviors in a transiently chaotic neural network. Discrete & Continuous Dynamical Systems - A, 2004, 10 (3) : 805-826. doi: 10.3934/dcds.2004.10.805 [8] Ndolane Sene. Fractional input stability and its application to neural network. Discrete & Continuous Dynamical Systems - S, 2020, 13 (3) : 853-865. doi: 10.3934/dcdss.2020049 [9] Zbigniew Gomolka, Boguslaw Twarog, Jacek Bartman. Improvement of image processing by using homogeneous neural networks with fractional derivatives theorem. Conference Publications, 2011, 2011 (Special) : 505-514. doi: 10.3934/proc.2011.2011.505 [10] Samuel Amstutz, Antonio André Novotny, Nicolas Van Goethem. Minimal partitions and image classification using a gradient-free perimeter approximation. Inverse Problems & Imaging, 2014, 8 (2) : 361-387. doi: 10.3934/ipi.2014.8.361 [11] Rui Hu, Yuan Yuan. Stability, bifurcation analysis in a neural network model with delay and diffusion. Conference Publications, 2009, 2009 (Special) : 367-376. doi: 10.3934/proc.2009.2009.367 [12] Hui-Qiang Ma, Nan-Jing Huang. Neural network smoothing approximation method for stochastic variational inequality problems. Journal of Industrial & Management Optimization, 2015, 11 (2) : 645-660. doi: 10.3934/jimo.2015.11.645 [13] Yixin Guo, Aijun Zhang. Existence and nonexistence of traveling pulses in a lateral inhibition neural network. Discrete & Continuous Dynamical Systems - B, 2016, 21 (6) : 1729-1755. doi: 10.3934/dcdsb.2016020 [14] Jianhong Wu, Ruyuan Zhang. A simple delayed neural network with large capacity for associative memory. Discrete & Continuous Dynamical Systems - B, 2004, 4 (3) : 851-863. doi: 10.3934/dcdsb.2004.4.851 [15] Sanjay K. Mazumdar, Cheng-Chew Lim. A neural network based anti-skid brake system. Discrete & Continuous Dynamical Systems - A, 1999, 5 (2) : 321-338. doi: 10.3934/dcds.1999.5.321 [16] K. L. Mak, J. G. Peng, Z. B. Xu, K. F. C. Yiu. A novel neural network for associative memory via dynamical systems. Discrete & Continuous Dynamical Systems - B, 2006, 6 (3) : 573-590. doi: 10.3934/dcdsb.2006.6.573 [17] Lidong Liu, Fajie Wei, Shenghan Zhou. Major project risk assessment method based on BP neural network. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1053-1064. doi: 10.3934/dcdss.2019072 [18] Danilo Costarelli, Gianluca Vinti. Asymptotic expansions and Voronovskaja type theorems for the multivariate neural network operators. Mathematical Foundations of Computing, 2020, 3 (1) : 41-50. doi: 10.3934/mfc.2020004 [19] Hyeontae Jo, Hwijae Son, Hyung Ju Hwang, Eun Heui Kim. Deep neural network approach to forward-inverse problems. Networks & Heterogeneous Media, 2020, 15 (2) : 247-259. doi: 10.3934/nhm.2020011 [20] Hiroaki Uchida, Yuya Oishi, Toshimichi Saito. A simple digital spiking neural network: synchronization and spike-train approximation. Discrete & Continuous Dynamical Systems - S, 2018, 0 (0) : 0-0. doi: 10.3934/dcdss.2020374

Impact Factor:

## Tools

Article outline

Figures and Tables

[Back to Top]