• Previous Article
    The uses and abuses of an age-period-cohort method: On the linear algebra and statistical properties of intrinsic and related estimators
  • MFC Home
  • This Issue
  • Next Article
    On approximation to discrete q-derivatives of functions via q-Bernstein-Schurer operators
February  2021, 4(1): 31-44. doi: 10.3934/mfc.2020024

Fixed-point algorithms for inverse of residual rectifier neural networks

School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Bentley, WA, Australia

* Corresponding author: Ruhua Wang

Received  August 2020 Revised  September 2020 Published  February 2021 Early access  October 2020

A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural networks, and investigate conditions for invertibility under which the hidden layers are invertible. A new fixed-point algorithm is developed to invert the hidden layers of residual networks. The proposed inverse algorithms are capable of inverting some residual networks which cannot be inverted by existing inverting algorithms. Furthermore, a special residual rectifier network is designed and trained on MNIST so that it can achieve comparable performance with the state-of-art performance while its hidden layers are invertible.

Citation: Ruhua Wang, Senjian An, Wanquan Liu, Ling Li. Fixed-point algorithms for inverse of residual rectifier neural networks. Mathematical Foundations of Computing, 2021, 4 (1) : 31-44. doi: 10.3934/mfc.2020024
References:
[1]

S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523.

[2]

J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582.

[3]

F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.

[4]

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863.

[5]

L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516.

[6]

L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803.

[7]

A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.

[8]

A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224.

[9] I. GoodfellowY. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016. 
[10]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861.

[11]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167.

[12]

J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088.

[13]

D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224.

[14]

A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.

[15]

E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.

[16]

R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.

[17]

A. SaberiA. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.

[18]

R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810.

[19]

T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728.

[20]

J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621.

[21]

M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.

show all references

References:
[1]

S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523.

[2]

J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582.

[3]

F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.

[4]

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863.

[5]

L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516.

[6]

L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803.

[7]

A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.

[8]

A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224.

[9] I. GoodfellowY. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016. 
[10]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861.

[11]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167.

[12]

J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088.

[13]

D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224.

[14]

A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.

[15]

E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.

[16]

R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.

[17]

A. SaberiA. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.

[18]

R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810.

[19]

T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728.

[20]

J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621.

[21]

M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.

Figure 1.  The proposed residual network architecture
Figure 2.  Inverse of the rectifier linear transform: Invertible percentage of 500 cases changes along with $ \gamma $ when the dimension of $ \mathbf{x} $ is 10
Figure 3.  Inverse of the residual unit with the fully-connected layer: Invertible percentage of 500 cases changes along with $ \gamma $ when the dimension of $ \mathbf{x} $ is 10
Figure 4.  Comparison of recovered images to original digit images. The 1st row illustrates the original images, whereas the 2nd and 3rd rows show the recovered images from the proposed fixed-point method and the existing fixed-point method, respectively
Figure 5.  Relative error rates (%) of the recovered images. One hundred samples per each class, in total 1000 samples, were chosen and recovered
[1]

Ying Sue Huang. Resynchronization of delayed neural networks. Discrete and Continuous Dynamical Systems, 2001, 7 (2) : 397-401. doi: 10.3934/dcds.2001.7.397

[2]

Torsten Trimborn, Stephan Gerster, Giuseppe Visconti. Spectral methods to study the robustness of residual neural networks with infinite layers. Foundations of Data Science, 2020, 2 (3) : 257-278. doi: 10.3934/fods.2020012

[3]

Tatyana S. Turova. Structural phase transitions in neural networks. Mathematical Biosciences & Engineering, 2014, 11 (1) : 139-148. doi: 10.3934/mbe.2014.11.139

[4]

Yacine Chitour, Zhenyu Liao, Romain Couillet. A geometric approach of gradient descent algorithms in linear neural networks. Mathematical Control and Related Fields, 2022  doi: 10.3934/mcrf.2022021

[5]

Leong-Kwan Li, Sally Shao. Convergence analysis of the weighted state space search algorithm for recurrent neural networks. Numerical Algebra, Control and Optimization, 2014, 4 (3) : 193-207. doi: 10.3934/naco.2014.4.193

[6]

Benedict Leimkuhler, Charles Matthews, Tiffany Vlaar. Partitioned integrators for thermodynamic parameterization of neural networks. Foundations of Data Science, 2019, 1 (4) : 457-489. doi: 10.3934/fods.2019019

[7]

Ricai Luo, Honglei Xu, Wu-Sheng Wang, Jie Sun, Wei Xu. A weak condition for global stability of delayed neural networks. Journal of Industrial and Management Optimization, 2016, 12 (2) : 505-514. doi: 10.3934/jimo.2016.12.505

[8]

Benedetta Lisena. Average criteria for periodic neural networks with delay. Discrete and Continuous Dynamical Systems - B, 2014, 19 (3) : 761-773. doi: 10.3934/dcdsb.2014.19.761

[9]

Larry Turyn. Cellular neural networks: asymmetric templates and spatial chaos. Conference Publications, 2003, 2003 (Special) : 864-871. doi: 10.3934/proc.2003.2003.864

[10]

Lars Grüne. Computing Lyapunov functions using deep neural networks. Journal of Computational Dynamics, 2021, 8 (2) : 131-152. doi: 10.3934/jcd.2021006

[11]

Leslaw Skrzypek, Yuncheng You. Feedback synchronization of FHN cellular neural networks. Discrete and Continuous Dynamical Systems - B, 2021, 26 (12) : 6047-6056. doi: 10.3934/dcdsb.2021001

[12]

Leong-Kwan Li, Sally Shao, K. F. Cedric Yiu. Nonlinear dynamical system modeling via recurrent neural networks and a weighted state space search algorithm. Journal of Industrial and Management Optimization, 2011, 7 (2) : 385-400. doi: 10.3934/jimo.2011.7.385

[13]

Karim El Laithy, Martin Bogdan. Synaptic energy drives the information processing mechanisms in spiking neural networks. Mathematical Biosciences & Engineering, 2014, 11 (2) : 233-256. doi: 10.3934/mbe.2014.11.233

[14]

Yong Zhao, Qishao Lu. Periodic oscillations in a class of fuzzy neural networks under impulsive control. Conference Publications, 2011, 2011 (Special) : 1457-1466. doi: 10.3934/proc.2011.2011.1457

[15]

Zbigniew Gomolka, Boguslaw Twarog, Jacek Bartman. Improvement of image processing by using homogeneous neural networks with fractional derivatives theorem. Conference Publications, 2011, 2011 (Special) : 505-514. doi: 10.3934/proc.2011.2011.505

[16]

Ivanka Stamova, Gani Stamov. On the stability of sets for reaction–diffusion Cohen–Grossberg delayed neural networks. Discrete and Continuous Dynamical Systems - S, 2021, 14 (4) : 1429-1446. doi: 10.3934/dcdss.2020370

[17]

Jui-Pin Tseng. Global asymptotic dynamics of a class of nonlinearly coupled neural networks with delays. Discrete and Continuous Dynamical Systems, 2013, 33 (10) : 4693-4729. doi: 10.3934/dcds.2013.33.4693

[18]

Cheng-Hsiung Hsu, Suh-Yuh Yang. Structure of a class of traveling waves in delayed cellular neural networks. Discrete and Continuous Dynamical Systems, 2005, 13 (2) : 339-359. doi: 10.3934/dcds.2005.13.339

[19]

Benoît Perthame, Delphine Salort. On a voltage-conductance kinetic system for integrate & fire neural networks. Kinetic and Related Models, 2013, 6 (4) : 841-864. doi: 10.3934/krm.2013.6.841

[20]

Meiyu Sui, Yejuan Wang, Peter E. Kloeden. Pullback attractors for stochastic recurrent neural networks with discrete and distributed delays. Electronic Research Archive, 2021, 29 (2) : 2187-2221. doi: 10.3934/era.2020112

 Impact Factor: 

Metrics

  • PDF downloads (312)
  • HTML views (406)
  • Cited by (0)

Other articles
by authors

[Back to Top]