# American Institute of Mathematical Sciences

• Previous Article
Word Sense disambiguation based on stretchable matching of the semantic template
• MFC Home
• This Issue
• Next Article
The uses and abuses of an age-period-cohort method: On the linear algebra and statistical properties of intrinsic and related estimators
doi: 10.3934/mfc.2020024

## Fixed-point algorithms for inverse of residual rectifier neural networks

 School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Bentley, WA, Australia

* Corresponding author: Ruhua Wang

Received  August 2020 Revised  September 2020 Published  October 2020

A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural networks, and investigate conditions for invertibility under which the hidden layers are invertible. A new fixed-point algorithm is developed to invert the hidden layers of residual networks. The proposed inverse algorithms are capable of inverting some residual networks which cannot be inverted by existing inverting algorithms. Furthermore, a special residual rectifier network is designed and trained on MNIST so that it can achieve comparable performance with the state-of-art performance while its hidden layers are invertible.

Citation: Ruhua Wang, Senjian An, Wanquan Liu, Ling Li. Fixed-point algorithms for inverse of residual rectifier neural networks. Mathematical Foundations of Computing, doi: 10.3934/mfc.2020024
##### References:
 [1] S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar [2] J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar [3] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar [4] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar [5] L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar [6] L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar [7] A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar [8] A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar [9] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar [10] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar [11] S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar [12] J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar [13] D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar [14] A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar [15] E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar [16] R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar [17] A. Saberi, A. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar [18] R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar [19] T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar [20] J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar [21] M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar

show all references

##### References:
 [1] S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar [2] J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar [3] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar [4] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar [5] L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar [6] L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar [7] A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar [8] A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar [9] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar [10] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar [11] S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar [12] J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar [13] D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar [14] A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar [15] E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar [16] R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar [17] A. Saberi, A. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar [18] R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar [19] T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar [20] J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar [21] M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar
The proposed residual network architecture
Inverse of the rectifier linear transform: Invertible percentage of 500 cases changes along with $\gamma$ when the dimension of $\mathbf{x}$ is 10
Inverse of the residual unit with the fully-connected layer: Invertible percentage of 500 cases changes along with $\gamma$ when the dimension of $\mathbf{x}$ is 10
Comparison of recovered images to original digit images. The 1st row illustrates the original images, whereas the 2nd and 3rd rows show the recovered images from the proposed fixed-point method and the existing fixed-point method, respectively
Relative error rates (%) of the recovered images. One hundred samples per each class, in total 1000 samples, were chosen and recovered
 [1] Lars Grüne. Computing Lyapunov functions using deep neural networks. Journal of Computational Dynamics, 2020  doi: 10.3934/jcd.2021006 [2] Leslaw Skrzypek, Yuncheng You. Feedback synchronization of FHN cellular neural networks. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2021001 [3] Ivanka Stamova, Gani Stamov. On the stability of sets for reaction–diffusion Cohen–Grossberg delayed neural networks. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1429-1446. doi: 10.3934/dcdss.2020370 [4] Jianping Zhou, Yamin Liu, Ju H. Park, Qingkai Kong, Zhen Wang. Fault-tolerant anti-synchronization control for chaotic switched neural networks with time delay and reaction diffusion. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1569-1589. doi: 10.3934/dcdss.2020357 [5] Ozlem Faydasicok. Further stability analysis of neutral-type Cohen-Grossberg neural networks with multiple delays. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1245-1258. doi: 10.3934/dcdss.2020359 [6] Chuangxia Huang, Hedi Yang, Jinde Cao. Weighted pseudo almost periodicity of multi-proportional delayed shunting inhibitory cellular neural networks with D operator. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1259-1272. doi: 10.3934/dcdss.2020372 [7] Xiaoxian Tang, Jie Wang. Bistability of sequestration networks. Discrete & Continuous Dynamical Systems - B, 2021, 26 (3) : 1337-1357. doi: 10.3934/dcdsb.2020165 [8] D. R. Michiel Renger, Johannes Zimmer. Orthogonality of fluxes in general nonlinear reaction networks. Discrete & Continuous Dynamical Systems - S, 2021, 14 (1) : 205-217. doi: 10.3934/dcdss.2020346 [9] Bernold Fiedler. Global Hopf bifurcation in networks with fast feedback cycles. Discrete & Continuous Dynamical Systems - S, 2021, 14 (1) : 177-203. doi: 10.3934/dcdss.2020344 [10] Pedro Aceves-Sanchez, Benjamin Aymard, Diane Peurichard, Pol Kennel, Anne Lorsignol, Franck Plouraboué, Louis Casteilla, Pierre Degond. A new model for the emergence of blood capillary networks. Networks & Heterogeneous Media, 2020  doi: 10.3934/nhm.2021001 [11] Shirin Panahi, Sajad Jafari. New synchronization index of non-identical networks. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1359-1373. doi: 10.3934/dcdss.2020371 [12] Editorial Office. Retraction: Honggang Yu, An efficient face recognition algorithm using the improved convolutional neural network. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 901-901. doi: 10.3934/dcdss.2019060 [13] Hongfei Yang, Xiaofeng Ding, Raymond Chan, Hui Hu, Yaxin Peng, Tieyong Zeng. A new initialization method based on normed statistical spaces in deep networks. Inverse Problems & Imaging, 2021, 15 (1) : 147-158. doi: 10.3934/ipi.2020045 [14] Charlotte Rodriguez. Networks of geometrically exact beams: Well-posedness and stabilization. Mathematical Control & Related Fields, 2021  doi: 10.3934/mcrf.2021002 [15] Sanmei Zhu, Jun-e Feng, Jianli Zhao. State feedback for set stabilization of Markovian jump Boolean control networks. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1591-1605. doi: 10.3934/dcdss.2020413 [16] Yvjing Yang, Yang Liu, Jungang Lou, Zhen Wang. Observability of switched Boolean control networks using algebraic forms. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1519-1533. doi: 10.3934/dcdss.2020373 [17] Bingyan Liu, Xiongbing Ye, Xianzhou Dong, Lei Ni. Branching improved Deep Q Networks for solving pursuit-evasion strategy solution of spacecraft. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2021016 [18] Guillaume Cantin, M. A. Aziz-Alaoui. Dimension estimate of attractors for complex networks of reaction-diffusion systems applied to an ecological model. Communications on Pure & Applied Analysis, 2021, 20 (2) : 623-650. doi: 10.3934/cpaa.2020283 [19] M. Syed Ali, L. Palanisamy, Nallappan Gunasekaran, Ahmed Alsaedi, Bashir Ahmad. Finite-time exponential synchronization of reaction-diffusion delayed complex-dynamical networks. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1465-1477. doi: 10.3934/dcdss.2020395 [20] Jinsen Zhuang, Yan Zhou, Yonghui Xia. Synchronization analysis of drive-response multi-layer dynamical networks with additive couplings and stochastic perturbations. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1607-1629. doi: 10.3934/dcdss.2020279

Impact Factor:

## Tools

Article outline

Figures and Tables