# American Institute of Mathematical Sciences

• Previous Article
The uses and abuses of an age-period-cohort method: On the linear algebra and statistical properties of intrinsic and related estimators
• MFC Home
• This Issue
• Next Article
On approximation to discrete q-derivatives of functions via q-Bernstein-Schurer operators
February  2021, 4(1): 31-44. doi: 10.3934/mfc.2020024

## Fixed-point algorithms for inverse of residual rectifier neural networks

 School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Bentley, WA, Australia

* Corresponding author: Ruhua Wang

Received  August 2020 Revised  September 2020 Published  October 2020

A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural networks, and investigate conditions for invertibility under which the hidden layers are invertible. A new fixed-point algorithm is developed to invert the hidden layers of residual networks. The proposed inverse algorithms are capable of inverting some residual networks which cannot be inverted by existing inverting algorithms. Furthermore, a special residual rectifier network is designed and trained on MNIST so that it can achieve comparable performance with the state-of-art performance while its hidden layers are invertible.

Citation: Ruhua Wang, Senjian An, Wanquan Liu, Ling Li. Fixed-point algorithms for inverse of residual rectifier neural networks. Mathematical Foundations of Computing, 2021, 4 (1) : 31-44. doi: 10.3934/mfc.2020024
##### References:
 [1] S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar [2] J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar [3] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar [4] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar [5] L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar [6] L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar [7] A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar [8] A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar [9] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar [10] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar [11] S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar [12] J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar [13] D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar [14] A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar [15] E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar [16] R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar [17] A. Saberi, A. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar [18] R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar [19] T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar [20] J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar [21] M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar

show all references

##### References:
 [1] S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar [2] J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar [3] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar [4] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar [5] L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar [6] L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar [7] A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar [8] A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar [9] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar [10] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar [11] S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar [12] J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar [13] D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar [14] A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar [15] E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar [16] R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar [17] A. Saberi, A. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar [18] R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar [19] T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar [20] J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar [21] M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar
The proposed residual network architecture
Inverse of the rectifier linear transform: Invertible percentage of 500 cases changes along with $\gamma$ when the dimension of $\mathbf{x}$ is 10
Inverse of the residual unit with the fully-connected layer: Invertible percentage of 500 cases changes along with $\gamma$ when the dimension of $\mathbf{x}$ is 10
Comparison of recovered images to original digit images. The 1st row illustrates the original images, whereas the 2nd and 3rd rows show the recovered images from the proposed fixed-point method and the existing fixed-point method, respectively
Relative error rates (%) of the recovered images. One hundred samples per each class, in total 1000 samples, were chosen and recovered
 [1] Zengyun Wang, Jinde Cao, Zuowei Cai, Lihong Huang. Finite-time stability of impulsive differential inclusion: Applications to discontinuous impulsive neural networks. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2677-2692. doi: 10.3934/dcdsb.2020200 [2] Juan Manuel Pastor, Javier García-Algarra, José M. Iriondo, José J. Ramasco, Javier Galeano. Dragging in mutualistic networks. Networks & Heterogeneous Media, 2015, 10 (1) : 37-52. doi: 10.3934/nhm.2015.10.37 [3] Gheorghe Craciun, Jiaxin Jin, Polly Y. Yu. Single-target networks. Discrete & Continuous Dynamical Systems - B, 2021  doi: 10.3934/dcdsb.2021065 [4] Alessandro Gondolo, Fernando Guevara Vasquez. Characterization and synthesis of Rayleigh damped elastodynamic networks. Networks & Heterogeneous Media, 2014, 9 (2) : 299-314. doi: 10.3934/nhm.2014.9.299 [5] Juan Manuel Pastor, Javier García-Algarra, Javier Galeano, José María Iriondo, José J. Ramasco. A simple and bounded model of population dynamics for mutualistic networks. Networks & Heterogeneous Media, 2015, 10 (1) : 53-70. doi: 10.3934/nhm.2015.10.53 [6] Mohsen Abdolhosseinzadeh, Mir Mohammad Alipour. Design of experiment for tuning parameters of an ant colony optimization method for the constrained shortest Hamiltonian path problem in the grid networks. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 321-332. doi: 10.3934/naco.2020028 [7] Rui Hu, Yuan Yuan. Stability, bifurcation analysis in a neural network model with delay and diffusion. Conference Publications, 2009, 2009 (Special) : 367-376. doi: 10.3934/proc.2009.2009.367 [8] Alexandr Mikhaylov, Victor Mikhaylov. Dynamic inverse problem for Jacobi matrices. Inverse Problems & Imaging, 2019, 13 (3) : 431-447. doi: 10.3934/ipi.2019021 [9] Armin Lechleiter, Tobias Rienmüller. Factorization method for the inverse Stokes problem. Inverse Problems & Imaging, 2013, 7 (4) : 1271-1293. doi: 10.3934/ipi.2013.7.1271 [10] J. Frédéric Bonnans, Justina Gianatti, Francisco J. Silva. On the convergence of the Sakawa-Shindo algorithm in stochastic control. Mathematical Control & Related Fields, 2016, 6 (3) : 391-406. doi: 10.3934/mcrf.2016008 [11] Vakhtang Putkaradze, Stuart Rogers. Numerical simulations of a rolling ball robot actuated by internal point masses. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 143-207. doi: 10.3934/naco.2020021 [12] Demetres D. Kouvatsos, Jumma S. Alanazi, Kevin Smith. A unified ME algorithm for arbitrary open QNMs with mixed blocking mechanisms. Numerical Algebra, Control & Optimization, 2011, 1 (4) : 781-816. doi: 10.3934/naco.2011.1.781 [13] Habib Ammari, Josselin Garnier, Vincent Jugnon. Detection, reconstruction, and characterization algorithms from noisy data in multistatic wave imaging. Discrete & Continuous Dynamical Systems - S, 2015, 8 (3) : 389-417. doi: 10.3934/dcdss.2015.8.389 [14] A. Kochergin. Well-approximable angles and mixing for flows on T^2 with nonsingular fixed points. Electronic Research Announcements, 2004, 10: 113-121. [15] Deren Han, Zehui Jia, Yongzhong Song, David Z. W. Wang. An efficient projection method for nonlinear inverse problems with sparsity constraints. Inverse Problems & Imaging, 2016, 10 (3) : 689-709. doi: 10.3934/ipi.2016017 [16] Namsu Ahn, Soochan Kim. Optimal and heuristic algorithms for the multi-objective vehicle routing problem with drones for military surveillance operations. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021037

Impact Factor:

## Tools

Article outline

Figures and Tables