doi: 10.3934/mfc.2020024

Fixed-point algorithms for inverse of residual rectifier neural networks

School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Bentley, WA, Australia

* Corresponding author: Ruhua Wang

Received  August 2020 Revised  September 2020 Published  October 2020

A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural networks, and investigate conditions for invertibility under which the hidden layers are invertible. A new fixed-point algorithm is developed to invert the hidden layers of residual networks. The proposed inverse algorithms are capable of inverting some residual networks which cannot be inverted by existing inverting algorithms. Furthermore, a special residual rectifier network is designed and trained on MNIST so that it can achieve comparable performance with the state-of-art performance while its hidden layers are invertible.

Citation: Ruhua Wang, Senjian An, Wanquan Liu, Ling Li. Fixed-point algorithms for inverse of residual rectifier neural networks. Mathematical Foundations of Computing, doi: 10.3934/mfc.2020024
References:
[1]

S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar

[2]

J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar

[3]

F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar

[4]

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar

[5]

L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar

[6]

L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar

[7]

A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar

[8]

A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar

[9] I. GoodfellowY. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar
[10]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar

[11]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar

[12]

J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar

[13]

D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar

[14]

A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar

[15]

E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar

[16]

R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar

[17]

A. SaberiA. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar

[18]

R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar

[19]

T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar

[20]

J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar

[21]

M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar

show all references

References:
[1]

S. An, F. Boussaid and M. Bennamoun, How can deep rectifier networks achieve linear separability and preserve distances?, International Conference on Machine Learning, 2015,514–523. Google Scholar

[2]

J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud and J.-H. Jacobsen, Invertible residual networks, International Conference on Machine Learning, 2019,573–582. Google Scholar

[3]

F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1251–1258. doi: 10.1109/CVPR.2017.195.  Google Scholar

[4]

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin and N. Usunier, Parseval networks: Improving robustness to adversarial examples, in Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2017,854–863. Google Scholar

[5]

L. Dinh, D. Krueger and Y. Bengio, NICE: Non-linear independent components estimation, preprint, arXiv: 1410.8516. Google Scholar

[6]

L. Dinh, J. Sohl-Dickstein and S. Bengio, Density estimation using real NVP, preprint, arXiv: 1605.08803. Google Scholar

[7]

A. Dosovitskiy and T. Brox, Inverting visual representations with convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4829–4837. doi: 10.1109/CVPR.2016.522.  Google Scholar

[8]

A. N. Gomez, M. Ren, R. Urtasun and R. B. Grosse, The reversible residual network: Backpropagation without storing activations, in Advances in Neural Information Processing Systems, 2017, 2214–2224. Google Scholar

[9] I. GoodfellowY. Bengio and A. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016.   Google Scholar
[10]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. Google Scholar

[11]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint, arXiv: 1502.03167. Google Scholar

[12]

J.-H. Jacobsen, A. Smeulders and E. Oyallon, $i$-RevNet: Deep invertible networks, preprint, arXiv: 1802.07088. Google Scholar

[13]

D. P. Kingma and P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions, in Advances in Neural Information Processing Systems, 2018, 10215–10224. Google Scholar

[14]

A. Mahendran and A. Vedaldi, Understanding deep image representations by inverting them, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5188–5196. doi: 10.1109/CVPR.2015.7299155.  Google Scholar

[15]

E. Oyallon, Building a regular decision boundary with deep networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5106–5114. doi: 10.1109/CVPR.2017.204.  Google Scholar

[16]

R. Prenger, R. Valle and B. Catanzaro, Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 3617–3621. doi: 10.1109/ICASSP.2019.8683143.  Google Scholar

[17]

A. SaberiA. A. Stoorvogel and P. Sannuti, Inverse filtering and deconvolution, Internat. J. Robust Nonlinear Control, 11 (2001), 131-156.  doi: 10.1002/rnc.553.  Google Scholar

[18]

R. Shwartz-Ziv and N. Tishby, Opening the black box of Deep Neural Networks via Information, preprint, arXiv: 1703.00810. Google Scholar

[19]

T. F. van der Ouderaa and D. E. Worrall, Reversible GANS for memory-efficient image-to-image translation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, 4720–4728. Google Scholar

[20]

J. Wang and L. Perez, The effectiveness of data augmentation in image classification using deep learning, preprint, arXiv: 1712.04621. Google Scholar

[21]

M. D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision–ECCV 2014, Lecture Notes in Computer Science, 8689, Springer, Cham, 2014,818–833. doi: 10.1007/978-3-319-10590-1_53.  Google Scholar

Figure 1.  The proposed residual network architecture
Figure 2.  Inverse of the rectifier linear transform: Invertible percentage of 500 cases changes along with $ \gamma $ when the dimension of $ \mathbf{x} $ is 10
Figure 3.  Inverse of the residual unit with the fully-connected layer: Invertible percentage of 500 cases changes along with $ \gamma $ when the dimension of $ \mathbf{x} $ is 10
Figure 4.  Comparison of recovered images to original digit images. The 1st row illustrates the original images, whereas the 2nd and 3rd rows show the recovered images from the proposed fixed-point method and the existing fixed-point method, respectively
Figure 5.  Relative error rates (%) of the recovered images. One hundred samples per each class, in total 1000 samples, were chosen and recovered
[1]

Christian Beck, Lukas Gonon, Martin Hutzenthaler, Arnulf Jentzen. On existence and uniqueness properties for solutions of stochastic fixed point equations. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020320

[2]

Gunther Uhlmann, Jian Zhai. Inverse problems for nonlinear hyperbolic equations. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 455-469. doi: 10.3934/dcds.2020380

[3]

Suhaib Abduljabbar Altamir, Zakariya Yahya Algamal. Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 87-98. doi: 10.3934/naco.2020017

[4]

Yi-Hsuan Lin, Gen Nakamura, Roland Potthast, Haibing Wang. Duality between range and no-response tests and its application for inverse problems. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020072

[5]

Kha Van Huynh, Barbara Kaltenbacher. Some application examples of minimization based formulations of inverse problems and their regularization. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020074

[6]

Justin Holmer, Chang Liu. Blow-up for the 1D nonlinear Schrödinger equation with point nonlinearity II: Supercritical blow-up profiles. Communications on Pure & Applied Analysis, , () : -. doi: 10.3934/cpaa.2020264

[7]

Wenbin Li, Jianliang Qian. Simultaneously recovering both domain and varying density in inverse gravimetry by efficient level-set methods. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020073

 Impact Factor: 

Metrics

  • PDF downloads (20)
  • HTML views (54)
  • Cited by (0)

Other articles
by authors

[Back to Top]