[1]
|
Z. Allen-Zhu, Y. Li and Z. Song, A convergence theory for deep learning via over-parameterization, preprint, arXiv: 1811.03962.
|
[2]
|
A. Brutzkus and A. Globerson, Globally optimal gradient descent for a ConvNet with Gaussian inputs, preprint, arXiv: 1702.07966.
|
[3]
|
A. Brutzkus and A. Globerson, Over-parameterization improves generalization in the XOR detection problem, preprint.
|
[4]
|
A. Brutzkus, A. Globerson, E. Malach and S. Shalev-Shwartz, SGDlearns over-parameterized networks that provably generalize on linearly separable data, 6th International Conference on Learning Representations, Vancouver, BC, Canada, 2018
|
[5]
|
R. T. des Combes, M. Pezeshki, S. Shabanian, A. Courville and Y. Bengio, Convergence properties of deep neural networks on separable data, 2019., Available from: https://openreview.net/forum?id=HJfQrs0qt7.
|
[6]
|
S. S. Du, X. Zhai, B. Poczós and A. Singh, Gradient descent provably optimizes over-parameterized neural networks, preprint, arXiv: 1810.02054.
|
[7]
|
C. Ho and S. Zimmerman, On the number of regions in an $m$-dimensional space cut by $n$ hyperplanes, Austral. Math. Soc. Gaz., 33 (2006), 260-264.
|
[8]
|
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735-1780.
doi: 10.1162/neco.1997.9.8.1735.
|
[9]
|
A. Krizhevsky, I. Sutskever and G. E. Hinton, ImageNet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, 2012, 1097-1105.
doi: 10.1145/3065386.
|
[10]
|
LeNet-5 - A Classic CNN Architecture., Available from: https://engmrk.com/lenet-5-a-classic-cnn-architecture/.,
|
[11]
|
Y. Li and Y. Liang, Learning overparameterized neural networks via stochastic gradient descent on structured data, preprint, arXiv: 1808.01204.
|
[12]
|
S. Liang, R. Sun, Y. Li and R. Srikant, Understanding the loss surface of neural networks for binary classification, preprint, arXiv: 1803.00909.
|
[13]
|
B. Neyshabur, Z. Li, S. Bhojanapalli, Y. LeCun and N. Srebro, Towards understanding the role of over-parametrization in generalization of neural networks, preprint, arXiv: 1805.12076.
|
[14]
|
Q. Nguyen, M. C. Mukkamala and M. Hein, On the loss landscape of a class of deep neural networks with no bad local valleys, preprint, arXiv: 1809.10749.
|
[15]
|
S. Ren, K. He, R. Girshick and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Machine Intell., 39 (2017), 1137-1149.
doi: 10.1109/TPAMI.2016.2577031.
|
[16]
|
A. Rosebrock, LeNet - Convolutional neural network in Python, 2016., Available from: https://www.pyimagesearch.com/2016/08/01/lenet-convolutional-neural-network-in-python/.
|
[17]
|
D. Silver, A. Huang, C. J. Maddison, A. Guez and L. et al. Sifre, Mastering the game of Go with deep neural networks and tree search, Nature, 529 (2016), 484-489.
doi: 10.1038/nature16961.
|
[18]
|
H. Wang, Y. Wang, Z. Zhou, X. Ji and D. Gong, et al., CosFace: Large margin cosine loss for deep face recognition, preprint, arXiv: 1801.09414.
doi: 10.1109/CVPR.2018.00552.
|
[19]
|
P. Yin, J. Xin and Y. Qi, Linear feature transform and enhancement of classification on deep neural network, J. Sci. Comput., 76 (2018), 1396-1406.
doi: 10.1007/s10915-018-0666-1.
|