Article Contents
Article Contents

# Designing neural networks for modeling biological data: A statistical perspective

• In this paper, we propose a strategy for the selection of the hidden layer size in feedforward neural network models. The procedure herein presented is based on comparison of different models in terms of their out of sample predictive ability, for a specified loss function. To overcome the problem of data snooping, we extend the scheme based on the use of the reality check with modifications apt to compare nested models. Some applications of the proposed procedure to simulated and real data sets show that it allows to select parsimonious neural network models with the highest predictive accuracy.
Mathematics Subject Classification: Primary: 62G08, 62H15; Secondary: 62F40, 92B20.

 Citation:

•  [1] D. K. Agrafiotis, W. Cedeño and V. S. Lobanov, On the use of neural network ensembles in QSAR and QSPR, J. Chem. Inf. Comput. Sci., 42 (2002), 903-911.doi: 10.1021/ci0203702. [2] A. R. Barron, Universal approximation bounds for superposition of a sigmoidal function, IEEE Trans. Inform. Theory, 39 (1993), 930-945.doi: 10.1109/18.256500. [3] J. K. Basu, D. Bhattacharya and T. Kim, Use of artificial neural network in pattern recognition, International Journal of Software Engineering and its Applications, 4 (2010), 23-33. [4] H. M. Cartwright, Artificial neural networks in biology and chemistry- the evolution of a new analytical tool, in Artificial Neural Networks: Methods and Applications (ed. D. J. Livingstone), Methods in Molecular Biology, Vol. 458, Humana Press, Totowa N.J., 2009, 1-13.doi: 10.1007/978-1-60327-101-1_1. [5] X. Chen and H. White, Improved rates and asymptotic normality for nonparametric neural network estimators, IEEE Trans. Inform. Theory, 45 (1999), 682-691.doi: 10.1109/18.749011. [6] W. Choe, O. K. Ersoy and M. Bina, Neural network schemes for detecting rare events in human genomic DNA, Bioinformatics, 16 (2010), 1062-1072.doi: 10.1093/bioinformatics/16.12.1062. [7] T. E. Clark and M. W. McCracken, Reality checks and comparison of nested predictive models, J. Bus. Econom. Statist., 30 (2012), 53-66.doi: 10.1198/jbes.2011.10278. [8] T. E. Clark and M. W. McCracken, In-sample tests of predictive ability: A new approach, J. Econometrics, 170 (2012), 1-14.doi: 10.1016/j.jeconom.2010.09.012. [9] V. Corradi and N. R. Swanson, Predictive density evaluation, in Handbook of Economic Forecasting, Vol. 1 (eds. G. Elliott, C. W. J. Granger and A. Timmermann), North-Holland, 2006, 197-284.doi: 10.2139/ssrn.812104. [10] J. Devillers, Neural Networks in QSAR and Drug Design, Academic Press, London, 1996. [11] R. De Veaux, J. Schumi, J. Schweinsberg and L. H. Ungar, Prediction intervals for neural networks via nonlinear regression, Technometrics, 40 (1998), 273-282.doi: 10.2307/1270528. [12] G. Elliot and A. Timmermann, Optimal forecast combinations under general loss functions and forecast error distribution, Journal Econometrics, 122 (2004), 47-79.doi: 10.1016/j.jeconom.2003.10.019. [13] J. H. Friedman, Multivariate adaptive regression splines, Ann. Statist., 19 (1991), 1-141.doi: 10.1214/aos/1176347963. [14] T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, 2nd edition, Springer, 2008. [15] M. Jalali-Heravi, Neural network in analytical chemistry, in Artificial Neural Networks: Methods and Applications (ed. D. J. Livingstone), Methods in Molecular Biology Series, Vol. 458, Humana Press, Totowa, NJ, 2009, 78-118.doi: 10.1007/978-1-60327-101-1_6. [16] K. Hornik, M. Stinchcombe and P. Auer, Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives, Neural Computation, 6 (1994), 1262-1275.doi: 10.1162/neco.1994.6.6.1262. [17] J. T. G. Hwang and A. A. Ding, Prediction intervals for artificial neural networks, J. Amer. Statist. Assoc., 92 (1997), 748-757.doi: 10.1080/01621459.1997.10474027. [18] L. J. Lancashire, D. G. Powe, J. S. Reis-Filho, E. Rakha, C. Lemetre, B. Weigelt, T. M. Abdel-Fatah, A. R. Green, R. Mukta and R. Blamey, et al., A validated gene expression profile for detecting clinical outcome in breast cancer using artificial neural networks, Breast Cancer Research and Treatment, 120 (2010), 83-93.doi: 10.1007/s10549-009-0378-1. [19] M. La Rocca and C. Perna, Variable selection in neural network regression models with dependent data: A subsampling approach, Comput. Statist. Data Anal., 48 (2005), 415-429.doi: 10.1016/j.csda.2004.01.004. [20] M. La Rocca and C. Perna, Neural network modeling by subsampling, in Computational Intelligence and Bioinspired Systems (eds. J. Cabestany, A. Prieto and F. Sandoval), Lecture Notes in Computer Science, Vol. 3512, Springer, Berlin-Heidelberg, 2005, 200-207.doi: 10.1007/11494669_25. [21] M. La Rocca and C. Perna, Neural network modeling with applications to euro exchange rates, in Computational Methods in Financial Engineering: Essays in Honour of Manfred Gili, Part II (eds. E. Kontoghiorghes, B. Rustem and P. Winker), Springer, Berlin-Heidelberg, 2008, 163-189.doi: 10.1007/978-3-540-77958-2_9. [22] C.-M. Kuan and T. Liu, Forecasting excange rates using feedforward networks and recurrent neural networks, Journal of Applied Econometrics, 10 (1995), 347-364.doi: 10.1002/jae.3950100403. [23] Y. Makovoz, Random approximates and neural networks, J. Approx. Theory, 85 (1996), 98-109.doi: 10.1006/jath.1996.0031. [24] H. Merdun and O. Cinar, Artificial neural network and regression techniques in modelling surface water quality, Environment Protection Engineering, 36 (2010), 95-109. [25] A. Ossen and S. M. Rügen, An analysis of the metric structure of the weight space of feedforward networks and its application to time series modelling and prediction, in Proceedings of the 4th European Symposium on Artificial Neural Networks (ESANN96), Bruges, Belgium, April 24-26, 1996, 315-322. [26] M. Qi and G. P. Zhang, An investigation of model selection criteria for neural network time series forecasting, European Journal of Operational Research, 132 (2001), 666-680.doi: 10.1016/S0377-2217(00)00171-5. [27] J. P. Romano and M. Wolf, Stepwise multiple testing as formalized data snooping, Econometrica, 73 (2005), 1237-1282.doi: 10.1111/j.1468-0262.2005.00615.x. [28] G. E. Schwarz, Estimating the dimension of a model, Ann. Statist., 6 (1978), 461-464.doi: 10.1214/aos/1176344136. [29] J. Shao and D. Tu, The Jackknife and the Bootstrap, Springer Series in Statistics, Springer-Verlag, New York, 1995.doi: 10.1007/978-1-4612-0795-5. [30] T. Stamey, J. Kabalin, J. McNeal, I. Johnstone, F. Freiha, E. Redwine and N. Yang, Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate II radical prostactomy treated patients, Journal of Urology, 16 (1989), 1076-1083. [31] N. R. Swanson and H. White, A model selection approach to real time macroeconomic forecasting using linear models and artificial neural networks, The Review of Economics and Statistics, 79 (1997), 540-550.doi: 10.1162/003465397557123. [32] I. V. Tetko, A. E. P. Villa and D. J. Livingstone, Neural network studies. 2. Variable selection, J. Chem. Comput. Sci., 36 (1996), 794-803.doi: 10.1021/ci950204c. [33] R. Tibshirani, A comparison of some error estimates for neural network models, Neural Computation, 8 (1996), 152-163.doi: 10.1162/neco.1996.8.1.152. [34] A. Tsanas, M. A. Little, P. E. McSharry and L. O. Ramig, Accurate telemonitoring of Parkinson's disease progression by non-invasive speech tests, IEEE Transactions on Biomedical Engineering, 57 (2010), 884-893. [35] B. Turlach, Discussion of Least angle regression by Efron, Hastie, Jon- stone and Tibshirani, Ann. Statist., 32 (2004), 494-499. [36] D. Urda, J. Subirats, L. Franco and J. M. Jerez, Constructive neural networks to predict breast cancer outcome by using gene expression profiles, in Trends in Applied Intelligent Systems: 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1-4, 2010, Proceedings, Part I, Lecture Notes in Computer Science, 6096, Springer-Verlag, Berlin-Heidelberg, 2010, 317-326.doi: 10.1007/978-3-642-13022-9_32. [37] H. White, Learning in artificial neural networks: A statistical perspective, Neural Computation, 1 (1989), 425-464.doi: 10.1162/neco.1989.1.4.425. [38] H. White, Connectionist nonparametric regression: Multi-layer feedforward networks can learn arbitrary mappings, Neural Networks, 3 (1990), 535-549. [39] H. White, A reality check for data snooping, Econometrica, 68 (2000), 1097-1126.doi: 10.1111/1468-0262.00152. [40] A. Yasri and D. Hartsough, Toward an optimal procedure for variable selection and QSAR model building, J. Chem. Inf. Comput. Sci., 41 (2001), 1218-1227.doi: 10.1021/ci010291a.
Open Access Under a Creative Commons license