January  2016, 1(1): 129-137. doi: 10.3934/bdia.2016.1.129

On balancing between optimal and proportional categorical predictions

1. 

Department of Mathematics, Guangzhou University, Guangzhou, Guangdong 510006, China

2. 

Kochava Inc, 414 Church Street, Suite 306, Sandpoint, Idaho 83864, United States

Received  May 2015 Revised  August 2015 Published  September 2015

A bias-variance dilemma in categorical data mining and analysis is the fact that a prediction method can aim at either maximizing the overall point-hit accuracy without constraint or with the constraint of minimizing the distribution bias. However, one can hardly achieve both at the same time. A scheme to balance these two prediction objectives is proposed in this article. An experiment with a real data set is conducted to demonstrate some of the scheme's characteristics. Some basic properties of the scheme are also discussed.
Citation: Wenxue Huang, Yuanyi Pan. On balancing between optimal and proportional categorical predictions. Big Data & Information Analytics, 2016, 1 (1) : 129-137. doi: 10.3934/bdia.2016.1.129
References:
[1]

A. C. Acock, Working with missing values, Journal of Marriage and Family, 67 (2005), 1012-1028. doi: 10.1111/j.1741-3737.2005.00191.x.

[2]

E. Acuna and C. Rodriguez, The treatment of missing values and its effect in the classifier accuracy, In Classification, Clustering and Data Mining Applications, (2004), 639-647.

[3]

G. E. Batista and M. C. Monard, An analysis of four missing data treatment methods for supervised learning, Applied Artificial Intelligence, 17 (2003), 519-533. doi: 10.1080/713827181.

[4]

J. Doak, An Evaluation of Feature Selection Methods and Their Application to Computer Security, UC Davis Department of Computer Science, 1992.

[5]

P. Domingos, A unified bias-variance decomposition, In Proceedings of 17th International Conference on Machine Learning. Stanford CA Morgan Kaufmann, 2000, 231-238.

[6]

, Survey of Family Expenditures - 1996,, STATCAN, (1998). 

[7]

A. Farhangfar, L. Kurgan and J. Dy, Impact of imputation of missing values on classification error for discrete data, Pattern Recognition, 41 (2008), 3692-3705. doi: 10.1016/j.patcog.2008.05.019.

[8]

H. H. Friedman, On bias, variance, 0/1-loss, and the curse-of-dimensionality, Data mining and knowledge discovery, 1 (1997), 55-77.

[9]

S. Geman, E. Bienenstock and R. Doursaté, Neural networks and the bias/variance dilemma, Neural computation, 4 (1992), 1-58. doi: 10.1162/neco.1992.4.1.1.

[10]

L. A. Goodman and W. H. Kruskal, Measures of association for cross classification, J. American Statistical Association, 49 (1954), 732-764.

[11]

I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3 (2003), 1157-1182.

[12]

L. Himmelspach and S. Conrad, Clustering approaches for data with missing values: Comparison and evaluation, In Digital Information Management (ICDIM), 2010 Fifth International Conference on,IEEE 2010, 19-28. doi: 10.1109/ICDIM.2010.5664691.

[13]

P. T. V. Hippel, Regression with missing Ys: An improved strategy for analyzing multiply imputed data, Sociological Methodology, 37 (2007), 83-117. doi: 10.1111/j.1467-9531.2007.00180.x.

[14]

W. Huang, Y. Shi and X. Wang, A nomminal association matrix with feature selection for categorical data, Communications in Statistics - Theory and Methods, to appear, 2015.

[15]

W. Huang, Y. Pan and J. Wu, Supervised Discretization for Optimal Prediction, Procedia Computer Science, 30 (2014), 75-80. doi: 10.1016/j.procs.2014.05.383.

[16]

G. James and T. Hastie, Generalizations of the Bias/Variance Decomposition for Prediction Error, Dept. Statistics, Stanford Univ., Stanford, CA, Tech. Rep, 1997.

[17]

S. Kullback and R. A. Leibler, On information and sufficiency, Annals of Mathematical Statistics, 22 (1951), 79-86. doi: 10.1214/aoms/1177729694.

[18]

R. J. A. Little and D. B. Rubin, Statistical Analysis with Missing Data, John Wiley & Sons, Inc. 1987, New York, NY, USA.

[19]

H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers 1998, Norwell, MA, USA. doi: 10.1007/978-1-4615-5689-3.

[20]

J. Luengo, S. García and F. Herrera, On the choice of the best imputation methods for missing values considering three groups of classification methods, Knowledge and information systems, 32 (2012), 77-108. doi: 10.1007/s10115-011-0424-2.

[21]

Z. Mark and Y. Baram, The bias-variance dilemma of the Monte Carlo method, Artificial Neural Networks,ICANN, 2130 (2001), 141-147. doi: 10.1007/3-540-44668-0_20.

[22]

R. Tibshirani, Bias, Variance and Prediction Error for Classification Rules, Citeseer 1996.

[23]

I. Yaniv and D. P. Foster, Graininess of judgment under uncertainty: An accuracy-informativeness trade-off, Journal of Experimental Psychology: General, 124 (1995), 424-432. doi: 10.1037/0096-3445.124.4.424.

[24]

L. Yu, K. K. Lai, S. Wang and W. Huang, A bias-variance-complexity trade-off framework for complex system modeling, In Computational Science and Its Applications-ICCSA 2006, Springer, 3980 (2006), 518-527. doi: 10.1007/11751540_55.

[25]

T. Zhou, Z. Kuscsik, J. Liu, M. Medo, J. R. Wakeling and Y. Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proceedings of the National Academy of Sciences, 107 (2010), 4511-4515. doi: 10.1073/pnas.1000488107.

show all references

References:
[1]

A. C. Acock, Working with missing values, Journal of Marriage and Family, 67 (2005), 1012-1028. doi: 10.1111/j.1741-3737.2005.00191.x.

[2]

E. Acuna and C. Rodriguez, The treatment of missing values and its effect in the classifier accuracy, In Classification, Clustering and Data Mining Applications, (2004), 639-647.

[3]

G. E. Batista and M. C. Monard, An analysis of four missing data treatment methods for supervised learning, Applied Artificial Intelligence, 17 (2003), 519-533. doi: 10.1080/713827181.

[4]

J. Doak, An Evaluation of Feature Selection Methods and Their Application to Computer Security, UC Davis Department of Computer Science, 1992.

[5]

P. Domingos, A unified bias-variance decomposition, In Proceedings of 17th International Conference on Machine Learning. Stanford CA Morgan Kaufmann, 2000, 231-238.

[6]

, Survey of Family Expenditures - 1996,, STATCAN, (1998). 

[7]

A. Farhangfar, L. Kurgan and J. Dy, Impact of imputation of missing values on classification error for discrete data, Pattern Recognition, 41 (2008), 3692-3705. doi: 10.1016/j.patcog.2008.05.019.

[8]

H. H. Friedman, On bias, variance, 0/1-loss, and the curse-of-dimensionality, Data mining and knowledge discovery, 1 (1997), 55-77.

[9]

S. Geman, E. Bienenstock and R. Doursaté, Neural networks and the bias/variance dilemma, Neural computation, 4 (1992), 1-58. doi: 10.1162/neco.1992.4.1.1.

[10]

L. A. Goodman and W. H. Kruskal, Measures of association for cross classification, J. American Statistical Association, 49 (1954), 732-764.

[11]

I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3 (2003), 1157-1182.

[12]

L. Himmelspach and S. Conrad, Clustering approaches for data with missing values: Comparison and evaluation, In Digital Information Management (ICDIM), 2010 Fifth International Conference on,IEEE 2010, 19-28. doi: 10.1109/ICDIM.2010.5664691.

[13]

P. T. V. Hippel, Regression with missing Ys: An improved strategy for analyzing multiply imputed data, Sociological Methodology, 37 (2007), 83-117. doi: 10.1111/j.1467-9531.2007.00180.x.

[14]

W. Huang, Y. Shi and X. Wang, A nomminal association matrix with feature selection for categorical data, Communications in Statistics - Theory and Methods, to appear, 2015.

[15]

W. Huang, Y. Pan and J. Wu, Supervised Discretization for Optimal Prediction, Procedia Computer Science, 30 (2014), 75-80. doi: 10.1016/j.procs.2014.05.383.

[16]

G. James and T. Hastie, Generalizations of the Bias/Variance Decomposition for Prediction Error, Dept. Statistics, Stanford Univ., Stanford, CA, Tech. Rep, 1997.

[17]

S. Kullback and R. A. Leibler, On information and sufficiency, Annals of Mathematical Statistics, 22 (1951), 79-86. doi: 10.1214/aoms/1177729694.

[18]

R. J. A. Little and D. B. Rubin, Statistical Analysis with Missing Data, John Wiley & Sons, Inc. 1987, New York, NY, USA.

[19]

H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers 1998, Norwell, MA, USA. doi: 10.1007/978-1-4615-5689-3.

[20]

J. Luengo, S. García and F. Herrera, On the choice of the best imputation methods for missing values considering three groups of classification methods, Knowledge and information systems, 32 (2012), 77-108. doi: 10.1007/s10115-011-0424-2.

[21]

Z. Mark and Y. Baram, The bias-variance dilemma of the Monte Carlo method, Artificial Neural Networks,ICANN, 2130 (2001), 141-147. doi: 10.1007/3-540-44668-0_20.

[22]

R. Tibshirani, Bias, Variance and Prediction Error for Classification Rules, Citeseer 1996.

[23]

I. Yaniv and D. P. Foster, Graininess of judgment under uncertainty: An accuracy-informativeness trade-off, Journal of Experimental Psychology: General, 124 (1995), 424-432. doi: 10.1037/0096-3445.124.4.424.

[24]

L. Yu, K. K. Lai, S. Wang and W. Huang, A bias-variance-complexity trade-off framework for complex system modeling, In Computational Science and Its Applications-ICCSA 2006, Springer, 3980 (2006), 518-527. doi: 10.1007/11751540_55.

[25]

T. Zhou, Z. Kuscsik, J. Liu, M. Medo, J. R. Wakeling and Y. Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proceedings of the National Academy of Sciences, 107 (2010), 4511-4515. doi: 10.1073/pnas.1000488107.

[1]

Sunmoo Yoon, Maria Patrao, Debbie Schauer, Jose Gutierrez. Prediction models for burden of caregivers applying data mining techniques. Big Data & Information Analytics, 2017  doi: 10.3934/bdia.2017014

[2]

Hanqing Jin, Shige Peng. Optimal unbiased estimation for maximal distribution. Probability, Uncertainty and Quantitative Risk, 2021, 6 (3) : 189-198. doi: 10.3934/puqr.2021009

[3]

Martin Frank, Benjamin Seibold. Optimal prediction for radiative transfer: A new perspective on moment closure. Kinetic and Related Models, 2011, 4 (3) : 717-733. doi: 10.3934/krm.2011.4.717

[4]

Xiang-Sheng Wang, Luoyi Zhong. Ebola outbreak in West Africa: real-time estimation and multiple-wave prediction. Mathematical Biosciences & Engineering, 2015, 12 (5) : 1055-1063. doi: 10.3934/mbe.2015.12.1055

[5]

Matthieu Canaud, Lyudmila Mihaylova, Jacques Sau, Nour-Eddin El Faouzi. Probability hypothesis density filtering for real-time traffic state estimation and prediction. Networks and Heterogeneous Media, 2013, 8 (3) : 825-842. doi: 10.3934/nhm.2013.8.825

[6]

Bo Jiang, Yongge Tian. On best linear unbiased estimation and prediction under a constrained linear random-effects model. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021209

[7]

Manisha Pujari, Rushed Kanawati. Link prediction in multiplex networks. Networks and Heterogeneous Media, 2015, 10 (1) : 17-35. doi: 10.3934/nhm.2015.10.17

[8]

Laltu Sardar, Sushmita Ruj. The secure link prediction problem. Advances in Mathematics of Communications, 2019, 13 (4) : 733-757. doi: 10.3934/amc.2019043

[9]

Junying Hu, Xiaofei Qian, Jun Pei, Changchun Tan, Panos M. Pardalos, Xinbao Liu. A novel quality prediction method based on feature selection considering high dimensional product quality data. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2977-3000. doi: 10.3934/jimo.2021099

[10]

Fok Ricky, Lasek Agnieszka, Li Jiye, An Aijun. Modeling daily guest count prediction. Big Data & Information Analytics, 2016, 1 (4) : 299-308. doi: 10.3934/bdia.2016012

[11]

Chen Li, Fajie Wei, Shenghan Zhou. Prediction method based on optimization theory and its application. Discrete and Continuous Dynamical Systems - S, 2015, 8 (6) : 1213-1221. doi: 10.3934/dcdss.2015.8.1213

[12]

Yao Kuang, Raphael Douady. Crisis risk prediction with concavity from Polymodel. Journal of Dynamics and Games, 2022, 9 (1) : 97-115. doi: 10.3934/jdg.2021027

[13]

Shui-Nee Chow, Xiaojing Ye, Hongyuan Zha, Haomin Zhou. Influence prediction for continuous-time information propagation on networks. Networks and Heterogeneous Media, 2018, 13 (4) : 567-583. doi: 10.3934/nhm.2018026

[14]

Yicang Zhou, Yiming Shao, Yuhua Ruan, Jianqing Xu, Zhien Ma, Changlin Mei, Jianhong Wu. Modeling and prediction of HIV in China: transmission rates structured by infection ages. Mathematical Biosciences & Engineering, 2008, 5 (2) : 403-418. doi: 10.3934/mbe.2008.5.403

[15]

Diogenis A. Kiziridis, Mike S. Fowler, Chenggui Yuan. Modelling fungal competition for space:Towards prediction of community dynamics. Discrete and Continuous Dynamical Systems - B, 2020, 25 (11) : 4411-4426. doi: 10.3934/dcdsb.2020104

[16]

Vasiliy N. Leonenko, Sergey V. Ivanov. Prediction of influenza peaks in Russian cities: Comparing the accuracy of two SEIR models. Mathematical Biosciences & Engineering, 2018, 15 (1) : 209-232. doi: 10.3934/mbe.2018009

[17]

Christopher J. Larsen. Local minimality and crack prediction in quasi-static Griffith fracture evolution. Discrete and Continuous Dynamical Systems - S, 2013, 6 (1) : 121-129. doi: 10.3934/dcdss.2013.6.121

[18]

Elena Braverman, Alexandra Rodkina. Stabilizing multiple equilibria and cycles with noisy prediction-based control. Discrete and Continuous Dynamical Systems - B, 2021  doi: 10.3934/dcdsb.2021281

[19]

Jingzhen Liu, Ka Fai Cedric Yiu, Alain Bensoussan. The optimal mean variance problem with inflation. Discrete and Continuous Dynamical Systems - B, 2016, 21 (1) : 185-203. doi: 10.3934/dcdsb.2016.21.185

[20]

Sarai Hedges, Kim Given. Addressing confirmation bias in middle school data science education. Foundations of Data Science, 2022  doi: 10.3934/fods.2021035

 Impact Factor: 

Metrics

  • PDF downloads (50)
  • HTML views (0)
  • Cited by (3)

Other articles
by authors

[Back to Top]