April  2017, 2(2): 119-125. doi: 10.3934/bdia.2017004

Proportional association based roi model

1. 

School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, 510006, China

2. 

Clearpier Inc., 1300-121 Richmond St. W., Toronto, Ontario M5H 2K1 Canada

3. 

School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, 510006, China

* Corresponding authors: Wenxue Huang and Lihong Zheng.

Published  April 2017

Based on a local-to-global proportional association measure proposed by Huang, Shi and Wang [9], with cost and revenue information known, an association measure is proposed to maximize the expected RoI. A descriptive experiment with a synthetical data set is presented.

Citation: Wenxue Huang, Yuanyi Pan, Lihong Zheng. Proportional association based roi model. Big Data & Information Analytics, 2017, 2 (2) : 119-125. doi: 10.3934/bdia.2017004
References:
[1]

C. Cornforth, What makes boards effctive? an examination of the relationships between board inputs, structures, processes and effctiveness in non-profit organisations, Corporate Governance: An International Review, 9 (2011), 217-227.   Google Scholar

[2]

L. L. FongM. S. Squillante and R. E. Hough, Computer resource proportional utilization and response time scheduling, US Patent, 6 (2001), 263-359.   Google Scholar

[3]

L. A. Goodman, A single general method for the analysis of cross-classifed data: Reconciliation, and synthesis of some methods of pearson, yule, and fisher, and also some methods of correspondence analysis and association analysis, Journal of the American Statistical Association, 91 (1996), 408-428.  doi: 10.1080/01621459.1996.10476702.  Google Scholar

[4]

L. A. Goodman and W. H. Kruskal, Measures of Association for Cross Classifications Springer, 1979.  Google Scholar

[5]

M. F. GregorL. YangE. FabbriniB. S. MohammedJ. C. EagonG. S. Hotamisligil and S. Klein, Endoplasmic reticulum stress is reduced in tissues of obese subjects after weight loss, Diabetes, 58 (2009), 693-700.  doi: 10.2337/db08-1220.  Google Scholar

[6]

W. Huang and Y. Pan, On balancing between optimal and proportional categorical predictions, Big Data and Information Analytics, 1 (2016), 129-137.  doi: 10.3934/bdia.2016.1.129.  Google Scholar

[7]

W. HuangY. Pan and J. Wu, Supervised discretization with GK-τ, Procedia Computer Science, 17 (2013), 114-120.   Google Scholar

[8]

W. HuangY. Pan and J. Wu, Performance measures of rare events targeting, International Journal of Data Analysis Techniques and Strategies, 6 (2014), 105-120.  doi: 10.1504/IJDATS.2014.062450.  Google Scholar

[9]

W. HuangY. Shi and X. Wang, A nominal association matrix with feature selection for categorical data, Comunications in Statistic -Theory and Methods, 46 (2017), 7798-7819.  doi: 10.1080/03610926.2014.930911.  Google Scholar

[10]

H. HwangT. Jung and E. Suh, An ltv model and customer segmentation based on customer value: A case study on the wireless telecommunication industry, Expert Systems with Applications, 26 (2004), 181-188.  doi: 10.1016/S0957-4174(03)00133-7.  Google Scholar

[11]

T. LinY. Yang and H. T. Shiau, A work weighted state vector control method for geometrically nonlinear analysis, Computers and Structures, 46 (1993), 689-694.  doi: 10.1016/0045-7949(93)90397-V.  Google Scholar

[12]

C. X. Ling and C. Li, Data mining for direct marketing: Problems and solutions, in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), AAAI Press, (1998), 73-79.   Google Scholar

[13]

J. R. Quinlan, Induction of decision trees, Machine Learning, 1 (1986), 81-106.  doi: 10.1007/BF00116251.  Google Scholar

show all references

References:
[1]

C. Cornforth, What makes boards effctive? an examination of the relationships between board inputs, structures, processes and effctiveness in non-profit organisations, Corporate Governance: An International Review, 9 (2011), 217-227.   Google Scholar

[2]

L. L. FongM. S. Squillante and R. E. Hough, Computer resource proportional utilization and response time scheduling, US Patent, 6 (2001), 263-359.   Google Scholar

[3]

L. A. Goodman, A single general method for the analysis of cross-classifed data: Reconciliation, and synthesis of some methods of pearson, yule, and fisher, and also some methods of correspondence analysis and association analysis, Journal of the American Statistical Association, 91 (1996), 408-428.  doi: 10.1080/01621459.1996.10476702.  Google Scholar

[4]

L. A. Goodman and W. H. Kruskal, Measures of Association for Cross Classifications Springer, 1979.  Google Scholar

[5]

M. F. GregorL. YangE. FabbriniB. S. MohammedJ. C. EagonG. S. Hotamisligil and S. Klein, Endoplasmic reticulum stress is reduced in tissues of obese subjects after weight loss, Diabetes, 58 (2009), 693-700.  doi: 10.2337/db08-1220.  Google Scholar

[6]

W. Huang and Y. Pan, On balancing between optimal and proportional categorical predictions, Big Data and Information Analytics, 1 (2016), 129-137.  doi: 10.3934/bdia.2016.1.129.  Google Scholar

[7]

W. HuangY. Pan and J. Wu, Supervised discretization with GK-τ, Procedia Computer Science, 17 (2013), 114-120.   Google Scholar

[8]

W. HuangY. Pan and J. Wu, Performance measures of rare events targeting, International Journal of Data Analysis Techniques and Strategies, 6 (2014), 105-120.  doi: 10.1504/IJDATS.2014.062450.  Google Scholar

[9]

W. HuangY. Shi and X. Wang, A nominal association matrix with feature selection for categorical data, Comunications in Statistic -Theory and Methods, 46 (2017), 7798-7819.  doi: 10.1080/03610926.2014.930911.  Google Scholar

[10]

H. HwangT. Jung and E. Suh, An ltv model and customer segmentation based on customer value: A case study on the wireless telecommunication industry, Expert Systems with Applications, 26 (2004), 181-188.  doi: 10.1016/S0957-4174(03)00133-7.  Google Scholar

[11]

T. LinY. Yang and H. T. Shiau, A work weighted state vector control method for geometrically nonlinear analysis, Computers and Structures, 46 (1993), 689-694.  doi: 10.1016/0045-7949(93)90397-V.  Google Scholar

[12]

C. X. Ling and C. Li, Data mining for direct marketing: Problems and solutions, in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), AAAI Press, (1998), 73-79.   Google Scholar

[13]

J. R. Quinlan, Induction of decision trees, Machine Learning, 1 (1986), 81-106.  doi: 10.1007/BF00116251.  Google Scholar

Table 1.  Contingency tables:$X_1$ vs $Y$ and $X_2$ vs $Y$
$X_1|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$ $X_2|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$
$x_{1_1}$ 1000 100 500 400 $x_{2_1}$ 500 300 200 1500
$x_{1_2}$ 200 1500 500 300 $x_{2_2}$ 500 400 400 50
$x_{1_3}$ 400 50 500 500 $x_{2_3}$ 500 500 300 700
$x_{1_4}$ 300 700 500 400 $x_{2_4}$ 500 400 1000 100
$x_{1_5}$ 200 500 400 200 $x_{2_5}$ 200 400 500 200
$X_1|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$ $X_2|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$
$x_{1_1}$ 1000 100 500 400 $x_{2_1}$ 500 300 200 1500
$x_{1_2}$ 200 1500 500 300 $x_{2_2}$ 500 400 400 50
$x_{1_3}$ 400 50 500 500 $x_{2_3}$ 500 500 300 700
$x_{1_4}$ 300 700 500 400 $x_{2_4}$ 500 400 1000 100
$x_{1_5}$ 200 500 400 200 $x_{2_5}$ 200 400 500 200
Table 2.  Association matrices:$X_1$ vs $Y$ and $X_2$ vs $Y$
$Y|\hat{Y}$ $\hat{y_1}|X_1$ $\hat{y_2}|X_1$ $\hat{y_3}|X_1$ $\hat{y_4}|X_1$ $Y|\hat{Y}$ $\hat{y_1}|X_2$ $\hat{y_2}|X_2$ $\hat{y_3}|X_2$ $\hat{y_4}X_2$
$y_1$ 0.34 0.18 0.27 0.22 $y_1$ 0.26 0.22 0.27 0.25
$y_2$ 0.13 0.48 0.24 0.15 $y_2$ 0.25 0.24 0.29 0.23
$y_{3}$ 0.24 0.28 0.27 0.21 $y_{3}$ 0.25 0.24 0.36 0.15
$y_{4}$ 0.25 0.25 0.28 0.22 $y_{4}$ 0.22 0.18 0.14 0.46
$Y|\hat{Y}$ $\hat{y_1}|X_1$ $\hat{y_2}|X_1$ $\hat{y_3}|X_1$ $\hat{y_4}|X_1$ $Y|\hat{Y}$ $\hat{y_1}|X_2$ $\hat{y_2}|X_2$ $\hat{y_3}|X_2$ $\hat{y_4}X_2$
$y_1$ 0.34 0.18 0.27 0.22 $y_1$ 0.26 0.22 0.27 0.25
$y_2$ 0.13 0.48 0.24 0.15 $y_2$ 0.25 0.24 0.29 0.23
$y_{3}$ 0.24 0.28 0.27 0.21 $y_{3}$ 0.25 0.24 0.36 0.15
$y_{4}$ 0.25 0.25 0.28 0.22 $y_{4}$ 0.22 0.18 0.14 0.46
Table 3.  Contingency table for correct predictions: $W_1$ and $W_2$
$X_1|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$ $X_2|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$
$x_{1_1}$ 471 6 121 83 $x_{2_1}$ 98 34 19 926
$x_{1_2}$ 101 746 159 107 $x_{2_2}$ 177 114 113 1
$x_{1_3}$ 130 1 167 157 $x_{2_3}$ 114 124 42 256
$x_{1_4}$ 44 243 145 85 $x_{2_4}$ 109 81 489 6
$x_{1_5}$ 21 210 114 32 $x_{2_5}$ 36 119 206 28
$X_1|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$ $X_2|Y$ $y_1$ $y_2$ $y_{3}$ $y_{4}$
$x_{1_1}$ 471 6 121 83 $x_{2_1}$ 98 34 19 926
$x_{1_2}$ 101 746 159 107 $x_{2_2}$ 177 114 113 1
$x_{1_3}$ 130 1 167 157 $x_{2_3}$ 114 124 42 256
$x_{1_4}$ 44 243 145 85 $x_{2_4}$ 109 81 489 6
$x_{1_5}$ 21 210 114 32 $x_{2_5}$ 36 119 206 28
Table 4.  Association measures: $\omega^{Y|X}$, and $\widehat{\omega}^{Y|X}$
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ total revenue average revenue
$X_1$ 0.3406 0.456 4313 0.4714
$X_2$ 0.3391 0.564 5178 0.5659
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ total revenue average revenue
$X_1$ 0.3406 0.456 4313 0.4714
$X_2$ 0.3391 0.564 5178 0.5659
Table 5.  Association with/without cost vectors: $X_1$ and $X_2$
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$X_1$ 0.3406 0.3406 1.3057 12016.17 1.3132
$X_2$ 0.3391 0.3391 1.8546 17072.17 1.8658
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$X_1$ 0.3406 0.3406 1.3057 12016.17 1.3132
$X_2$ 0.3391 0.3391 1.8546 17072.17 1.8658
Table 6.  Association with/without new cost vectors: $X_1$ and $X_2$
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$X_1$ 0.3406 0.3406 1.7420 15938.17 1.7419
$X_2$ 0.3391 0.3391 1.3424 12268.17 1.3408
$X $ $\omega^{Y|X}$ $\widehat{\omega}^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$X_1$ 0.3406 0.3406 1.7420 15938.17 1.7419
$X_2$ 0.3391 0.3391 1.3424 12268.17 1.3408
Table 7.  Simulated feature selection: one variable
$X$ $|Dmn(X)|$ $\omega^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$V_1$ 7 0.3906 3.5381 35390 3.5390
$V_2$ 4 0.3882 3.8433 38771 3.8771
$V_{3}$ 4 0.3250 4.8986 48678 4.8678
$V_{4}$ 8 0.3274 3.7050 36889 3.6889
$X$ $|Dmn(X)|$ $\omega^{Y|X}$ $\bar{\omega}^{Y|X}$ total profit average profit
$V_1$ 7 0.3906 3.5381 35390 3.5390
$V_2$ 4 0.3882 3.8433 38771 3.8771
$V_{3}$ 4 0.3250 4.8986 48678 4.8678
$V_{4}$ 8 0.3274 3.7050 36889 3.6889
Table 8.  Simulated feature selection: two variables
$X_1, X_2$ $|Dmn(X_1, X_2)|$ $\omega^{Y|(X_1, X_2)}$ $\bar{\omega}^{Y|(X_1, X_2)}$ total profit average profit
$V_1,V_2$ 28 0.4367 1.8682 18971 1.8971
$V_1, V_{3}$ 28 0.4025 2.1106 20746 2.0746
$V_1, V_{4}$ 56 0.4055 1.8055 17915 1.7915
$V_{3}, V_2$ 16 0.4055 2.3585 24404 2.4404
$V_{3}, V_{4}$ 32 0.3385 2.0145 19903 1.9903
$X_1, X_2$ $|Dmn(X_1, X_2)|$ $\omega^{Y|(X_1, X_2)}$ $\bar{\omega}^{Y|(X_1, X_2)}$ total profit average profit
$V_1,V_2$ 28 0.4367 1.8682 18971 1.8971
$V_1, V_{3}$ 28 0.4025 2.1106 20746 2.0746
$V_1, V_{4}$ 56 0.4055 1.8055 17915 1.7915
$V_{3}, V_2$ 16 0.4055 2.3585 24404 2.4404
$V_{3}, V_{4}$ 32 0.3385 2.0145 19903 1.9903
[1]

Haili Yuan, Yijun Hu. Optimal investment for an insurer under liquid reserves. Journal of Industrial & Management Optimization, 2021, 17 (1) : 339-355. doi: 10.3934/jimo.2019114

[2]

Sushil Kumar Dey, Bibhas C. Giri. Coordination of a sustainable reverse supply chain with revenue sharing contract. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020165

[3]

Peizhao Yu, Guoshan Zhang, Yi Zhang. Decoupling of cubic polynomial matrix systems. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 13-26. doi: 10.3934/naco.2020012

[4]

Shengxin Zhu, Tongxiang Gu, Xingping Liu. AIMS: Average information matrix splitting. Mathematical Foundations of Computing, 2020, 3 (4) : 301-308. doi: 10.3934/mfc.2020012

[5]

Parikshit Upadhyaya, Elias Jarlebring, Emanuel H. Rubensson. A density matrix approach to the convergence of the self-consistent field iteration. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 99-115. doi: 10.3934/naco.2020018

[6]

S. Sadeghi, H. Jafari, S. Nemati. Solving fractional Advection-diffusion equation using Genocchi operational matrix based on Atangana-Baleanu derivative. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020435

[7]

Yuri Fedorov, Božidar Jovanović. Continuous and discrete Neumann systems on Stiefel varieties as matrix generalizations of the Jacobi–Mumford systems. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020375

[8]

Dan Zhu, Rosemary A. Renaut, Hongwei Li, Tianyou Liu. Fast non-convex low-rank matrix decomposition for separation of potential field data using minimal memory. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020076

[9]

Sihem Guerarra. Maximum and minimum ranks and inertias of the Hermitian parts of the least rank solution of the matrix equation AXB = C. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 75-86. doi: 10.3934/naco.2020016

[10]

Nalin Fonseka, Jerome Goddard II, Ratnasingham Shivaji, Byungjae Son. A diffusive weak Allee effect model with U-shaped emigration and matrix hostility. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020356

 Impact Factor: 

Metrics

  • PDF downloads (33)
  • HTML views (217)
  • Cited by (0)

Other articles
by authors

[Back to Top]