
- Previous Article
- BDIA Home
- This Issue
-
Next Article
What can we learn about the Middle East Respiratory Syndrome (MERS) outbreak from tweets?
Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.
Readers can access Online First articles via the “Online First” tab for the selected journal.
A novel approach using incremental under sampling for data stream mining
1. | Research Scholar, GITAM University, Telangana, Hyderabad, India |
2. | Sambalpur University Institute of Information Technology, Sambalpur, Orissa, India |
Data stream mining is every popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms has been compromised with imbalance nature present in real world data streams. In this paper, we propose an algorithm known as Increment Under Sampling for Data streams (IUSDS) which uses an unique under sampling technique to almost balance the data sets to minimize the effect of imbalance in stream mining process. The experimental analysis conducted suggests that the proposed algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of standard performance measures namely accuracy, AUC, precision, recall, F-measure, TP rate, FP rate and TN rate.
References:
[1] |
J. Alcalá-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García, L. Sánchez and F. Herrera,
KEEL data-mining software tool: Data set repository, Integration of Algorithms and Experimental Analysis Framework, Journal of Multiple-Valued Logic and Soft Computing, 17 (2011), 255-287.
|
[2] |
A. Asuncion and D. J. Newman,
UCI Repository of Machine Learning Database (School of Information and Computer Science), Irvine, CA: Univ. of California [Online], 2007. Available: http://www.ics.uci.edu/mlearn/MLRepository.html |
[3] |
I. Brown and C. Mues,
An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39 (2012), 3446-3453.
doi: 10.1016/j.eswa.2011.09.033. |
[4] |
P. Cao, D. Zhao and O. Zaiane,
A PSO-based cost-sensitive neural network for imbalanced data classification, Trends and Applications in Knowledge Discovery and Data Mining, (2013), 452-463.
doi: 10.1007/978-3-642-40319-4_39. |
[5] |
Y. Chen,
Learning Classifiers from Imbalanced Only Positive and Unlabeled Data Sets 2008 UC San Diego Data Mining Contest. |
[6] |
Y. Chen, S. Tang, L. Zhou, C. Wang, J. Du, T. Wang and S. Pei,
Decentralized Clustering by Finding Loose and Distributed Density Cores, Inform. Sci., 433/434 (2018), 510-526.
doi: 10.1016/j.ins.2016.08.009. |
[7] |
Doucette and M. I. Heywood,
Classification under imbalanced data sets:Active sub-sampling and auc approximation, M. O'Neill et al. Eds.:EuroGP 2008, LNCS, 4971 (2008), 266-277.
|
[8] |
B. J. Frey and D. Dueck,
Clustering by passing messages between data points, Science, 315 (2007), 972-976.
doi: 10.1126/science.1136800. |
[9] |
G. Hulten, L. Spencer and P. Domingos, Mining time-changing data streams, In: ACM
SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, (2001), 97-106.
doi: 10.1145/502512.502529. |
[10] |
A. K. Jain,
Data clustering:50 years beyond K-means, Part of the Lecture Notes in Computer Science book series, 5211 (2008), 3-4.
doi: 10.1007/978-3-540-87479-9_3. |
[11] |
R. Kohavi, Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid, In: Second International Conference on Knoledge Discovery and Data Mining, (1996), 202-207. |
[12] |
V. López, I. Triguero, C. J. Carmona, S. García and F. Herrera,
Addressing imbalanced classification withinstance generation techniques: IPADE-ID, Neurocomputing, 126 (2014), 15-28.
|
[13] |
A. C. Lorena, L. F. O. Jacintho, M. F. Siqueira, R. De Giovanni, L. G. Lohmann, A. C. P. L. F. de Carvalho and M. Yamamoto,
Comparing machine learning classifiers in potential distribution modelling, Expert Systems with Applications, 38 (2011), 5268-5275.
doi: 10.1016/j.eswa.2010.10.031. |
[14] |
H. Ma,
Correlation-based Feature Subset Selection For Machine Learning PhD Thesis, 1998. |
[15] |
A. K. Menon, H. Narasimhan, S. Agarwal and S. Chawla, On the statistical consistency of algorithms for binary classification under class imbalance, Appearing in Proceedings of the 30 thInternational Conference on Machine Learning Atlanta, Georgia, USA, 2013. |
[16] |
A. Rodriguez and A. Laio,
Clustering by fast search and find of density peaks, Science, 344 (2014), 1492-1496.
doi: 10.1126/science.1242072. |
[17] |
N. Verbiesta, E. Ramentol, C. Cornelisa and F. Herrera,
Preprocessing noisy imbalanced datasets using SMOTE enhanced withfuzzy rough prototype selection, Applied Soft Computing, 22 (2014), 511-517.
|
[18] |
S. Wang, L. L. Minku and X. Yao,
Resampling-based ensemble methods for online class imbalance learning, IEEE Transactions on Knowledge and Data Engineering, 27 (2015), 1356-1368.
doi: 10.1109/TKDE.2014.2345380. |
[19] |
I. H. Witten and E. Frank,
Data mining:Practical machine learning tools and techniques, Newsletter: ACM SIGMOD Record Homepage Archive, 31 (2002), 76-77.
doi: 10.1145/507338.507355. |
[20] |
B. Yang and L. Jing, A Novel nonparallel plane proximal svm for imbalance data classification Journal of Software, 9 2014. |
show all references
References:
[1] |
J. Alcalá-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García, L. Sánchez and F. Herrera,
KEEL data-mining software tool: Data set repository, Integration of Algorithms and Experimental Analysis Framework, Journal of Multiple-Valued Logic and Soft Computing, 17 (2011), 255-287.
|
[2] |
A. Asuncion and D. J. Newman,
UCI Repository of Machine Learning Database (School of Information and Computer Science), Irvine, CA: Univ. of California [Online], 2007. Available: http://www.ics.uci.edu/mlearn/MLRepository.html |
[3] |
I. Brown and C. Mues,
An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39 (2012), 3446-3453.
doi: 10.1016/j.eswa.2011.09.033. |
[4] |
P. Cao, D. Zhao and O. Zaiane,
A PSO-based cost-sensitive neural network for imbalanced data classification, Trends and Applications in Knowledge Discovery and Data Mining, (2013), 452-463.
doi: 10.1007/978-3-642-40319-4_39. |
[5] |
Y. Chen,
Learning Classifiers from Imbalanced Only Positive and Unlabeled Data Sets 2008 UC San Diego Data Mining Contest. |
[6] |
Y. Chen, S. Tang, L. Zhou, C. Wang, J. Du, T. Wang and S. Pei,
Decentralized Clustering by Finding Loose and Distributed Density Cores, Inform. Sci., 433/434 (2018), 510-526.
doi: 10.1016/j.ins.2016.08.009. |
[7] |
Doucette and M. I. Heywood,
Classification under imbalanced data sets:Active sub-sampling and auc approximation, M. O'Neill et al. Eds.:EuroGP 2008, LNCS, 4971 (2008), 266-277.
|
[8] |
B. J. Frey and D. Dueck,
Clustering by passing messages between data points, Science, 315 (2007), 972-976.
doi: 10.1126/science.1136800. |
[9] |
G. Hulten, L. Spencer and P. Domingos, Mining time-changing data streams, In: ACM
SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, (2001), 97-106.
doi: 10.1145/502512.502529. |
[10] |
A. K. Jain,
Data clustering:50 years beyond K-means, Part of the Lecture Notes in Computer Science book series, 5211 (2008), 3-4.
doi: 10.1007/978-3-540-87479-9_3. |
[11] |
R. Kohavi, Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid, In: Second International Conference on Knoledge Discovery and Data Mining, (1996), 202-207. |
[12] |
V. López, I. Triguero, C. J. Carmona, S. García and F. Herrera,
Addressing imbalanced classification withinstance generation techniques: IPADE-ID, Neurocomputing, 126 (2014), 15-28.
|
[13] |
A. C. Lorena, L. F. O. Jacintho, M. F. Siqueira, R. De Giovanni, L. G. Lohmann, A. C. P. L. F. de Carvalho and M. Yamamoto,
Comparing machine learning classifiers in potential distribution modelling, Expert Systems with Applications, 38 (2011), 5268-5275.
doi: 10.1016/j.eswa.2010.10.031. |
[14] |
H. Ma,
Correlation-based Feature Subset Selection For Machine Learning PhD Thesis, 1998. |
[15] |
A. K. Menon, H. Narasimhan, S. Agarwal and S. Chawla, On the statistical consistency of algorithms for binary classification under class imbalance, Appearing in Proceedings of the 30 thInternational Conference on Machine Learning Atlanta, Georgia, USA, 2013. |
[16] |
A. Rodriguez and A. Laio,
Clustering by fast search and find of density peaks, Science, 344 (2014), 1492-1496.
doi: 10.1126/science.1242072. |
[17] |
N. Verbiesta, E. Ramentol, C. Cornelisa and F. Herrera,
Preprocessing noisy imbalanced datasets using SMOTE enhanced withfuzzy rough prototype selection, Applied Soft Computing, 22 (2014), 511-517.
|
[18] |
S. Wang, L. L. Minku and X. Yao,
Resampling-based ensemble methods for online class imbalance learning, IEEE Transactions on Knowledge and Data Engineering, 27 (2015), 1356-1368.
doi: 10.1109/TKDE.2014.2345380. |
[19] |
I. H. Witten and E. Frank,
Data mining:Practical machine learning tools and techniques, Newsletter: ACM SIGMOD Record Homepage Archive, 31 (2002), 76-77.
doi: 10.1145/507338.507355. |
[20] |
B. Yang and L. Jing, A Novel nonparallel plane proximal svm for imbalance data classification Journal of Software, 9 2014. |


S.no | Dataset | symbol | Instances | Majority | Minority | IR |
1. | Breast-cancer | B1 | 286 | 201 | 85 | 2.36 |
2. | Breast-w | B2 | 699 | 458 | 241 | 1.90 |
3. | Colic | C1 | 368 | 232 | 136 | 1.71 |
4. | Credit-g | C2 | 1,000 | 700 | 300 | 2.33 |
5. | Diabetes | D1 | 768 | 500 | 268 | 1.87 |
6. | Heart-c | H1 | 303 | 165 | 138 | 1.19 |
7. | Heart-h | H2 | 294 | 188 | 10 | 1.77 |
8. | Heart-stat | H3 | 270 | 150 | 120 | 1.25 |
9. | Hepatitis | H4 | 155 | 123 | 32 | 3.85 |
10. | Ionosphere | I1 | 351 | 225 | 126 | 1.79 |
11. | Kr-vs-kp | K1 | 3196 | 1669 | 1527 | 1.09 |
12. | Labor | L1 | 57 | 37 | 20 | 1.85 |
13. | Mushroom | M1 | 8124 | 4208 | 3916 | 1.08 |
14. | Sick | S1 | 3772 | 3541 | 231 | 15.32 |
15. | Sonar | S2 | 208 | 111 | 97 | 1.15 |
S.no | Dataset | symbol | Instances | Majority | Minority | IR |
1. | Breast-cancer | B1 | 286 | 201 | 85 | 2.36 |
2. | Breast-w | B2 | 699 | 458 | 241 | 1.90 |
3. | Colic | C1 | 368 | 232 | 136 | 1.71 |
4. | Credit-g | C2 | 1,000 | 700 | 300 | 2.33 |
5. | Diabetes | D1 | 768 | 500 | 268 | 1.87 |
6. | Heart-c | H1 | 303 | 165 | 138 | 1.19 |
7. | Heart-h | H2 | 294 | 188 | 10 | 1.77 |
8. | Heart-stat | H3 | 270 | 150 | 120 | 1.25 |
9. | Hepatitis | H4 | 155 | 123 | 32 | 3.85 |
10. | Ionosphere | I1 | 351 | 225 | 126 | 1.79 |
11. | Kr-vs-kp | K1 | 3196 | 1669 | 1527 | 1.09 |
12. | Labor | L1 | 57 | 37 | 20 | 1.85 |
13. | Mushroom | M1 | 8124 | 4208 | 3916 | 1.08 |
14. | Sick | S1 | 3772 | 3541 | 231 | 15.32 |
15. | Sonar | S2 | 208 | 111 | 97 | 1.15 |
Dataset | Instances | Majority | Minority | IR |
Chunk 1:{B1} | 286 | 201 | 85 | 2.36 |
Chunk 2:{B1, B2} | 985 | 659 | 326 | 2.02 |
Chunk 3:{B1, B2, C1} | 1353 | 891 | 462 | 1.92 |
Chunk 4:{B1, B2, C1, C2} | 2353 | 1591 | 1062 | 1.49 |
Chunk 5:{B1, B2, C1, C2, D1} | 3121 | 2091 | 1325 | 1.57 |
Chunk 6:{B1, B2, C1, C2, D1, H1} | 3424 | 2256 | 1463 | 1.52 |
Chunk 7:{B1, B2, C1, C2, D1, H1, H2} | 3718 | 2444 | 1569 | 1.55 |
Chunk 8:{B1, B2, C1, C2, D1, H1, H2, H3} | 3988 | 2594 | 1689 | 1.53 |
Chunk 9:{B1, B2, C1, C2, D1, H1, H2, H3, H4} | 4143 | 2717 | 1721 | 1.57 |
Chunk 10:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1} | 4494 | 2942 | 1847 | 1.59 |
Chunk 11:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1} | 7690 | 4611 | 3374 | 1.36 |
Chunk 12:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1} | 7747 | 4648 | 3394 | 1.36 |
Chunk 13:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1} | 15871 | 8856 | 7310 | 1.21 |
Chunk 14:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1, S1} | 19643 | 12397 | 7541 | 1.64 |
Chunk 15:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1, S1, S2} | 19851 | 12508 | 7638 | 1.63 |
Dataset | Instances | Majority | Minority | IR |
Chunk 1:{B1} | 286 | 201 | 85 | 2.36 |
Chunk 2:{B1, B2} | 985 | 659 | 326 | 2.02 |
Chunk 3:{B1, B2, C1} | 1353 | 891 | 462 | 1.92 |
Chunk 4:{B1, B2, C1, C2} | 2353 | 1591 | 1062 | 1.49 |
Chunk 5:{B1, B2, C1, C2, D1} | 3121 | 2091 | 1325 | 1.57 |
Chunk 6:{B1, B2, C1, C2, D1, H1} | 3424 | 2256 | 1463 | 1.52 |
Chunk 7:{B1, B2, C1, C2, D1, H1, H2} | 3718 | 2444 | 1569 | 1.55 |
Chunk 8:{B1, B2, C1, C2, D1, H1, H2, H3} | 3988 | 2594 | 1689 | 1.53 |
Chunk 9:{B1, B2, C1, C2, D1, H1, H2, H3, H4} | 4143 | 2717 | 1721 | 1.57 |
Chunk 10:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1} | 4494 | 2942 | 1847 | 1.59 |
Chunk 11:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1} | 7690 | 4611 | 3374 | 1.36 |
Chunk 12:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1} | 7747 | 4648 | 3394 | 1.36 |
Chunk 13:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1} | 15871 | 8856 | 7310 | 1.21 |
Chunk 14:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1, S1} | 19643 | 12397 | 7541 | 1.64 |
Chunk 15:{B1, B2, C1, C2, D1, H1, H2, H3, H4, I1, K1, L1, M1, S1, S2} | 19851 | 12508 | 7638 | 1.63 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.260 |
0.395 |
0.325 |
Chunk 2 (maj=659; min=326) | 0.596 |
0.685 |
0.652 |
Chunk 3 (maj=891; min=462) | 0.636 |
0.693 |
0.689 |
Chunk 4 (maj=1591; min=762) | 0.577 |
0.642 |
0.624 |
Chunk 5 (maj=2091; min=1030) | 0.582 |
0.634 |
0.638 |
Chunk 6 (maj=2214; min=1062) | 0.635 |
0.674 |
0.685 |
Chunk 7 (maj=2439; min=1188) | 0.679 |
0.707 |
0.723 |
Chunk 8 (maj=2476; min=1208) | 0.702 |
0.738 |
0.740 |
Chunk 9 (maj=6017; min=1438) | 0.721 |
0.667 |
0.759 |
Chunk 10 (maj=6128; min=1536) | 0.724 |
0.657 |
0.757 |
Chunk 11 (maj=6395; min=1704) | 0.745 |
0.684 |
0.778 |
|
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.260 |
0.395 |
0.325 |
Chunk 2 (maj=659; min=326) | 0.596 |
0.685 |
0.652 |
Chunk 3 (maj=891; min=462) | 0.636 |
0.693 |
0.689 |
Chunk 4 (maj=1591; min=762) | 0.577 |
0.642 |
0.624 |
Chunk 5 (maj=2091; min=1030) | 0.582 |
0.634 |
0.638 |
Chunk 6 (maj=2214; min=1062) | 0.635 |
0.674 |
0.685 |
Chunk 7 (maj=2439; min=1188) | 0.679 |
0.707 |
0.723 |
Chunk 8 (maj=2476; min=1208) | 0.702 |
0.738 |
0.740 |
Chunk 9 (maj=6017; min=1438) | 0.721 |
0.667 |
0.759 |
Chunk 10 (maj=6128; min=1536) | 0.724 |
0.657 |
0.757 |
Chunk 11 (maj=6395; min=1704) | 0.745 |
0.684 |
0.778 |
|
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 74.28 |
72.18 |
71.73 |
Chunk 2 (maj=659; min=326) | 84.64 |
84.09 |
84.94 |
Chunk 3 (maj=891; min=462) | 84.81 |
81.90 |
84.03 |
Chunk 4 (maj=1591; min=762) | 81.42 |
80.19 |
79.10 |
Chunk 5 (maj=2091; min=1030) | 80.04 |
79.30 |
79.26 |
Chunk 6 (maj=2214; min=1062) | 79.90 |
79.86 |
79.59 |
Chunk 7 (maj=2439; min=1188) | 81.31 |
81.06 |
81.23 |
Chunk 8 (maj=2476; min=1208) | 80.97 |
82.15 |
81.01 |
Chunk 9 (maj=6017; min=1438) | 82.94 | 83.44 |
82.94 |
Chunk 10 (maj=6128; min=1536) | 82.01 |
81.92 |
82.17 |
Chunk 11 (maj=6395; min=1704) | 83.33 |
83.11 |
83.52 |
|
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 74.28 |
72.18 |
71.73 |
Chunk 2 (maj=659; min=326) | 84.64 |
84.09 |
84.94 |
Chunk 3 (maj=891; min=462) | 84.81 |
81.90 |
84.03 |
Chunk 4 (maj=1591; min=762) | 81.42 |
80.19 |
79.10 |
Chunk 5 (maj=2091; min=1030) | 80.04 |
79.30 |
79.26 |
Chunk 6 (maj=2214; min=1062) | 79.90 |
79.86 |
79.59 |
Chunk 7 (maj=2439; min=1188) | 81.31 |
81.06 |
81.23 |
Chunk 8 (maj=2476; min=1208) | 80.97 |
82.15 |
81.01 |
Chunk 9 (maj=6017; min=1438) | 82.94 | 83.44 |
82.94 |
Chunk 10 (maj=6128; min=1536) | 82.01 |
81.92 |
82.17 |
Chunk 11 (maj=6395; min=1704) | 83.33 |
83.11 |
83.52 |
|
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.740 |
0.605 |
0.675 |
Chunk 2 (maj=659; min=326) | 0.404 |
0.315 |
0.348 |
Chunk 3 (maj=891; min=462) | 0.364 |
0.307 |
0.311 |
Chunk 4 (maj=1591; min=762) | 0.423 |
0.358 |
0.376 |
Chunk 5 (maj=2091; min=1030) | 0.418 |
0.366 |
0.362 |
Chunk 6 (maj=2214; min=1062) | 0.365 |
0.326 |
0.315 |
Chunk 7 (maj=2439; min=1188) | 0.321 |
0.293 |
0.277 |
Chunk 8 (maj=2476; min=1208) | 0.298 |
0.262 |
0.260 |
Chunk 9 (maj=6017; min=1438) | 0.279 |
0.333 |
0.241 |
Chunk 10 (maj=6128; min=1536) | 0.276 |
0.343 |
0.243 |
Chunk 11 (maj=6395; min=1704) | 0.255 |
0.316 |
0.222 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.740 |
0.605 |
0.675 |
Chunk 2 (maj=659; min=326) | 0.404 |
0.315 |
0.348 |
Chunk 3 (maj=891; min=462) | 0.364 |
0.307 |
0.311 |
Chunk 4 (maj=1591; min=762) | 0.423 |
0.358 |
0.376 |
Chunk 5 (maj=2091; min=1030) | 0.418 |
0.366 |
0.362 |
Chunk 6 (maj=2214; min=1062) | 0.365 |
0.326 |
0.315 |
Chunk 7 (maj=2439; min=1188) | 0.321 |
0.293 |
0.277 |
Chunk 8 (maj=2476; min=1208) | 0.298 |
0.262 |
0.260 |
Chunk 9 (maj=6017; min=1438) | 0.279 |
0.333 |
0.241 |
Chunk 10 (maj=6128; min=1536) | 0.276 |
0.343 |
0.243 |
Chunk 11 (maj=6395; min=1704) | 0.255 |
0.316 |
0.222 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.606 |
0.683 |
0.637 |
Chunk 2 (maj=659; min=326) | 0.782 |
0.836 |
0.812 |
Chunk 3 (maj=891; min=462) | 0.802 |
0.832 |
0.833 |
Chunk 4 (maj=1591; min=762) | 0.764 |
0.820 |
0.777 |
Chunk 5 (maj=2091; min=1030) | 0.761 |
0.818 |
0.787 |
Chunk 6 (maj=2214; min=1062) | 0.746 |
0.819 |
0.775 |
Chunk 7 (maj=2439; min=1188) | 0.766 |
0.836 |
0.795 |
Chunk 8 (maj=2476; min=1208) | 0.761 |
0.845 |
0.791 |
Chunk 9 (maj=6017; min=1438) | 0.782 |
0.813 |
0.810 |
Chunk 10 (maj=6128; min=1536) | 0.779 |
0.812 |
0.806 |
Chunk 11 (maj=6395; min=1704) | 0.798 |
0.826 |
0.821 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.606 |
0.683 |
0.637 |
Chunk 2 (maj=659; min=326) | 0.782 |
0.836 |
0.812 |
Chunk 3 (maj=891; min=462) | 0.802 |
0.832 |
0.833 |
Chunk 4 (maj=1591; min=762) | 0.764 |
0.820 |
0.777 |
Chunk 5 (maj=2091; min=1030) | 0.761 |
0.818 |
0.787 |
Chunk 6 (maj=2214; min=1062) | 0.746 |
0.819 |
0.775 |
Chunk 7 (maj=2439; min=1188) | 0.766 |
0.836 |
0.795 |
Chunk 8 (maj=2476; min=1208) | 0.761 |
0.845 |
0.791 |
Chunk 9 (maj=6017; min=1438) | 0.782 |
0.813 |
0.810 |
Chunk 10 (maj=6128; min=1536) | 0.779 |
0.812 |
0.806 |
Chunk 11 (maj=6395; min=1704) | 0.798 |
0.826 |
0.821 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.753 |
0.774 |
0.736 |
Chunk 2 (maj=659; min=326) | 0.859 |
0.881 |
0.861 |
Chunk 3 (maj=891; min=462) | 0.856 |
0.866 |
0.836 |
Chunk 4 (maj=1591; min=762) | 0.834 |
0.849 |
0.800 |
Chunk 5 (maj=2091; min=1030) | 0.827 |
0.839 |
0.802 |
Chunk 6 (maj=2214; min=1062) | 0.774 |
0.795 | 0.785 |
Chunk 7 (maj=2439; min=1188) | 0.791 |
0.802 |
0.804 |
Chunk 8 (maj=2476; min=1208) | 0.779 |
0.811 |
0.798 |
Chunk 9 (maj=6017; min=1438) | 0.803 |
0.826 |
0.819 |
Chunk 10 (maj=6128; min=1536) | 0.795 |
0.807 |
0.813 |
Chunk 11 (maj=6395; min=1704) | 0.811 |
0.822 |
0.828 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.753 |
0.774 |
0.736 |
Chunk 2 (maj=659; min=326) | 0.859 |
0.881 |
0.861 |
Chunk 3 (maj=891; min=462) | 0.856 |
0.866 |
0.836 |
Chunk 4 (maj=1591; min=762) | 0.834 |
0.849 |
0.800 |
Chunk 5 (maj=2091; min=1030) | 0.827 |
0.839 |
0.802 |
Chunk 6 (maj=2214; min=1062) | 0.774 |
0.795 | 0.785 |
Chunk 7 (maj=2439; min=1188) | 0.791 |
0.802 |
0.804 |
Chunk 8 (maj=2476; min=1208) | 0.779 |
0.811 |
0.798 |
Chunk 9 (maj=6017; min=1438) | 0.803 |
0.826 |
0.819 |
Chunk 10 (maj=6128; min=1536) | 0.795 |
0.807 |
0.813 |
Chunk 11 (maj=6395; min=1704) | 0.811 |
0.822 |
0.828 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.947 |
0.860 |
0.909 |
Chunk 2 (maj=659; min=326) | 0.953 |
0.906 |
0.946 |
Chunk 3 (maj=891; min=462) | 0.946 |
0.875 |
0.925 |
Chunk 4 (maj=1591; min=762) | 0.921 | 0.872 |
0.888 |
Chunk 5 (maj=2091; min=1030) | 0.901 |
0.866 |
0.884 |
Chunk 6 (maj=2214; min=1062) | 0.813 |
0.828 |
0.820 |
Chunk 7 (maj=2439; min=1188) | 0.814 | 0.830 |
0.824 |
Chunk 8 (maj=2476; min=1208) | 0.793 | 0.826 |
0.808 |
Chunk 9 (maj=6017; min=1438) | 0.815 |
0.844 |
0.828 |
Chunk 10 (maj=6128; min=1536) | 0.806 |
0.841 |
0.822 |
Chunk 11 (maj=6395; min=1704) | 0.821 |
0.851 |
0.833 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.947 |
0.860 |
0.909 |
Chunk 2 (maj=659; min=326) | 0.953 |
0.906 |
0.946 |
Chunk 3 (maj=891; min=462) | 0.946 |
0.875 |
0.925 |
Chunk 4 (maj=1591; min=762) | 0.921 | 0.872 |
0.888 |
Chunk 5 (maj=2091; min=1030) | 0.901 |
0.866 |
0.884 |
Chunk 6 (maj=2214; min=1062) | 0.813 |
0.828 |
0.820 |
Chunk 7 (maj=2439; min=1188) | 0.814 | 0.830 |
0.824 |
Chunk 8 (maj=2476; min=1208) | 0.793 | 0.826 |
0.808 |
Chunk 9 (maj=6017; min=1438) | 0.815 |
0.844 |
0.828 |
Chunk 10 (maj=6128; min=1536) | 0.806 |
0.841 |
0.822 |
Chunk 11 (maj=6395; min=1704) | 0.821 |
0.851 |
0.833 |
Chunk no | C4.5 | HoeffdingTree | IUSDS | |||||
Chunk 1 (maj=201; min=85) | 0.838 |
0.812 |
0.812 | |||||
Chunk 2 (maj=659; min=326) | 0.900 |
0.890 |
0.898 | |||||
Chunk 3 (maj=891; min=462) | 0.896 | 0.867 |
0.874 | |||||
Chunk 4 (maj=1591; min=762) | 0.873 |
0.857 |
0.838 | |||||
Chunk 5 (maj=2091; min=1030) | 0.860 |
0.849 |
0.838 | |||||
Chunk 6 (maj=2214; min=1062) | 0.785 |
0.803 |
0.791 | |||||
Chunk 7 (maj=2439; min=1188) | 0.794 |
0.808 |
0.804 | |||||
Chunk 8 (maj=2476; min=1208) | 0.774 |
0.810 |
0.790 | |||||
Chunk 9 (maj=6017; min=1438) | 0.798 |
0.827 |
0.813 | |||||
Chunk 10 (maj=6128; min=1536) | 0.790 |
0.815 |
0.807 | |||||
Chunk 11 (maj=6395; min=1704) | 0.807 |
0.828 |
0.821 | |||||
Chunk no | C4.5 | HoeffdingTree | IUSDS | |||||
Chunk 1 (maj=201; min=85) | 0.838 |
0.812 |
0.812 | |||||
Chunk 2 (maj=659; min=326) | 0.900 |
0.890 |
0.898 | |||||
Chunk 3 (maj=891; min=462) | 0.896 | 0.867 |
0.874 | |||||
Chunk 4 (maj=1591; min=762) | 0.873 |
0.857 |
0.838 | |||||
Chunk 5 (maj=2091; min=1030) | 0.860 |
0.849 |
0.838 | |||||
Chunk 6 (maj=2214; min=1062) | 0.785 |
0.803 |
0.791 | |||||
Chunk 7 (maj=2439; min=1188) | 0.794 |
0.808 |
0.804 | |||||
Chunk 8 (maj=2476; min=1208) | 0.774 |
0.810 |
0.790 | |||||
Chunk 9 (maj=6017; min=1438) | 0.798 |
0.827 |
0.813 | |||||
Chunk 10 (maj=6128; min=1536) | 0.790 |
0.815 |
0.807 | |||||
Chunk 11 (maj=6395; min=1704) | 0.807 |
0.828 |
0.821 | |||||
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.053 |
0.140 |
0.091 |
Chunk 2 (maj=659; min=326) | 0.047 |
0.094 |
0.054 |
Chunk 3 (maj=891; min=462) | 0.054 |
0.125 |
0.075 |
Chunk 4 (maj=1591; min=762) | 0.079 | 0.128 |
0.112 |
Chunk 5 (maj=2091; min=1030) | 0.099 |
0.134 |
0.116 |
Chunk 6 (maj=2214; min=1062) | 0.187 |
0.172 |
0.180 |
Chunk 7 (maj=2439; min=1188) | 0.186 | 0.170 |
0.176 |
Chunk 8 (maj=2476; min=1208) | 0.207 | 0.174 |
0.192 |
Chunk 9 (maj=6017; min=1438) | 0.185 |
0.156 |
0.172 |
Chunk 10 (maj=6128; min=1536) | 0.194 |
0.159 |
0.178 |
Chunk 11 (maj=6395; min=1704) | 0.179 |
0.149 |
0.167 |
Chunk no | C4.5 | HoeffdingTree | IUSDS |
Chunk 1 (maj=201; min=85) | 0.053 |
0.140 |
0.091 |
Chunk 2 (maj=659; min=326) | 0.047 |
0.094 |
0.054 |
Chunk 3 (maj=891; min=462) | 0.054 |
0.125 |
0.075 |
Chunk 4 (maj=1591; min=762) | 0.079 | 0.128 |
0.112 |
Chunk 5 (maj=2091; min=1030) | 0.099 |
0.134 |
0.116 |
Chunk 6 (maj=2214; min=1062) | 0.187 |
0.172 |
0.180 |
Chunk 7 (maj=2439; min=1188) | 0.186 | 0.170 |
0.176 |
Chunk 8 (maj=2476; min=1208) | 0.207 | 0.174 |
0.192 |
Chunk 9 (maj=6017; min=1438) | 0.185 |
0.156 |
0.172 |
Chunk 10 (maj=6128; min=1536) | 0.194 |
0.159 |
0.178 |
Chunk 11 (maj=6395; min=1704) | 0.179 |
0.149 |
0.167 |
Results | Systems | Wins | Ties | Losses |
TN Rate | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
Accuracy | IUSDS v/s C4.5 | 04 | 1 | 6 |
IUSDS v/s HoeffdingTree | 05 | 0 | 6 | |
FP Rate | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
AUC | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 1 | 0 | 10 | |
Precision | IUSDS v/s C4.5 | 10 | 0 | 1 |
IUSDS v/s HoeffdingTree | 08 | 1 | 02 | |
Recall | IUSDS v/s C4.5 | 05 | 03 | 03 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
F-measure | IUSDS v/s C4.5 | 09 | 01 | 01 |
IUSDS v/s HoeffdingTree | 10 | 00 | 01 | |
FN Rate | IUSDS v/s C4.5 | 05 | 03 | 03 |
IUSDS v/s HoeffdingTree | 11 | 00 | 00 |
Results | Systems | Wins | Ties | Losses |
TN Rate | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
Accuracy | IUSDS v/s C4.5 | 04 | 1 | 6 |
IUSDS v/s HoeffdingTree | 05 | 0 | 6 | |
FP Rate | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
AUC | IUSDS v/s C4.5 | 11 | 0 | 0 |
IUSDS v/s HoeffdingTree | 1 | 0 | 10 | |
Precision | IUSDS v/s C4.5 | 10 | 0 | 1 |
IUSDS v/s HoeffdingTree | 08 | 1 | 02 | |
Recall | IUSDS v/s C4.5 | 05 | 03 | 03 |
IUSDS v/s HoeffdingTree | 11 | 0 | 0 | |
F-measure | IUSDS v/s C4.5 | 09 | 01 | 01 |
IUSDS v/s HoeffdingTree | 10 | 00 | 01 | |
FN Rate | IUSDS v/s C4.5 | 05 | 03 | 03 |
IUSDS v/s HoeffdingTree | 11 | 00 | 00 |
[1] |
Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016 |
[2] |
Alexandre J. Chorin, Fei Lu, Robert N. Miller, Matthias Morzfeld, Xuemin Tu. Sampling, feasibility, and priors in data assimilation. Discrete and Continuous Dynamical Systems, 2016, 36 (8) : 4227-4246. doi: 10.3934/dcds.2016.36.4227 |
[3] |
Jingzhi Li, Jun Zou. A direct sampling method for inverse scattering using far-field data. Inverse Problems and Imaging, 2013, 7 (3) : 757-775. doi: 10.3934/ipi.2013.7.757 |
[4] |
Yoshiaki Inoue, Tetsuya Takine. The FIFO single-server queue with disasters and multiple Markovian arrival streams. Journal of Industrial and Management Optimization, 2014, 10 (1) : 57-87. doi: 10.3934/jimo.2014.10.57 |
[5] |
Andrea Cianchi, Vladimir Maz'ya. Global gradient estimates in elliptic problems under minimal data and domain regularity. Communications on Pure and Applied Analysis, 2015, 14 (1) : 285-311. doi: 10.3934/cpaa.2015.14.285 |
[6] |
Hasan Hosseini-Nasab, Vahid Ettehadi. Development of opened-network data envelopment analysis models under uncertainty. Journal of Industrial and Management Optimization, 2022 doi: 10.3934/jimo.2022027 |
[7] |
Minlong Lin, Ke Tang. Selective further learning of hybrid ensemble for class imbalanced increment learning. Big Data & Information Analytics, 2017, 2 (1) : 1-21. doi: 10.3934/bdia.2017005 |
[8] |
Stefano Galatolo. Orbit complexity and data compression. Discrete and Continuous Dynamical Systems, 2001, 7 (3) : 477-486. doi: 10.3934/dcds.2001.7.477 |
[9] |
Alessia Marigo. Equilibria for data networks. Networks and Heterogeneous Media, 2007, 2 (3) : 497-528. doi: 10.3934/nhm.2007.2.497 |
[10] |
Pooja Bansal, Aparna Mehra. Integrated dynamic interval data envelopment analysis in the presence of integer and negative data. Journal of Industrial and Management Optimization, 2022, 18 (2) : 1339-1363. doi: 10.3934/jimo.2021023 |
[11] |
Yunmei Chen, Xiaojing Ye, Feng Huang. A novel method and fast algorithm for MR image reconstruction with significantly under-sampled data. Inverse Problems and Imaging, 2010, 4 (2) : 223-240. doi: 10.3934/ipi.2010.4.223 |
[12] |
Shixiu Zheng, Zhilei Xu, Huan Yang, Jintao Song, Zhenkuan Pan. Comparisons of different methods for balanced data classification under the discrete non-local total variational framework. Mathematical Foundations of Computing, 2019, 2 (1) : 11-28. doi: 10.3934/mfc.2019002 |
[13] |
Anna Chiara Lai, Monica Motta. Stabilizability in optimization problems with unbounded data. Discrete and Continuous Dynamical Systems, 2021, 41 (5) : 2447-2474. doi: 10.3934/dcds.2020371 |
[14] |
Richard Boire. Understanding AI in a world of big data. Big Data & Information Analytics, 2018 doi: 10.3934/bdia.2018001 |
[15] |
Xiaosheng Li, Gunther Uhlmann. Inverse problems with partial data in a slab. Inverse Problems and Imaging, 2010, 4 (3) : 449-462. doi: 10.3934/ipi.2010.4.449 |
[16] |
Roman Chapko, B. Tomas Johansson. Integral equations for biharmonic data completion. Inverse Problems and Imaging, 2019, 13 (5) : 1095-1111. doi: 10.3934/ipi.2019049 |
[17] |
Thomas R. Cameron, Sebastian Charmot, Jonad Pulaj. On the linear ordering problem and the rankability of data. Foundations of Data Science, 2021, 3 (2) : 133-149. doi: 10.3934/fods.2021010 |
[18] |
Jelena Grbić, Jie Wu, Kelin Xia, Guo-Wei Wei. Aspects of topological approaches for data science. Foundations of Data Science, 2022, 4 (2) : 165-216. doi: 10.3934/fods.2022002 |
[19] |
Raluca Felea, Romina Gaburro, Allan Greenleaf, Clifford Nolan. Microlocal analysis of borehole seismic data. Inverse Problems and Imaging, , () : -. doi: 10.3934/ipi.2022026 |
[20] |
Ida De Bonis, Daniela Giachetti. Singular parabolic problems with possibly changing sign data. Discrete and Continuous Dynamical Systems - B, 2014, 19 (7) : 2047-2064. doi: 10.3934/dcdsb.2014.19.2047 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]