August  2020, 3(3): 165-183. doi: 10.3934/mfc.2020016

Modal additive models with data-driven structure identification

1. 

Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, K1N 6N5, Canada

2. 

College of Science, Huazhong Agricultural University, Wuhan 430070, China

* Corresponding author: Hong Chen

Received  January 2020 Revised  May 2020 Published  June 2020

Additive models, due to their high flexibility, have received a great deal of attention in high dimensional regression analysis. Many efforts have been made on capturing interactions between predictive variables within additive models. However, typical approaches are designed based on conditional mean assumptions, which may fail to reveal the structure when data is contaminated by heavy-tailed noise. In this paper, we propose a penalized modal regression method, Modal Additive Models (MAM), based on a conditional mode assumption for simultaneous function estimation and structure identification. MAM approximates the non-parametric function through forward neural networks, and maximizes modal risk with constraints on the function space and group structure. The proposed approach can be implemented by the half-quadratic (HQ) optimization technique, and its asymptotic estimation and selection consistency are established. It turns out that MAM can achieve satisfactory learning rate and identify the target group structure with high probability. The effectiveness of MAM is also supported by some simulated examples.

Citation: Tieliang Gong, Chen Xu, Hong Chen. Modal additive models with data-driven structure identification. Mathematical Foundations of Computing, 2020, 3 (3) : 165-183. doi: 10.3934/mfc.2020016
References:
[1]

L. Breiman and J. Friedman, Estimating optimal transformations for multiple regression and correlation, Journal of the American Statistical Association, 80 (1985), 580-598.  doi: 10.1080/01621459.1985.10478157.  Google Scholar

[2]

P. Chao and M. Zhu, Group additive structure identification for kernel non-parametric regression, Advances in Neural Information Processing Systems, (2017). Google Scholar

[3]

H. Chen, X. Wang, C. Deng and H. Huang, Group sparse additive machine, Advances in Neural Information Processing Systems, (2017). Google Scholar

[4]

H. Chen and Y. L. Wang, Kernel-based sparse regression with the correntropy-induced loss, Appl. Comput. Harmon. Anal., 44 (2018), 144-164.  doi: 10.1016/j.acha.2016.04.004.  Google Scholar

[5]

Y.-C. ChenR. GenoveseR. Tibshirani and L. Wasserman, Nonparametric modal regression, Annals of Statistics, 44 (2016), 489-514.  doi: 10.1214/15-AOS1373.  Google Scholar

[6]

G. Collomb, W. Härdle and S. Hassani, A note on prediction via estimation of the conditional mode function, Journal of Statistical Planning and Inference, 15 (1986), 227– 236. doi: 10.1016/0378-3758(86)90099-6.  Google Scholar

[7]

F. Cucker and S. Smale, Best choices for regularization parameters in learning theory: On the bias-variance problem, Foundations of Computational Mathematics, 2 (2002), 413-428.  doi: 10.1007/s102080010030.  Google Scholar

[8]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bulletin of the American Mathematical Society, 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[9]

J. Q. Fan, Y. Feng and R. Song, Nonparametric independence screening in sparse ultra-high-dimensional additive models, Journal of the American Statistical Association, 106 (2011), 544– 557. doi: 10.1198/jasa.2011.tm09779.  Google Scholar

[10]

J. Q. Fan and R. Z. Li, Variable selection via non-concave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar

[11]

J. Q. Fan and J. C. Lv, Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B Stat. Methodol., 70 (2008), 849-911.  doi: 10.1111/j.1467-9868.2008.00674.x.  Google Scholar

[12]

Y. FengJ. Fan and Y. Suykens, A statistical learning approach to modal regression, Journal of Machine Learning Research, 21 (2020), 1-35.   Google Scholar

[13]

D. Geman and C. Yang, Nonlinear image recovery with half-quadratic regularization, IEEE Transactions on Image Processing, 4 (1995), 932-946.  doi: 10.1109/83.392335.  Google Scholar

[14]

T. L. GongZ. B. Xu and H. Chen, Generalization analysis of Fredholm kernel regularized classifiers, Neural Computation, 29 (2017), 1879-1901.  doi: 10.1162/NECO_a_00967.  Google Scholar

[15]

C. Gu, Smoothing Spline ANOVA Models, Second edition, Springer Series in Statistics, 297. Springer, New York, 2013. doi: 10.1007/978-1-4614-5369-7.  Google Scholar

[16]

X. He, J. Wang and S. Lv, Scalable kernel-based variable selection with sparsistency, preprint, arXiv: 1802.09246. Google Scholar

[17]

J. HuangJ. Horowitz and F. R. Wei, Variable selection in nonparametric additive models, Annals of Statistics, 38 (2010), 2282-2313.  doi: 10.1214/09-AOS781.  Google Scholar

[18]

J. Huang and L. J. Yang, Identification of non-linear additive autoregressive models, Journal of the Royal Statistical Society, Series B, 66 (2004), 463-477.  doi: 10.1111/j.1369-7412.2004.05500.x.  Google Scholar

[19]

J. HuangS. G. Ma and C.-H. Zhang, Adaptive lasso for sparse high-dimensional regression models, Statistica Sinica., 18 (2008), 1603-1618.   Google Scholar

[20]

K. Kandasamy and Y. Yu, Additive approximations in high-dimensional non- parametric regression via the salsa, International Conference on Machine Learning, (2016). Google Scholar

[21]

T. Kühn, Covering numbers of Gaussian reproducing kernel Hilbert spaces, Journal of Complexity, 27 (2011), 489-499.  doi: 10.1016/j.jco.2011.01.005.  Google Scholar

[22]

F. KuoG. SloanG. Wasilkowski and H. Woźniakowski, On decompositions of multivariate functions, Mathematics of computation, Mathematics of Computation, 79 (2010), 953-966.  doi: 10.1090/S0025-5718-09-02319-9.  Google Scholar

[23]

Y. Lin and H. Zhang, Component selection and smoothing in multi-variate nonparametric regression, Annals of Statistics, 34 (2006), 2272-2297.  doi: 10.1214/009053606000000722.  Google Scholar

[24]

T. Sager and R. Thisted, Maximum likelihood estimation of isotonic modal regression, Annals of Statistics, 10 (1982), 690-707.  doi: 10.1214/aos/1176345865.  Google Scholar

[25]

X. T. ShenW. Pan and Y. Z. Zhu, Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107 (2012), 223-232.  doi: 10.1080/01621459.2011.645783.  Google Scholar

[26]

L. Shi, Y.-L. Feng and D.-X. Zhou, Concentration estimates for learning with $\ell^{1}$-regularizer and data dependent hypothesis space, Applied and Computational Harmonic Analysis, 31 (2011), 286 – 302. doi: 10.1016/j.acha.2011.01.001.  Google Scholar

[27]

T. ShivelyR. Kohn and S. Wood, Variable selection and function estimation in additive non-parametric regression using a data-based prior, Journal of the American Statistical Association, 94 (1999), 777-794.  doi: 10.1080/01621459.1999.10474180.  Google Scholar

[28]

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, 58 (1996), 267-288.  doi: 10.1111/j.2517-6161.1996.tb02080.x.  Google Scholar

[29]

X. Wang, H. Chen, W. Cai, D. Shen and H. Huang, Regularized modal regression with applications in cognitive impairment prediction, Advances in Neural Information Processing Systems, (2017). Google Scholar

[30]

Q. WuY. M. Ying and D.-X. Zhou, Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[31]

Q. Wu and D.-X. Zhou, Learning with sample dependent hypothesis spaces, Computers and Mathematics with Applications, 56 (2008), 2896-2907.  doi: 10.1016/j.camwa.2008.09.014.  Google Scholar

[32]

W. Yao and R. Lindsay amd R. Li, Local modal regression, Journal of Nonparametric Statistics, 24 (2012), 647-663.  doi: 10.1080/10485252.2012.678848.  Google Scholar

[33]

J. Yin, X. Chen and E. Xing, Group sparse additive models, International Conference on Machine Learning, (2012). Google Scholar

[34]

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society, Series B, 68 (2006), 49-67.  doi: 10.1111/j.1467-9868.2005.00532.x.  Google Scholar

[35]

T. Zhang, Covering number bounds of certain regularized linear function classes, Journal of Machine Learning Research, 2 (2002), 527-550.   Google Scholar

[36]

D.-X. Zhou, The covering number in learning theory, Journal of Complexity, 18 (2002), 739-767.  doi: 10.1006/jcom.2002.0635.  Google Scholar

[37]

D.-X. Zhou, Capacity of reproducing kernel space in learning theory, IEEE Transactions on Information Theory, 49 (2003), 1743-1752.  doi: 10.1109/TIT.2003.813564.  Google Scholar

[38]

D.-X. Zhou and K. Jetter, Approximation with polynomial kernels and SVM classifiers, Advances in Computational Mathematics, 25 (2006), 323-344.  doi: 10.1007/s10444-004-7206-2.  Google Scholar

[39]

H. Zou, The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101 (2006), 1418-1429.  doi: 10.1198/016214506000000735.  Google Scholar

show all references

References:
[1]

L. Breiman and J. Friedman, Estimating optimal transformations for multiple regression and correlation, Journal of the American Statistical Association, 80 (1985), 580-598.  doi: 10.1080/01621459.1985.10478157.  Google Scholar

[2]

P. Chao and M. Zhu, Group additive structure identification for kernel non-parametric regression, Advances in Neural Information Processing Systems, (2017). Google Scholar

[3]

H. Chen, X. Wang, C. Deng and H. Huang, Group sparse additive machine, Advances in Neural Information Processing Systems, (2017). Google Scholar

[4]

H. Chen and Y. L. Wang, Kernel-based sparse regression with the correntropy-induced loss, Appl. Comput. Harmon. Anal., 44 (2018), 144-164.  doi: 10.1016/j.acha.2016.04.004.  Google Scholar

[5]

Y.-C. ChenR. GenoveseR. Tibshirani and L. Wasserman, Nonparametric modal regression, Annals of Statistics, 44 (2016), 489-514.  doi: 10.1214/15-AOS1373.  Google Scholar

[6]

G. Collomb, W. Härdle and S. Hassani, A note on prediction via estimation of the conditional mode function, Journal of Statistical Planning and Inference, 15 (1986), 227– 236. doi: 10.1016/0378-3758(86)90099-6.  Google Scholar

[7]

F. Cucker and S. Smale, Best choices for regularization parameters in learning theory: On the bias-variance problem, Foundations of Computational Mathematics, 2 (2002), 413-428.  doi: 10.1007/s102080010030.  Google Scholar

[8]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bulletin of the American Mathematical Society, 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[9]

J. Q. Fan, Y. Feng and R. Song, Nonparametric independence screening in sparse ultra-high-dimensional additive models, Journal of the American Statistical Association, 106 (2011), 544– 557. doi: 10.1198/jasa.2011.tm09779.  Google Scholar

[10]

J. Q. Fan and R. Z. Li, Variable selection via non-concave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar

[11]

J. Q. Fan and J. C. Lv, Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B Stat. Methodol., 70 (2008), 849-911.  doi: 10.1111/j.1467-9868.2008.00674.x.  Google Scholar

[12]

Y. FengJ. Fan and Y. Suykens, A statistical learning approach to modal regression, Journal of Machine Learning Research, 21 (2020), 1-35.   Google Scholar

[13]

D. Geman and C. Yang, Nonlinear image recovery with half-quadratic regularization, IEEE Transactions on Image Processing, 4 (1995), 932-946.  doi: 10.1109/83.392335.  Google Scholar

[14]

T. L. GongZ. B. Xu and H. Chen, Generalization analysis of Fredholm kernel regularized classifiers, Neural Computation, 29 (2017), 1879-1901.  doi: 10.1162/NECO_a_00967.  Google Scholar

[15]

C. Gu, Smoothing Spline ANOVA Models, Second edition, Springer Series in Statistics, 297. Springer, New York, 2013. doi: 10.1007/978-1-4614-5369-7.  Google Scholar

[16]

X. He, J. Wang and S. Lv, Scalable kernel-based variable selection with sparsistency, preprint, arXiv: 1802.09246. Google Scholar

[17]

J. HuangJ. Horowitz and F. R. Wei, Variable selection in nonparametric additive models, Annals of Statistics, 38 (2010), 2282-2313.  doi: 10.1214/09-AOS781.  Google Scholar

[18]

J. Huang and L. J. Yang, Identification of non-linear additive autoregressive models, Journal of the Royal Statistical Society, Series B, 66 (2004), 463-477.  doi: 10.1111/j.1369-7412.2004.05500.x.  Google Scholar

[19]

J. HuangS. G. Ma and C.-H. Zhang, Adaptive lasso for sparse high-dimensional regression models, Statistica Sinica., 18 (2008), 1603-1618.   Google Scholar

[20]

K. Kandasamy and Y. Yu, Additive approximations in high-dimensional non- parametric regression via the salsa, International Conference on Machine Learning, (2016). Google Scholar

[21]

T. Kühn, Covering numbers of Gaussian reproducing kernel Hilbert spaces, Journal of Complexity, 27 (2011), 489-499.  doi: 10.1016/j.jco.2011.01.005.  Google Scholar

[22]

F. KuoG. SloanG. Wasilkowski and H. Woźniakowski, On decompositions of multivariate functions, Mathematics of computation, Mathematics of Computation, 79 (2010), 953-966.  doi: 10.1090/S0025-5718-09-02319-9.  Google Scholar

[23]

Y. Lin and H. Zhang, Component selection and smoothing in multi-variate nonparametric regression, Annals of Statistics, 34 (2006), 2272-2297.  doi: 10.1214/009053606000000722.  Google Scholar

[24]

T. Sager and R. Thisted, Maximum likelihood estimation of isotonic modal regression, Annals of Statistics, 10 (1982), 690-707.  doi: 10.1214/aos/1176345865.  Google Scholar

[25]

X. T. ShenW. Pan and Y. Z. Zhu, Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107 (2012), 223-232.  doi: 10.1080/01621459.2011.645783.  Google Scholar

[26]

L. Shi, Y.-L. Feng and D.-X. Zhou, Concentration estimates for learning with $\ell^{1}$-regularizer and data dependent hypothesis space, Applied and Computational Harmonic Analysis, 31 (2011), 286 – 302. doi: 10.1016/j.acha.2011.01.001.  Google Scholar

[27]

T. ShivelyR. Kohn and S. Wood, Variable selection and function estimation in additive non-parametric regression using a data-based prior, Journal of the American Statistical Association, 94 (1999), 777-794.  doi: 10.1080/01621459.1999.10474180.  Google Scholar

[28]

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, 58 (1996), 267-288.  doi: 10.1111/j.2517-6161.1996.tb02080.x.  Google Scholar

[29]

X. Wang, H. Chen, W. Cai, D. Shen and H. Huang, Regularized modal regression with applications in cognitive impairment prediction, Advances in Neural Information Processing Systems, (2017). Google Scholar

[30]

Q. WuY. M. Ying and D.-X. Zhou, Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[31]

Q. Wu and D.-X. Zhou, Learning with sample dependent hypothesis spaces, Computers and Mathematics with Applications, 56 (2008), 2896-2907.  doi: 10.1016/j.camwa.2008.09.014.  Google Scholar

[32]

W. Yao and R. Lindsay amd R. Li, Local modal regression, Journal of Nonparametric Statistics, 24 (2012), 647-663.  doi: 10.1080/10485252.2012.678848.  Google Scholar

[33]

J. Yin, X. Chen and E. Xing, Group sparse additive models, International Conference on Machine Learning, (2012). Google Scholar

[34]

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society, Series B, 68 (2006), 49-67.  doi: 10.1111/j.1467-9868.2005.00532.x.  Google Scholar

[35]

T. Zhang, Covering number bounds of certain regularized linear function classes, Journal of Machine Learning Research, 2 (2002), 527-550.   Google Scholar

[36]

D.-X. Zhou, The covering number in learning theory, Journal of Complexity, 18 (2002), 739-767.  doi: 10.1006/jcom.2002.0635.  Google Scholar

[37]

D.-X. Zhou, Capacity of reproducing kernel space in learning theory, IEEE Transactions on Information Theory, 49 (2003), 1743-1752.  doi: 10.1109/TIT.2003.813564.  Google Scholar

[38]

D.-X. Zhou and K. Jetter, Approximation with polynomial kernels and SVM classifiers, Advances in Computational Mathematics, 25 (2006), 323-344.  doi: 10.1007/s10444-004-7206-2.  Google Scholar

[39]

H. Zou, The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101 (2006), 1418-1429.  doi: 10.1198/016214506000000735.  Google Scholar

Figure 1.  Estimated transformation function for selected groups. Top-left: group $ (1, 6) $, top-right: group $ (8, 12) $, bottom-left: group $ (3, 7 ) $, bottom-right: group $ (10, 13) $
Algorithm 1.  Half-quadratic Optimization for MAM
1: Require: Input data $ ({{\bf x}}_i, y_i)_{i=1}^n $, kernel-induced representing function $ \phi $, activating function $ \psi $, weight parameter $ {{\bf w}} $ and bias term $ {{\bf b}} $.
2: Ensure: $ {{\bf a}}_{{\bf z}} $;
3: Define function $ f $ such that $ f({{\bf x}}^2) = \phi({{\bf x}}) $;
4: Initialize $ \sigma $, coefficient $ {{\bf a}} $;
5:while not converge do
6:    Update $ e_i $ by $ e_i = f^\prime \Big( \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 \Big) $;
7:    Update $ {{\bf a}} $ by $ {{\bf a}} = \arg \max_{{{\bf a}} \in \mathbb{R}^h} \frac{1}{n \sigma}\sum_{i=1}^{n} \Big( e_i \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 - g(e_i) \Big) - \lambda \|{{\bf a}}\|_2^2 $;
8:    update $ \sigma $;
9: end while
10: Output: $ {{\bf a}}_{{\bf z}} = {{\bf a}} $.
1: Require: Input data $ ({{\bf x}}_i, y_i)_{i=1}^n $, kernel-induced representing function $ \phi $, activating function $ \psi $, weight parameter $ {{\bf w}} $ and bias term $ {{\bf b}} $.
2: Ensure: $ {{\bf a}}_{{\bf z}} $;
3: Define function $ f $ such that $ f({{\bf x}}^2) = \phi({{\bf x}}) $;
4: Initialize $ \sigma $, coefficient $ {{\bf a}} $;
5:while not converge do
6:    Update $ e_i $ by $ e_i = f^\prime \Big( \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 \Big) $;
7:    Update $ {{\bf a}} $ by $ {{\bf a}} = \arg \max_{{{\bf a}} \in \mathbb{R}^h} \frac{1}{n \sigma}\sum_{i=1}^{n} \Big( e_i \big(\frac{y_i - f({{\bf x}}_i)}{\sigma} \big)^2 - g(e_i) \Big) - \lambda \|{{\bf a}}\|_2^2 $;
8:    update $ \sigma $;
9: end while
10: Output: $ {{\bf a}}_{{\bf z}} = {{\bf a}} $.
Algorithm 2.  Backward Stepwise Selection for MAM
1: Start with the variable pool $ G = \{(1,2,\cdots, d)\} $;
2: Solve (13) to obtain the maximum value $ \mathscr{R}_{\lambda, G} $;
3: for each variable $ j $ in $ G $ do
4:    $ \hat{G} \longleftarrow $ either divide $ j $ into subgroups or add to an existing group;
5:    Solve (13) to obtain the maximum value $ \mathscr{R}_{\lambda, \hat{G}} $;
6:    if $ \mathscr{R}_{\lambda, \hat{G}} > \mathscr{R}_{\lambda, G} $ then
7:        Preserve $ \hat{G} $ as the new group structure;
8:    end if
9: end for
10: Return $ \hat{G} $.
1: Start with the variable pool $ G = \{(1,2,\cdots, d)\} $;
2: Solve (13) to obtain the maximum value $ \mathscr{R}_{\lambda, G} $;
3: for each variable $ j $ in $ G $ do
4:    $ \hat{G} \longleftarrow $ either divide $ j $ into subgroups or add to an existing group;
5:    Solve (13) to obtain the maximum value $ \mathscr{R}_{\lambda, \hat{G}} $;
6:    if $ \mathscr{R}_{\lambda, \hat{G}} > \mathscr{R}_{\lambda, G} $ then
7:        Preserve $ \hat{G} $ as the new group structure;
8:    end if
9: end for
10: Return $ \hat{G} $.
Table 1.  Selected models for simulation study and the corresponding intrinsic group structures
ID Model Intrinsic group structure
M1 $ y = x_1 + x_2^2 + \frac{1}{1+ x_3^2} + \sin(\pi x_4) +\log(x_5+5) + \sqrt{|x_6|} + \epsilon $ $ \{(1),(2),(3),(4),(5),(6)\} $
M2 $ y = \frac{\sin(x_1)}{x_1 } + \cos((x_2 +x_3)\cdot \pi ) + \arctan((x_4 + x_5 + x_6)^2)+ \epsilon $ $ \{(1),(2, 3),(4, 5, 6)\} $
M3 $ y = \sin(x_1 + x_2) + 2\log(x_3 + 5) +x_4 + x_5\cdot x_6 + \epsilon $ $ \{(1, 2), (3), (4), (5, 6)\} $
ID Model Intrinsic group structure
M1 $ y = x_1 + x_2^2 + \frac{1}{1+ x_3^2} + \sin(\pi x_4) +\log(x_5+5) + \sqrt{|x_6|} + \epsilon $ $ \{(1),(2),(3),(4),(5),(6)\} $
M2 $ y = \frac{\sin(x_1)}{x_1 } + \cos((x_2 +x_3)\cdot \pi ) + \arctan((x_4 + x_5 + x_6)^2)+ \epsilon $ $ \{(1),(2, 3),(4, 5, 6)\} $
M3 $ y = \sin(x_1 + x_2) + 2\log(x_3 + 5) +x_4 + x_5\cdot x_6 + \epsilon $ $ \{(1, 2), (3), (4), (5, 6)\} $
Table 3.  Average performance that intrinsic group structures are identified for $ (\mu, \beta) $ pair (Gaussian noise)
Parameters M1 M2 M3
$ \mu $ $ \beta $ MF Size TP U O MF Size TP U O MF Size TP U O
$ 1 \rm{e} - 6 $ $ 1 $ 0 2 1 1 0 0 2 0.66 1 0 0 2 1 0 1
$ 1 \rm{e} - 5 $ $ 1 $ 0 2 1 1 0 0 2 0.84 1 0 0 2 1 0 1
$ 1 \rm{e} - 4 $ $ 1 $ 0 2 1 1 0 0 2 0.68 1 0 0 2 0.1 1 0
$ 1 \rm{e} - 3 $ $ 1 $ 0 2 1 1 0 0 2 0.46 0.46 0 0 2 1 1 0
$ 1 \rm{e} - 2 $ $ 1 $ 0 2 1 1 0 0 2 0.62 0.62 0 0 2 1 1 0
$ 1 \rm{e} - 1 $ $ 1 $ 0 2 1 1 0 0 2 0.78 0.78 0 0 2 1 0 0
$ 1 \rm{e} - 6 $ $ 3 $ 0 3 2 1 0 0 2 0.42 0.42 0 0 2 0.66 0.66 0
$ 1 \rm{e} - 5 $ $ 3 $ 0 2.84 1.78 0.94 0 0 2 0.54 0.54 0 0 2 0 1 0
$ 1 \rm{e} - 4 $ $ 3 $ 0 3.36 2.32 1 0 0 2 0.58 0.58 0 0 2.2 1.6 1 0
$ 1 \rm{e} - 3 $ $ 3 $ 0 4.9 3.9 1 0 0 2 0.78 0.78 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 3 $ 50 6 6 0 0 29 3.62 1.9 0 0.22 50 4 4 0 0
$ 1 \rm{e} - 1 $ $ 3 $ 50 6 6 0 0 0 5.38 1.62 0 1 0 6 2 0 1
$ 1 \rm{e} - 6 $ $ 5 $ 0 2.72 1.64 0.92 0 0 2 0.5 0.5 0 0 2.3 0.6 1 0
$ 1 \rm{e} - 5 $ $ 5 $ 0 3.4 1.6 0.8 0 0 2 0.58 0.58 0 0 3 2 1 0
$ 1 \rm{e} - 4 $ $ 5 $ 0 4.82 3.82 1 0 0 2.01 0.38 0.38 0 50 4 4 0 0
$ 1 \rm{e} - 3 $ $ 5 $ 27 5.54 5.08 0.46 0 28 3.44 1.76 0 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 5 $ 50 6 6 0 0 0 5 2 0 1 0 6 2 0 1
$ 1 \rm{e} - 1 $ $ 5 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
Parameters M1 M2 M3
$ \mu $ $ \beta $ MF Size TP U O MF Size TP U O MF Size TP U O
$ 1 \rm{e} - 6 $ $ 1 $ 0 2 1 1 0 0 2 0.66 1 0 0 2 1 0 1
$ 1 \rm{e} - 5 $ $ 1 $ 0 2 1 1 0 0 2 0.84 1 0 0 2 1 0 1
$ 1 \rm{e} - 4 $ $ 1 $ 0 2 1 1 0 0 2 0.68 1 0 0 2 0.1 1 0
$ 1 \rm{e} - 3 $ $ 1 $ 0 2 1 1 0 0 2 0.46 0.46 0 0 2 1 1 0
$ 1 \rm{e} - 2 $ $ 1 $ 0 2 1 1 0 0 2 0.62 0.62 0 0 2 1 1 0
$ 1 \rm{e} - 1 $ $ 1 $ 0 2 1 1 0 0 2 0.78 0.78 0 0 2 1 0 0
$ 1 \rm{e} - 6 $ $ 3 $ 0 3 2 1 0 0 2 0.42 0.42 0 0 2 0.66 0.66 0
$ 1 \rm{e} - 5 $ $ 3 $ 0 2.84 1.78 0.94 0 0 2 0.54 0.54 0 0 2 0 1 0
$ 1 \rm{e} - 4 $ $ 3 $ 0 3.36 2.32 1 0 0 2 0.58 0.58 0 0 2.2 1.6 1 0
$ 1 \rm{e} - 3 $ $ 3 $ 0 4.9 3.9 1 0 0 2 0.78 0.78 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 3 $ 50 6 6 0 0 29 3.62 1.9 0 0.22 50 4 4 0 0
$ 1 \rm{e} - 1 $ $ 3 $ 50 6 6 0 0 0 5.38 1.62 0 1 0 6 2 0 1
$ 1 \rm{e} - 6 $ $ 5 $ 0 2.72 1.64 0.92 0 0 2 0.5 0.5 0 0 2.3 0.6 1 0
$ 1 \rm{e} - 5 $ $ 5 $ 0 3.4 1.6 0.8 0 0 2 0.58 0.58 0 0 3 2 1 0
$ 1 \rm{e} - 4 $ $ 5 $ 0 4.82 3.82 1 0 0 2.01 0.38 0.38 0 50 4 4 0 0
$ 1 \rm{e} - 3 $ $ 5 $ 27 5.54 5.08 0.46 0 28 3.44 1.76 0 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 5 $ 50 6 6 0 0 0 5 2 0 1 0 6 2 0 1
$ 1 \rm{e} - 1 $ $ 5 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
Table 4.  Average performance that intrinsic group structures are identified for $ (\mu, \beta) $ pair (Gamma noise)
Parameters M1 M2 M3
$ \mu $ $ \beta $ MF Size TP U O MF Size TP U O MF Size TP U O
$ 1 \rm{e} - 6 $ $ 1 $ 0 2 1 1 0 0 2 0.6 0.6 0 0 2 1 1 0
$ 1 \rm{e} - 5 $ $ 1 $ 0 2 1 1 0 0 2 0.7 0.7 0 0 2 1 1 0
$ 1 \rm{e} - 4 $ $ 1 $ 0 2 1 1 0 0 2 1 1 0 0 2 1 1 0
$ 1 \rm{e} - 3 $ $ 1 $ 0 2 1 1 0 0 2 0.92 0.92 0 0 2 1 1 0
$ 1 \rm{e} - 2 $ $ 1 $ 0 2 1 1 0 0 2 0.58 0.58 0 0 2 1 1 0
$ 1 \rm{e} - 1 $ $ 1 $ 0 2 1 1 0 0 2 0.76 0.76 0 0 2 1 1 0
$ 1 \rm{e} - 6 $ $ 3 $ 0 2 1 1 0 0 2 0.52 0.52 0 0 2 1 1 0
$ 1 \rm{e} - 5 $ $ 3 $ 0 2 1 1 0 0 2 1 1 0 0 2.42 0.66 1 0
$ 1 \rm{e} - 4 $ $ 3 $ 0 3.8 2.6 1 0 0 2 0.8 0.8 0 0 2 1 1 0
$ 1 \rm{e} - 3 $ $ 3 $ 0 4 3 1 0 5 2.26 0.92 0.62 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 3 $ 42 5.84 5.88 0.16 0 27 3.66 1.82 0 0.2 50 4 4 0 0
$ 1 \rm{e} - 1 $ $ 3 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
$ 1 \rm{e} - 6 $ $ 5 $ 0 2.56 1.48 1 0 0 2 0.62 0.62 0 0 2 0.92 0.92 0
$ 1 \rm{e} - 5 $ $ 5 $ 0 3.5 2.5 1 0 0 2 0.66 0.66 0 0 3 2 1 0
$ 1 \rm{e} - 4 $ $ 5 $ 7 4.88 3.76 0.86 0 24 3.08 1.8 0 0.08 0 2.2 0.52 1 0
$ 1 \rm{e} - 3 $ $ 5 $ 8 4.94 3.84 0.84 0 27 3.4 1.6 0 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 5 $ 50 6 6 0 0 0 5 2 0 1 0 5.14 2.86 0 1
$ 1 \rm{e} - 1 $ $ 5 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
Parameters M1 M2 M3
$ \mu $ $ \beta $ MF Size TP U O MF Size TP U O MF Size TP U O
$ 1 \rm{e} - 6 $ $ 1 $ 0 2 1 1 0 0 2 0.6 0.6 0 0 2 1 1 0
$ 1 \rm{e} - 5 $ $ 1 $ 0 2 1 1 0 0 2 0.7 0.7 0 0 2 1 1 0
$ 1 \rm{e} - 4 $ $ 1 $ 0 2 1 1 0 0 2 1 1 0 0 2 1 1 0
$ 1 \rm{e} - 3 $ $ 1 $ 0 2 1 1 0 0 2 0.92 0.92 0 0 2 1 1 0
$ 1 \rm{e} - 2 $ $ 1 $ 0 2 1 1 0 0 2 0.58 0.58 0 0 2 1 1 0
$ 1 \rm{e} - 1 $ $ 1 $ 0 2 1 1 0 0 2 0.76 0.76 0 0 2 1 1 0
$ 1 \rm{e} - 6 $ $ 3 $ 0 2 1 1 0 0 2 0.52 0.52 0 0 2 1 1 0
$ 1 \rm{e} - 5 $ $ 3 $ 0 2 1 1 0 0 2 1 1 0 0 2.42 0.66 1 0
$ 1 \rm{e} - 4 $ $ 3 $ 0 3.8 2.6 1 0 0 2 0.8 0.8 0 0 2 1 1 0
$ 1 \rm{e} - 3 $ $ 3 $ 0 4 3 1 0 5 2.26 0.92 0.62 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 3 $ 42 5.84 5.88 0.16 0 27 3.66 1.82 0 0.2 50 4 4 0 0
$ 1 \rm{e} - 1 $ $ 3 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
$ 1 \rm{e} - 6 $ $ 5 $ 0 2.56 1.48 1 0 0 2 0.62 0.62 0 0 2 0.92 0.92 0
$ 1 \rm{e} - 5 $ $ 5 $ 0 3.5 2.5 1 0 0 2 0.66 0.66 0 0 3 2 1 0
$ 1 \rm{e} - 4 $ $ 5 $ 7 4.88 3.76 0.86 0 24 3.08 1.8 0 0.08 0 2.2 0.52 1 0
$ 1 \rm{e} - 3 $ $ 5 $ 8 4.94 3.84 0.84 0 27 3.4 1.6 0 0 50 4 4 0 0
$ 1 \rm{e} - 2 $ $ 5 $ 50 6 6 0 0 0 5 2 0 1 0 5.14 2.86 0 1
$ 1 \rm{e} - 1 $ $ 5 $ 50 6 6 0 0 0 6 1 0 1 0 6 2 0 1
Table 2.  Mean absolute error comparisons (Mean$ \pm $std.) for Gaussian and Gamma noise}
GASI MAM
Model Gaussian Gamma Gaussian Gamma
M1 $ 186.3.92. \pm 437.8 $ $ 458.8 \pm 988.8 $ $ \mathbf{109.92 \pm 257.2} $ $ \mathbf{272.8 \pm 536.2} $
M2 $ 1.088 \pm 0.025 $ $ 0.774 \pm 0.032 $ $ \mathbf{0.839 \pm 0.023} $ $ \mathbf{0.751 \pm 0.028} $
M3 $ \mathbf{0.857 \pm 0.025} $ $ \mathbf{ 0.873 \pm 0.019} $ $ 0.901 \pm 0.028 $ $ 0.917 \pm 0.021 $
GASI MAM
Model Gaussian Gamma Gaussian Gamma
M1 $ 186.3.92. \pm 437.8 $ $ 458.8 \pm 988.8 $ $ \mathbf{109.92 \pm 257.2} $ $ \mathbf{272.8 \pm 536.2} $
M2 $ 1.088 \pm 0.025 $ $ 0.774 \pm 0.032 $ $ \mathbf{0.839 \pm 0.023} $ $ \mathbf{0.751 \pm 0.028} $
M3 $ \mathbf{0.857 \pm 0.025} $ $ \mathbf{ 0.873 \pm 0.019} $ $ 0.901 \pm 0.028 $ $ 0.917 \pm 0.021 $
[1]

Scott R. Pope, Laura M. Ellwein, Cheryl L. Zapata, Vera Novak, C. T. Kelley, Mette S. Olufsen. Estimation and identification of parameters in a lumped cerebrovascular model. Mathematical Biosciences & Engineering, 2009, 6 (1) : 93-115. doi: 10.3934/mbe.2009.6.93

[2]

Pavel Krejčí. The Preisach hysteresis model: Error bounds for numerical identification and inversion. Discrete & Continuous Dynamical Systems - S, 2013, 6 (1) : 101-119. doi: 10.3934/dcdss.2013.6.101

[3]

S. M. Crook, M. Dur-e-Ahmad, S. M. Baer. A model of activity-dependent changes in dendritic spine density and spine structure. Mathematical Biosciences & Engineering, 2007, 4 (4) : 617-631. doi: 10.3934/mbe.2007.4.617

[4]

Ben A. Vanderlei, Matthew M. Hopkins, Lisa J. Fauci. Error estimation for immersed interface solutions. Discrete & Continuous Dynamical Systems - B, 2012, 17 (4) : 1185-1203. doi: 10.3934/dcdsb.2012.17.1185

[5]

Rua Murray. Approximation error for invariant density calculations. Discrete & Continuous Dynamical Systems - A, 1998, 4 (3) : 535-557. doi: 10.3934/dcds.1998.4.535

[6]

Baohuai Sheng, Huanxiang Liu, Huimin Wang. Learning rates for the kernel regularized regression with a differentiable strongly convex loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3973-4005. doi: 10.3934/cpaa.2020176

[7]

Yang Mi, Kang Zheng, Song Wang. Homography estimation along short videos by recurrent convolutional regression network. Mathematical Foundations of Computing, 2020, 3 (2) : 125-140. doi: 10.3934/mfc.2020014

[8]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[9]

Bingzheng Li, Zhengzhan Dai. Error analysis on regularized regression based on the Maximum correntropy criterion. Mathematical Foundations of Computing, 2020, 3 (1) : 25-40. doi: 10.3934/mfc.2020003

[10]

Fabrizio Colombo, Davide Guidetti. Identification of the memory kernel in the strongly damped wave equation by a flux condition. Communications on Pure & Applied Analysis, 2009, 8 (2) : 601-620. doi: 10.3934/cpaa.2009.8.601

[11]

Vladimir E. Fedorov, Natalia D. Ivanova. Identification problem for a degenerate evolution equation with overdetermination on the solution semigroup kernel. Discrete & Continuous Dynamical Systems - S, 2016, 9 (3) : 687-696. doi: 10.3934/dcdss.2016022

[12]

Shaoyong Lai, Qichang Xie. A selection problem for a constrained linear regression model. Journal of Industrial & Management Optimization, 2008, 4 (4) : 757-766. doi: 10.3934/jimo.2008.4.757

[13]

Hengguang Li, Jeffrey S. Ovall. A posteriori eigenvalue error estimation for a Schrödinger operator with inverse square potential. Discrete & Continuous Dynamical Systems - B, 2015, 20 (5) : 1377-1391. doi: 10.3934/dcdsb.2015.20.1377

[14]

Beatris Adriana Escobedo-Trujillo, José Daniel López-Barrientos. Nonzero-sum stochastic differential games with additive structure and average payoffs. Journal of Dynamics & Games, 2014, 1 (4) : 555-578. doi: 10.3934/jdg.2014.1.555

[15]

Beatris Adriana Escobedo-Trujillo, Alejandro Alaffita-Hernández, Raquiel López-Martínez. Constrained stochastic differential games with additive structure: Average and discount payoffs. Journal of Dynamics & Games, 2018, 5 (2) : 109-141. doi: 10.3934/jdg.2018008

[16]

Azmy S. Ackleh, Jeremy J. Thibodeaux. Parameter estimation in a structured erythropoiesis model. Mathematical Biosciences & Engineering, 2008, 5 (4) : 601-616. doi: 10.3934/mbe.2008.5.601

[17]

Houssein Ayoub, Bedreddine Ainseba, Michel Langlais, Rodolphe Thiébaut. Parameters identification for a model of T cell homeostasis. Mathematical Biosciences & Engineering, 2015, 12 (5) : 917-936. doi: 10.3934/mbe.2015.12.917

[18]

Yuepeng Wang, Yue Cheng, I. Michael Navon, Yuanhong Guan. Parameter identification techniques applied to an environmental pollution model. Journal of Industrial & Management Optimization, 2018, 14 (2) : 817-831. doi: 10.3934/jimo.2017077

[19]

Tao Lin, Yanping Lin, Weiwei Sun. Error estimation of a class of quadratic immersed finite element methods for elliptic interface problems. Discrete & Continuous Dynamical Systems - B, 2007, 7 (4) : 807-823. doi: 10.3934/dcdsb.2007.7.807

[20]

Matthieu Canaud, Lyudmila Mihaylova, Jacques Sau, Nour-Eddin El Faouzi. Probability hypothesis density filtering for real-time traffic state estimation and prediction. Networks & Heterogeneous Media, 2013, 8 (3) : 825-842. doi: 10.3934/nhm.2013.8.825

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]