October  2016, 1(4): 299-308. doi: 10.3934/bdia.2016012

Modeling daily guest count prediction

Department of Computer Science and Engineering York University 4700 Keele Street, Toronto, Ontario M3J 1P3, Canada

* Corresponding author: Ricky Fok

Published  April 2017

We present a novel method for analyzing data with temporal variations. In particular, the problem of modeling daily guest count forecast for a restaurant with more than 60 chain stores is presented. We study the transaction data collected from each store, perform data preprocessing and feature constructions for the data. We then discuss different forecasting techniques based on data mining and machine learning techniques. A new modeling algorithm SW-LAR-LASSO is proposed. We compare multiple regression model, poisson regression model, and the proposed SW-LAR-LASSO model for prediction. Experimental results show that the approach of combining sliding windows and LAR-LASSO produces the best results with the highest precision. This approach can also be applied to other areas where temporal variations exist in the data.

Citation: Fok Ricky, Lasek Agnieszka, Li Jiye, An Aijun. Modeling daily guest count prediction. Big Data & Information Analytics, 2016, 1 (4) : 299-308. doi: 10.3934/bdia.2016012
References:
[1]

S. CoxeS. West and L. S. Aiken, The analysis of count data: A gentle introduction to poisson regression and its alternatives, J. Pers. Assess, 91 (2009), 121-136.  doi: 10.1080/00223890802634175.  Google Scholar

[2]

B. EfronT. HastieI. Johnstone and R. Tibshirani, Tibshirani, Least angle regression, The Annals of Statistics, 32 (2004), 407-499.  doi: 10.1214/009053604000000067.  Google Scholar

[3]

F. G. Forst, Forecasting restaurant sales using multiple regression and box-jenkins analysis, J. Appl. Bus. Res, 382 (1992), 2157-8834.  doi: 10.19030/jabr.v8i2.6157.  Google Scholar

[4]

T. HastieR. Tibshirani and J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer, New York, (2009).  doi: 10.1007/978-0-387-84858-7.  Google Scholar

[5]

S. E. KimesR. B. ChaseS. ChoiP. Y. Lee and E. N. Ngonzi, Restaurant revenue management applying yield management to the restaurant industry, Cornell Hospitality Q, 39 (1998), 32-39.  doi: 10.1177/001088049803900308.  Google Scholar

[6]

A. LasekN. Cercone and J. Saunders, Restaurant sales and customer demand forecasting: Literature survey and categorization of methods, Smart City 360, 166 (2016), 479-491.  doi: 10.1007/978-3-319-33681-7_40.  Google Scholar

[7]

M. S. Morgan and P. K. Chintagunta, Forecasting restaurant sales using self-selectivity models, J. Retail. Consum. Serv, 4 (1997), 117-128.  doi: 10.1016/S0969-6989(96)00035-5.  Google Scholar

[8]

D. ReynoldsI. Rahman and W. Balinbin, Econometric modeling of the U.S. restaurant industry International, J. Hospitality Manage, 34 (2013), 317-323.   Google Scholar

[9]

K. Ryu and A. Sanchez, The evaluation of forecasting methods at an institutional foodservice dining facility, J. Hospitality Financ. Manage, (2013), 27-45.  doi: 10.1080/10913211.2003.10653769.  Google Scholar

[10]

K. F. Sellers and G. Shmueli, Predicting censored count data with COM-Poisson regression, Working Paper, Indian School of Business, Hyderabad, 2010. Google Scholar

[11]

J. T. Wulu JrK. P. SinghF. FamoyeT. N. Thomas and G. McGwin, Regression analysis of count data, J. Ind. Soc. Ag. Statistics, 55 (2002), 220-230.   Google Scholar

show all references

References:
[1]

S. CoxeS. West and L. S. Aiken, The analysis of count data: A gentle introduction to poisson regression and its alternatives, J. Pers. Assess, 91 (2009), 121-136.  doi: 10.1080/00223890802634175.  Google Scholar

[2]

B. EfronT. HastieI. Johnstone and R. Tibshirani, Tibshirani, Least angle regression, The Annals of Statistics, 32 (2004), 407-499.  doi: 10.1214/009053604000000067.  Google Scholar

[3]

F. G. Forst, Forecasting restaurant sales using multiple regression and box-jenkins analysis, J. Appl. Bus. Res, 382 (1992), 2157-8834.  doi: 10.19030/jabr.v8i2.6157.  Google Scholar

[4]

T. HastieR. Tibshirani and J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer, New York, (2009).  doi: 10.1007/978-0-387-84858-7.  Google Scholar

[5]

S. E. KimesR. B. ChaseS. ChoiP. Y. Lee and E. N. Ngonzi, Restaurant revenue management applying yield management to the restaurant industry, Cornell Hospitality Q, 39 (1998), 32-39.  doi: 10.1177/001088049803900308.  Google Scholar

[6]

A. LasekN. Cercone and J. Saunders, Restaurant sales and customer demand forecasting: Literature survey and categorization of methods, Smart City 360, 166 (2016), 479-491.  doi: 10.1007/978-3-319-33681-7_40.  Google Scholar

[7]

M. S. Morgan and P. K. Chintagunta, Forecasting restaurant sales using self-selectivity models, J. Retail. Consum. Serv, 4 (1997), 117-128.  doi: 10.1016/S0969-6989(96)00035-5.  Google Scholar

[8]

D. ReynoldsI. Rahman and W. Balinbin, Econometric modeling of the U.S. restaurant industry International, J. Hospitality Manage, 34 (2013), 317-323.   Google Scholar

[9]

K. Ryu and A. Sanchez, The evaluation of forecasting methods at an institutional foodservice dining facility, J. Hospitality Financ. Manage, (2013), 27-45.  doi: 10.1080/10913211.2003.10653769.  Google Scholar

[10]

K. F. Sellers and G. Shmueli, Predicting censored count data with COM-Poisson regression, Working Paper, Indian School of Business, Hyderabad, 2010. Google Scholar

[11]

J. T. Wulu JrK. P. SinghF. FamoyeT. N. Thomas and G. McGwin, Regression analysis of count data, J. Ind. Soc. Ag. Statistics, 55 (2002), 220-230.   Google Scholar

Figure 1.  Examples of boxplots for some of the stores from the chain of restaurants
Figure 2.  Three iterations of the sliding window are shown. Each line interval denotes a week. The shaded boxes denote the sliding windows for the training data over eight weeks and the empty boxes denote the weeks where the guest counts are predicted
Figure 3.  Experimental process for guest count predictions
Table 1.  Table of results from chosen stores. The bolded results denote the lowest predictive error among the three algorithms tested
Benchmark StoresMultiple regressionPoisson regressionSW-LAR-LASSOlocalization
Store_1 7.888.288.40Canada stores
Store_215.5616.71 15.00
Store_3 10.2010.8610.25
Store_413.1514.51 12.86
Store_510.5011.44 10.25
Store_616.0417.66 14.19US Stores
Store_718.6224.37 15.60
Store_816.0215.69 12.89
Store_9----22.57
Store_10----14.68
Benchmark StoresMultiple regressionPoisson regressionSW-LAR-LASSOlocalization
Store_1 7.888.288.40Canada stores
Store_215.5616.71 15.00
Store_3 10.2010.8610.25
Store_413.1514.51 12.86
Store_510.5011.44 10.25
Store_616.0417.66 14.19US Stores
Store_718.6224.37 15.60
Store_816.0215.69 12.89
Store_9----22.57
Store_10----14.68
[1]

Yanqing Liu, Jiyuan Tao, Huan Zhang, Xianchao Xiu, Lingchen Kong. Fused LASSO penalized least absolute deviation estimator for high dimensional linear regression. Numerical Algebra, Control & Optimization, 2018, 8 (1) : 97-117. doi: 10.3934/naco.2018006

[2]

Lucian Coroianu, Danilo Costarelli, Sorin G. Gal, Gianluca Vinti. Approximation by multivariate max-product Kantorovich-type operators and learning rates of least-squares regularized regression. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4213-4225. doi: 10.3934/cpaa.2020189

[3]

Shaoyong Lai, Qichang Xie. A selection problem for a constrained linear regression model. Journal of Industrial & Management Optimization, 2008, 4 (4) : 757-766. doi: 10.3934/jimo.2008.4.757

[4]

Adil Bagirov, Sona Taheri, Soodabeh Asadi. A difference of convex optimization algorithm for piecewise linear regression. Journal of Industrial & Management Optimization, 2019, 15 (2) : 909-932. doi: 10.3934/jimo.2018077

[5]

Shuhua Wang, Zhenlong Chen, Baohuai Sheng. Convergence of online pairwise regression learning with quadratic loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4023-4054. doi: 10.3934/cpaa.2020178

[6]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

[7]

Song Wang, Quanxi Shao, Xian Zhou. Knot-optimizing spline networks (KOSNETS) for nonparametric regression. Journal of Industrial & Management Optimization, 2008, 4 (1) : 33-52. doi: 10.3934/jimo.2008.4.33

[8]

Erik Kropat, Gerhard Wilhelm Weber. Fuzzy target-environment networks and fuzzy-regression approaches. Numerical Algebra, Control & Optimization, 2018, 8 (2) : 135-155. doi: 10.3934/naco.2018008

[9]

Wei Li, Yun Teng. Enterprise inefficient investment behavior analysis based on regression analysis. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1015-1025. doi: 10.3934/dcdss.2019069

[10]

Bingzheng Li, Zhengzhan Dai. Error analysis on regularized regression based on the Maximum correntropy criterion. Mathematical Foundations of Computing, 2020, 3 (1) : 25-40. doi: 10.3934/mfc.2020003

[11]

Baohuai Sheng, Huanxiang Liu, Huimin Wang. Learning rates for the kernel regularized regression with a differentiable strongly convex loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3973-4005. doi: 10.3934/cpaa.2020176

[12]

Yang Mi, Kang Zheng, Song Wang. Homography estimation along short videos by recurrent convolutional regression network. Mathematical Foundations of Computing, 2020, 3 (2) : 125-140. doi: 10.3934/mfc.2020014

[13]

Lianjun Zhang, Lingchen Kong, Yan Li, Shenglong Zhou. A smoothing iterative method for quantile regression with nonconvex $ \ell_p $ penalty. Journal of Industrial & Management Optimization, 2017, 13 (1) : 93-112. doi: 10.3934/jimo.2016006

[14]

Andrew J. Majda, Yuan Yuan. Fundamental limitations of Ad hoc linear and quadratic multi-level regression models for physical systems. Discrete & Continuous Dynamical Systems - B, 2012, 17 (4) : 1333-1363. doi: 10.3934/dcdsb.2012.17.1333

[15]

Victor Meng Hwee Ong, David J. Nott, Taeryon Choi, Ajay Jasra. Flexible online multivariate regression with variational Bayes and the matrix-variate Dirichlet process. Foundations of Data Science, 2019, 1 (2) : 129-156. doi: 10.3934/fods.2019006

[16]

Yazhe Li, Tony Bellotti, Niall Adams. Issues using logistic regression with class imbalance, with a case study from credit risk modelling. Foundations of Data Science, 2019, 1 (4) : 389-417. doi: 10.3934/fods.2019016

[17]

Xin Guo, Qiang Fu, Yue Wang, Kenneth C. Land. A numerical method to compute Fisher information for a special case of heterogeneous negative binomial regression. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4179-4189. doi: 10.3934/cpaa.2020187

[18]

Wei Xue, Wensheng Zhang, Gaohang Yu. Least absolute deviations learning of multiple tasks. Journal of Industrial & Management Optimization, 2018, 14 (2) : 719-729. doi: 10.3934/jimo.2017071

[19]

Xiang-Sheng Wang, Luoyi Zhong. Ebola outbreak in West Africa: real-time estimation and multiple-wave prediction. Mathematical Biosciences & Engineering, 2015, 12 (5) : 1055-1063. doi: 10.3934/mbe.2015.12.1055

[20]

Subhamoy Maitra. Guest editorial. Advances in Mathematics of Communications, 2019, 13 (4) : ⅰ-ⅱ. doi: 10.3934/amc.2019033

 Impact Factor: 

Metrics

  • PDF downloads (67)
  • HTML views (422)
  • Cited by (0)

Other articles
by authors

[Back to Top]