A self adaptive inertial algorithm for solving split variational inclusion and fixed point problems with applications
Solution method for discrete double obstacle problems based on a power penalty approach
## A novel quality prediction method based on feature selection considering high dimensional product quality data

 1 School of Management, Hefei University of Technology, Hefei 230009, China 2 Center for Applied Optimization, Department of Industrial and Systems, Engineering University of Florida, Gainesville, FL 32611, USA 3 Key Laboratory of Process Optimization and Intelligent Decision-making, of the Ministry of Education, Hefei 230009, China 4 School of Economics, Hefei University of Technology, Hefei 230009, China 5 Ministry of Education Engineering Research Center for Intelligent Decision-Making & Information System Technologies, Hefei 230009, China

* Corresponding author: Xiaofei Qian, Xinbao Liu

Received  December 2019 Revised  January 2021 Early access May 2021

Product quality is the lifeline of enterprise survival and development. With the rapid development of information technology, the semiconductor manufacturing process produces multitude of quality features. Due to the increasing quality features, the requirement on the training time and classification accuracy of quality prediction methods becomes increasingly higher. Aiming at realizing the quality prediction for semiconductor manufacturing process, this paper proposes a modified support vector machine (SVM) model based on feature selection, considering the high dimensional and nonlinear characteristics of data. The model first improves the Radial Basis Function (RBF) in SVM, and then combines the Duelist algorithm (DA) and variable neighborhood search algorithm (VNS) for feature selection and parameters optimization. Compared with some other SVM models that are based on DA, genetic algorithm (GA), and Information Gain algorithm (IG), the experiment results show that our DA-VNS-SVM can obtain higher classification accuracy rate with a smaller feature subset. In addition, we compare the DA-VNS-SVM with some common machine learning algorithms such as logistic regression, naive Bayes, decision tree, random forest, and artificial neural network. The results indicate that our model outperform these machine learning algorithms for the quality prediction of semiconductor.

Citation: Junying Hu, Xiaofei Qian, Jun Pei, Changchun Tan, Panos M. Pardalos, Xinbao Liu. A novel quality prediction method based on feature selection considering high dimensional product quality data. Journal of Industrial & Management Optimization, doi: 10.3934/jimo.2021099
Stages of semiconductor manufacturing
Flowchart of proposed method
Flowchart of DA-VNS algorithm
Encoding of DA-VNS algorithm
Running results after 500 iterations of GA, DA and DA-VNS algorithms respectively (The blue points indicate the projection of the solutions on $(q(\theta),R(\theta))$.)
The evolution of the best $q(\theta)$ and $R(\theta)$ for GA, DA and DA-VNS respectively over 500 iterations
Performance improvement between DA-VNS-SVM and other algorithms
Quality prediction problems in semiconductor manufacturing processes in recent years
 Publications Problems Methods Data Driven Fridgeirsdottir [73] Fault diagnosis Data mining √ Kim [37] Prediction of plasma etch processes PNN √ Bae [9] Modeling and rule extraction of the ingot fabrication DPNN √ Su [59] Quality prognostics for plasma sputtering NN √ Chou [16] Prediction of dynamic wafer quality SVM √ Purwins [55] Prediction of Silicon Nitride layer thickness Collinearity regression √ Melhem [49] Prediction of batch scrap Regularized regression √ Alagic [5] Prediction of the damage intensity Image processing and statistical modeling √ Al-Kharaz [4] Prediction of quality state ANN √ Kim [36] Prediction of wafers errors Ordinary least squares regression and ridge regression √
Common Kernel Function
 Kernel function name Kernel function representation Radial basis function $\kappa(x_i,x_j)=\exp(-\gamma ||x_i-x_j||)$ Linear kernel function $\kappa(x_i,x_j)=x_i\cdot x_j$ Polynomial kernel function $\kappa(x_i,x_j)=(x_i\cdot x_j+1)^d$ Sigmoid kernel function $\kappa(x_i,x_j)=\tanh[n +\theta]$
List of preset parameters in DA-VNS
 Parameters $DA-VNS_{RBF}$ $DA-VNS_{\kappa_{1}}$ $DA-VNS_{\kappa_{2}}$ Population size 100 100 100 Iteration times 500 500 500 Nearest neighbor number / 5 5 Learning probability 0.8 0.8 0.8 Innovation probability 0.1 0.1 0.1 Mutation probability 0.1 0.1 0.1 Search range of penalty parameter $C$ $[10^{-3},10^{3}]$ $[10^{-3},10^3]$ $[10^{-3},10^3]$ Search range of kernel width $\gamma$ $[2^{-6},2^6]$ $[2^{-6},2^6]$ $[2^{-6},2^6]$ Search range of amplitude regulating parameter $t_1$ / / $[-10,10]$ Search range of displacement regulating parameter $t_2$ / / $[-10,10]$ Luck coefficient {0, 0.01, 0.1, 0.2, 0.5} {0, 0.01, 0.1, 0.2, 0.5} {0, 0.01, 0.1, 0.2, 0.5} $w_c$, $w_f$, $c_{f1}$, $c_{f2}$ 0.8, 0.2, 0.8, 0.2 0.8, 0.2, 0.8, 0.2 0.8, 0.2, 0.8, 0.2
Comparison of performance between DA-VNS and other algorithms
 Algorithm Optimal parameters $q(\theta)$ $R(\theta)$ Selected features $C$ $\gamma$ $t_1$ $t_2$ $GA_{RBF}$ 78.54 3.36 / / 0.6136 0.725 60 $GA_{\kappa_1}$ 73.11 0.80 / / 0.6217 0.7333 47 $GA_{\kappa_2}$ 11.21 0.94 1.53 4.41 0.6262 0.7416 56 $GA_{RBF}$ 96.10 0.52 / / 0.6329 0.75 64 $DA_{\kappa_{1}}$ 62.34 0.49 / / 0.6559 0.775 43 $DA_{\kappa_{1}}$ 42.04 0.49 9.81 9.96 0.6695 0.7917 41 IG / / / / 0.6957 0.8105 24 $GA-VNS_{RBF}$ 21.32 1.31 / / 0.6882 0.8033 48 $DA-VNS{\kappa_{1}}$ 60.82 1.60 / / 0.7038 0.8333 36 $DA-VNS{\kappa_{2}}$ 96 0.91 -2.69 8.77 0.7221 0.8583 49
Performance improvement of $q(\theta)$ between DA-VNS and other algorithms. $Improvement = \frac{q(\theta)-q^{\prime}(\theta)}{q(\theta)}*100\%$
 Improvement $GA_{RBF}$ $GA_{\kappa_{1}}$ $GA_{\kappa_{2}}$ $DA_{RBF}$ $DA_{\kappa_{1}}$ $DA_{\kappa_{2}}$ IG $DA-VNS_{RBF}$ $DA-VNS_{\kappa_{1}}$ $DA-VNS{\kappa_{1}}$ 12.82 11.66 11.02 10.07 6.81 4.87 1.15 2.22 / $DA-VNS{\kappa_{2}}$ 15.03 13.90 13.28 12.35 9.17 7.28 3.66 4.69 2.53
Performance improvement of $R(\theta)$ between DA-VNS and other algorithms. $Improvement = \frac{R(\theta)-R^{\prime}(\theta)}{R(\theta)}*100\%$
 Improvement $GA_{RBF}$ $GA_{\kappa_{1}}$ $GA_{\kappa_{2}}$ $DA_{RBF}$ $DA_{\kappa_{1}}$ $DA_{\kappa_{2}}$ IG $DA-VNS_{RBF}$ $DA-VNS_{\kappa_{1}}$ $DA-VNS{\kappa_{1}}$ 13.00 12.00 11.00 10.00 7.00 4.99 2.74 3.60 / $DA-VNS{\kappa_{2}}$ 15.53 14.56 13.60 12.62 9.70 7.76 5.57 6.41 2.91
Comparison of performance between DA-VNS-SVM and common machine learning algorithms
 Algorithm Accuracy Logistic Regression (LR) 0.4917 Naive Bayes (NB) 0.6167 Artificial Neural Network (ANN) 0.6417 Decision Tree (DT) 0.658 Random Forest (RF) 0.667 DA-VNS$_{\kappa_1}$-SVM 0.7038 DA-VNS$_{\kappa_2}$-SVM 0.7221
