Intelligent recognition algorithm for social network sensitive information based on classification technology

. In the social network, there is the problem of network sensitive information with low accuracy rate of information recognition. To eﬀectively improve the accuracy of intelligent identiﬁcation of sensitive information, an intelligent recognition algorithm for sensitive information based on improved fuzzy support vector machine is proposed in this paper. The information is collected. The trajectory of the best movement of the information node is found in the low energy cache. In the limited time, the performance of information acquisition is improved by using the mobility of information nodes. According to DFS criterion, the features are added into the feature subset or eliminate the sensitive information. The feature selection algorithm based on multi-label is applied to feature selection of the collected information, so that the information gain between information feature and label set can be used to measure the importance. The improved support vector machine classiﬁcation algorithm is used to classify the information selected by feature selection, and select eﬀective candidate support vector, reduce the number of training samples, and improve the training speed. The new membership function is deﬁned to enhance the eﬀect of support vector on the construction of fuzzy support vector machine. Finally, the nearest neighbor sample density is applied to the design of membership function to reduce the noise, and achieve intelligent recognition of the sensitive information in the social network. Experimental results show that the accuracy rate of sensitive information intelligent recognition can be eﬀectively improved by using the proposed algorithm.

information is generated every day. In order to obtain and use this information conveniently and effectively, it is needed to be classified [7,12]. In this way, the artificial intelligence technology has been developed. The application of target recognition is widely used in the field of computer vision. It plays an important role in the industrial application, aerospace and military field [20]. For the perception of the world, most of the information comes from the vision. The description and recognition of the object feature is the key to cognition. The current computer visual perception is not able to meet human needs. Therefore, it is a major challenge in the field of recognition to simulate the learning of the human brain to accomplish the learning of the target, and it is also a hot research [4,21].
In the social network, the target recognition of sensitive information is also called as the pattern recognition of the vision. The purpose is to process the information by using the theory and method of information processing and pattern recognition, in order to determine whether it contains sensitive information, extract useful information, determine the location of the target, and realize the description, analysis, judgment and recognition of sensitive information. Pattern recognition is a subject of classification and description of the information or physical procesess. However, research results show that the classification algorithms that use different theories on the same data will obtain different results. In practice, in order to achieve the best effect, researchers need continuous experiments [3,9].
In present, a target recognition algorithm for convolution neural networks based on unsupervised pre-training and multi-scale partitioning is proposed. A sparse automatic coder is trained by using a non-labeled image to obtain a filter set that meets the characteristics of the dataset with good initial value. The feature is finally used for information classification, and feature input classifier is used to achieve information target recognition. However, the classification performance of the method is poor.
2. Intelligent recognition algorithm for social network sensitive information based on classification technology.
2.1. Social network sensitive information collection. In the social network, when the data is collected in the data cache with the distance from network center l = 2(R 2 − 2r 2 )/2, the total energy consumption of the whole network is low [11].
Randomly select a small region dxdy on a circular ring with the center O, which is given by ρdρddθ in the polar coordinate. The number of the node in the region is nρdρddθ πR 2 . When ρ ≺ 1 − r, at least l−ρ r − 1 jump is required for each information node to reach the cache area. When l+r ≺ ρ ≤ R, at least l−ρ r −1 jump is required. Energy consumption for transmitting or receiving unit data is e. The amount of data collected by each node is q. The total energy consumption of the information node in the cache area is related to the data amount of the node [14,16]. Assume the data of all information nodes are transmitted to the sensitive information node, and the total energy consumption of the nodes to transmit and receive data is p t and p r , respectively. Then the total energy consumption in the cache area is given by In addition to the cache area, the total energy consumption of the information nodes in the other regions along the shortest path for data transmission is approximately expressed as When f (l) = 6l 2 + 6r 3 − 3R 2 = 0 and l = 2(R 2 − 2r 2 )/2, f (l) obtains the minimum value, that is, the total energy consumption of the network is low when the information node moves in the cache area with the distance l from the network center.
Constrained by the mobility of information nodes, information nodes traverse all the nodes in the cache area sequentially, which takes longer time and longer distance to move, and the more frequent the information nodes move, the worse the network stability is. In order to further shorten the movement distance of information nodes, different access probabilities are set for nodes in the cache area, which is to further reduce the number of nodes directly visited by information nodes [5,8]. When the information node accesses one of the nodes, all the neighbor nodes can communicate directly with the information node. So the information node does not need to move to the actual location of the neighbor nodes to collect data. In order to ensure when the information node moves in the cache area, all the nodes have the opportunity to communicate directly with the information node, the width of the cache area is set to r.
When the width of the cache area is r and the mobile step of the social network information node is less than √ 3r, it can guarantee that all nodes in the network have the opportunity to communicate directly with the sensitive information node.
As shown in Fig. 1, the current location of the sensitive information node is O. The communication scope of the next node to be visited by the information node must cover the two points of A and B to ensure that the sensitive information node can communicate directly with all the nodes in the network. A and B is taken as the center, respectively, and r is as the radius, then the curves intersect at the point C. Only when OC is less than √ 3r, it can guarantee that the two points of A and B is in the communication scope, as shown in the shadow part in Fig. 1.
Let SS = {SS(1), SS(2), . . . , SS(n s )} is the set of all nodes, N S(i) is the number of all the neighbor nodes. Assume the first visited node of the sensitive information node is SS(i). After collecting the data of this node, the node with the maximum probability is selected as the object of the next access [18,23]. The transfer Figure 1. Information node access probability between nodes is expressed as where α and β is a constant between 0 and 1, respectively, d(i, j) is the distance between SS(i) and SS(j), N S(j)/n s is the ratio of the number of neighbor nodes of SS(j) to the total number of nodes in the cache area. The node with more neighbor nodes is more likely to be accessed by the sensitive information node. When a node is accessed by a sensitive information node, the access probability of all neighbor nodes will be 0. The value limit of d(i, j) can ensure that all nodes have the opportunity to communicate directly with the sensitive information nodes to prevent the loss of data. The set of d(i, j)/ √ 3r makes the probability that the nodes in the shadow area far away from the current location of the sensitive information node become more likely to be the next access object. The moving step of the sensitive information node is increased, thus reducing the number of times the sensitive information node moves [22]. The probability matrix of the sensitive information node moving between nodes is expressed as The matrix V S(i) denotes the optimal access point set of the sensitive information node as the start of SS(i). pID denotes the ID of the current node accessed by the sensitive information node, nID denotes the ID of the next node to be accessed by the sensitive information node, D(i) denotes the moving distance of the sensitive information node from the SS(i) to SS(i) in a sampling cycle. In order to obtain the minimum moving distance of the sensitive information node, a path for the sensitive information node with the smallest distance is selected from the n s candidate path as the optimal movement strategy [6,13]. Then the optimal access node set of the sensitive information node is V S(arg min D(i)).
In a certain network delay time, according to the sum of the cache data of each node in the communication scope of sensitive information, the pause time can be changed to improve the data collection performance [17]. Assume the time limit is T and the maximum moving speed of the sensitive information node is v m , then the total moving time is D(x) vm . The total pause time and the pause time of each accessed node are given by where a k is the total number of the nodes communicated with the sensitive information node for the kth accessed point, q kj is the number of the member nodes of the jth node for the kth accessed point [1]. If multiple nodes transmit the data to the sensitive information node at the same time, it will cause collision conflict and the loss of information data. In order to avoid conflict, the TDMA mechanism is used in this paper. The sensitive information node generates TDMA rules based on the pause time and the member number of each node. Then these rules are sent to the nodes in the communication scope. After receiving the rules or messages, the node is in the sleep state and only transmits data in the specific time. This not only reduces the loss rate of data and improves the efficiency of data acquisition, but also reduces the energy consumption of a single node and improves the utilization of energy.
Define the ratio of the collected data amount q i of the ith node to the buffered data amount q buf f er is the data collection rate of the ith node. The computation is given by Eq. (7), where 0 ≺ p i ≤ 1. p i = 1 denotes all the buffered data of the node is transmitted to the sensitive information node. The greater the value of p i , the stronger the data collection ability.
The sensitive information collection on the social network is achieved as the above process.

2.2.
Social network sensitive information feature selection. Assume two classification problems in m-dimensional real space R m , The size of the training sample set of the sensitive information is n, the number of samples of positive and negative classes are n + and n − . The training set is The discernibility of feature subsets (DFS) of the sensitive information with i(i = 1, 2, . . . , m) features is given by is the mean of the jth feature in the whole dataset, the positive class dataset, and the negative class dataset, respectively, x (+) k,j is the value of the jth feature of the kth positive class sample point, x (−) k,j is the value of the jth feature of the kth negative class sample point.
In Eq. (8), the numerator denotes the sum of the square of the distance between the mean vector of the feature subset with i features for the positive class and the negative class and the mean vector of the feature subset with the i features for the whole sample set. The denominator denotes the sum of the variance of the sensitive information feature subset with i features for the positive class and the negative class. The larger numerator represents the interclass of the feature subset is sparser and the smaller denominator represents the interclass is more clustered [2,15]. The larger value of DFS represents the discernibility ability is stronger.
For the l(l ≥ 2) class classification problem, assume the size of the sensitive information training sample set is n, the dimension of sample space is m. The training sample set is {(x k , y k ) |x k ∈ R m , m 0, y k ∈ {1, . . . , l} , l ≥ 2 }, where the number of the jth class samples is n j , y k |y k = j, k = 1, . . . , l. DF S i of the i(i = 1, . . . , m) features is defined by wherex andx (j) is the mean vector of the feature subset in the whole dataset and the jth class dataset, respectively, x (j) k is the feature vector of the current i features of the kth sample in the jth class.
When i = 1, DF S i becomes the criterion for interclass discernibility of a single feature, which is the improved F-score criterion, given by is the mean of the ith feature in the whole dataset and the jth class dataset, respectively, x (j) k,i is the value of the ith feature of the kth sample in the jth class. In Eq. (10), the numerator denotes the sum of the square of the distance between the centers of each class of the ith feature and the center of the whole sample set, and the denominator denotes the variance of interclass of the ith feature of each class [19]. Therefore, F i represents the ratio of the distance to the variance of interclass of the ith feature. The larger value shows the stronger discernibility ability.
Given a sensitive information feature x and label set L = {l 1 , l 2 , . . . , l m }. Assume IG(l i |x ) is the information gain of the feature x and the label l i . Then the information gain of the feature x and the label set L is given by As IG(l i |x ) ≥ 0, IGS(L |x ) ≥ 0. From the information theory, If the sensitive information feature x and each label in the label set L = {l 1 , l 2 , . . . , l m } are independent, the information gain has minimum value.
As the nonnegativity of the information gain, if the label and the feature x are independent, then IG(l i |x) = 0 and IGS(L |x ) = 0. As GS(L |x ) ≥ 0, the information gain has minimum value.
If each label l i is decided by the feature x, the information gain has maximum value. Then IGS(L |x ) = H(l i ) and H(l i ), the information gain has maximum value.
From IGS(L |x ), it can be found the feature which has no effect on the label set L. In this way, a reasonable threshold can be set to remove the features which have little correlation with the label set.
In order to compute the threshold, the information gain of different feature x and the label set L is transformed. Assume the distribution of the sensitive information gain obeys the normal distribution. It can be transformed to the standard normal distribution by using where µ is the mean of the sensitive information gain, σ is the standard variance [10]. Given a threshold δ 0, if the absolute value of the information gain |IGS(L |x )| ≥ δ, the feature x is related to the label set, otherwise, they are unrelated.
In order to make the information gain of the features and the label has the same measurement range, the information gain is normalized, that is, IG ( IG(l i |x ), it can be known that the information gain is related to the number of labels. As the number of the datasets is different, one specific threshold is not suitable. To address this problem, it is necessary to design a method that can automatically calculate the threshold value of the sensitive information gain according to different applications [24]. Assume the number of the candidate labels m = 20. For two different features x 1 and x 2 , IGS(L |x 1 ) = 0.5 and IGS(L |x 2 ) = 1.6. It can be seen that the importance of x 2 is greater that x 1 , so the feature x 1 is reserved. Based on the standard normal transformation, is used for setting the threshold, where |IGZ(L |x i )| is the absolute value of the sensitive information gain IGZ(L |x i ). In Eq. (14), the mathematical expectation of the absolute value of the transformed sensitive information gain is used as the current threshold, which can increase the adaptive ability of the algorithm. The selection of the sensitive information feature of the social network is achieved as above.
2.3. Classification and recognition of social network sensitive information. Combined with the feature selection, the two classification problems are used in this paper, which are the positive class and negative class training sample. Because the training speed of the support vector machine is related to the size of the training sample set, the support vector which is distributed on the class boundary plays a decisive role in the decision. Therefore, the number of sensitive information of training samples is reduced by preselecting effective candidate support vectors to improve the training speed. The sample selected from the center distance of the mutual center (the distance between the sample and the disparate class center) is less than the distance of two class sample centers as the effective candidate support vector.
(1) Linearly separable case. The known sample set is, then the average feature of the samples of this class sensitive information is called the center m, which is given by (2) Nonlinearly separable case. Known two vectors x and y. They are mapped to the feature space H by the nonlinear function, then the Euclidean distance of Figure 2. Preselected effective support vector the two vectors in the feature space is given by where K(·) is the kernel function. The center vector m φ of the samples in feature space is given by According to Eq. (15) or Eq. (17), the class centers of the two classes are obtained, which are the positive class center m + and the negative class center m − . The distance of the two class centers is given by By using Eq. (19), the distance from all samples to the disparate class center m is calculated. The sample with the distance less that D is selected as effective candidate support vector.
The sensitive information sample satisfying D ≺ D is retained as effective candidate support vector, shown in the arc part of Fig. 2. After preprocessing the sensitive information sample, the aim of reducing the number of training samples is achieved and the training speed is improved.
In order to reduce the influence of noise points on the construction of support vector machine, a fuzzy membership sensitivity information function based on class center is proposed. The membership function considers the location of noise point or outlier point far away from the class center. Therefore, the influence of noise point or outlier point can be reduced by giving smaller membership to the sample far from the class center. However, the design of the membership function ignores the support vector is also far from the class center. Thus, the reduction of the noise point or outlier point introduces the reduction of the support vector, as shown in Fig. 3. Therefore, a new membership function is designed to make the membership of the sample increases with the increase of the distance from the class center. In this way, the support vector will obtain larger membership.
The distance of each positive class sample and the positive class center is d + i = |x + i − m + | and the distance of each negative class sample and the negative class Assume after preselecting support vector the positive class sensitive information sample set is X + , and the negative class sample set is X − . Then the designed membership function is given by where δ is a small enough positive number to avoid s(x i ) = 0. The membership function is based on the measurement of the farthest distance from the class center, which make the sample farthest from the class center has greater membership. In this way, the role of the support vector in the construction of the optimal classification surface is ensured. However, as the noise point is also located in the farthest region, it will enhance the noise point or outlier point.
The membership function is weighted by the nearest neighbor sample density to distinguish the noise point from the normal sample. The more samples in the nearest neighbor e of the sample x 1 represent the larger nearest neighbor sample density and the less samples of the noise point x 2 represent the smaller nearest neighbor sample density. Therefore, the weighting of the membership function can suppress the noise point. The nearest neighbor sample density function of each sample is calculated to quantify the sample density of the nearest neighbor sensitive information. For each sample x, by calculating the sample x j with the distance satisfying Eq. (12) in the sample set, the nearest neighbor sample subset X i of the ith sample is formed.
where d ij is the distance between the two samples of x j and x i , e 1 is the nearest neighbor of the sample and min(d ij ) ≤ e 1 ≺ max(d ij ), numX is the number of the samples in the sample set. Sample density is defined by the distance between the nearest neighbors. Assume there are k nearest neighbor samples in the nearest neighbor sample subset X i , the nearest neighbor sample density function is given by where a is a small penalty constant. The normalization of the nearest neighbor sample density is given by where z i is the nearest neighbor sample density of the ith sample x i . The samples in the nearest neighbor are more, w i is larger. Because the influence of the classification of the nearest neighbor sample on the category of the sample is different, the nearest neighbor sample density can be adjusted. If the nearest neighbor sample subset is the similar class sample, the sample is not confused with the disparate class sample and the nearest neighbor sample density is kept constant. If the nearest neighbor sample subset includes the disparate class sample, the sample is confused with the disparate class sample and the membership is decreased. If the nearest neighbor sample subset is the disparate class sample, the membership is set to 0, in order to reduce the effect on construction of support vector machine.
The final membership function is given by The new membership function is combined with the membership function based on class center and nearest neighbor sample density, which can enhance the support vector and reduce the noise. The membership function obtained with Eq. (24) is used to train the fuzzy support vector machine for classification. The intelligent recognition of social network sensitive information is achieved with the classification result.
3. Experimental results and analysis. To verify the proposed algorithm, simulation experiment is carried out and analyzed with the current algorithm. In the experiment, 4 artificial data sets and 5 UCI data sets are used for classification. Experimental environment is with CPU Intel i52, 60GHz, 4GB RAM, 64 bit Windows8, and MATLAB2017.  In the experiment, the sensitive information training sample set is randomly generated two classes of two-dimensional samples with normal distribution, which are positive class sample and negative class sample, and the 2% random noise data is added. A random two-dimensional sample is taken as a test sample with addition of 1% noise data. The parameters of the two algorithms is selected the same (C = 100). The selection of e 1 is different with the sample and 3∼5 times of min(d ij ). With the increase of the data set, the classification results of the two algorithms are shown in Fig. 4 and Fig. 5 for the case of the 200 positive samples and 200 negative samples. In Fig. 4 and Fig. 5, triangle represents positive class sample, circle represents negative class sample, and square represents the deleted sample after preselecting candidate support vector.
From Fig. 4 and Fig. 5, it can be known that, with the increasing number of sensitive information training samples, the proposed algorithm has improved in training time and classification accuracy compared with the current algorithm. A large number of kernel matrix operations and storage are carried out in support vector machine training to make the training slow, and the size of the matrix is related to the number of training samples. As the current method does not reduce the number of samples, the training time is longer, and the current algorithm adds class centripetal degree to set membership, which results in the slower training speed. However, the proposed algorithm preselects candidate support vector and deletes some samples to reduce the number of training samples. Experiment proves that the time cost of the proposed algorithm in membership setting process is less than the time cost of training samples. In addition, the proposed algorithm gives support vector greater membership to enhance its role, and weights the membership with the consideration of smaller nearest neighbor sample density of the noise point and the outlier point, which can effectively reduce the noise point or the outlier point and improve the classification accuracy.
Assume S is the set of all sensitive information related to the information identification in the social network, R is the set of the recognized sensitive information, s is the number of the recognized related sensitive information for one detection, m is the number of the recognized unrelated sensitive information for one detection, and n is the number of unrecognized related sensitive information. The recall ratio The high recall ratio and precision ratio represents the good performance. But generally, they are contradictory. When the precision ratio is high, the recall ratio is low, and vice versa. Comparison of the recall ratio and the precision ratio between the proposed algorithm and the original algorithm is shown in Fig. 6. As the test sensitive information is increasing, the trend of the recall ratio is rising and the trend of precision ratio is down. As the support vector machine in the proposed algorithm eliminates some error samples which can affect the result, and the precision ratio and the recall ratio is improved compared with the original algorithm without adding location information. The results show that the information recognition rate of the proposed algorithm is high, and it can effectively improve the security of network operation.