Relay selection based on social relationship prediction and information leakage reduction for mobile social networks

Despite the extensive study on relay selection in mobile social networks (MSNs), few work has taken both transmission latency (i.e. efficiency) and information leakage probability (i.e. security) into consideration. Therefore we target on designing an efficient and secure relay selection algorithm to enable communication among legitimate users while reducing the information leakage probability to other users. In this paper, we propose a novel mobility model for MSN users considering both the randomness and the sociality of the movements, based on which the social relationship among users, i.e. the meeting probabilities among the users, are predicted. Taken both efficiency and security into consideration, we design a network formation game based relay selection algorithm by defining the payoff functions of the users, designing the game evolving rules, and proving the stability of the formed network structure. Extensive simulation is conducted to validate the performance of the relay selection algorithm by using both synthetic trace and real-world trace. The results show that our algorithm outperforms other algorithms by trading a balance between efficiency and security.

1. Introduction. Nowadays, people carry various mobile devices such as smartphones, PADs, and laptops to leverage WiFi, Internet, and 3G/4G cellular networks for pervasive commmunication services. When such infrastructures are unavailable, these devices can form a self-organizing network and enable communication among users by using the opportunistic encounters between them. Such a network is termed as a mobile social network (MSN) [8]. Finding appropriate relays and forming suitable routing paths play critical roles for information dissemination in MSNs.
Extensive effort has been taken to design efficient relay selection algorithms and routing protocols for MSNs. Moreover, taking into account the fact that human social behavior heavily impacts the operation of MSNs, researchers have designed a number of relay selection algorithms by considering human social features such as the visiting frequencies of human beings to certain locations and the encounter frequencies among human beings [24,25]. However, few literature has considered the fundamental problem of preventing information leakage to non-destination users. Despite the employment of encryption mechanism [3,26], reducing the information leakage probability is still critical as encryption mechanism is not "perfectly secure". Therefore, we target on designing an efficient and secure relay selection algorithm to enable communication among legitimate users while reducing the information leakage probability to other users in MSNs.
The main contributions of this paper are as follows: First, both the social relationship among the users and the overhearing probabilities of the users are considered in our design, through which the efficiency and the security of the relay selection algorithm can be ensured. Second, we propose a novel mobility model for MSN users considering both the randomness and the sociality of the movements, based on which the social relationship among users are predicted. Third, we propose a network formation game based relay selection algorithm. The payoff functions of the information source, the candidate relays, and the information destination are defined, the game evolving rules are designed, and the stability of the formed network structure is proved. Finally, extensive simulation is conducted to validate the performance of the proposed relay selection algorithm by using both synthetic trace and real-world trace. The results show that our algorithm outperforms other algorithms by trading a balance between the transmission latency (i.e. efficiency) and the information leakage probability (i.e. security).
In the rest of the paper, we briefly summarize the related work on relay selection in MSNs in Section 2. We describe the problem to be solved in Section 3. In Section 4, we propose our mobility model by considering both the randomness and the sociality of the movements and predict the social relationship among users. An efficient and secure relay selection algorithm is proposed based on the network formation game in Section 5, where we define the payoff functions in Section 5.1, design the game evolving rules in Section 5.2, and prove the stability of the formed network structure in Section 5.3. A simulation study to validate the performance of the relay selection algorithm is reported in Section 6. Finally, we conclude the paper in Section 7.
2. Related work. The MSN routing schemes can be classified into four main categories: flooding based schemes, single-copy based schemes, utility based schemes, and quota based schemes. Flooding based schemes maintain low transmission latency at the expense of high storage cost and communication overhead [19]. On the contrary, single-copy based schemes reduce the storage cost and the communication overhead by allowing only one copy of an information packet to be transmitted in the network [22]. But they provide no performance guarantee on the transmission latency. To tradeoff between transmission latency and storage/communication cost, utility based schemes and quota based schemes are proposed. In utility based schemes, a user forwards the information only if it encounters another user with higher utility [6]. While in quota based schemes, the number of copies for each information packet is upper bounded by a fixed number [18].
Considering the fact that network operations are heavily impacted by human social behaviors, variants of the aforementioned routing schemes are proposed by exploiting different human social characteristics. [5] takes the rate of encountering other users, or simply the encounter rate, as the main metric for relay selection. A user will forward the information to an encountered user with higher encounter rate with the destination. [15] shares a similar idea with [5] while the encountering duration is taken as the relay selection metric. [21], [23], and [11] exploit the mobility properties of the users for relay selection. They intend to predict/assess the encounters among the users through mobility analysis, thus they can be considered as a variant of [5] and [15].
Despite these efforts to solve the relay selection problem in MSNs, none of them considers the fundamental problem of preventing information leakage to nondestination users [24,2]. Although the encryption mechanism [20,13] can be employed to protect the information transmission, reducing the information leakage probability is still critical as encryption mechanism is not "perfectly secure". To tackle this challenge, a routing scheme is designed in [4] to reduce information leakage in social networks. However, the social relationship among users is not considered thus the transmission latency (i.e. routing efficiency) can not be guaranteed. To the best of our knowledge, we are the first to take both efficiency and security into consideration for relay selection in MSNs.
3. Problem description. Consider a mobile social network consisting of a set of users, denoted by N = {1, 2, · · · , n}. For each user i ∈ N , a social relationship vector Ω i = {ω i1 , ω i2 , · · · , ω in } is defined to describe its social relationship with others. A higher social relationship 0 ≤ ω ij ≤ 1 indicates a higher probability that user i meets user j. Moreover, an overhearing probability p i is defined for each user i to indicate the probability that i intercepts the information not for it. A higher 0 ≤ p i ≤ 1 indicates a higher probability that i overhears others' information.
We study the relay selection problem in the considered mobile social network. Assume that there is an information source I s ∈ N trying to share its information with a set of destinations V D ⊆ N \ I s . The information source and the destinations are assumed to be static. For the destinations within the source's transmission range, the information source transmits the information directly to them. While for those destinations outside its transmission, the information source will explore the movements of the MSN users for information dissemination. This assumption is reasonable in kinds of practical scenarios. For example, a road side unit (RSU) in vehicular networks wants to share some traffic information or road information with other RSUs. If the destination RSU is within the source's transmission range, direction transmission can be conducted. Otherwise, the source RSU should employ the moving vehicles and conduct relay transmission. In our work, we aim to find a relay path L i for each destination i outside the transmission range of the information source. The relay path can be denoted by L i = {(i l , i l+1 )|l ∈ {1, 2, · · · , l i n − 1}, i 1 = I s , i l i n = i}, with (i l , i l+1 ) being a relay link from user i l to user i l+1 and l i n being the number of users on path L i . Our objective is to find the relay path L i with small information leakage probability and transmission latency.
4. Prediction of the social relationship ω ij in MSNs. In order to find the relay path with low transmission latency, we need to know the meeting probabilities among the users, i.e. the social relationship ω ij among users. However, this information is hard to obtain due to the huge number of users in MSNs. To tackle this challenge, we will propose a social relationship prediction method based on analysing the mobility model of the MSN users.
4.1. Mobility model of MSN users. As mentioned in Section 1, the users in MSNs are the smartphones, PADs, and laptops carried by human beings. As a result, the mobility pattern of the MSN users corresponds to that of their holders. Empirical observation reveals that the mobility of human beings has the characteristic of both randomness and sociality. Randomness means people can change their moving speed and direction arbitrarily, while sociality signifies that there exists certain regulations for the movements. In this subsection, we consider both the randomness and the sociality, and focus on modeling the moving direction and the moving speed of the MSN users. We assume the users' movement can be divided into a sequence of time intervals with length , the users do not change their moving direction or speed within one interval [10]. Based on the basic idea of the virtual grid proposed in [1], we participate the whole area A into multiple square zones A = {A 1 , A 2 , · · · } with each square A τ ∈ A being indicated by the coordinates of its center (x τ , y τ ). Each zone A τ = (x τ , y τ ) is determined by its functions, such as home, school, working place, gymnasium, shopping mall, restaurant, and so on. For different users, the probability of moving towards each zone is different. For example, the probability of user i moving towards A τ will be high if its home or working place is within A τ . Let ζ iτ being the probability for user i moving towards A τ . It will be easy to get the value of ζ iτ based on the analysis of user i's hobbies and daily life traces. Through this way, we can get the probability function of choosing different moving directions for the users.
In normal life, human beings usually move with an average speed while the probability of moving with an extremely low or high speed is quite small. For example, the average speed for walking is about 5km/hour and the average speed for driving cars is about 100km/hour on high ways. According to this observation, it is reasonable to assume the moving speed of the users follow normal distribution.
Let v i denote the moving speed of user i, then v i ∼ N (µ, σ). Here N (µ, σ) is a normal distribution with the mean being µ and the standard deviation being σ. In our model, we define the average moving speed of the users as µ. According to the 3σ rules of the normal distribution, (µ − 3σ, µ + 3σ) can be considered as the value space for v i . Since the minimum speed of a user is 0, we let µ − 3σ = 0 and get σ = µ 3 . Therefore, we model the moving speed of user i as v i ∼ N (µ, µ 3 ).

4.2.
Prediction of ω ij . Due to the fading property of wireless signals, each user has a limited communication range. As shown in Fig. 1, the communication range of user j is indicated by a dashed circle centered on j. Assume that the radius of the communication range for each user is the same, we draw the following conclusion.
Only if user i moves into the communication range of user j, these two users can communicate with each other. Such case is defined as a meeting between user i and user j. As mentioned in Section 3, the social relationship ω ij indicates the probability that user i meets user j. In this subsection, we will predict ω ij through deriving the meeting probability between i and j based on the mobility model described in last subsection. Refer to Fig. 1, user i is a mobile user and user j is one of the destinations outside the communication range of the information source. Let us assume the current time interval is Int 0 , and the current location of user i is (x 0 i , y 0 i ) which is obtained via GPS or other techniques. Denote the current moving direction and speed of user i by (x a , y a ) and v a respectively, with (x a , y a ) being the center of the zone that user i is currently moving towards. Since the users do not change their moving direction or speed within one time interval [10], we can get the location of user i at the end of Int 0 denoted by (x 1 i , y 1 i ). For next time interval Int 1 , user i will change its moving direction and speed according to the probability function of ζ iτ and the probability density function of v i respectively. As shown in Fig. 1, it is possible for user i to meet user j (i.e., user i may move into the communication range of user j) if it's moving direction in Int 1 is (x b , y b ) or (x c , y c ) while the meeting between user i and user j will never happen during Int 1 if user i's moving direction is (x d , y d ). More specifically, when the moving direction of user i is (x c , y c ), it will meet user j if it's location at the end of Int 1 is no nearer than (x min , y min ) and no further than (x max , y max ). Therefore, the minimum speed of user i should be while the maximum speed of user i should be Thus, when the moving direction of user i is (x c , y c ), the probability that user i meets user j during Int 1 can be derived as Here, σ = µ 3 and erf (·) is the Gauss error function. Let C be the set of moving directions that makes the meeting between user i and j possible. For example, The social relationship between i and j can be predicted as Here, ζ ic is the probability that user i choose direction (x c , y c ) and φ c ij is the probability that user i meet user j when its moving direction is (x c , y c ).

5.
Design of the relay selection algorithm. In this section, we model the relay selection problem as a network formation game, where the users, also referred to as the players in the game, have the discretion in forming the relay paths. Each user calculates its payoff as the difference between the benefit and the cost for jointing a specific relay path and decides its strategy that could maximize its own payoff. More specifically, we will define the payoff functions for the information source, the relays, and the destinations in Section 5.1, design the game evolving rules (that is the procedure of the relay selection algorithm) in Section 5.2, and prove the stability of the network structure formed by the proposed algorithm in Section 5.3 .

Payoff functions.
For any game, one basic problem is to inspire the users to participate. In this subsection, we design the payoff functions to capture the incentives for the information source, the relays, and the destinations.
First, we design the payoff function for the information source I s . Let N (I s ) denote the set of users that are within the transmission range of I s . Then the destination set V D could be partitioned into two subsets: V in D ⊆ N (I s ) and V out For any destination i ∈ V in D , I s could build a direct transmission link (I s , i) and gain a unit of payoff through finishing the transmission to destination i. If I s selects a relay j = i for destination i instead of performing the direct transmission, i will not be reached at this time interval. Thus, we assume that I s will gain a payoff of −∞ if I s selects a relay j = i for any i ∈ V in D . For any destination i ∈ V out D , I s can either hold the transmission until it moves into the destination's transmission range or employ some other users, which are within its transmission range, as relays. For any destination i ∈ V out D , the probability that I s will meet i can be denoted by ω Isi . The probability that destination i will be reached in next time interval if user j ∈ N (I s ) is selected as the relay can be denoted by ω ji . Thus, when ω ji > ω Isi , the benefit of employing user j as the relay to destination i can be defined as: However, everything has two sides. Compared with direct transmission, relay transmission may cause a certain probability of information leakage on the relay paths. For a certain relay, the information may be intercepted by itself or be overheard by the users within its transmission range. Therefore, the probability that the information for destination i is leaked at relay j ∈ L i can be defined as: Where p j is the overhearing probability of user j.
Calculating the difference between B s i (j) and C s i (j), we get the payoff earned by I s for employing user j as a relay to destination i. Taking into account all the discussed cases, we define the payoff function of the information source as follows: Here, the first row indicates the case where destination i ∈ V in D and I s performs the direct transmission; the second row indicates the case where destination i ∈ V in D and I s selects a relay j = i for i; and the third row indicates the case where i ∈ V out D and I s selects a relay j ∈ N (I s ) for i.
When considering the network formation from the view point of the relays, the wages for acting as a relay and the cost for maintianing the relay links are two major concerns. In this work, we assume that each source destination pair (I s , i), i ∈ V D , provides a wage budget of α.
Each relay j on path L i earns a wage B r i (j) through sharing the total budget according to a predefined rule, which will be detailed in next subsection. As to the cost, we assume that a cost β is needed for user j to maintain each link. Then, the cost of user j for acting as a relay to destination i can be denoted by Where k ij is the number of links that should be added by j to perform the relay function for i. Therefore, the payoff function of user j for acting as a relay to destination i can be defined as: Finally, we define the payoff function for the destinations. For any destination i ∈ V D , the payoff function U d i is defined as: That is, destination i ∈ V D earns a unit of payoff if i is reached and earns nothing otherwise.

5.2.
Relay selection based on the network formation game. In the network formation game, the objective of each player (user) is to maximize its payoff [12].
In this subsection, we propose the relay selection algorithm based on the network formation game described above. relay selection consists of M rounds, with M being the maximum number of hops on the relay paths. In the m-th round, m ∈ {1, 2, · · · , M }, the m-th hop relay links for those destinations, which have not been reached, are formed. For the first round, i.e., m = 1, the information source searches within its transmission range for the destinations. Step 1: I s broadcasts an employment information, including the set V out D and the value of α, to those users belong to N (I s ).
Step 2: Each user j ∈ N (I s ) feedbacks a relationship list Ω j = {ω ji |i ∈ V } to I s .
Step 3: After receiving the replies, I s selects a relay j * i ∈ N (I s ) for each destination i ∈ V out D according to the following rule: Where U s i (j) = B s i (j) − C s i (j) can be calculated according to (5), (6), and (7). Then, I s constructs an employment list E j for each j ∈ N (I s ).
We should notice that E j = ∅, if user j ∈ N (I s ) is not selected as a relay for any i ∈ V out D . For any user j ∈ N (I s ) with nonempty E j , I s sends the employment list to it.
Step 4: If user j receives a nonempty employment list E j from I s , it accepts to act as relays for a subset of destinations V * j ⊆ E j , and feedbacks V * j to I s . V * j is determined according to the following rule: is the payoff of user j for acting as a relay to destination i. C r i (j) can be calculated according to equation (8) and B r i (j) = α in the first round since j is the only relay for destination pair (I s , i) now.
Step 5: After receiving the feedbacks, I s constructs the relay links for any i ∈ V * j as L i = {(I s , j), (j, i)}, and modifies the set of destinations need to be relayed as V out D = V out D \ V * j . Then, it performs Step 1 to Step 5, until V out D = ∅. It should be noticed that the relays which have rejected the employment of I s will not be considered again. After the first round is finished, the first hop relays of all the destinations have been employed (if necessary). Denote the set of all the first hop relays by R 1 . The information source I s sends the information to the relays in R 1 . Once the relays in R 1 have accepted the employment, they have the responsibilities to transmit the information to the destinations. Otherwise, they will be punished by a tough punishment mechanism.
Then, our relay selection algorithm selects the second-hop relays in the second round. In this round, each first hop relay j ∈ R 1 , which has the responsibility to transmit the information to the destinations in V * j , can be considered as a virtual information source I sj = j with a set of destinations V * j . Each virtual information source I sj just needs to perform Step 1-Step 5 as in the first round. The only difference we should notice is that the declared value of α in Step 1 is changed in the second round. Since relay I sj has accepted the employment of I s , it has the responsibility to relay the information to destination i ∈ V * j . When i is out of the transmission range of j, j is motivated to employ a second hop relay for help. However, j is a selfish user who has to consider its own benefits. Therefore, j would not give all the wages it earns from I s to the second hop relay. In our work, we assume that j declares a wage budget α 2 = α 2 in Step 1. Therefore, the benefit earned by a second hop relay j 2 will be B r i (j 2 ) = α 2 = α 2 in Step 4. Similarly, the m-th hop relays could be selected independently with the selection of the (m − 1)-th hop relays. Here, the (m − 1)-th relays are considered as the virtual information sources, the wage budget α m is declared to be α m = α 2 (m− 1) in Step 1, and the benefit earned by a m-th hop relay j m is B r i (j m ) = α 2 (m−1) in Step 4. Finally, the relay selection algorithm ends when all the destinations of the information source I s are reached. 5.3. Stability analysis. Performing our relay selection algorithm, a relay path is formed for each destination. The totally |V D | relay paths construct a network structure, denoted by g. In this subsection, we prove the stability of the formed network structure.
According to [9], there are two kinds of stability, that is, pairwise stability and strong stability, in the network formation game. The definitions of these two kinds of stability are given as follows: Definition 5.1. A network structure g is pairwise stable with respective to the payoff function U if 1. for any edge (i, j) ∈ g, U i (g) ≥ U i (g − (i, j)) and U j (g) ≥ U j (g − (i, j)), and 2. for any edge (i, j) / ∈ g, if U i (g + (i, j)) > U i (g) then U j (g + (i, j)) < U j (g) Here, U i (g) denotes the payoff earned by user i in network structure g.
Definition 5.2. A network structure g is strong stable with respective to the payoff function U if for any other network structure g , which is obtained via deviation from g, there exists at least one player (user) i such that U i (g ) < U i (g).
It could be found from the definitions that pairwise stability only considers deviations on a single link at a time, while strong stability allows for more complicated deviations on groups of links. In this paper, we ignore the collusion among users and only consider the deviations on a single link at a time. Based on this assumption, we prove that the proposed relay selection algorithm is pairwise stable as follows.
Theorem 5.3. The network structure g formed by the relay selection algorithm is pairwise stable.
Proof. First, we prove the first condition of pairwise stability. According to the description in Section 3, we use edge (i m−1 , i m ) ∈ g to denote the relay link between the (m−1)-th hop relay and the m-th hop relay on relay path L i . Then, we consider four cases.
(1) Case 1. i m−1 = I s , i m = i. Edge (i m−1 , i m ) indicates that destination i m = i is within the transmission range of the information source i m−1 = I s . Obviously, it is against common sense to break the direct transmission link when i m is reachable by i m−1 . Theoretically, user i m−1 earns a payoff of 1 when edge (i m−1 , i m ) exists (refer to the first row of equation (7)). If edge (i m−1 , i m ) is removed, i m−1 should find another user j = i m to relay the message. According to the second row of (7), i m−1 will earns a payoff of −∞ in such a case. Therefore, U im−1 (g) > U im−1 (g − (i m−1 , i m )). As for the destination i m , it earns the payoff of 1, i.e., getting information from i m−1 , if edge (i m−1 , i m ) exists, and earns nothing if edge (i m−1 , i m ) is removed. Thus, U im (g) > U im (g − (i m−1 , i m )).
(2) Case 2. i m−1 = I s , i m = i. Edge (i m−1 , i m ) = (I s , i m ) indicates that user i m is a relay selected by the information source. According to (7) and (9), i m−1 and i m earn a payoff of U im−1 (g) = U s i (i m ) = B s i (i m ) − C s i (i m ) and U im (g) = U r i (i m ) = B r i (i m ) − C r i (i m ), respectively. It can be seen from (13) that U im (g) = U r i (i m ) > 0. If edge (i m−1 , i m ) is removed, user i m can not earn the payoff U r i (i m ) anymore. In other words, we can say that user i m earns a zero payoff now. Therefore, U im (g) > U im (g − (i m−1 , i m )). As for the information source i m−1 = I s , a new relay i m should be selected to replace i m if edge (i m−1 , i m ) is deleted. According to (11), As described in Section 5.2, i m−1 can be considered as a virtual information source in this case. Similar to the analysis in Case 2, we find that (4) Case 4. i m−1 = I s , i m = i. In this case, i m−1 can also be considered as a virtual information source. Similar to the analysis in Case 1, we have Summarizing these four cases, we conclude that g satisfies the first condition of pairwise stability.
Then, we prove the second condition of pairwise stability. For any edge (p, q) / ∈ g, there are two cases.
(1) Case 1: ∀i ∈ V D , p / ∈ L i , q / ∈ L i . This case indicates the situation where neither p or q belongs to any relay path. Forming such an edge is meaningless in the relay selection process. Thus, we do not consider this case in our work.
(3) Case 2: There is a pair of i and m ∈ {1, 2, · · · , M } such that p = i m−1 . If q = i m , edge (i m−1 , i m ) already exists in the network structure g. Otherwise, adding edge (p, q) means replace relay i m with another user q. In other words, edge (i m−1 , i m ) should be deleted before adding edge (p, q). According to the proof of the first condition, deleting edge (i m−1 , i m ) will lead to a decreased payoff for user i m−1 . Similarly, the payoff of user i m will be decreased when there is a pair of i and m ∈ {1, 2, · · · , M } such that q = i m and p = i m−1 . Therefore, the network structure g formed by our relay selection algorithm also satisfies the second condition of pairwise stability.
Based on all the above analysis, we draw the conclusion that the network structure g formed by our relay selection algorithm is pairwise stable. 6. Simulation. In this section, we compare the performance of the proposed relay selection algorithm with the random relay selection algorithm (Rand), the relationship based relay selection algorithm (Relation), and the leakage probability based relay selection algorithm (Leakage), in MATLAB, using both synthetic trace and real-world trace. In our simulation, we consider a time-slotted system with a set of users, denoted by N = {1, 2, · · · , n}. For each formed relay path, we define the transmission latency as the number of slots cost for reaching the destination, and define the maximum leakage probability as the maximum leakage probability of the users on the relay path. The relay selection algorithms are conducted for 3000 times, the average latency (A-Latency) and the average maximum leakage probability (A-MLP) are used to evaluate the performance of the relay selection algorithms. Our relay selection algorithm is denoted by ESRS for short in the simulation study. The three schemes for comparison are briefly introduced as follows, • Rand: the next hop relay is randomly selected.
• Relation: the next hop relay is selected as the user having the maximum social relationship with the destination. • Leakage: the next hop relay is selected as the user with the minimum leakage probability.
6.1. Simulation study using synthetic ζ iτ . As defined in Section 4.1, ζ iτ is the probability for user i moving towards zone A τ . Different users have different   The exponent of the power law distribution for ζ iτ 1.7 k l The exponent of the power law distribution for p i 3 C l The maximum value of p i 0.6 R d The radius of the communication range 6m The length of the time interval within which 30s users keep their moving direction and speed unchanged µ The mean of the normal distribution for users' speed 1.4 σ The standard deviation of the normal distribution for users' speed µ 3 probabilities for moving towards each zone. As we know, many human activities in social networks can be modeled by the power law distribution [16,14,28,7,27]. Therefore, for each user, we model the probability that it moves towards different zones as a power law distribution. That is, a user moves to several zones (such as the home, the working place, and the favorite restaurants) with high probability and rarely moves to other zones. We assume there are 100 zones {A 1 , A 2 , · · · , A 100 } in a 100m × 100m area A, with each zone being a 10m × 10m square. 100 users moves within the whole area. For each user i, the probability function of choosing different moving directions is modeled as a power law distribution : ζ iτ = τ −kr τ τ −kr , with τ ∈ {1, 2, · · · , 100} being the index of a zone, k r being the exponent of the distribution, and τ τ −kr being the normalization factor.
The probability that a user launches a malicious attack is also modeled by power law distribution. Specifically, we define p i = C l i −k l , with i ∈ N being the user index, N being the set of users, k l being the exponent, and C l being a constant indicating the maximum value of p i , respectively. The effect of k r on ζ iτ , the effects of C l and k l on p i can be shown by Fig. 2.
In our simulation, we set the parameters according to Table 1 and we get the simulation results shown in Table 2. Simulation study using real-world meeting probability. We take the meeting probability obtained from real trace [17], which is conducted at University of St Andrews, as the social relationship between users. To the best of our knowledge, there is no available trace indicating the probability that a user launches a malicious attack. In the real trace simulation, we change the exponent k l of the power law distribution p i = C l i −k l from 1 to 3 to investigate the performance of the schemes under different attack probabilities. The simulation results using the real-world trace are given in Fig. 3. As shwon in Fig. 3(a), Relation always gets the best A-Latency performance as it selects the relays based solely on the relationship (i.e. the encounter rate) among the users. The A-Latency performance of Rand and Leakage are random since the encounter rate is not considered for relay selection. As for the relay selection algorithm, its A-Latency performance approaches that of Relation with the increase of k l . The reason is explained as follows. According to Fig. 2, the probabilities of users launching malicious attacks decrease with the increasing of k l . In this case, the relationship among the users play a vital role for relay selection. Therefore, our relay selection algorithm will find the relays with higher encounter rates with the destination and the A-Latency performance will approach that of Relation. The comparison of the A-MLP performance is given in Fig. 3(b). It can be seen that the relay selection algorithm maintains a similar A-LMP performance with that of Leakage which selects the relays based solely on the leakage probabilities of the users.

7.
Conclusion. This paper tackles the challenge of designing an efficient and secure relay selection algorithm for MSNs. We propose a novel mobility model for MSN users considering both the randomness and the sociality of the movements. Through analysing of the mobility model, the social relationship among users is predicted for helping find relay paths with low transmission latency. We propose a network formation game based relay selection algorithm by defining the payoff  functions of the users and designing the game evolving rules. The stability of the network structure formed by our relay selection algorithm is proved and the performance of the algorithm is compared with other algorithms through extensive simulation study. The simulation results using both synthetic trace and real-world trace show that our algorithm outperforms other algorithms by enabling lower latency and lower information leakage probability communication in MSNs.