Research Article
Allocating Limited Resources to Protect a Massive Number of Targets Using a Game Theoretic Model
Xu Liu ,
1,2Xiaoqiang Di ,
1,2Jinqing Li,
1,2Huan Wang,
1,2Jianping Zhao,
1,2Huamin Yang ,
1,2Ligang Cong,
1,2and Yuming Jiang
31School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
2Jilin Province Key Laboratory of Network and Information Security, Changchun, China
3Department of Information Security and Communication Technology, Norwegian University of Science and Technology, Trondheim, Norway
Correspondence should be addressed to Xiaoqiang Di; [email protected] Received 7 September 2018; Accepted 20 February 2019; Published 13 March 2019
Academic Editor: Neale R. Smith
Copyright © 2019 Xu Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Resource allocation is the process of optimizing the rare resources. In the area of security, how to allocate limited resources to protect a massive number of targets is especially challenging. This paper addresses this resource allocation issue by constructing a game theoretic model. A defender and an attacker are players and the interaction is formulated as a trade-off between protecting targets and consuming resources. The action cost which is a necessary role of consuming resource is considered in the proposed model.
Additionally, a bounded rational behavior model (quantal response: QR), which simulates a human attacker of the adversarial nature, is introduced to improve the proposed model. To validate the proposed model, we compare the different utility functions and resource allocation strategies. The comparison results suggest that the proposed resource allocation strategy performs better than others in the perspective of utility and resource effectiveness.
1. Introduction
Resource allocation has always been a complex problem, especially when driven by security requirements. How to devise a mechanism to control the trade-off between the cost of protection and the achieved security utility is an open challenge [1]. In the AWS re:Invent 2014, the AWS engineer claimed that Amazon had nearly 28 total sets across the world, each of which has one or more data centers with a typical facility containing 50,000 to 80,000 servers [2]. To protect these servers against attack and maintain their consistent operation, cloud providers will implement security strategy.
For example, they can protect targets (e.g., virtual machines, VMs) by setting up resource reservations to analyze the operation of targets and then respond the attack quickly, which is followed by a lot of resource consumption [3].
Therefore, a trade-off problem could be abstracted between consuming resources and protecting targets. Especially, when the number of available resources or resource budget is fixed and limited for all the targets, how to allocate limited
resources to protect a massive number of targets is a vital issue in the security area.
The extreme approach may be to allocate security resources to cover all the targets [4], for instance, setting up the full resource reservations for all the VMs, which will lead to almost double resource consumption. The common approach may be to protect those targets with the most value [5], for instance, setting up the resource reservation for the VMs that store the most data or the sensitive data (e.g., financial data). The former approach fails to consider resource constraints and effectiveness; however, the available resources may not be sufficient to protect all the targets on the one hand; on the other hand, resources allocated to some empty targets may be inefficient. The latter approach does not account for the adversarial nature and perspective-taking of the attacker. An attacker who can learn about a defender’s possible target protection strategy can exploit this knowledge to launch an attack on the targets that the defender does not protect.
Volume 2019, Article ID 5475341, 16 pages https://doi.org/10.1155/2019/5475341
This paper focuses on developing a general resource allo- cation method to address the trade-off between security gain and resource consumption. The goal is to resolve the problem of how to utilize limited resources to efficiently protect massive targets against attack. How to build a mathematical model to describe this problem is the key, for example,(1) how to maintain security while allocating resources and(2) how to simulate an attacker of the adversarial nature.
In the previous studies [5–10], the number of allocated resources is measured by defense probability. But the impor- tance and emphasis of resource allocation weaken in such scheme. In general, performing different actions on a target will result in different outcomes. If an action is successful, the actor will obtain some benefit as a reward; otherwise, the actor will lose some assets as a penalty. No matter whether an action is successful, the actor will incur some cost by taking the action. Recent studies about the effort of deterrence and risk preferences in the security games [11, 12] have analyzed the impact of risk preference on the defense effort and deterrence level and the impact of defender’s cost on the investment strategy. Meanwhile, statistics show that a large data center costs between $10 million and $25 million per year and the corresponding maintenance costs account for nearly 80% of its total cost [13]. So it is clear that action cost is an important factor which cannot be ignored. By combining the rewards, penalties, costs, and probabilities of actions in some manner, it may be possible to describe our problem.
Game theory, an important tool for analyzing real world resource allocation problems, such as the assignment of cyber analysts [5] and patrolling strategies [14, 15], provides an alternative solution. However, in most of the previous studies [6–9] on game theoretic resource allocation, only the reward and penalty associated with an action have been included in the game utility function, but the action cost has been ignored. In the real world, no matter what one wants to do, an action cost is often necessary. This cost might be measured in monetary units, physical resources, abstract resources, and so on. Whatever it is, it can be abstracted as a mathematical expression. Hence, we include cost additionally in the Stackelberg game [16] utility function and analyze the impact of different parameter value configurations on the defender’s utility.
Since both the defender and the attacker are intelligent and have the perspective-taking ability, we consider an inter- action in which the defender designs a resource allocation strategy first and the attacker subsequently develops an attack strategy. Although the attacker has the ability to consider the situation from the perspective of the defender, the attacker might also take abrupt actions that lie outside the defender’s expectations. This type of attacker, who is of the adversarial nature, can be simulated by the quantal response (QR) model, which has received widespread support in the literature on modeling human behavior in games [17]. In this paper, we introduce it into the proposed Game Theoretic Resource Allocation (GTRA) model to simulate adversarial reasoning.
The efficient resource allocation strategy for the defender is obtained from an optimization algorithm. Three indica- tors, namely, vulnerability, coverage, and effectiveness, are designed to evaluate the effectiveness of our strategy. We
compare the equilibrium strategy based on the proposed GTRA with the one based on a game utility function without considering the action cost, and also compare with four extreme resource allocation strategies, namely average allo- cation strategy, partial allocation strategy, random allocation strategy, and full-coverage strategy. The experimental results demonstrate the effectiveness of our proposed GTRA model.
The contributions of this paper can be summarized as follows:
(1) To emphasize the action cost in resource consump- tion. The players’ action costs are included in the game utility function as an independent item. The numerical analyses prove that this type of resource measurement can improve the utility and effective- ness.
(2) To better balance target security and resource con- sumption. The obtained Nash equilibrium strategy is selected as the defender’s resource allocation strategy because it outperforms the other extreme resource allocation strategies in terms of both security and effectiveness.
(3) The constructed GTRA model provides advice based on the target parameters to assist in determining the appropriate quantity of resources to protect a massive number of targets.
The remainder of this paper is organized as follows.
Sections 2 and 3 describe the related work on resource allocation and our problem, respectively. Game theoretic model, QR model, and the proposed algorithm are presented in Section 4. The numerical analyses are discussed in Sec- tion 5. The final section summarizes the paper and outlines directions for future work.
2. Related Work
Resource allocation is defined as the economical distribution of resources among competing groups of people or programs [1]. Game theory has been applied in resource allocation to better capture the interaction between resource provider and user and shows the economic nature of resource allo- cation. The previous studies can be roughly classified into two categories based on the different participants consid- ered: security-driven resource allocation between a resource provider and an attacker; and demand-driven resource allo- cation between a resource provider and a legitimate user.
Demand-driven resource allocation can be further subdi- vided into cost-scheme-based, performance-scheme-based, and mixed-scheme-based resource allocation. The original pricing scheme is used for the allocation of resources of a single type, such as bandwidth [18–20], offload [21, 22], or cache [23]. With the development of the Internet, resource provider could provide nearly all the resources that users need, such as cloud computing provider provides on-demand resources including storage, memory, and bandwidth. Mul- tiresource pricing schemes [24], such as the cost-optimized scheme considering multiple resources [25], have emerged.
Meanwhile, since user requests are becoming necessary
while providing service, some research has focused on user- demand-driven resource allocation [26–28]. Later, the cost- optimized and performance-based schemes are combined to allow a resource provider to achieve a win-win objective in which resource provider obtains the maximum profit while the user receives the best experience [13, 29–31].
However, during the pursuit of the best experience and the maximum benefit, security issues increase and security- driven resource allocation become a research hotspot, espe- cially when resources are limited and cannot cover all the targets that require protection. The American Institute Teamcore conducted a project with the theme of “AI and game theory for public safety and security”, and their achieve- ments have been applied in various areas. ARMOR [32] was deployed to develop randomized checkpoints and a patrol route strategy at Los Angeles International Airport. GUARDS [33] was developed to assist airports in allocating limited air police resources to protect more than 400 United States airports. Federal Air Marshals used IRIS [34] to provide scheduling coverage for potential attacks. PROTECT [35] was deployed to generate randomized patrolling schedules for the US Coast Guard. These cases are typical instances of limited security resource allocation using game theoretic model.
Other game theoretic studies have also produced good results. One study [5] investigated an intelligent allocation method for assigning limited cyberanalysts to analyze a massive number of security alerts in a network. Another work [10] developed new models and algorithms that could scale to highly complex instances of limited security resource allocation games. Their new methods performed faster than known algorithms when solving massive security games. In further research [6] based on a previous work [7], efficient algorithms were developed to compute the best responses of security forces to different adversary models when resources are limited, and it was proven that the proposed response strategy was superior because it relaxed the assumption of perfect rationality. An additional study [8] proposed a game theoretic scheme for developing dynamic and randomized security strategies that consider an adversary’s surveillance capabilities. The experimental results showed that the pro- posed algorithm outperformed the existing approaches.
Although these works have utilized the nature and principles of game theory to determine optimal resource allocation strategies, most of them considered only rewards and penalties in their allocation strategies. Recent works [11, 12] specially examined the effect of risk preferences on deterrence and analyzed the impact of the defender’s cost on its investment, which demonstrated that the cost of actions cannot be ignored. Nonetheless, in the previous works [5–10], the action cost was measured by defense probability simply, which inclines to analyze the impact of defense instead of action cost. Therefore, the game theoretic approach that includes the action cost independently is required to perform the resource allocation in the security area.
3. Problem Description
This paper considers a common scenario of a defender and an attacker. The defender’s responsibility is to protect the
Table 1: Parameter descriptions.
Parameter Description
𝑇 set of targets
𝑁 number of targets in𝑇
𝐴 attacker
𝐷 defender
𝑝𝑖 attack probability for target𝑖 𝑞𝑖 defense probability for target𝑖 𝐶𝑚𝑖 resources allocated to protect target𝑖
𝑀 maximum quantity of available resources
security of 𝑁 targets using 𝑀 resources, so it allocates resources to targets as its action. By contrast, the attacker’s intention is to attack the targets, and such attack also costs resources. For both sides, the benefit of consuming resources can be measured in terms of the security gain. The resources can be computing, storage, energy, or even monetary units, and the security gain indicates the return of protecting the targets by consuming resource. Although the units of resources and returns are different, they can be abstracted into the numerical value by mathematical methods. In this paper, we put emphasis on analyzing the relationship between them by setting various parameter configurations to simulate the different scenarios. For example, if the defender allocates resources to a target𝑖, this target will be relatively more secure than a target without being covered by resources, which can be configured with a bigger security gain.
Therefore, the defender obtains a security gain by expend- ing resources, which can be abstracted as a limited resource allocation problem; that is, the problem of how the defender should allocate𝑀resources to protect𝑁when𝑀is far less than𝑁. The defender wants to achieve the greatest security gain while minimizing resource consumption. Therefore, this is a trade-off problem between protecting targets and consuming resources. Table 1 lists the parameters used in this paper.𝑇 = {1, . . . , 𝑖, . . . , 𝑁}is the set of active targets;
𝑖denotes one target; 𝑅𝑚𝑖 and 𝑃𝑖𝑚 are the defender’s reward and penalty, respectively, for an attack on this target; and 𝐶𝑎𝑖 and 𝐶𝑚𝑖 are the resources required to be expended by the attacker and the defender, respectively, to best protect target 𝑖. 𝐴is the attacker, who commits to a strategyp = {𝑝1, 𝑝2, . . . , 𝑝𝑁}, where𝑝𝑖is the probability of an attack on target𝑖. 𝐷is the defender, who commits to a strategyq = {𝑞1, 𝑞2, . . . , 𝑞𝑁}, where𝑞𝑖is the probability of protecting target 𝑖. We take ∑𝑖∈𝑇𝑞𝑖𝐶𝑚𝑖 ≤ 𝑀 to represent the constraint of the defender’s available resources, where𝑞𝑖𝐶𝑚𝑖 represents the resources allocated to target𝑖and𝑀represents the maximum quantity of available resources.
In this way, our problem is transformed into computing a reasonable defense probability distribution subject to the defender’s resource constraints based on known parameters, including the resource constraints, the number of targets, the reward for protection, the penalty of protection, the cost of protection, and the cost of attack for the set of targets.
Figure 1: The game theoretic resource allocation interface.
4. Model Formulation
To solve the given problem, we construct a Game Theoretic Resource Allocation (GTRA) model, as shown in Figure 1.
The input parameters are discussed in the above section. After the parameters are input, the proposed GTRA model com- putes the defender’s possible defense probability distribution and the attacker’s possible attack probability distribution.
For the computation process, a Stackelberg game is used to model the interaction between the defender and the attacker. Then, the game payoff functions are built from the input parameters and are designed as strategic rules. Next, the QR model is used to simulate an attacker of the adversarial nature. In addition, an iterative genetic algorithm is utilized to obtain the equilibrium game strategy.
4.1. Stackelberg Game. Game theory [16] is widely used to analyze problems in which all players who are in a conflict with a payoff attempt to win or to maximize their payoffs via changing their strategies based on the reactions of their adversaries. A Stackelberg game is a common game instance in which players select strategies sequentially: the leader moves first, and the follower responds accordingly.
In this paper, defender and attacker are the two rival roles.
They are in conflict over the targets’ security, and both attempt to maximize their own payoffs by allocating the fewest resources to the targets. Through game theoretical deduction, the defender first decides how to allocate resources to cover the targets; then, the attacker selects the targets to attack after
observing the defender’s strategy. The rivalry, the pursuit of the maximum payoffs, and the sequence of actions make our problem fit perfectly into the framework of a Stackelberg game; thus, the GTRA model is built based on a Stackelberg game.
In a Stackelberg game, each player selects the action with the greatest payoff, which is defined as the player’s return after taking the selected action. This payoff usually consists of reward, penalty and cost. In the proposed GTRA model, both the defender and the attacker can take two actions, so their payoffs for a target 𝑖 can be represented by a 2 × 2 payoff matrix, as shown in Table 2. Clearly, there are four cases corresponding to the attacker’s two actions (Attack or Not) and the defender’s two actions (Protect or Not), which are represented by the four cells. Each cell contains two values separated by a comma: the first is the attacker’s payoff, and the second is the defender’s. In contrast to previous payoff matrices, we include action cost to measure the resource allocation metric directly.
Case 1(attack, protect). The attacker launches an attack on target𝑖, and the defender protects it simultaneously. In this case, the attacker’s benefit is −𝛼𝑃𝑖𝑎 + (1 − 𝛼)𝑅𝑎𝑖, and the defender’s benefit is𝛼𝑃𝑖𝑎−(1−𝛼)𝑅𝑎𝑖, where𝛼is the accuracy of attack prediction,𝑃𝑖𝑎is the attack penalty, and𝑅𝑎𝑖 is the attack reward.
Case 2(attack, not protect). The attacker launches an attack on target 𝑖, and the defender does not protect it. In this
Table 2: Payoffs of the two players for target𝑖.
Protect(𝑞𝑖) Not Protect(1 − 𝑞𝑖) Attack
(𝑝𝑖) −𝛼𝑃𝑖𝑎+ (1 − 𝛼)𝑅𝑎𝑖 − 𝐶𝑎𝑖,
𝛼𝑃𝑖𝑎− (1 − 𝛼)𝑅𝑎𝑖− 𝐶𝑚𝑖 𝑅𝑎𝑖− 𝐶𝑎𝑖,
−𝑅𝑎𝑖 Not Attack
(1 − 𝑝𝑖) 0, −𝐶𝑚𝑖 0, 0
case, the attacker will not be punished, and its payoff is the difference between𝑅𝑎𝑖 and cost. The defender’s payoff is−𝑅𝑎𝑖 alone, without a cost, because no protection is attempted.
Case 3(not attack, protect). The attacker does not launch an attack on target𝑖, but the defender protects it. In this case, the attacker’s payoff is zero due to the absence of an attack, and the defender’s payoff is the negative value corresponding to the cost of the consumed security resources,−𝐶𝑚𝑖 , with no benefit.
Case 4(not attack, not protect). The attacker does not launch an attack on target𝑖, and the defender does not protect it. In this case, each player’s payoff is zero because neither performs an action.
To distinguish different targets, one work [36] considered targets with different noncorrelated security assets. Motivated by that study, we label targets with different security assets in the form of distinct rewards, penalties, and action costs for the defender and attacker. A player’s total payoff is the combination of the four separate cases. In combination with the attack probability, defense probability, and payoff items shown in Table 2, the total payoff functions of the defender and the attacker are given in (1) and (2), respectively. These two utility functions are different from the utility functions used in many previous studies because the players’ action costs are directly included in our utility functions.
𝑈𝑀= ∑
𝑖∈T
𝑝𝑖𝑞𝑖[𝛼𝑃𝑖𝑎− (1 − 𝛼) 𝑅𝑎𝑖 − 𝐶𝑚𝑖 ] − 𝑝𝑖(1 − 𝑞𝑖) 𝑅𝑎𝑖
− (1 − 𝑝𝑖) 𝑞𝑖𝐶𝑚𝑖
= ∑
𝑖∈T
𝑞𝑖[𝛼𝑝𝑖(𝑃𝑖𝑎+ 𝑅𝑎𝑖) − 𝐶𝑚𝑖 ] − 𝑝𝑖𝑅𝑎𝑖
(1)
𝑈𝐴= ∑
𝑖∈T
𝑝𝑖𝑞𝑖[−𝛼𝑃𝑖𝑎+ (1 − 𝛼) 𝑅𝑎𝑖 − 𝐶𝑎𝑖] + 𝑝𝑖(1 − 𝑞𝑖)
∗ (𝑅𝑎𝑖 − 𝐶𝑎𝑖)
= ∑
𝑖∈T
𝑝𝑖[−𝛼𝑞𝑖(𝑃𝑖𝑎+ 𝑅𝑎𝑖) + (𝑅𝑎𝑖 − 𝐶𝑎𝑖)]
(2)
For both the defender and the attacker, the objective of each player is to maximize that player’s own payoff by designing an optimal strategy. When both players achieve their maximum payoffs, the corresponding solution to the problem is called the Nash equilibrium [16].
Definition.Consider a game𝐺 = {𝑠1, .., 𝑠𝑛; 𝑢1, . . . , 𝑢𝑛}with𝑛 players. If, for a strategy profile{𝑠∗1, . . . , 𝑠∗𝑛}, the strategy𝑠∗𝑖 for
every player𝑖is either the optimal strategy for that player or a strategy that is no worse than any of the other(𝑛−1)strategies, then that strategy profile is called a Nash equilibrium (NE) strategy profile.
The NE of a Stackelberg game can be derived by applying backward induction [37], which involves reasoning from the end of a situation to determine the sequence of optimal strategies. In this context, we deduce the defender’s protection strategy in a forward manner from the attacker’s situation in each round, as follows.
Follower: Attacker Side.The attacker observes the defender’s strategy and designs a greedy strategy to maximize its payoff.
Formally, for any givenq∈ 𝑆𝑀, the attacker’s task is to solve the optimization problem in
p(q) =argmax 𝑈𝐴(p,q(p))
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 0 ≤ 𝑝𝑖≤ 1 (3) Leader: Defender Side.The defender knows that the attacker will respond greedily. Therefore, the defender designs a protection strategy based on the attacker’s potentially best response. Formally, the defender needs to solve the optimiza- tion problem in (4). The first constraint suggests that the total quantity of resources available to the defender is no more than 𝑀.
q(p) =argmax 𝑈𝑀(p(q) ,q) 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 ∑
𝑖∈𝑇
𝑞𝑖𝐶𝑚𝑖 ≤ 𝑀
𝑎𝑛𝑑 0 ≤ 𝑞𝑖≤ 1
(4)
We derive the NE from the above two sequential steps.
We deriveq∗by solving (4); then,p∗is derived asp(q∗)by solving (3). Finally, the strategy combination(p∗,q∗)is the equilibrium game strategy for the Stackelberg game, and the final resources allocated to target𝑖will be𝑞∗∗ 𝐶𝑚𝑖 .
4.2. Quantal Response. The previous analysis is performed under the assumption that the attacker is perfectly rational and develops its strategy with complete knowledge of the defender’s strategy. However, in the real world, the attacker will not always be perfectly rational since the attacker cannot always know the defender’s strategy. Consequently, the defender is unsure whether the attacker will operate according to the predictive strategyp(q). If the attacker is not perfectly rational and chooses a strategy that deviates slightly from the rational strategy, the defender’s payoff may decrease.
Clearly, the defender is unwilling to accept a lower payoff while doing nothing.
To simulate the bounded rational adversary, many behav- ior models have been proposed, including quantal response (QR), SUQR, and prospect theory (PT), which are all com- monly used. The defender’s response to them in the game theoretic model has been done in our previous work [38] and we found the defender’s response to the QR model is the most careful where the defense probability is relatively bigger than
the other two bounded rational models. Hence, the QR model is introduced into the GTRA model to simulate the attacker’s adversarial nature and to improve the proposed model.
When the QR model is applied, the noise in a bounded rational attacker’s strategy is controlled by𝜆.𝜆 = 0represents a uniform random probability distribution over the attacker’s possible strategies, while 𝜆 → ∞ represents a perfectly rational attacker. Thus, the attacker’s probability of attacking target𝑖is changed to
𝑝𝑖= 𝑒𝜆𝑈𝐴(𝑞𝑖)
∑𝑛𝑗=1𝑒𝜆𝑈𝐴(𝑞𝑗) (5) Furthermore, the defender’s utility function becomes 𝑈𝑀= ∑
𝑖∈𝑇
𝑞𝑖∗ [𝛼𝑝𝑖(𝑃𝑖𝑎+ 𝑅𝑎𝑖) − 𝐶𝑚𝑖 ] − 𝑝𝑖𝑅𝑎𝑖
= ∑
𝑖∈𝑇
[ [
[𝛼𝑞𝑖(𝑃𝑖𝑎+ 𝑅𝑎𝑖) − 𝑅𝑎𝑖] ∗ 𝑒−𝜆[𝛼𝑞𝑖(𝑃𝑖𝑎+𝑅𝑎𝑖)−(𝑅𝑎𝑖−𝐶𝑎𝑖)]
∑𝑛𝑗=1𝑒−𝜆[𝛼𝑞𝑗(𝑃𝑗𝑎+𝑅𝑎𝑗)−(𝑅𝑎𝑗−𝐶𝑎𝑗)]
− 𝑞𝑖𝐶𝑚𝑖 ] ]
(6)
In summary, the proposed GTRA model is used to solve the defender’s problem of how to allocate𝑀units of resources to maximize the defender’s utility function, as illustrated in
maxq 𝑈𝑀 s.t. ∑
𝑖∈𝑇
𝑞𝑖𝐶𝑚𝑖 ≤ 𝑀
0 ≤ 𝑞𝑖≤ 1, ∀𝑖
(7)
4.3. Algorithm. Since the defender’s objective utility func- tion expressed in (7) corresponds to a nonlinear constraint problem, the optimal solution is extremely difficult to find.
As a classic algorithm for searching for an approximately optimal solution [39], the genetic algorithm (GA) provides an alternative approach. GA is a stochastic global search and optimization method that mimics natural biological evolution. However, the typical GA attempts to find a globally near-optimal solution instead of a globally optimal one.
Therefore, in this paper, we utilize Algorithm 1 to compute the defender’s equilibrium strategy in the proposed GTRA model.
In addition to the parameters of the utility function discussed in the previous section, the number of targets and the number of iterations and the resource constraint are initialized before the iteration process. In each iteration, we find the locally optimal strategy 𝑞𝑖 and the corresponding utility𝑈𝑑𝑖using the 𝐺𝐴()function in MATLAB. Then, we record the current maximum after each iteration. When the iteration number𝑖 reaches the given maximum 𝑡𝑖𝑚𝑒𝑠, the globally optimal strategy 𝑞∗𝑖 and the corresponding utility 𝑈𝑑∗𝑖 are obtained. In general, the probability of reaching the global optimum increases as the number of iterations𝑡𝑖𝑚𝑒𝑠 increases.
(1)Initialization:
number of targets→ 𝑁;
number of iterations→ 𝑡𝑖𝑚𝑒𝑠 = 10;
resource constraint→ 𝑀;
𝑈𝑑∗← −∞
(2)Iteration:
(3)while𝑖 < 𝑡𝑖𝑚𝑒𝑠do
(4) (𝑞𝑖, 𝑈𝑑𝑖) ← 𝐺𝐴(𝑀𝑢𝑙𝑡𝑖𝑂𝑏𝑗,𝑁, 𝑀) (5) if 𝑈𝑑𝑖> 𝑈𝑑∗ then
(6) 𝑈𝑑∗= 𝑈𝑑𝑖; (7) 𝑞∗= 𝑞𝑖; (8) end if (9) end while (10)return(𝑞∗, 𝑈𝑑∗)
Algorithm 1: Iterative Genetic Algorithm (IGA).
Table 3: Simplified payoffs for target𝑖.
Protect(𝑞) Not Protect(1 − 𝑞)
Attack (𝑝) 𝑎, 𝑏 𝑐, 𝑑
Not Attack (1 − 𝑝) 0, 𝑓 0, 0
To better understand the equilibrium game strategy, we illustrate the evolutionary behavior of the defender by adopting the phase plane of replicator dynamics [40]. First, tersely describe the payoff of the defender and attacker in every case; Table 2 is changed into Table 3 where𝑎 = −𝛼𝑃𝑖𝑎+ (1−𝛼)𝑅𝑎𝑖−𝐶𝑎𝑖,𝑏 = 𝛼𝑃𝑖𝑎−(1−𝛼)𝑅𝑎𝑖−𝐶𝑚𝑖 ,𝑐 = 𝑅𝑎𝑖−𝐶𝑎𝑖,𝑑 = −𝑅𝑎𝑖, and𝑓 = −𝐶𝑚𝑖 . Then, the replicator dynamics equations of the attacker and the defender are expressed as (8).𝑈𝐴and𝑈𝑀 represent the average payoffs. The evolutionary equilibrium can then be obtained by solving the following equations ̇𝑝 = 0 and ̇𝑞 = 0.
̇𝑝 = 𝑝 (𝑈𝐴− 𝑈𝐴) = 𝑝 (1 − 𝑝) [𝑞𝑎 + (1 − 𝑞) 𝑐 ]
̇𝑞 = 𝑞 (𝑈𝑀− 𝑈𝑀) = 𝑞 (1 − 𝑞) [𝑝 (𝑏 − 𝑑) + (1 − 𝑝) 𝑓] (8) Figure 2 presents the evolutionary equilibrium of the defender’s strategy, which can be seen as the process of adapting the initial strategy to the NE strategy. The smallest circle around the NE point (0.012, 0.2851) is the entire feasible region of the solution.
5. Numerical Study
Since the focus of this paper is to explore the impact of different parameter configurations, we perform the numer- ical analysis directly to validate the proposed method. We first compare the proposed utility function with the utility function that does not consider the action cost. Then, we compare the NE strategy that is computed based on the proposed utility function with four other resource allocation strategies. In each group of experiments, 100 game instances under the same conditions are considered, and the average value is taken as the result. In each game instance, the number
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
q
X: 0.01235 Y: 0.2851
Figure 2: Evolution process of the defender’s behavior.
Table 4: Four utility function scenarios.
No. Relation 𝐶𝑚 𝐶𝑎
1 𝐶𝑚𝑖 > 𝐶𝑎𝑖 𝛾 < 𝐶𝑚𝑖 < 2 ∗ 𝛾 0 < 𝐶𝑎𝑖 < 𝛾 2 𝐶𝑚𝑖 < 𝐶𝑎𝑖 0 < 𝐶𝑚𝑖 < 𝛾 𝛾 < 𝐶𝑎𝑖 < 2 ∗ 𝛾 3 𝐶𝑚𝑖 = 𝐶𝑎𝑖 0 < 𝐶𝑚𝑖 < 𝛾 𝐶𝑎𝑖 = 𝐶𝑚𝑖
4 𝑁𝑜𝐶𝑜𝑠𝑡 𝛾 = 0 𝛾 = 0
of iterations in Algorithm 1 is set to 10 (We conducted an experiment with 100 iterations and found that the maximum value was usually found within the first ten iterations.), and the maximum value is taken.
5.1. Comparison of Utility Functions. To assess the impact of the action cost on the strategy, we compare the utility func- tions with and without action cost, respectively. Specifically, to explore the influence of the relationship between the two players’ action costs on each player’s strategy, we design three groups of experiments, as shown in Table 4.
The first scenario corresponds to a utility function in which the cost of defense is greater than the cost of attack.
On the contrary, the second scenario corresponds to a utility function in which the cost of attack is greater than the cost of defense. The third scenario corresponds to a utility function in which the costs of attack and the cost of defense are equal and are greater than 0. The fourth scenario corresponds to the utility function used in previous works [5–10], in which the action costs𝐶𝑎𝑖 and𝐶𝑚𝑖 are 0 and the resource consumption is simply the sum of the defense probabilities.
We measure the solution quality in each scenario in terms of the defender’s average utility and the average effectiveness over all 100 game instances, where theeffectivenessis defined as the average number of protected targets per resource, as shown in (9). The growth rate, defined in (10), is used to measure the difference in solution quality between different utility function scenarios, where 𝑎 denotes the solution quality for a utility function without action cost and𝑏denotes the solution quality for a utility function that includes the action cost.
𝑒ff𝑒𝑐𝑡𝑖V𝑒𝑛𝑒𝑠𝑠 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜V𝑒𝑟𝑒𝑑 𝑡𝑎𝑟𝑔𝑒𝑡𝑠 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑑 𝑟𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠 (9)
𝐺𝑟𝑜𝑤𝑡ℎ 𝑅𝑎𝑡𝑒 = 𝑎 − 𝑏𝑏 (10)
The parameters used in this paper are the same as those used in the previous study [9], where reward𝑅𝑎𝑖 is chosen randomly from a uniform distribution from 1 to 10, penalty𝑃𝑖𝑎 is chosen randomly from a uniform distribution from -10 to -1, and the resource constraint is proportional to the number of targets, 𝑀 = 𝛾 ∗ 𝑁. We assume that the total available resources, including the total cost, are insufficient to protect all targets. Therefore, the value of parameter𝛾is set to less than 1.
Figure 3 shows the solution quality results for the var- ious utility function scenarios introduced in Table 4 with different parameter configurations. The defender’s average utility, and the defender’s resource consumption along with the effectiveness are displayed on the y-axes. On the x-axes, Figures 3(a), 3(c), and 3(e) show the results of varying the number of targets (𝑁) while keeping the ratio (𝛾) of resources (𝑀) to 𝑁 fixed to 0.1. Figures 3(b), 3(d), and 3(f) show the results of varying the ratio of resources to targets while keeping the number of targets fixed at 200. The corresponding solution qualities in the various utility function scenarios are presented as groups of bars.
Figures 3(a), 3(c), and 3(e) show the following.(1)The defender’s utility (𝑈𝑚) increases as the number of targets (𝑁) increases. The utility is larger in the first scenario than other scenarios under the same conditions, and it is nearly stable in the fourth scenario, regardless of𝑁, which indicates that the greater cost of defense has a better effect on obtaining payoff under the same conditions. (2) The defender’s resource consumption increases as the number of targets (𝑁) increases. The resource consumption is larger in the first scenario than in the second and third scenarios, and it is strongly proportional to𝑁in the fourth scenario.
It illustrates the cumulative impact of action costs on a massive number of targets.(3)The effectiveness does not vary regularly with the number of targets (𝑁); it varies inversely with the resource consumption in the first scenario, in which the action cost is considered in the utility function, while nearly constant effectiveness is maintained in the fourth scenario. It suggests that when expending the same number of resources, the number of protected targets of the fourth scenario where the action costs are not considered in the utility function is the least.
Figures 3(b), 3(d), and 3(f) show the following.(1)The defender’s utility (𝑈𝑚) increases as the ratio of resources to the number of targets increases. The defender’s utility is larger than the second and third scenarios under the same conditions when 𝐶𝑚𝑖 > 𝐶𝑎𝑖, and it is nearly stable when there is no action cost in the utility function, regardless of the resource-to-target ratio. It reveals that the utility functions including action cost provide more utility than those without action cost. Moreover, the number of resources has a positive effect on the defender’s utility.(2)The defender’s resource consumption increases as the resource-to-target ratio increases; it is larger in the first scenario than in the second and third scenarios and it is strongly proportional to the resource-to-target ratio in the fourth scenario. It also
Cm-b igger-Cost
Equal-C ost
Ca-b igger-Cost
NoCost
Utility Functions 0
10 20 30
Defender's Utility (Um)
N = 50 N = 100 N = 200
N = 300 N = 400
(a) Defender’s utility comparison (ratio = 0.1)
Cm-b igger-Cost
Equal-Cost Ca-b
igger-C ost
NoCost
Utility Functions 0
10 20 30 40 50
Defender's Utility (Um)
ratio = 0.1 ratio = 0.2
ratio = 0.3 ratio = 0.4
(b) Defender’s utility comparison (N = 200)
N = 50 N = 100 N = 200
N = 300 N = 400 Cm-b
igger-Cost Equal-C
ost Ca-b
igger-Cost
NoCost
Utility Functions 0
10 20 30 40
Defender's Resource
(c) Resource comparison (ratio = 0.1)
ratio = 0.1 ratio = 0.2
ratio = 0.3 ratio = 0.4 Cm-b
igger-Cost
Equal-Cost
Ca-bigger-Cost
NoCost
Utility Functions 0
20 40 60 80
Defender's Resource
(d) Resource comparison (N = 200)
N = 50 N = 100 N = 200
N = 300 N = 400 Cm-b
igger-C ost
Equal-C ost
Ca-b igger-Cost
NoCost
Utility Functions 0
20 40 60
Effectiveness
(e) Effectiveness comparison (ratio = 0.1)
ratio = 0.1 ratio = 0.2
ratio = 0.3 ratio = 0.4 Cm-b
igger-Cost Equal-C
ost Ca-b
igger-C ost
NoCost
Utility Functions 0
20 40 60
Effectiveness
(f) Effectiveness comparison (N = 200) Figure 3: Solution quality comparison of different utility functions.
shows the cumulative impact of action costs on a massive number of targets. (3) The effectiveness decreases with an increasing resource-to-target ratio. The effectiveness varies inversely with resource consumption in all four scenarios, and it is smaller in the first scenario than in the second and third scenarios. It suggests that the effectiveness decreases with the increasing number of resources under the same conditions, and the effectiveness is the least when the utility does not include the action cost.
Furthermore, we illustrate the difference in solution quality when the utility function does not consider the action cost in Figure 4. Equation (10) shows that if the growth rate is less than zero, then 𝑎 is less than 𝑏. An overall comparative analysis of the results shown in Figure 4 indicates that when the utility function does not include the action cost, the defender’s utility and the effectiveness are lower and the defender’s resource consumption is larger.
The difference becomes especially evident as the number of resources increases (the resource-to-target ratio increases) or the number of targets increases (𝑁increases).
Overall, a utility function that considers the action cost, regardless of the relationship between defense and attack costs, provides the defender with a larger payoff and higher effectiveness. Additionally, although it is possible to represent the resources allocated to targets using the defense probabil- ities, as done in the utility functions implemented in many previous studies, sometimes the resource metric is not the same as the defense probability. For example, when 10 GB of storage is required to run intrusion prevention servers to protect a base station, this requirement cannot be represented as a probability. However, we can set 𝐶𝑚𝑖 = 10 directly, and the defense probability then represents the probability that this target may be covered by this server. Hence, adding the action cost into the utility function is beneficial. In the following sections, we present a series of comparative analyses of various strategies based on our proposed utility function that considers the action cost.
5.2. Comparison of Allocation Strategies. A system with high security requirements is considered; e.g., government systems usually require a high level of consistency and need to be able to resist various attacks. The defender is usually equipped with high-performance defense modules with powerful processing capabilities, so a relatively large protection reward and a small protection penalty, which can be represented as 𝑃𝑖𝑎 > 𝑅𝑎𝑖 [9], are chosen in our study.
Since all three scenarios regarding the relationship between the defense cost and the attack cost have a similar impact on the solution quality, we perform our further study based on the case in which the defense cost is less than the attack cost represented by𝐶𝑚𝑖 < 𝐶𝑎𝑖.
We varied the reward and penalty from 1 to 10, the action cost from 0.1 to 0.4, and the numerical gap between the reward (or penalty) and the cost was considered large. Here, we limit the gap between the reward (or penalty) and the cost by randomly choosing values from𝐶𝑚𝑖 ∈ [0.01, 0.02], 𝐶𝑎𝑖 ∈ [0.02, 0.03],𝑃𝑖𝑎 ∈ [1.4, 1.6], and𝑃𝑖𝑚 ∈ [0.4, 0.6]. These digits can be projected to the scenario that a unit of defense resource can protect at most 100 targets, and a unit of attack
resource can attack at most 50 targets. If an attack fails, the attacker will get a penalty about 1.4. And if the protection fails, the defender will get a penalty about 0.4. In this case, the attacker can be seen as the type of risk-averse player who aims to minimize the risk loss [12].
To further evaluate the utility function which includes the action cost, we simulate four extreme resource allocation strategies in which the defender does not follow the NE strategy.
5.2.1. PartOneS Strategy. The defender cannot protect all the targets due to resource limitations, so it must select at most 𝑘targets to protect. The remaining 𝑁 − 𝑘 targets are not protected. The defense probability distribution is obtained from (11). In this strategy,𝑀available units of resources are consumed.
𝑞𝑖= {{ {{ {{ {{ {{ {
1, 𝑖 = 1, . . . , 𝑘 − 1;
(𝑀 − ∑𝑘−1𝑗=1𝑞𝑗∗ 𝐶𝑚𝑗 )
𝐶𝑚𝑖 , 𝑞𝑗= 1, 𝑖 = 𝑘;
0, 𝑖 = 𝑘 + 1, . . . , 𝑛.
(11)
5.2.2. Rand Strategy. The defender protects targets following a random probability distribution according to (12). In this strategy, the quantity of resources consumed is no more than 𝑀.
𝑞𝑖= 𝑅𝑎𝑛𝑑 (𝑞𝑖) ∗ 𝑀
∑𝑛𝑗=1𝑅𝑎𝑛𝑑 (𝑞𝑗) ∗ 𝐶𝑚𝑗 , 𝑖 = 1, 2, . . . , 𝑛 (12) 5.2.3. Average Strategy. Resources are allocated to each target equally; the defense probability distribution obeys (13). In this strategy, all𝑀available units of resources are consumed.
𝑞𝑖∗ 𝐶𝑚𝑖 = 𝑀
𝑛, 𝑖 = 1, 2, . . . , 𝑛 (13) 5.2.4. AllOneS Strategy. The resource limitation is relaxed, and the defender protects all targets, as expressed in (14), which is approximately abstracted asAllOneS. In this strategy, the quantity of resources consumed is greater than𝑀.
𝑞𝑖= 1, 𝑖 = 1, 2, . . . , 𝑛 (14) Our strategy obtained based on the proposed GTRA model is similar to (15). It is computed using the IGA given in Algorithm 1. In our strategy, the quantity of resources consumed is no more than𝑀.
q= 𝐼𝐺𝐴 (𝑈𝑀, 𝑛, 𝑀) ,
∑ 𝑞𝑖∗ 𝐶𝑚𝑖 ≤ 𝑀 (15)
We start by comparing the utility of the defender with that of the attacker. The defender’s resources are considered to be limited, whereas the attacker’s resources are unlimited. One hundred game instances, with the number of targets ranging from 10 to 1000 in increments of 10, are considered.
400 300 200 100 50
The number of targets (N) 0
2 4 6 8
Growth rate of Defender's Utility
Cm-bigger-Cost Equal-Cost Ca-bigger-Cost
(a) Defender’s utility comparison (ratio = 0.1)
0.1 0.2 0.3 0.4
The ratio of resources (M) to targets (N)
−1.2
−1
−0.8
−0.6
−0.4
−0.2 0
Growth rate of Defender's Utility
Cm-bigger-Cost Equal-Cost Ca-bigger-Cost
(b) Defender’s utility comparison (N = 200)
50 100 200 300 400
The number of targets (N) 0
1 2 3 4 5 6 7
Growth rate of Defense Resource
Cm-bigger-Cost Equal-Cost Ca-bigger-Cost
(c) Resource comparison (ratio = 0.1)
0.1 0.2 0.3 0.4
The ratio of resources (M) to targets (N) 0
2 4 6 8 10
Growth rate of Defense Resource
Cm-bigger-Cost Equal-Cost Ca-bigger-Cost
(d) Resource comparison (N = 200)
50 100 200 300 400
The number of targets (N) Cm-bigger-Cost
Equal-Cost Ca-bigger-Cost
−1.2
−1
−0.8
−0.6
−0.4
−0.2 0
Growth rate of Resource Effectiveness
(e) Effectiveness comparison (ratio = 0.1)
0.1 0.2 0.3 0.4
The ratio of resources (M) to targets (N)
−1.2
−1
−0.8
−0.6
−0.4
−0.2 0
Growth rate of Resource Effectiveness
Cm-bigger-Cost Equal-Cost Ca-bigger-Cost
(f) Effectiveness comparison (N = 200) Figure 4: Difference in solution quality when the utility function does not consider the action cost.
−14
−12
−10
−8
−6
−4
−2 0 2
defender's utility
NE PartOnes Rand
Average AllOnes
X: 120 Y: -0.4879
X: 312 Y: -2.871
100 200 300 400 500 600 700 800 900 1000 0
the number of targets -- N
Figure 5: Comparison of the defender’s utility.
The defender’s utility results for the different strategies are displayed in Figure 5. The vertical axis represents the defender’s utility 𝑈𝑀, and the horizontal axis shows the number of targets𝑁.
When the number of targets is below 120,AllOneSpro- vides the defender with the greatest utility. However, when the number of targets exceeds 312,AllOneSprovides the defender with the smallest utility because of the higher resource consumption. The NE strategy based on our GTRA model outperforms the other four strategies when the number of targets is greater than 120.
Interestingly, the defender’s utility decreases with an increasing number of targets in Figure 5, while the opposite trend is seen in Figure 3. The difference between these two configurations is the range of parameters. When the reward or penalty is much larger than the cost, the defender’s utility is larger and the impact of the number of targets is directly proportional to the utility. By contrast, when the reward or penalty is only slightly larger than the cost, the defender’s utility is smaller and potentially even negative. In this case, the impact of the number of targets is directly proportional to the utility. Hence, the parameter configuration, such as the gap between the reward (or penalty) and cost, influences the defender’s utility. Regardless of the parameter configuration, the NE strategy based on our GTRA model is better than the other strategies in terms of the defender’s utility.
5.3. Comparison in Terms of Various Evaluation Criteria. The NE strategy is obtained by finding the maximum utility for both players. In this subsection, we evaluate the vulnerability, coverage, and effectiveness of our equilibrium strategy and the other four strategies.
5.3.1. Vulnerability. We first evaluate the vulnerability of the defender’s various strategies. TheV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦is defined as a risk indicator for the targets as shown in [41]
V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 −f𝑎𝑖𝑙𝑢𝑟𝑒
𝑠𝑢𝑐𝑐𝑒𝑠𝑠 +f𝑎𝑖𝑙𝑢𝑟𝑒 (16)
the number of targets -- N
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
Vulnerability
NE PartOnes AllOnes
Rand Average
200 400 600 800 1000
0
Figure 6: Vulnerability of 100 groups of instances.
where 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 and 𝑓𝑎𝑖𝑙𝑢𝑟𝑒 denote the numbers of targets in which an attack that is launched is not detected or is detected, respectively. The assumption is made that if the defender allocates resources to protect a target, then that target will be successfully protected against attack by the continuously operating defense system; otherwise, the attack will be successful. Clearly, a greater 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 value and a lower 𝑓𝑎𝑖𝑙𝑢𝑟𝑒 value indicate a more vulnerable strategy. Hence, the lower the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦, the better the strategy.
Figure 6 shows that the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦ofAllOneSis −1, which implies that the number of successes is 0. In this situation, the targets are the most secure. The defender’s protections cover all the targets, resulting in the most secure environment. When the number of targets (N) is small, the NEstrategy achieves the most secure state; asN increases, the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦 increases because the available resources become insufficient to protect all targets. Additionally, once N is greater than 400, theV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦ofNE varies only slightly. These analyses reveal that NEcan be scaled up to protect a large number of targets. Furthermore, compared withPartOneS,Rand, andAverage, as the number of targets to protect increases such that there are insufficient resources to protect all of them,NEperforms better. It enables control of the trade-off between the security benefit and the resource consumption and focuses on protecting targets that are more likely to be attacked.
As a result, NE performs better than all the other strategies except forAllOneSin terms of theV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦of the targets.
5.3.2. Coverage. We evaluate the allocated resources’ cov- erage of the targets next. The coverage is defined as the proportion of protected targets among the total targets, as shown in (17), where the protected targets are defined as those that are attacked by attacker and also protected by defender, those that are not attacked but protected, and those that are not attacked and not protected, either, as denoted byAP,NP, andNF, respectively, in Table 5.
the number of targets -- N 0.3
0.4 0.5 0.6 0.7 0.8 0.9 1
Coverage
NE PartOnes AllOnes
Rand Average
200 400 600 800 1000
0
Figure 7: Coverage in 100 groups of instances.
Table 5: Protection type of target𝑖.
Protect Fail to Protect
Attack AP AF
Not Attack NP NF
𝑐𝑜V𝑒𝑟𝑎𝑔𝑒 =𝐴𝑃 + 𝑁𝑃 + 𝑁𝐹
𝑁 (17)
Figure 7 shows the coverage results for the five strategies for 100 groups of experimental instances.AllOneScovers all targets by protecting all of them, whereasNEcovers almost the fewest targets because NE protects only risky targets attractive to the attacker. Thus, the number of protected targets is smaller than the other strategies. The disadvantage of this strategy is that it cannot guarantee the absolute security of the targets, in contrast to AllOneS. However, it may be useful for saving resources or improving the effectiveness with which those resources are used, especially when resources are valuable or limited.
5.3.3. Effectiveness. We now evaluate the effectiveness of the five strategies. The greater the effectiveness is, the better the strategy is. For 100 groups of experimental instances, the quantities of resources consumed by each strategy are plotted in Figure 8(b). It is noteworthy thatAllOneSconsumes the most resources and NEconsumes the least resources. The resources consumed byPartOneSandAverageare equivalent since these two strategies use all the available resources.
When the number of targets is increased to 1000, the quantity of resources consumed byAllOneSis close to four times the resources consumed byNE. When these values are applied to the real world, they represent a large amount of material or financial resources that must be expended by the defender.
Consequently, our strategy aims to provide high effectiveness.
In Figure 8(a), there is an evident upward trend in the effectiveness ofNEwhen the number of targets is less than 200, which then gradually drops to a stable value with an increasing number of targets. Moreover,NEhas the highest effectiveness among all five strategies. These results suggest
that increasing the number of targets does not affect the effectiveness. In addition, althoughAllOneSprotects the most targets, its effectiveness is lower than that ofNEbecause it consumes more resources.AllOneSmay protect some targets that are not likely to be attacked, which may cause resources to be consumed without gaining benefits, thus decreasing the defender’s effectiveness.
Now, we combine the number of targets, coverage, and vulnerability in Figure 9(a) and combine the number of targets, effectiveness, and vulnerability in Figure 9(b). From Figure 9, it can be concluded that more targets must be protected to maintain a low vulnerability or to decrease the vulnerability. However, if the defender increases the number of protected targets, more resources will be required. Take AllOneSas an example. The vulnerability of the targets is near zero, and the number of protected targets is the largest, but the number of covered targets per resource is low because of the high resource consumption. In this situation, to improve the security of the targets, the NE strategy obtained based on the proposedNEmodel, which balances the security utility and the resource consumption, is the best choice for allowing the defender to utilize limited resources effectively.
5.4. Parameter Analysis. The security utilities of the defender and the attacker are related not only to their strategies but also to certain specific parameters: the resource constraint𝑀, the prediction accuracy𝛼, and the noise𝜆in the attacker’s rationality.
5.4.1. Resource Constraint𝑀. We generate 100 random game instances with 1000 targets and consider different quantities of resources to assess the impact of the resource constraint 𝑀on the players’ utilities. In Figure 10, the x-axis shows the proportion of available resources relative to the maximum resources required, and the y-axis represents the players’
utilities.
Figure 10 shows that when the resource proportion is zero, the defender’s utility is the lowest and the attacker’s utility is the highest. As the proportion of available resources increases, the defender’s utility increases and the attacker’s utility decreases. When the proportion reaches 40%, the utilities of both players become stable. Hence, we conclude that, for the case in which 𝑅𝑚𝑖 > 𝑃𝑖𝑚 and 𝑃𝑖𝑚 > 𝐶𝑚𝑖 , 40% of the maximum resources are an efficient rate of utilization for the defender. When the proportion is greater than 40%, both players’ utilities remain approximately stable.
The jitter in the raw data is due to the aggregated analysis of resource consumption, which demonstrates that spending more resources to protect targets may be less risky, but the cost of the resources consumed will exceed the benefit.
The proposedNE model can compute the correspond- ing best resource proportions for different combinations of reward, penalty, and cost. Therefore, the proposed model offers the defender an alternative means of gaining greater utility while saving resources, thereby improving the defender’s outcome from the perspective of economics.
When the defender needs to estimate the overall quantity of resources required to protect a massive number of targets, the proposed GTRA model can be used to compute the