Allocating Limited Resources to Protect a Massive Number of Targets Using a Game Theoretic Model

(1)

Research Article

Allocating Limited Resources to Protect a Massive Number of Targets Using a Game Theoretic Model

Xu Liu ,

^1,2

Xiaoqiang Di ,

^1,2

Jinqing Li,

^1,2

Huan Wang,

^1,2

Jianping Zhao,

^1,2

Huamin Yang ,

^1,2

Ligang Cong,

^1,2

and Yuming Jiang

³

1School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China

2Jilin Province Key Laboratory of Network and Information Security, Changchun, China

3Department of Information Security and Communication Technology, Norwegian University of Science and Technology, Trondheim, Norway

Correspondence should be addressed to Xiaoqiang Di; [email protected] Received 7 September 2018; Accepted 20 February 2019; Published 13 March 2019

Academic Editor: Neale R. Smith

Copyright © 2019 Xu Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Resource allocation is the process of optimizing the rare resources. In the area of security, how to allocate limited resources to protect a massive number of targets is especially challenging. This paper addresses this resource allocation issue by constructing a game theoretic model. A defender and an attacker are players and the interaction is formulated as a trade-off between protecting targets and consuming resources. The action cost which is a necessary role of consuming resource is considered in the proposed model.

Additionally, a bounded rational behavior model (quantal response: QR), which simulates a human attacker of the adversarial nature, is introduced to improve the proposed model. To validate the proposed model, we compare the different utility functions and resource allocation strategies. The comparison results suggest that the proposed resource allocation strategy performs better than others in the perspective of utility and resource effectiveness.

1. Introduction

Resource allocation has always been a complex problem, especially when driven by security requirements. How to devise a mechanism to control the trade-off between the cost of protection and the achieved security utility is an open challenge [1]. In the AWS re:Invent 2014, the AWS engineer claimed that Amazon had nearly 28 total sets across the world, each of which has one or more data centers with a typical facility containing 50,000 to 80,000 servers [2]. To protect these servers against attack and maintain their consistent operation, cloud providers will implement security strategy.

For example, they can protect targets (e.g., virtual machines, VMs) by setting up resource reservations to analyze the operation of targets and then respond the attack quickly, which is followed by a lot of resource consumption [3].

Therefore, a trade-off problem could be abstracted between consuming resources and protecting targets. Especially, when the number of available resources or resource budget is fixed and limited for all the targets, how to allocate limited

resources to protect a massive number of targets is a vital issue in the security area.

The extreme approach may be to allocate security resources to cover all the targets [4], for instance, setting up the full resource reservations for all the VMs, which will lead to almost double resource consumption. The common approach may be to protect those targets with the most value [5], for instance, setting up the resource reservation for the VMs that store the most data or the sensitive data (e.g., financial data). The former approach fails to consider resource constraints and effectiveness; however, the available resources may not be sufficient to protect all the targets on the one hand; on the other hand, resources allocated to some empty targets may be inefficient. The latter approach does not account for the adversarial nature and perspective-taking of the attacker. An attacker who can learn about a defender’s possible target protection strategy can exploit this knowledge to launch an attack on the targets that the defender does not protect.

Volume 2019, Article ID 5475341, 16 pages https://doi.org/10.1155/2019/5475341

(2)

This paper focuses on developing a general resource allocation method to address the trade-off between security gain and resource consumption. The goal is to resolve the problem of how to utilize limited resources to efficiently protect massive targets against attack. How to build a mathematical model to describe this problem is the key, for example,(1) how to maintain security while allocating resources and(2) how to simulate an attacker of the adversarial nature.

In the previous studies [5–10], the number of allocated resources is measured by defense probability. But the impor- tance and emphasis of resource allocation weaken in such scheme. In general, performing different actions on a target will result in different outcomes. If an action is successful, the actor will obtain some benefit as a reward; otherwise, the actor will lose some assets as a penalty. No matter whether an action is successful, the actor will incur some cost by taking the action. Recent studies about the effort of deterrence and risk preferences in the security games [11, 12] have analyzed the impact of risk preference on the defense effort and deterrence level and the impact of defender’s cost on the investment strategy. Meanwhile, statistics show that a large data center costs between $10 million and $25 million per year and the corresponding maintenance costs account for nearly 80% of its total cost [13]. So it is clear that action cost is an important factor which cannot be ignored. By combining the rewards, penalties, costs, and probabilities of actions in some manner, it may be possible to describe our problem.

Game theory, an important tool for analyzing real world resource allocation problems, such as the assignment of cyber analysts [5] and patrolling strategies [14, 15], provides an alternative solution. However, in most of the previous studies [6–9] on game theoretic resource allocation, only the reward and penalty associated with an action have been included in the game utility function, but the action cost has been ignored. In the real world, no matter what one wants to do, an action cost is often necessary. This cost might be measured in monetary units, physical resources, abstract resources, and so on. Whatever it is, it can be abstracted as a mathematical expression. Hence, we include cost additionally in the Stackelberg game [16] utility function and analyze the impact of different parameter value configurations on the defender’s utility.

Since both the defender and the attacker are intelligent and have the perspective-taking ability, we consider an interaction in which the defender designs a resource allocation strategy first and the attacker subsequently develops an attack strategy. Although the attacker has the ability to consider the situation from the perspective of the defender, the attacker might also take abrupt actions that lie outside the defender’s expectations. This type of attacker, who is of the adversarial nature, can be simulated by the quantal response (QR) model, which has received widespread support in the literature on modeling human behavior in games [17]. In this paper, we introduce it into the proposed Game Theoretic Resource Allocation (GTRA) model to simulate adversarial reasoning.

The efficient resource allocation strategy for the defender is obtained from an optimization algorithm. Three indica- tors, namely, vulnerability, coverage, and effectiveness, are designed to evaluate the effectiveness of our strategy. We

compare the equilibrium strategy based on the proposed GTRA with the one based on a game utility function without considering the action cost, and also compare with four extreme resource allocation strategies, namely average allocation strategy, partial allocation strategy, random allocation strategy, and full-coverage strategy. The experimental results demonstrate the effectiveness of our proposed GTRA model.

The contributions of this paper can be summarized as follows:

(1) To emphasize the action cost in resource consumption. The players’ action costs are included in the game utility function as an independent item. The numerical analyses prove that this type of resource measurement can improve the utility and effectiveness.

(2) To better balance target security and resource consumption. The obtained Nash equilibrium strategy is selected as the defender’s resource allocation strategy because it outperforms the other extreme resource allocation strategies in terms of both security and effectiveness.

(3) The constructed GTRA model provides advice based on the target parameters to assist in determining the appropriate quantity of resources to protect a massive number of targets.

The remainder of this paper is organized as follows.

Sections 2 and 3 describe the related work on resource allocation and our problem, respectively. Game theoretic model, QR model, and the proposed algorithm are presented in Section 4. The numerical analyses are discussed in Sec- tion 5. The final section summarizes the paper and outlines directions for future work.

2. Related Work

Resource allocation is defined as the economical distribution of resources among competing groups of people or programs [1]. Game theory has been applied in resource allocation to better capture the interaction between resource provider and user and shows the economic nature of resource allocation. The previous studies can be roughly classified into two categories based on the different participants considered: security-driven resource allocation between a resource provider and an attacker; and demand-driven resource allocation between a resource provider and a legitimate user.

Demand-driven resource allocation can be further subdi- vided into cost-scheme-based, performance-scheme-based, and mixed-scheme-based resource allocation. The original pricing scheme is used for the allocation of resources of a single type, such as bandwidth [18–20], offload [21, 22], or cache [23]. With the development of the Internet, resource provider could provide nearly all the resources that users need, such as cloud computing provider provides on-demand resources including storage, memory, and bandwidth. Mul- tiresource pricing schemes [24], such as the cost-optimized scheme considering multiple resources [25], have emerged.

Meanwhile, since user requests are becoming necessary

(3)

while providing service, some research has focused on user- demand-driven resource allocation [26–28]. Later, the cost- optimized and performance-based schemes are combined to allow a resource provider to achieve a win-win objective in which resource provider obtains the maximum profit while the user receives the best experience [13, 29–31].

However, during the pursuit of the best experience and the maximum benefit, security issues increase and security- driven resource allocation become a research hotspot, especially when resources are limited and cannot cover all the targets that require protection. The American Institute Teamcore conducted a project with the theme of “AI and game theory for public safety and security”, and their achieve- ments have been applied in various areas. ARMOR [32] was deployed to develop randomized checkpoints and a patrol route strategy at Los Angeles International Airport. GUARDS [33] was developed to assist airports in allocating limited air police resources to protect more than 400 United States airports. Federal Air Marshals used IRIS [34] to provide scheduling coverage for potential attacks. PROTECT [35] was deployed to generate randomized patrolling schedules for the US Coast Guard. These cases are typical instances of limited security resource allocation using game theoretic model.

Other game theoretic studies have also produced good results. One study [5] investigated an intelligent allocation method for assigning limited cyberanalysts to analyze a massive number of security alerts in a network. Another work [10] developed new models and algorithms that could scale to highly complex instances of limited security resource allocation games. Their new methods performed faster than known algorithms when solving massive security games. In further research [6] based on a previous work [7], efficient algorithms were developed to compute the best responses of security forces to different adversary models when resources are limited, and it was proven that the proposed response strategy was superior because it relaxed the assumption of perfect rationality. An additional study [8] proposed a game theoretic scheme for developing dynamic and randomized security strategies that consider an adversary’s surveillance capabilities. The experimental results showed that the proposed algorithm outperformed the existing approaches.

Although these works have utilized the nature and principles of game theory to determine optimal resource allocation strategies, most of them considered only rewards and penalties in their allocation strategies. Recent works [11, 12] specially examined the effect of risk preferences on deterrence and analyzed the impact of the defender’s cost on its investment, which demonstrated that the cost of actions cannot be ignored. Nonetheless, in the previous works [5–10], the action cost was measured by defense probability simply, which inclines to analyze the impact of defense instead of action cost. Therefore, the game theoretic approach that includes the action cost independently is required to perform the resource allocation in the security area.

3. Problem Description

This paper considers a common scenario of a defender and an attacker. The defender’s responsibility is to protect the

Table 1: Parameter descriptions.

Parameter Description

𝑇 set of targets

𝑁 number of targets in𝑇

𝐴 attacker

𝐷 defender

𝑝_𝑖 attack probability for target𝑖 𝑞_𝑖 defense probability for target𝑖 𝐶^𝑚_𝑖 resources allocated to protect target𝑖

𝑀 maximum quantity of available resources

security of 𝑁 targets using 𝑀 resources, so it allocates resources to targets as its action. By contrast, the attacker’s intention is to attack the targets, and such attack also costs resources. For both sides, the benefit of consuming resources can be measured in terms of the security gain. The resources can be computing, storage, energy, or even monetary units, and the security gain indicates the return of protecting the targets by consuming resource. Although the units of resources and returns are different, they can be abstracted into the numerical value by mathematical methods. In this paper, we put emphasis on analyzing the relationship between them by setting various parameter configurations to simulate the different scenarios. For example, if the defender allocates resources to a target𝑖, this target will be relatively more secure than a target without being covered by resources, which can be configured with a bigger security gain.

Therefore, the defender obtains a security gain by expending resources, which can be abstracted as a limited resource allocation problem; that is, the problem of how the defender should allocate𝑀resources to protect𝑁when𝑀is far less than𝑁. The defender wants to achieve the greatest security gain while minimizing resource consumption. Therefore, this is a trade-off problem between protecting targets and consuming resources. Table 1 lists the parameters used in this paper.𝑇 = {1, . . . , 𝑖, . . . , 𝑁}is the set of active targets;

𝑖denotes one target; 𝑅^𝑚_𝑖 and 𝑃_𝑖^𝑚 are the defender’s reward and penalty, respectively, for an attack on this target; and 𝐶^𝑎_𝑖 and 𝐶^𝑚_𝑖 are the resources required to be expended by the attacker and the defender, respectively, to best protect target 𝑖. 𝐴is the attacker, who commits to a strategyp = {𝑝₁, 𝑝₂, . . . , 𝑝_𝑁}, where𝑝_𝑖is the probability of an attack on target𝑖. 𝐷is the defender, who commits to a strategyq = {𝑞₁, 𝑞₂, . . . , 𝑞_𝑁}, where𝑞_𝑖is the probability of protecting target 𝑖. We take ∑_𝑖∈𝑇𝑞_𝑖𝐶^𝑚_𝑖 ≤ 𝑀 to represent the constraint of the defender’s available resources, where𝑞_𝑖𝐶^𝑚_𝑖 represents the resources allocated to target𝑖and𝑀represents the maximum quantity of available resources.

In this way, our problem is transformed into computing a reasonable defense probability distribution subject to the defender’s resource constraints based on known parameters, including the resource constraints, the number of targets, the reward for protection, the penalty of protection, the cost of protection, and the cost of attack for the set of targets.

(4)

Figure 1: The game theoretic resource allocation interface.

4. Model Formulation

To solve the given problem, we construct a Game Theoretic Resource Allocation (GTRA) model, as shown in Figure 1.

The input parameters are discussed in the above section. After the parameters are input, the proposed GTRA model com- putes the defender’s possible defense probability distribution and the attacker’s possible attack probability distribution.

For the computation process, a Stackelberg game is used to model the interaction between the defender and the attacker. Then, the game payoff functions are built from the input parameters and are designed as strategic rules. Next, the QR model is used to simulate an attacker of the adversarial nature. In addition, an iterative genetic algorithm is utilized to obtain the equilibrium game strategy.

4.1. Stackelberg Game. Game theory [16] is widely used to analyze problems in which all players who are in a conflict with a payoff attempt to win or to maximize their payoffs via changing their strategies based on the reactions of their adversaries. A Stackelberg game is a common game instance in which players select strategies sequentially: the leader moves first, and the follower responds accordingly.

In this paper, defender and attacker are the two rival roles.

They are in conflict over the targets’ security, and both attempt to maximize their own payoffs by allocating the fewest resources to the targets. Through game theoretical deduction, the defender first decides how to allocate resources to cover the targets; then, the attacker selects the targets to attack after

observing the defender’s strategy. The rivalry, the pursuit of the maximum payoffs, and the sequence of actions make our problem fit perfectly into the framework of a Stackelberg game; thus, the GTRA model is built based on a Stackelberg game.

In a Stackelberg game, each player selects the action with the greatest payoff, which is defined as the player’s return after taking the selected action. This payoff usually consists of reward, penalty and cost. In the proposed GTRA model, both the defender and the attacker can take two actions, so their payoffs for a target 𝑖 can be represented by a 2 × 2 payoff matrix, as shown in Table 2. Clearly, there are four cases corresponding to the attacker’s two actions (Attack or Not) and the defender’s two actions (Protect or Not), which are represented by the four cells. Each cell contains two values separated by a comma: the first is the attacker’s payoff, and the second is the defender’s. In contrast to previous payoff matrices, we include action cost to measure the resource allocation metric directly.

Case 1(attack, protect). The attacker launches an attack on target𝑖, and the defender protects it simultaneously. In this case, the attacker’s benefit is −𝛼𝑃_𝑖^𝑎 + (1 − 𝛼)𝑅^𝑎_𝑖, and the defender’s benefit is𝛼𝑃_𝑖^𝑎−(1−𝛼)𝑅^𝑎_𝑖, where𝛼is the accuracy of attack prediction,𝑃_𝑖^𝑎is the attack penalty, and𝑅^𝑎_𝑖 is the attack reward.

Case 2(attack, not protect). The attacker launches an attack on target 𝑖, and the defender does not protect it. In this

(5)

Table 2: Payoffs of the two players for target𝑖.

Protect(𝑞_𝑖) Not Protect(1 − 𝑞_𝑖) Attack

(𝑝_𝑖) −𝛼𝑃_𝑖^𝑎+ (1 − 𝛼)𝑅^𝑎_𝑖 − 𝐶^𝑎_𝑖,

𝛼𝑃_𝑖^𝑎− (1 − 𝛼)𝑅^𝑎_𝑖− 𝐶^𝑚_𝑖 𝑅^𝑎_𝑖− 𝐶^𝑎_𝑖,

−𝑅^𝑎_𝑖 Not Attack

(1 − 𝑝_𝑖) 0, −𝐶^𝑚_𝑖 0, 0

case, the attacker will not be punished, and its payoff is the difference between𝑅^𝑎_𝑖 and cost. The defender’s payoff is−𝑅^𝑎_𝑖 alone, without a cost, because no protection is attempted.

Case 3(not attack, protect). The attacker does not launch an attack on target𝑖, but the defender protects it. In this case, the attacker’s payoff is zero due to the absence of an attack, and the defender’s payoff is the negative value corresponding to the cost of the consumed security resources,−𝐶^𝑚_𝑖 , with no benefit.

Case 4(not attack, not protect). The attacker does not launch an attack on target𝑖, and the defender does not protect it. In this case, each player’s payoff is zero because neither performs an action.

To distinguish different targets, one work [36] considered targets with different noncorrelated security assets. Motivated by that study, we label targets with different security assets in the form of distinct rewards, penalties, and action costs for the defender and attacker. A player’s total payoff is the combination of the four separate cases. In combination with the attack probability, defense probability, and payoff items shown in Table 2, the total payoff functions of the defender and the attacker are given in (1) and (2), respectively. These two utility functions are different from the utility functions used in many previous studies because the players’ action costs are directly included in our utility functions.

𝑈_𝑀= ∑

𝑖∈T

𝑝_𝑖𝑞_𝑖[𝛼𝑃_𝑖^𝑎− (1 − 𝛼) 𝑅^𝑎_𝑖 − 𝐶^𝑚_𝑖 ] − 𝑝_𝑖(1 − 𝑞_𝑖) 𝑅^𝑎_𝑖

− (1 − 𝑝_𝑖) 𝑞_𝑖𝐶^𝑚_𝑖

= ∑

𝑖∈T

𝑞_𝑖[𝛼𝑝_𝑖(𝑃_𝑖^𝑎+ 𝑅^𝑎_𝑖) − 𝐶^𝑚_𝑖 ] − 𝑝_𝑖𝑅^𝑎_𝑖

(1)

𝑈_𝐴= ∑

𝑖∈T

𝑝_𝑖𝑞_𝑖[−𝛼𝑃_𝑖^𝑎+ (1 − 𝛼) 𝑅^𝑎_𝑖 − 𝐶^𝑎_𝑖] + 𝑝_𝑖(1 − 𝑞_𝑖)

∗ (𝑅^𝑎_𝑖 − 𝐶^𝑎_𝑖)

= ∑

𝑖∈T

𝑝_𝑖[−𝛼𝑞_𝑖(𝑃_𝑖^𝑎+ 𝑅^𝑎_𝑖) + (𝑅^𝑎_𝑖 − 𝐶^𝑎_𝑖)]

(2)

For both the defender and the attacker, the objective of each player is to maximize that player’s own payoff by designing an optimal strategy. When both players achieve their maximum payoffs, the corresponding solution to the problem is called the Nash equilibrium [16].

Definition.Consider a game𝐺 = {𝑠₁, .., 𝑠_𝑛; 𝑢₁, . . . , 𝑢_𝑛}with𝑛 players. If, for a strategy profile{𝑠^∗₁, . . . , 𝑠^∗_𝑛}, the strategy𝑠^∗_𝑖 for

every player𝑖is either the optimal strategy for that player or a strategy that is no worse than any of the other(𝑛−1)strategies, then that strategy profile is called a Nash equilibrium (NE) strategy profile.

The NE of a Stackelberg game can be derived by applying backward induction [37], which involves reasoning from the end of a situation to determine the sequence of optimal strategies. In this context, we deduce the defender’s protection strategy in a forward manner from the attacker’s situation in each round, as follows.

Follower: Attacker Side.The attacker observes the defender’s strategy and designs a greedy strategy to maximize its payoff.

Formally, for any givenq∈ 𝑆_𝑀, the attacker’s task is to solve the optimization problem in

p(q) =argmax 𝑈_𝐴(p,q(p))

𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 0 ≤ 𝑝_𝑖≤ 1 (3) Leader: Defender Side.The defender knows that the attacker will respond greedily. Therefore, the defender designs a protection strategy based on the attacker’s potentially best response. Formally, the defender needs to solve the optimization problem in (4). The first constraint suggests that the total quantity of resources available to the defender is no more than 𝑀.

q(p) =argmax 𝑈_𝑀(p(q) ,q) 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 ∑

𝑖∈𝑇

𝑞_𝑖𝐶^𝑚_𝑖 ≤ 𝑀

𝑎𝑛𝑑 0 ≤ 𝑞_𝑖≤ 1

(4)

We derive the NE from the above two sequential steps.

We deriveq^∗by solving (4); then,p^∗is derived asp(q^∗)by solving (3). Finally, the strategy combination(p^∗,q^∗)is the equilibrium game strategy for the Stackelberg game, and the final resources allocated to target𝑖will be𝑞^∗∗ 𝐶^𝑚_𝑖 .

4.2. Quantal Response. The previous analysis is performed under the assumption that the attacker is perfectly rational and develops its strategy with complete knowledge of the defender’s strategy. However, in the real world, the attacker will not always be perfectly rational since the attacker cannot always know the defender’s strategy. Consequently, the defender is unsure whether the attacker will operate according to the predictive strategyp(q). If the attacker is not perfectly rational and chooses a strategy that deviates slightly from the rational strategy, the defender’s payoff may decrease.

Clearly, the defender is unwilling to accept a lower payoff while doing nothing.

To simulate the bounded rational adversary, many behavior models have been proposed, including quantal response (QR), SUQR, and prospect theory (PT), which are all com- monly used. The defender’s response to them in the game theoretic model has been done in our previous work [38] and we found the defender’s response to the QR model is the most careful where the defense probability is relatively bigger than

(6)

the other two bounded rational models. Hence, the QR model is introduced into the GTRA model to simulate the attacker’s adversarial nature and to improve the proposed model.

When the QR model is applied, the noise in a bounded rational attacker’s strategy is controlled by𝜆.𝜆 = 0represents a uniform random probability distribution over the attacker’s possible strategies, while 𝜆 󳨀→ ∞ represents a perfectly rational attacker. Thus, the attacker’s probability of attacking target𝑖is changed to

𝑝_𝑖= 𝑒^𝜆𝑈^𝐴^(𝑞^𝑖⁾

∑^𝑛_𝑗=1𝑒^𝜆𝑈^𝐴^(𝑞^𝑗⁾ (5) Furthermore, the defender’s utility function becomes 𝑈_𝑀= ∑

𝑖∈𝑇

𝑞_𝑖∗ [𝛼𝑝_𝑖(𝑃_𝑖^𝑎+ 𝑅^𝑎_𝑖) − 𝐶^𝑚_𝑖 ] − 𝑝_𝑖𝑅^𝑎_𝑖

= ∑

𝑖∈𝑇

[ [

[𝛼𝑞_𝑖(𝑃_𝑖^𝑎+ 𝑅^𝑎_𝑖) − 𝑅^𝑎_𝑖] ∗ 𝑒^{−𝜆[𝛼𝑞}^𝑖^(𝑃^𝑖^𝑎^+𝑅^𝑎^𝑖^)−(𝑅^𝑎^𝑖^−𝐶^𝑎^𝑖^)]

∑^𝑛_𝑗=1𝑒^{−𝜆[𝛼𝑞}^𝑗^(𝑃^𝑗^𝑎^+𝑅^𝑎^𝑗^)−(𝑅^𝑎^𝑗^−𝐶^𝑎^𝑗^)]

− 𝑞𝑖𝐶^𝑚_𝑖 ] ]

(6)

In summary, the proposed GTRA model is used to solve the defender’s problem of how to allocate𝑀units of resources to maximize the defender’s utility function, as illustrated in

maxq 𝑈_𝑀 s.t. ∑

𝑖∈𝑇

𝑞_𝑖𝐶^𝑚_𝑖 ≤ 𝑀

0 ≤ 𝑞_𝑖≤ 1, ∀𝑖

(7)

4.3. Algorithm. Since the defender’s objective utility function expressed in (7) corresponds to a nonlinear constraint problem, the optimal solution is extremely difficult to find.

As a classic algorithm for searching for an approximately optimal solution [39], the genetic algorithm (GA) provides an alternative approach. GA is a stochastic global search and optimization method that mimics natural biological evolution. However, the typical GA attempts to find a globally near-optimal solution instead of a globally optimal one.

Therefore, in this paper, we utilize Algorithm 1 to compute the defender’s equilibrium strategy in the proposed GTRA model.

In addition to the parameters of the utility function discussed in the previous section, the number of targets and the number of iterations and the resource constraint are initialized before the iteration process. In each iteration, we find the locally optimal strategy 𝑞_𝑖 and the corresponding utility𝑈𝑑_𝑖using the 𝐺𝐴()function in MATLAB. Then, we record the current maximum after each iteration. When the iteration number𝑖 reaches the given maximum 𝑡𝑖𝑚𝑒𝑠, the globally optimal strategy 𝑞^∗_𝑖 and the corresponding utility 𝑈𝑑^∗_𝑖 are obtained. In general, the probability of reaching the global optimum increases as the number of iterations𝑡𝑖𝑚𝑒𝑠 increases.

(1)Initialization:

number of targets󳨀→ 𝑁;

number of iterations󳨀→ 𝑡𝑖𝑚𝑒𝑠 = 10;

resource constraint󳨀→ 𝑀;

𝑈𝑑^∗←󳨀 −∞

(2)Iteration:

(3)while𝑖 < 𝑡𝑖𝑚𝑒𝑠do

(4) (𝑞_𝑖, 𝑈𝑑_𝑖) ←󳨀 𝐺𝐴(𝑀𝑢𝑙𝑡𝑖𝑂𝑏𝑗,𝑁, 𝑀) (5) if 𝑈𝑑_𝑖> 𝑈𝑑^∗ then

(6) 𝑈𝑑^∗= 𝑈𝑑_𝑖; (7) 𝑞^∗= 𝑞_𝑖; (8) end if (9) end while (10)return(𝑞^∗, 𝑈𝑑^∗)

Algorithm 1: Iterative Genetic Algorithm (IGA).

Table 3: Simplified payoffs for target𝑖.

Protect(𝑞) Not Protect(1 − 𝑞)

Attack (𝑝) 𝑎, 𝑏 𝑐, 𝑑

Not Attack (1 − 𝑝) 0, 𝑓 0, 0

To better understand the equilibrium game strategy, we illustrate the evolutionary behavior of the defender by adopting the phase plane of replicator dynamics [40]. First, tersely describe the payoff of the defender and attacker in every case; Table 2 is changed into Table 3 where𝑎 = −𝛼𝑃_𝑖^𝑎+ (1−𝛼)𝑅^𝑎_𝑖−𝐶^𝑎_𝑖,𝑏 = 𝛼𝑃_𝑖^𝑎−(1−𝛼)𝑅^𝑎_𝑖−𝐶^𝑚_𝑖 ,𝑐 = 𝑅^𝑎_𝑖−𝐶^𝑎_𝑖,𝑑 = −𝑅^𝑎_𝑖, and𝑓 = −𝐶^𝑚_𝑖 . Then, the replicator dynamics equations of the attacker and the defender are expressed as (8).𝑈_𝐴and𝑈_𝑀 represent the average payoffs. The evolutionary equilibrium can then be obtained by solving the following equations ̇𝑝 = 0 and ̇𝑞 = 0.

̇𝑝 = 𝑝 (𝑈_𝐴− 𝑈_𝐴) = 𝑝 (1 − 𝑝) [𝑞𝑎 + (1 − 𝑞) 𝑐 ]

̇𝑞 = 𝑞 (𝑈_𝑀− 𝑈_𝑀) = 𝑞 (1 − 𝑞) [𝑝 (𝑏 − 𝑑) + (1 − 𝑝) 𝑓] (8) Figure 2 presents the evolutionary equilibrium of the defender’s strategy, which can be seen as the process of adapting the initial strategy to the NE strategy. The smallest circle around the NE point (0.012, 0.2851) is the entire feasible region of the solution.

5. Numerical Study

Since the focus of this paper is to explore the impact of different parameter configurations, we perform the numerical analysis directly to validate the proposed method. We first compare the proposed utility function with the utility function that does not consider the action cost. Then, we compare the NE strategy that is computed based on the proposed utility function with four other resource allocation strategies. In each group of experiments, 100 game instances under the same conditions are considered, and the average value is taken as the result. In each game instance, the number

(7)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

q

X: 0.01235 Y: 0.2851

Figure 2: Evolution process of the defender’s behavior.

Table 4: Four utility function scenarios.

No. Relation 𝐶^𝑚 𝐶^𝑎

1 𝐶^𝑚_𝑖 > 𝐶^𝑎_𝑖 𝛾 < 𝐶^𝑚_𝑖 < 2 ∗ 𝛾 0 < 𝐶^𝑎_𝑖 < 𝛾 2 𝐶^𝑚_𝑖 < 𝐶^𝑎_𝑖 0 < 𝐶^𝑚_𝑖 < 𝛾 𝛾 < 𝐶^𝑎_𝑖 < 2 ∗ 𝛾 3 𝐶^𝑚_𝑖 = 𝐶^𝑎_𝑖 0 < 𝐶^𝑚_𝑖 < 𝛾 𝐶^𝑎_𝑖 = 𝐶^𝑚_𝑖

4 𝑁𝑜𝐶𝑜𝑠𝑡 𝛾 = 0 𝛾 = 0

of iterations in Algorithm 1 is set to 10 (We conducted an experiment with 100 iterations and found that the maximum value was usually found within the first ten iterations.), and the maximum value is taken.

5.1. Comparison of Utility Functions. To assess the impact of the action cost on the strategy, we compare the utility functions with and without action cost, respectively. Specifically, to explore the influence of the relationship between the two players’ action costs on each player’s strategy, we design three groups of experiments, as shown in Table 4.

The first scenario corresponds to a utility function in which the cost of defense is greater than the cost of attack.

On the contrary, the second scenario corresponds to a utility function in which the cost of attack is greater than the cost of defense. The third scenario corresponds to a utility function in which the costs of attack and the cost of defense are equal and are greater than 0. The fourth scenario corresponds to the utility function used in previous works [5–10], in which the action costs𝐶^𝑎_𝑖 and𝐶^𝑚_𝑖 are 0 and the resource consumption is simply the sum of the defense probabilities.

We measure the solution quality in each scenario in terms of the defender’s average utility and the average effectiveness over all 100 game instances, where theeffectivenessis defined as the average number of protected targets per resource, as shown in (9). The growth rate, defined in (10), is used to measure the difference in solution quality between different utility function scenarios, where 𝑎 denotes the solution quality for a utility function without action cost and𝑏denotes the solution quality for a utility function that includes the action cost.

𝑒ff𝑒𝑐𝑡𝑖V𝑒𝑛𝑒𝑠𝑠 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜V𝑒𝑟𝑒𝑑 𝑡𝑎𝑟𝑔𝑒𝑡𝑠 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑑 𝑟𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠 (9)

𝐺𝑟𝑜𝑤𝑡ℎ 𝑅𝑎𝑡𝑒 = 𝑎 − 𝑏𝑏 (10)

The parameters used in this paper are the same as those used in the previous study [9], where reward𝑅^𝑎_𝑖 is chosen randomly from a uniform distribution from 1 to 10, penalty𝑃_𝑖^𝑎 is chosen randomly from a uniform distribution from -10 to -1, and the resource constraint is proportional to the number of targets, 𝑀 = 𝛾 ∗ 𝑁. We assume that the total available resources, including the total cost, are insufficient to protect all targets. Therefore, the value of parameter𝛾is set to less than 1.

Figure 3 shows the solution quality results for the various utility function scenarios introduced in Table 4 with different parameter configurations. The defender’s average utility, and the defender’s resource consumption along with the effectiveness are displayed on the y-axes. On the x-axes, Figures 3(a), 3(c), and 3(e) show the results of varying the number of targets (𝑁) while keeping the ratio (𝛾) of resources (𝑀) to 𝑁 fixed to 0.1. Figures 3(b), 3(d), and 3(f) show the results of varying the ratio of resources to targets while keeping the number of targets fixed at 200. The corresponding solution qualities in the various utility function scenarios are presented as groups of bars.

Figures 3(a), 3(c), and 3(e) show the following.(1)The defender’s utility (𝑈_𝑚) increases as the number of targets (𝑁) increases. The utility is larger in the first scenario than other scenarios under the same conditions, and it is nearly stable in the fourth scenario, regardless of𝑁, which indicates that the greater cost of defense has a better effect on obtaining payoff under the same conditions. (2) The defender’s resource consumption increases as the number of targets (𝑁) increases. The resource consumption is larger in the first scenario than in the second and third scenarios, and it is strongly proportional to𝑁in the fourth scenario.

It illustrates the cumulative impact of action costs on a massive number of targets.(3)The effectiveness does not vary regularly with the number of targets (𝑁); it varies inversely with the resource consumption in the first scenario, in which the action cost is considered in the utility function, while nearly constant effectiveness is maintained in the fourth scenario. It suggests that when expending the same number of resources, the number of protected targets of the fourth scenario where the action costs are not considered in the utility function is the least.

Figures 3(b), 3(d), and 3(f) show the following.(1)The defender’s utility (𝑈_𝑚) increases as the ratio of resources to the number of targets increases. The defender’s utility is larger than the second and third scenarios under the same conditions when 𝐶^𝑚_𝑖 > 𝐶^𝑎_𝑖, and it is nearly stable when there is no action cost in the utility function, regardless of the resource-to-target ratio. It reveals that the utility functions including action cost provide more utility than those without action cost. Moreover, the number of resources has a positive effect on the defender’s utility.(2)The defender’s resource consumption increases as the resource-to-target ratio increases; it is larger in the first scenario than in the second and third scenarios and it is strongly proportional to the resource-to-target ratio in the fourth scenario. It also

(8)

Cm-b igger-Cost

Equal-C ost

Ca-b igger-Cost

NoCost

Utility Functions 0

10 20 30

Defender's Utility (Um)

N = 50 N = 100 N = 200

N = 300 N = 400

(a) Defender’s utility comparison (ratio = 0.1)

Cm-b igger-Cost

Equal-Cost Ca-b

igger-C ost

NoCost

Utility Functions 0

10 20 30 40 50

Defender's Utility (Um)

ratio = 0.1 ratio = 0.2

(b) Defender’s utility comparison (N = 200)

N = 50 N = 100 N = 200

N = 300 N = 400 Cm-b

igger-Cost Equal-C

ost Ca-b

igger-Cost

NoCost

Utility Functions 0

10 20 30 40

Defender's Resource

(c) Resource comparison (ratio = 0.1)

ratio = 0.3 ratio = 0.4 Cm-b

igger-Cost

Equal-Cost

Ca-bigger-Cost

NoCost

Utility Functions 0

20 40 60 80

Defender's Resource

(d) Resource comparison (N = 200)

N = 50 N = 100 N = 200

N = 300 N = 400 Cm-b

igger-C ost

Equal-C ost

Ca-b igger-Cost

NoCost

Utility Functions 0

20 40 60

Effectiveness

(e) Effectiveness comparison (ratio = 0.1)

ratio = 0.3 ratio = 0.4 Cm-b

igger-Cost Equal-C

ost Ca-b

igger-C ost

NoCost

Utility Functions 0

20 40 60

Effectiveness

(f) Effectiveness comparison (N = 200) Figure 3: Solution quality comparison of different utility functions.

(9)

shows the cumulative impact of action costs on a massive number of targets. (3) The effectiveness decreases with an increasing resource-to-target ratio. The effectiveness varies inversely with resource consumption in all four scenarios, and it is smaller in the first scenario than in the second and third scenarios. It suggests that the effectiveness decreases with the increasing number of resources under the same conditions, and the effectiveness is the least when the utility does not include the action cost.

Furthermore, we illustrate the difference in solution quality when the utility function does not consider the action cost in Figure 4. Equation (10) shows that if the growth rate is less than zero, then 𝑎 is less than 𝑏. An overall comparative analysis of the results shown in Figure 4 indicates that when the utility function does not include the action cost, the defender’s utility and the effectiveness are lower and the defender’s resource consumption is larger.

The difference becomes especially evident as the number of resources increases (the resource-to-target ratio increases) or the number of targets increases (𝑁increases).

Overall, a utility function that considers the action cost, regardless of the relationship between defense and attack costs, provides the defender with a larger payoff and higher effectiveness. Additionally, although it is possible to represent the resources allocated to targets using the defense probabilities, as done in the utility functions implemented in many previous studies, sometimes the resource metric is not the same as the defense probability. For example, when 10 GB of storage is required to run intrusion prevention servers to protect a base station, this requirement cannot be represented as a probability. However, we can set 𝐶^𝑚_𝑖 = 10 directly, and the defense probability then represents the probability that this target may be covered by this server. Hence, adding the action cost into the utility function is beneficial. In the following sections, we present a series of comparative analyses of various strategies based on our proposed utility function that considers the action cost.

5.2. Comparison of Allocation Strategies. A system with high security requirements is considered; e.g., government systems usually require a high level of consistency and need to be able to resist various attacks. The defender is usually equipped with high-performance defense modules with powerful processing capabilities, so a relatively large protection reward and a small protection penalty, which can be represented as 𝑃_𝑖^𝑎 > 𝑅^𝑎_𝑖 [9], are chosen in our study.

Since all three scenarios regarding the relationship between the defense cost and the attack cost have a similar impact on the solution quality, we perform our further study based on the case in which the defense cost is less than the attack cost represented by𝐶^𝑚_𝑖 < 𝐶^𝑎_𝑖.

We varied the reward and penalty from 1 to 10, the action cost from 0.1 to 0.4, and the numerical gap between the reward (or penalty) and the cost was considered large. Here, we limit the gap between the reward (or penalty) and the cost by randomly choosing values from𝐶^𝑚_𝑖 ∈ [0.01, 0.02], 𝐶^𝑎_𝑖 ∈ [0.02, 0.03],𝑃_𝑖^𝑎 ∈ [1.4, 1.6], and𝑃_𝑖^𝑚 ∈ [0.4, 0.6]. These digits can be projected to the scenario that a unit of defense resource can protect at most 100 targets, and a unit of attack

resource can attack at most 50 targets. If an attack fails, the attacker will get a penalty about 1.4. And if the protection fails, the defender will get a penalty about 0.4. In this case, the attacker can be seen as the type of risk-averse player who aims to minimize the risk loss [12].

To further evaluate the utility function which includes the action cost, we simulate four extreme resource allocation strategies in which the defender does not follow the NE strategy.

5.2.1. PartOneS Strategy. The defender cannot protect all the targets due to resource limitations, so it must select at most 𝑘targets to protect. The remaining 𝑁 − 𝑘 targets are not protected. The defense probability distribution is obtained from (11). In this strategy,𝑀available units of resources are consumed.

𝑞_𝑖= {{ {{ {{ {{ {{ {

1, 𝑖 = 1, . . . , 𝑘 − 1;

(𝑀 − ∑^𝑘−1_𝑗=1𝑞_𝑗∗ 𝐶^𝑚_𝑗 )

𝐶^𝑚_𝑖 , 𝑞_𝑗= 1, 𝑖 = 𝑘;

0, 𝑖 = 𝑘 + 1, . . . , 𝑛.

(11)

5.2.2. Rand Strategy. The defender protects targets following a random probability distribution according to (12). In this strategy, the quantity of resources consumed is no more than 𝑀.

𝑞_𝑖= 𝑅𝑎𝑛𝑑 (𝑞_𝑖) ∗ 𝑀

∑^𝑛_𝑗=1𝑅𝑎𝑛𝑑 (𝑞_𝑗) ∗ 𝐶^𝑚_𝑗 , 𝑖 = 1, 2, . . . , 𝑛 (12) 5.2.3. Average Strategy. Resources are allocated to each target equally; the defense probability distribution obeys (13). In this strategy, all𝑀available units of resources are consumed.

𝑞_𝑖∗ 𝐶^𝑚_𝑖 = 𝑀

𝑛, 𝑖 = 1, 2, . . . , 𝑛 (13) 5.2.4. AllOneS Strategy. The resource limitation is relaxed, and the defender protects all targets, as expressed in (14), which is approximately abstracted asAllOneS. In this strategy, the quantity of resources consumed is greater than𝑀.

𝑞_𝑖= 1, 𝑖 = 1, 2, . . . , 𝑛 (14) Our strategy obtained based on the proposed GTRA model is similar to (15). It is computed using the IGA given in Algorithm 1. In our strategy, the quantity of resources consumed is no more than𝑀.

q= 𝐼𝐺𝐴 (𝑈_𝑀, 𝑛, 𝑀) ,

∑ 𝑞_𝑖∗ 𝐶^𝑚_𝑖 ≤ 𝑀 (15)

We start by comparing the utility of the defender with that of the attacker. The defender’s resources are considered to be limited, whereas the attacker’s resources are unlimited. One hundred game instances, with the number of targets ranging from 10 to 1000 in increments of 10, are considered.

(10)

400 300 200 100 50

The number of targets (N) 0

2 4 6 8

Growth rate of Defender's Utility

Cm-bigger-Cost Equal-Cost Ca-bigger-Cost

(a) Defender’s utility comparison (ratio = 0.1)

0.1 0.2 0.3 0.4

The ratio of resources (M) to targets (N)

−1.2

−1

−0.8

−0.6

−0.4

−0.2 0

Growth rate of Defender's Utility

(b) Defender’s utility comparison (N = 200)

50 100 200 300 400

The number of targets (N) 0

1 2 3 4 5 6 7

Growth rate of Defense Resource

(c) Resource comparison (ratio = 0.1)

0.1 0.2 0.3 0.4

The ratio of resources (M) to targets (N) 0

2 4 6 8 10

Growth rate of Defense Resource

(d) Resource comparison (N = 200)

50 100 200 300 400

The number of targets (N) Cm-bigger-Cost

Equal-Cost Ca-bigger-Cost

−1.2

−1

−0.8

−0.6

−0.4

−0.2 0

Growth rate of Resource Effectiveness

(e) Effectiveness comparison (ratio = 0.1)

0.1 0.2 0.3 0.4

The ratio of resources (M) to targets (N)

−1.2

−1

−0.8

−0.6

−0.4

−0.2 0

Growth rate of Resource Effectiveness

(f) Effectiveness comparison (N = 200) Figure 4: Difference in solution quality when the utility function does not consider the action cost.

(11)

−14

−12

−10

−8

−6

−4

−2 0 2

defender's utility

NE PartOnes Rand

Average AllOnes

X: 120 Y: -0.4879

X: 312 Y: -2.871

100 200 300 400 500 600 700 800 900 1000 0

the number of targets -- N

Figure 5: Comparison of the defender’s utility.

The defender’s utility results for the different strategies are displayed in Figure 5. The vertical axis represents the defender’s utility 𝑈_𝑀, and the horizontal axis shows the number of targets𝑁.

When the number of targets is below 120,AllOneSprovides the defender with the greatest utility. However, when the number of targets exceeds 312,AllOneSprovides the defender with the smallest utility because of the higher resource consumption. The NE strategy based on our GTRA model outperforms the other four strategies when the number of targets is greater than 120.

Interestingly, the defender’s utility decreases with an increasing number of targets in Figure 5, while the opposite trend is seen in Figure 3. The difference between these two configurations is the range of parameters. When the reward or penalty is much larger than the cost, the defender’s utility is larger and the impact of the number of targets is directly proportional to the utility. By contrast, when the reward or penalty is only slightly larger than the cost, the defender’s utility is smaller and potentially even negative. In this case, the impact of the number of targets is directly proportional to the utility. Hence, the parameter configuration, such as the gap between the reward (or penalty) and cost, influences the defender’s utility. Regardless of the parameter configuration, the NE strategy based on our GTRA model is better than the other strategies in terms of the defender’s utility.

5.3. Comparison in Terms of Various Evaluation Criteria. The NE strategy is obtained by finding the maximum utility for both players. In this subsection, we evaluate the vulnerability, coverage, and effectiveness of our equilibrium strategy and the other four strategies.

5.3.1. Vulnerability. We first evaluate the vulnerability of the defender’s various strategies. TheV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦is defined as a risk indicator for the targets as shown in [41]

V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 −f𝑎𝑖𝑙𝑢𝑟𝑒

𝑠𝑢𝑐𝑐𝑒𝑠𝑠 +f𝑎𝑖𝑙𝑢𝑟𝑒 (16)

the number of targets -- N

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

Vulnerability

NE PartOnes AllOnes

Rand Average

200 400 600 800 1000

0

Figure 6: Vulnerability of 100 groups of instances.

where 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 and 𝑓𝑎𝑖𝑙𝑢𝑟𝑒 denote the numbers of targets in which an attack that is launched is not detected or is detected, respectively. The assumption is made that if the defender allocates resources to protect a target, then that target will be successfully protected against attack by the continuously operating defense system; otherwise, the attack will be successful. Clearly, a greater 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 value and a lower 𝑓𝑎𝑖𝑙𝑢𝑟𝑒 value indicate a more vulnerable strategy. Hence, the lower the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦, the better the strategy.

Figure 6 shows that the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦ofAllOneSis −1, which implies that the number of successes is 0. In this situation, the targets are the most secure. The defender’s protections cover all the targets, resulting in the most secure environment. When the number of targets (N) is small, the NEstrategy achieves the most secure state; asN increases, the V𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦 increases because the available resources become insufficient to protect all targets. Additionally, once N is greater than 400, theV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦ofNE varies only slightly. These analyses reveal that NEcan be scaled up to protect a large number of targets. Furthermore, compared withPartOneS,Rand, andAverage, as the number of targets to protect increases such that there are insufficient resources to protect all of them,NEperforms better. It enables control of the trade-off between the security benefit and the resource consumption and focuses on protecting targets that are more likely to be attacked.

As a result, NE performs better than all the other strategies except forAllOneSin terms of theV𝑢𝑙𝑛𝑒𝑟𝑎𝑏𝑖𝑙𝑖𝑡𝑦of the targets.

5.3.2. Coverage. We evaluate the allocated resources’ coverage of the targets next. The coverage is defined as the proportion of protected targets among the total targets, as shown in (17), where the protected targets are defined as those that are attacked by attacker and also protected by defender, those that are not attacked but protected, and those that are not attacked and not protected, either, as denoted byAP,NP, andNF, respectively, in Table 5.

(12)

the number of targets -- N 0.3

0.4 0.5 0.6 0.7 0.8 0.9 1

Coverage

NE PartOnes AllOnes

Rand Average

200 400 600 800 1000

0

Figure 7: Coverage in 100 groups of instances.

Table 5: Protection type of target𝑖.

Protect Fail to Protect

Attack AP AF

Not Attack NP NF

𝑐𝑜V𝑒𝑟𝑎𝑔𝑒 =𝐴𝑃 + 𝑁𝑃 + 𝑁𝐹

𝑁 (17)

Figure 7 shows the coverage results for the five strategies for 100 groups of experimental instances.AllOneScovers all targets by protecting all of them, whereasNEcovers almost the fewest targets because NE protects only risky targets attractive to the attacker. Thus, the number of protected targets is smaller than the other strategies. The disadvantage of this strategy is that it cannot guarantee the absolute security of the targets, in contrast to AllOneS. However, it may be useful for saving resources or improving the effectiveness with which those resources are used, especially when resources are valuable or limited.

5.3.3. Effectiveness. We now evaluate the effectiveness of the five strategies. The greater the effectiveness is, the better the strategy is. For 100 groups of experimental instances, the quantities of resources consumed by each strategy are plotted in Figure 8(b). It is noteworthy thatAllOneSconsumes the most resources and NEconsumes the least resources. The resources consumed byPartOneSandAverageare equivalent since these two strategies use all the available resources.

When the number of targets is increased to 1000, the quantity of resources consumed byAllOneSis close to four times the resources consumed byNE. When these values are applied to the real world, they represent a large amount of material or financial resources that must be expended by the defender.

Consequently, our strategy aims to provide high effectiveness.

In Figure 8(a), there is an evident upward trend in the effectiveness ofNEwhen the number of targets is less than 200, which then gradually drops to a stable value with an increasing number of targets. Moreover,NEhas the highest effectiveness among all five strategies. These results suggest

that increasing the number of targets does not affect the effectiveness. In addition, althoughAllOneSprotects the most targets, its effectiveness is lower than that ofNEbecause it consumes more resources.AllOneSmay protect some targets that are not likely to be attacked, which may cause resources to be consumed without gaining benefits, thus decreasing the defender’s effectiveness.

Now, we combine the number of targets, coverage, and vulnerability in Figure 9(a) and combine the number of targets, effectiveness, and vulnerability in Figure 9(b). From Figure 9, it can be concluded that more targets must be protected to maintain a low vulnerability or to decrease the vulnerability. However, if the defender increases the number of protected targets, more resources will be required. Take AllOneSas an example. The vulnerability of the targets is near zero, and the number of protected targets is the largest, but the number of covered targets per resource is low because of the high resource consumption. In this situation, to improve the security of the targets, the NE strategy obtained based on the proposedNEmodel, which balances the security utility and the resource consumption, is the best choice for allowing the defender to utilize limited resources effectively.

5.4. Parameter Analysis. The security utilities of the defender and the attacker are related not only to their strategies but also to certain specific parameters: the resource constraint𝑀, the prediction accuracy𝛼, and the noise𝜆in the attacker’s rationality.

5.4.1. Resource Constraint𝑀. We generate 100 random game instances with 1000 targets and consider different quantities of resources to assess the impact of the resource constraint 𝑀on the players’ utilities. In Figure 10, the x-axis shows the proportion of available resources relative to the maximum resources required, and the y-axis represents the players’

utilities.

Figure 10 shows that when the resource proportion is zero, the defender’s utility is the lowest and the attacker’s utility is the highest. As the proportion of available resources increases, the defender’s utility increases and the attacker’s utility decreases. When the proportion reaches 40%, the utilities of both players become stable. Hence, we conclude that, for the case in which 𝑅^𝑚_𝑖 > 𝑃_𝑖^𝑚 and 𝑃_𝑖^𝑚 > 𝐶^𝑚_𝑖 , 40% of the maximum resources are an efficient rate of utilization for the defender. When the proportion is greater than 40%, both players’ utilities remain approximately stable.

The jitter in the raw data is due to the aggregated analysis of resource consumption, which demonstrates that spending more resources to protect targets may be less risky, but the cost of the resources consumed will exceed the benefit.

The proposedNE model can compute the corresponding best resource proportions for different combinations of reward, penalty, and cost. Therefore, the proposed model offers the defender an alternative means of gaining greater utility while saving resources, thereby improving the defender’s outcome from the perspective of economics.

When the defender needs to estimate the overall quantity of resources required to protect a massive number of targets, the proposed GTRA model can be used to compute the