Evaluating Loss and Latency Mitigation Techniques in a Tick-Based Game
Server
Eirik Rodvang Tveterås
Thesis submitted for the degree of
Master in Informatics: Programming and Networks 60 credits
Department of Informatics
Faculty of mathematics and natural sciences
UNIVERSITY OF OSLO
Evaluating Loss and Latency Mitigation Techniques in a
Tick-Based Game Server
Eirik Rodvang Tveterås
c 2018 Eirik Rodvang Tveterås
Evaluating Loss and Latency Mitigation Techniques in a Tick-Based Game Server
http://www.duo.uio.no/
Printed: Reprosentralen, University of Oslo
Abstract
Interactive applications are often tick-driven, meaning that they produce data with regular time intervals, and the sending rate is controlled by the periodicity of the application data production. The nature of these applications may lead to periodic behavior causing some flows to experience consistently higher loss rates, degrading the experience of the application for the user. This phenomenon has so far only been observed in a closed test environment, and we will strive to see if this effect can be reproduced in a real production system.
The data flow stemming from a tick-based application is characterized as a thin stream of data. Even though there exists an increasing amount of research on thin streams, particularly in relation to congestion control mechanisms in TCP, there has been little research on how the synchronization effects of tick-based applications affect thin streams. For instance, the periodic transmission of data allows for specialized techniques of handling issues related to packet loss and latency.
This thesis uses Counter-Strike: Global Offensive, a tick-based FPS game, as a use case for exploring the effectiveness of client-side and server-side latency mitigation techniques. In particular, we will be experimenting on the effectiveness of interpolation, lag compensation, and input prediction.
Our results suggest that input prediction masks latencies less than 100ms even in high intensity situations. The client side prediction becomes more unstable at higher latencies, degrading the experience of the game for the user. Interpolation causes an inherent client-side latency, known as the interpolation amount. Our experiments show that an interpolation amount between 30ms and 120ms is ideal for optimal gameplay. However, the exact trade-off point for client side latency due to interpolation, and smooth rendering in the client is a topic for further research.
Acknowledgments
I would like to thank Dr. Andreas Petlund for being an excellent adviser for my master’s degree. I would also like to thank my wonderful cat, Marlon, who lay next to me during the majority of my writing sessions.
Last but not least I would like to thank my girlfriend, Ylva, for always supporting me in my endeavors and who kept me motivated throughout this journey.
Contents
Abstract i
Acknowledgements iii
Contents vii
List of Figures ix
List of Tables xi
1 Introduction 1
1.1 Thesis Context . . . 4
1.2 Scope and Limitations . . . 4
1.3 Problem Statement . . . 4
1.4 Outline . . . 5
2 Background 7 2.1 Internetworking and Transport Protocols . . . 7
2.1.1 The Internet Architecture . . . 7
2.1.2 The Internet Protocol Suite . . . 8
2.1.3 The Open Systems Interconnection Model . . . 11
2.1.4 User Datagram Protocol . . . 13
2.1.4.1 UDP Header Structure . . . 14
2.1.5 Transmission Control Protocol . . . 15
2.1.5.1 TCP Header Structure . . . 17
2.1.5.2 Data Communication . . . 18
2.1.5.3 Flow Control . . . 21
2.1.5.4 Three-way Handshake . . . 22
2.2 Thin Streams . . . 23
2.2.1 Thin Streams and TCP . . . 24
2.2.2 Related Work . . . 27
2.2.2.1 TCP Variations . . . 27
2.2.2.2 TCP Modifications . . . 28
2.2.2.3 Aggressiveness in Thin Streams . . . 29
2.2.3 Clustering Effect . . . 29
3 Tick-Based Applications 31 3.1 Application Types . . . 32
3.1.1 Sensor Networks . . . 32
3.1.2 Online Games . . . 33
3.2 Ticks . . . 34
3.3 Tick rate . . . 35
3.3.1 Tick rate in CS:GO . . . 36
3.4 Latency Masking . . . 37
3.4.1 Entity interpolation and extrapolation . . . 37
3.4.2 Input Prediction . . . 39
3.4.2.1 Implementation . . . 40
3.4.3 Lag Compensation . . . 41
3.4.3.1 Drawbacks of Lag Compensation . . . 42
3.4.4 Cheat Prevention . . . 43
4 Experiment Design 45 4.1 Metrics . . . 45
4.1.1 Latency and Packet Loss . . . 45
4.1.2 Interpolation Ratio . . . 46
4.1.3 Tickrate, Updaterate and Cmdrate . . . 46
4.2 Test Environment . . . 46
4.2.1 Operating System . . . 46
4.2.2 Dedicated Server . . . 47
4.2.3 The Application . . . 47
4.2.4 The Steam Console Client . . . 48
4.2.5 Linux Game Server Managers . . . 48
4.2.6 Logging . . . 48
4.2.7 Server Configuration . . . 49
4.2.7.1 Console Variables . . . 49
4.3 Analysis Tools . . . 51
4.3.1 TCPDUMP . . . 51
4.3.2 Wireshark . . . 52
4.3.3 Capinfos . . . 52
4.3.4 Open Broadcast Software Studio . . . 52
4.3.5 Application Console . . . 53
4.4 Experiments . . . 53
4.4.1 Experiment One: Input Prediction . . . 53
4.4.2 Experiment Two: Interpolation and Lag Compensation . . . . 54
5 Analysis 55 5.1 Baseline Data . . . 55
5.2 Experiment one: Input Prediction . . . 57
5.2.1 Data Collection . . . 57
5.2.2 Configuration . . . 60
5.2.3 Execution . . . 60
5.3 Experiment Two: Interpolation and Lag Compensation . . . 61
5.3.1 Configuration . . . 61
5.3.1.1 Interpolation Amount . . . 61
5.3.1.2 Setup . . . 63
5.3.2 Execution . . . 63
6 Results 65 6.1 First Hypothesis . . . 65
6.1.1 Input Prediction . . . 65
6.1.2 Interpolation and Lag Compensation . . . 67
6.1.2.1 Limitations . . . 68
6.2 Thin-stream Clustering . . . 68
7 Conclusion 71 7.1 Future Work . . . 72
Bibliography 75
List of Figures
2.1 How data flows through the layers when two applications communicate. 9
2.2 At each layer, the data is repackaged with that layer’s header [7]. . . 11
2.3 The seven-layer OSI model [7]. . . 12
2.4 The UDP segment structure [25]. . . 15
2.5 TCP header control bits [16]. . . 17
2.6 TCP header structure [24]. . . 18
2.7 Delayed Acknowledgments. . . 20
2.8 three-way-handshake [26]. . . 23
2.9 Packet statistics for for a selection of thin stream and greedy stream applications [16]. . . 24
2.10 Fast retransmit [23]. . . 26
2.11 RTT versus maximum application delay [13]. . . 28
3.1 Wireless Sensor Network [8]. . . 33
3.2 Example of arrival times of incoming world snapshots in a game [36]. 38 3.3 Communication between client and server in a Valve game [36]. . . . 39
3.4 Pseudo-code for input prediction as it is implemented in the source SDK [37]. . . 41
3.5 Lag Compensation in Counter-Strike: Source [36]. . . 42
5.1 Visualization of the packet rate in the server side trace . . . 56
5.2 Visualization of the packet rate in the client side trace . . . 56
5.3 NetGraph output showing a client connection with different mag- nitudes of packet loss. . . 58
5.4 NetGraph output showing a client connection with 15% packet loss. . 59
5.5 Output of cl_showerror (prediction error) in the client. . . 61
6.1 Screenshot of NetGraph during a session with interp_ratio of 10, and 10% packet loss, on a 66 tick server . . . 67
List of Tables
5.1 Metadata of initial server and client side traces . . . 57 5.2 Server and client CVARs forexperiment one. . . . 60 5.3 Server and client CVARs forexperiment two. . . . 63 5.4 Interpolaton ratio and packet loss for each session in experiment two. 64 6.1 The amount of prediction errors occurring at different latencies. . . . 66
1
Introduction
W
ith the development of more complex networks and applications, we have come to see a change in traffic patterns. The advances in technology have enabled us to realize large-scale ideas such as The Internet of Things (IoT) and Massive Multiplayer Online Games (MMOG). Event-driven and interactive applications are now found everywhere, which implies a shift in how applications communicate. These are trends that are not fully accounted for, and the nature of the Internet is now more complex than it has ever been. The fundamental structures that are currently used to handle data communication have to adapt to the new unconventional traffic patterns that have emerged with the new trends.In particular, the data stemming from interactive applications is not handled well by the technology that was originally designed to support other kinds of traffic, hence the mechanisms inherent to these technologies can potentially cause nasty side effects with new applications. An example of this is high application latency, which in essence is a measurement of how long it takes to do an operation and getting a response. The necessity to adapt new standards to complement the inevitable changes in technological trends has always been a challenge in the field of computer
science; and perhaps now more than ever. The production of data has exploded, and we are moving towards greater interactivity and real-time requirements. If the latency is higher than a certain threshold, the experience of the application is significantly degraded. This is perhaps especially prominent in the video game industry, which has been growing immensely in recent years. A glance at fresh statistics shows that 53% of the most frequent gamers play multiplayer online games, and women now represent a significantly greater portion of the game-playing population than before [3]. Our thesis will strive to tackle some of the challenges associated with these newly emerged traffic patterns, and to examine its influences with basis in current research, in hopes of improving its coexistence with current solutions.
Data traffic manifests itself in different ways, and follows a certain pattern characterized by the source application. There are many different patterns, but some of the common ones are greedy streams (downloads), bursty flows (DASH video streaming), and short flows (web page subflows). However, many interactive applications have intermittent sending patterns combined with small packet payloads. This kind of traffic conforms to what we call thin streams of data [13]. Transmissions occur when the user performs an action in the application, such as clicking an item in the game. The client for games such as these is an example of an application which produces thin streams of data. Another technology that has become more prominent is sensor networks, which consists of spatially dispersed sensors that can be used to monitor and record physical conditions and report back to a networked device. Sensor networks can consist of a thousand devices that each has one or more connections that send data whenever an event occurs, such as a fire alarm detecting smoke, or at clock-based intervals. These produces thin streams of data, or short connections at random or regular intervals.
As with most new technology, there needs to be an advancement in the development of existing and new protocols in order to support the technological needs of the new trends. Even though studies have been done to improve the support for thin stream applications, they are still reliant on old technology that is fundamental in the Internet. For instance, time dependent applications such as MMOGs and sensor networks often need reliable delivery of packets. The natural choice often falls on the Transmission Control Protocol (TCP), whose traditional implementation is not optimized for such traffic [21]. Its development has been focused on maximizing throughput, which is to transfer large amounts of data in
a short period of time. The congestion control and retransmission mechanisms in TCP reflect the purpose for which the protocol was originally made; to support bulk data transfers and networks that are otherwise subject to heavy congestion. Thin stream applications such as computer games, high frequency trading, screen sharing applications and sensor networks have different characteristics than that of greedy streams, and studies have shown that interactive applications using thin streams can be victim of devastating delays when using TCP-based retransmission schemes [13].
Interactive applications can also be tick-driven, meaning that they produce data with regular time intervals. The behavior of such applications in relation to thin streams is still a topic of research, although we do know that when tick based applications communicate over TCP they are subject to the same problems as other thin stream applications. Tick-based applications are not limited by congestion control, but the sending rate is controlled by the periodicity of the data simulations (ticks) produced by the application. An example of this would be constant bit-rate (CBR) encoding applications such as a Voice over IP (VoIP) audio protocol, where segments of a fixed size are clocked out by the codec at fixed intervals. This in turn creates a traffic pattern that conforms to thin streams. There are also tick-driven games, such as Quake and the Counter Strike franchise [36]. As a matter of fact, we will be using a dedicated Counter-Strike: Global Offensive server as a means of investigating the synchronization effects of tick-based applications in this thesis.
Understanding how these synchronization effects work could be an important step in developing mitigation options for tick-based thin stream applications. Such mitigation can be made in the application, on the network stack, in the network itself or in any combination of these. Lastly, a recent master thesis, written in collaboration with Simula Research Laboratory, have found that in some tick-based applications, specific streams seem to have consistently higher loss rates than others [16]. However, this is a phenomenon that so far has only been seen in an experimental test bed. In this thesis, we will look at application layer mitigation for packet loss in tick-based applications in general. If we manage to replicate the issues with clustered loss, we will also see whether certain mitigation techniques for packet loss, namely interpolation, lag compensation and input prediction, counter the issues with clustered loss.
1.1 Thesis Context
This thesis is a master’s thesis done in collaboration with Simula Research Laboratory as part of the master’s program Informatics: Programming and Networks at the University of Oslo.
1.2 Scope and Limitations
The aim of this thesis is to look closer at mitigation for packet loss and latency in tick-based applications that produces thin streams of data, as well as attempting to replicate a situation where particular streams experience higher loss rates than others in the form of clustered loss. If we do not manage to replicate the clustered loss, we will have a theoretical discussion on whether the mitigation techniques is powerful enough to counter latencies of the magnitude seen in previous experiments.
As part of our experiments, we wanted to have real people playing on our servers.
However, this was not always possible. As a backup plan for this, we used the powerful configuration of the server to have non-player characters (NPC) imitate real players. For instance, we had them move in a more non-deterministic way, and engage in combat much like what you would see in a real game of Counter-Strike:
Global Offensive. This should give valid data when it comes to the effectiveness of interpolation, client-side prediction and lag compensation, as NPCs are subject to the same techniques. The reason is that the player’s client does not differentiate between whether the enemy is a NPC or a real player, it still uses interpolation to display enemy movement, and hit registration is lag-compensated on the server.
However, the NPC is not a client that is connected to the server, so it has no latency.
1.3 Problem Statement
A recent study found, in their search for the right aggressiveness for retransmissions in thin TCP streams, an abnormality in the thin data streams stemming from tick- based applications. Their findings were based on an experimental approach in a closed test environment [16]. This is an example of latency issues that can potentially occur in tick-based applications.
There is a need for research on mitigation options for packet loss, and the
consequent latency, in tick-based thin stream applications. In this thesis, we will look at existing mitigation options implemented by Valve, namely Interpolation, Lag CompensationandInput Prediction, and tie these to current research on thin streams. We will also strive to see whether or not these techniques are effective means of mitigation for clustered loss in tick-based games. Following are our hypotheses:
• Hypothesis 1: Interpolation, Lag Compensation and Input prediction are effective mitigation options for tick-based games.
We want to experiment on the effectiveness of these techniques as means of mitigation in thin stream, tick-based applications.
• Hypothesis 2: The clustering effect can be reproduced in a real production system.
The clustering effect was discovered in a closed test environment. We will strive to see if this effect can be replicated in a real application that is used by millions of people today. The nature of the clustered loss is discussed in subsection 2.2.3.
The application we chose as the use-case for our thesis is Counter-Strike:
Global Offensive. TheValve Corporationhas long been developing FPS games with implementations of interpolation, client-side prediction and lag compensation, and offers a lot of useful tools for debugging and testing, as well as the possibility to run dedicated servers. These are highly configurable servers which allow us to specify configuration variables related to latency, interpolation periods, update intervals and tick rate (among other things). We will collect data from both the client side streams and server side streams, and thoroughly analyze them under different configurations to see the effectiveness of the mitigation options.
1.4 Outline
Below is the outline of this thesis.
• chapter 2 touches upon the different technologies and practices relevant to understanding the topic of thin streams, especially related to transportation of data. It presents a high level overview of the Internet architecture, and how
the theoretical OSI stack relates to the Internet stack. Additionally, and in most part, it describes UDP and the inner workings of the TCP protocol. The chapter also defines thin streams and describes the problems regarding thin stream applications using TCP. Lastly, it mentions previous research done on the topic of thin streams.
• chapter 3is dedicated to tick-based applications. The chapter introduces the concept of interpolation, lag compensation and input prediction, which were used as metrics in our experiments.
• chapter 4 explains the setup of our experiment along with the related tools and technologies used to test our hypothesis.
• chapter 5 describes the implementation and execution of our tests, in line with the setup that was presented in chapter 4.
• chapter 6 presents the findings of our experiment in relation to our hypothesis.
• chapter 7 concludes our thesis and discusses what we can draw from the findings of our experiment.
2
Background
2.1 Internetworking and Transport Protocols
This section is dedicated to internetworking and data communication. In order to get insight into the subject of thin streams, it is important to have an understanding of how entities in a network communicate. The goal of this section is to give a high level overview of the network stack, as well as a low level overview of the transport protocols. The application on which we run our experiments inherently uses UDP.
However, understanding the inner workings of TCP is still relevant, as it has been a focus of previous thin stream research, and hence helps to set the nature of thin streams into context.
2.1.1 The Internet Architecture
“Fortunately, nobody owns the Internet, there is no centralized control, and nobody can turn it off. Its evolution depends on rough consensus
about technical proposals, and on running code. Engineering feed-back from real implementations is more important than any architectural principles.” — B. Carpenter; Architectural Principles of the Internet [6]
The Internet is a complex network comprising billions of interconnected computers, and is often described as a network of networks. These networks are interconnected through the use ofgatewaysorIP routers, referred to asIntermediate Systems, which are packet-switching computers. These are stateless by design in order to improve the robustness of the communication system, which is achieved by forwarding packets independently of other packets. Because of this, redundant paths can be used in order to provide a robust service in case of failures of intervening gateways and networks. All routing decisions are done by the intermediate systems as this is a complex and daunting task that should not be done by the end systems[5].
The computers connected to the Internet,hosts(end-systems), are the consumers of communication services that generally execute programs on behalf of users. All the information required for end-to-end communication is implemented in the hosts, which means that the connection will be lost if one of the hosts is subject to failure.
They communicate with one another using a set of standard procedures called a protocol suite.
In particular, the so called Internet Protocol (IP) is responsible for providing addresses for hosts and for routing packets of data from a source host to a destination host across the network. The Internet supports any communications protocol built on top of the IP. The most common one is the Transmission Control Protocol (TCP), which gives reliable and error-free transmission of data between hosts. Together they are known as the Standard Internet Protocol Suite [5].
2.1.2 The Internet Protocol Suite
In order for a host to communicate using the Internet it needs to implement the layered set of protocols included in the Internet protocol suite, which is a computer networking model that consists of communication protocols used in the Internet.
This implementation is referred to as the protocol stack, often known as TCP/IP after its most prominent protocols.
The implementation must account for the four protocol layers of the stack;
namely the application layer, transport layer, Internet layer and the link layer
Figure 2.1: How data flows through the layers when two applications communicate.
(often referred to as the Network Access Layer) [4]. Together they provide a range of communication services and are responsible for different tasks in order to achieve the common goal of host-to-host communication in heterogeneous networks.
The TCP/IP model has a modular design where the different components (layers) function independently from one another, but interfaces with the adjacent layers in the stack [7].
• Application Layer: The top most layer of the stack that provides applications the ability to access the services of the other layers by defining how host programs interface with the transport layer in order to use the network.
Includes higher-level protocols such as HTTP (Hypertext Transfer Protocol), DHCP(Dynamic Host Configuration Protocol) and SMTP(Simple Mail Transfer Protocol).
• Transport Layer: Provides the application layer with communication services such as flow control, error control and information about the connection between two communicating hosts. Furthermore, it provides multiplexing over IP addresses, namely ports. Finally, but not exclusively, it provides congestion control and recovery in networks prone to congestion (see subsection 2.1.5).
The core protocols of this layer are the TCP (Transmission Control Protocol) and theUDP (User Datagram Protocol).
• Internet Layer: The layer that is responsible for addressing, packaging and routing functions. It supports delivery across the internetwork, as well as relating physical addresses to logical addresses in the network. In essence, it consists of specifications that provide functionality to transport datagrams across network boundaries. Some core protocols in the Internet layer are theIP (Internet Protocol),ARP(Address Resolution Protocol),ICMP(Internet Control Message Protocol) and IGMP (Internet Group Management Protocol).
• Network Access Layer: Provides details of how data is physically sent through the network and defines an interface with the physical network. It is responsible for placing TCP/IP packets on the network medium and addresses data for the subnet based on physical hardware addresses.
A host typically implements at least one protocol from each layer. Each component layer invokes the services it needs to perform its role in the stack. In a typical outgoing transmission from a source host, the packet travels through the layers where each each layer includes relevant information in the packet header along with the data. Figure 2.1 shows how the data flows through the layers when two applications communicate. For each layer the data packet gets encapsulated with a new header until it is sent as bits on the wire by the physical layer. Figure 2.2 shows how the process works.
On the receiving end, the data packet moves up through the layers, and each layer unpacks the packet relevant to their role. For instance, the Internet layer uses the information in the Internet layer header. Because of this modular design, the packets created at each layer has their own names. The packet at the application
Figure 2.2: At each layer, the data is repackaged with that layer’s header [7].
layer is called amessage, on the transport layer it is called asegment, on the Internet layer it is called adatagramand on the network access layer it is called aframe [5][7].
2.1.3 The Open Systems Interconnection Model
The Internet protocol stack is often seen in relation to the Open Systems Interconnection Model (OSI model), which is a seven-layer conceptual model that serves as a framework for the standardization of communication between computer systems and protocols. Although not directly used as part of our thesis, it serves as a great context when talking about data communication in general. It was designed by the International Organization for Standardization (ISO), and was an effort to promote interconnectivity and open access to protocol standards for software developers. The OSI model consists of the following layers: The application layer, presentation layer, session layer, transport layer, network layer, data link layer and the physical layer.
The OSI model predates the modern Internet by several years, but the published reference model was influential in the development of the protocols used in the
Figure 2.3: The seven-layer OSI model [7].
TCP/IP stack. This has lead to similar terminology being used between the models, and even though the models do not conform directly we can still see OSI concepts being incorporated into the TCP/IP layers, with some of the OSI layers merging into individual layers in the traditional stack. One could argue that the subdivision of layers in the OSI model adds complexity, but the more precise distinction of services may make the development of software more approachable for developers [7]. It was believed that the OSI model were going to be the standard for data communications, but it never happened. The TCP/IP protocol suite became the commercial architecture due to its extensive testing in the Internet. Hence, the OSI model was never fully implemented [12]. However, it is still a great tool of reference when discussing data communication.
Figure 2.3 shows how the layers of the TCP/IP standard typically translates into the OSI layers. It is worth noting that the TCP/IP model and the OSI model are standards, not implementations. The actual implementations of these might differ slightly from the reference models, and the exact relationship depicted in Figure 2.3 is also a subject of discussion. However, the Internet (network) layer and the transport layer, in which the IP and TCP are relevant, remains the same.
2.1.4 User Datagram Protocol
The pinnacle protocols used for end-to-end transportation of data in the Internet are the User Datagram Protocol (UDP) and the Transmission Control Protocol (TCP). Although most of the research on thin streams are on the problems regarding TCP and how its mechanisms hinder the full potential of thin stream applications (see subsection 2.2.1), the reality is that thin streams can be found in applications using UDP as well. These are typically tick-based games, such as Overwatch, Open Arena or the Counter-Strike franchise. In other words, UDP is still relevant to the topic of thin streams as it is used by many latency sensitive and interactive applications that produce thin streams of data. As part of our thesis, we will be using an interactive tick-based application which inherently uses the UDP protocol.
Following is an examination of the services and intricacies provided by the two major transport protocols, starting with the User Datagram Protocol.
The User Datagram Protocol is defined in RFC-768, and is a minimal protocol which uses a simple connectionless transmission model in order to communicate between hosts in the Internet [25]. It is stateless and message oriented, and does not offer in-order delivery or reliability. Furthermore, it does not offer congestion control, which, along with its lack of reliability, can cause UDP streams to cause congestion when used for data-intensive applications. It is not unusual for Internet Service Providers (ISP) to block UDP because of its lack of congestion control and because of its ability to broadcast messages, which can sway developers toward other protocols. However, it is not unusual for applications that use UDP to have a reliable transport protocol as a fallback if it is blocked by a firewall [23]. UDP is more of a general-use protocol, as it does not need to retransmit lost packets nor does it do any connection setup which means no setup overhead. The fact that there is no handshaking between the sending and receiving transport-layer entities is what defines UDP as aconnectionless protocol. The packets are sent as they are delivered from the application layer, giving the sender control of the transmission timing [25, 23].
The applications that profit from UDP are often those that wish to use communication services that are not available from TCP (e.g., multicast or broadcasting), by applications that do not require the level of services provided by TCP, or by applications where you need to be in control of the sending rate. These are typically multimedia and interactive applications where data loss is considered acceptable, such as streaming services or fast paced online games that require high
refresh rates in order for the game to feel immersive and responsive. For instance, the First Person Shooter (FPS) game Counter-Strike, created by Valve and running on their Source engine, uses the UDP protocol. In order to conserve the fluidity of the game they would rather drop packets than account for built-in reliability and in-order delivery which could increase latency. However, these services may be implemented on the application layer if needed [35, 16, 24].
Another example is Domain Name Systems (DNS), whose application level protocol typically uses UDP [14]. It constructs a DNS query message which is sent to the name server, which is then passed to UDP. The headers are added and the packet is forwarded to the network layer where its encapsulated in a datagram and sent to the name server. The time saved on overhead due to small packets (compared to TCP), no handshake, and that state information does not need to be continuously maintained on both sides of the connection, makes the operation fast.
If the request does not get a response, it tries another name server or informs the user that the query fails. The use of UDP allows it to cater a much larger amount of requests effectively, and is perfect in applications that require a minimum sending rate below what is guaranteed by TCP, and that can tolerate a degree of data loss.
2.1.4.1 UDP Header Structure
The UDP header, shown in Figure 2.4, consists of four fields, each of which is 16 bits.
They are respectively Source port, Destination port, Length, and Checksum.
The source port and the checksum are optional fields in IPv4, although only the source port is optional when using IPv6.
• Source and destination port (16 bits each): The communication endpoints for sending and receiving devices. More specifically, it is the transport layer’s way of knowing which application needs the data. However, if the source port is omitted, a zero will be inserted in this field.
• Length (16 bits): The length field specifies the length of the UDP segment, including the header, in bytes. The minimum value of this field is 8, as its the size of the header. An explicit length variable is necessary due to the fact that different segments may have data fields of different sizes.
• Checksum (16 bits): The checksum is used for error detection. In essence it makes it possible to determine whether or not the bits within the UDP
Figure 2.4: The UDP segment structure [25].
segment were altered as it moved from the source host to its destination. This could for instance happen due to noise in the links or while it was stored in a router [14]. The checksum should add up to 0, and the field carries all-zeros if unused.
2.1.5 Transmission Control Protocol
The Transmission Control Protocol was first defined by RFC-793 in 1981 and later elaborated upon in RFC-1122. It is a reliable and message-oriented transport protocol able to deliver full-duplex (e.g., bidirectional transmission of data) streams of octets across nodes in an Internet Protocol (IP) network, and is a fundamental part of the Internet Protocol Suite. TCP is widely deployed and is normally the choice of protocol for services that require reliable connections with guaranteed, error-free and in-order delivery of data. Its reliability is assured, in short terms, by implementing an acknowledgment scheme where for each packet received, an acknowledgment of the received packet is sent back to the sender.
Several improvements have been done on the protocol and its congestion control algorithms since its inception, and there are currently many viable versions in use (e.g., Reno, New Reno, Vegas, Cubic), with New Reno being somewhat of a baseline for the new generation of TCP algorithms [26, 5].
TCP is also widely supported due to its popularity. For instance, HTTP and HTTPS both use TCP, and is nearly always allowed through Internet Service Provider (ISP) firewalls, which makes it the natural choice of protocol in many environments [13]. The developer is left with few transport protocol options beside the established UDP and TCP, as experimental protocols either lack support, which makes them undesirable to businesses, or they are commercially unavailable all together. Furthermore, other protocols, such as the Stream Control Transmission Protocol (SCTP), rely on congestion control frameworks that were developed for TCP [23].
TCP is generally the choice of protocol for applications that require services outside the scope of UDP, and especially for applications that require full reliability.
However, the full spectrum of services provided by TCP stretches beyond that of reliability, and assesses several needs in areas such as security and data integrity.
The services provided by TCP include [23]:
• Reliability: Guarantees that data is delivered by retransmitting lost data until successful delivery.
• In-order delivery: Data, represented as a stream of bytes, is delivered to the receiver in the same order as it was sent. Additionally, data segments are delivered in the sending order.
• Congestion control: TCP Implements back-off algorithms to enable the sharing of network capacity when congestion is detected. It limits the sending rate based on the perceived network congestion.
• Flow Control: The sender does not transmit more data than the receiver has capacity to receive.
• Error control: Detects transmission errors by checking the checksum field of the packet header for corrupted segments, and then either discarding or allowing the packet based on the result. The protocol also implements error checking in the form of acknowledgments for confirmation of data, as well as the retransmission schemes for lost packets.
Figure 2.5: TCP header control bits [16].
2.1.5.1 TCP Header Structure
The IP protocol contains information about the source and destination addresses for the communicating hosts, and is responsible for bringing the packet from host A to host B. A TCP segment encompass chunks of bytes from an active data stream with a TCP header containing information specific to the TCP protocol, and is encapsulated in an IP datagram before it is sent across the wire. Figure 2.6 shows the structure of the TCP header, which is as follows [26, 5, 16]:
• Source and destination port (16 bits): The communication endpoints for sending and receiving devices. More specifically, it is the transport layers way of knowing which application needs the data.
• Sequence number (32 bits): The index of the first byte in this segment. In cases where the SYN control bit is set (e.g., when a host wants to initiate a TCP connection), the sequence number is the initial sequence number (ISN) and the first data octet is ISN + 1.
• Acknowledgement number (32 bits): The sequence number of the next expected byte.
• Window size (16 bits): The number of bytes still free in the receive buffer.
• Checksum (16 bits): Field used for error-checking of the header and data, which can be used to detect errors in transmission. The checksum should add up to 0.
Figure 2.6: TCP header structure [24].
• Data Offset (4 bits): The number of 32 bit words in the TCP header, which indicates where the data begins.
• Reserved (6 bits): Reserved for future use and should be zero. Its purpose is to align the total header size as a multiple of four bytes.
• Flags (6 bits): Flags, AKA Control Bits, that convey information about the stream. These are six standard bits, and six extended bits, with each individual bit representing on oroff. It is, among other things, used by ECN (Explicit Congestion notification), which is a congestion notification technique.
Figure 2.5 lists the control bits and their meaning.
• Urgent pointer (16 bits): The (positive) byte offset from the current sequence number where the urgent data of the segment begins.
• Options (variable): Options are variable in size and included in the checksum.
They occupy the space at the end of the TCP header and are a multiple of 8 bits in length. Examples of options are Selective Acknowledgments and TCP time stamp [16].
• Data: The packet payload which carries the data for the application. It is not part of the header.
2.1.5.2 Data Communication
As mentioned in subsection 2.1.2, a packet on the transportation layer is often referred to as a segment.. A TCP segment consists of a (fixed) 20
byte header (additional bytes can be included as options in the TCP header, subsubsection 2.1.5.1) followed by the data payload from the host process. In a TCP session, the host process sends a stream of bytes, where each byte has a sequence number associated with it [23, 16]. The function of the sequence number is to keep track of the amount of bytes the stream has transferred thus far, and is incremented for each byte that is sent. The stream is divided into chunks of bytes, each of which forms the payload of a TCP segment. Every segment is acknowledged by the receiver by returning a segment with the ACK flag set along with setting the acknowledge number to the next expected byte (i.e the expected sequence number of the next segment). This allows the stream to be full-duplex, as every segment header contains a sequence number and the acknowledgment number corresponding to the previously received byte + 1.
Segments can be lost in transmission due to factors such as congestion, problems with network devices or software issues. When this happens, the segment has to be retransmitted [26]. The exact way in which these scenarios are handled differs slightly across different TCP variations. When a segment is sent on a TCP connection, there is a timer set for that segment. This timer is referred to as the retransmission time-out period (RTO period), and it acts as a time limit for when the segment needs to be retransmitted.
If the sender has received an ACK for a specific segment before the RTO period has expired, then the timer is canceled and the next segment is sent. However, if the timer expires without the sender receiving an ACK then the segment is retransmitted. for instance, if the sender wants to send 4 segments with sequence numbers 1 - 4 and the segment with sequence number 2 is dropped, then the receiver would keep requesting the second segment by returning ACK segments (duplicate acknowledgments) with the acknowledgment number set to 2. Once the RTO timer expires for that packet it will be retransmitted along with all subsequent packets [23]. However, newer TCP versions (e.g., New Reno) implementfast retransmission, an algorithm that makes retransmissions more effective in certain scenarios. How RTOs and retransmissions work in relation to thin streams are especially important in understanding the problem area of thin streams and TCP, which will be elaborated upon in subsection 2.2.1.
The basic way in which TCP acknowledges packets is by using cumulative acknowledgments. The way this works is that the receiver sends an ACK for the highest in-order data segment that has been delivered, which allows the sender
to assume that every byte up until the acknowledged byte has been delivered successfully. This functionality makes it possible for the receiving host to delay its ACK segments if it deems it fit [5]. This is referred to asdelayed acknowledgements.
Its purpose is to allow the receiving host to acknowledge several segments with one ACK, hence reducing the amount of packets traversing the network as well as reducing the processing load associated with generating packets. However, the ACK delay must not exceed 500 ms as this would interfere with packet loss mechanisms in TCP, and the ACK delay timer is set at 200ms in Linux for safe measure[23].
Additionally, in cases where the stream carries full-sized segments, there should be an ACK for at least every second segment[5]. Figure 2.7 shows how delayed ACKs work with cumulative packets and single packets breaching the 200ms window [23].
Note that in reality the ACK segment would acknowledge with the next expected byte, but for simplicity, the diagram shows which segments are acknowledged.
Figure 2.7: Delayed Acknowledgments.
Selective Acknowledgments
A problem with cumulative ACKs is that the sender does not know which segments have been successfully delivered or lost. A TCP sender can only learn about one lost segment per round-trip delay time (RTT), and even if the sender chooses to aggressively retransmit packets early, there is a chance that those segments have already been received. In other words, TCP streams can, by only using accumulative ACKs, experience a reduction in throughput when there are multiple dropped segments.
This problem was addressed by RFC2018, in which theSelective Acknowledgment was introduced [17]. SACK can be included in the TCP header options and tells the sender which specific segments have been received successfully, which allows the sender to only retransmit the segments that were lost in transmission. SACKs must be supported by both endpoints and the use of SACKs is negotiated in the initial setup of the TCP connection (e.g., the three-way handshake, subsubsection 2.1.5.4).
In 2000, the Network Working Group proposed an extension to SACK;Duplicate Selective Acknowledgments (DSACK) [11]. By using the first block of the SACK option field to inform the sender about the sequence numbersof the segment that triggered a specific ACK, it allows the sender to infer the order of packets received at the receiver. The information can be used by the TCP sender to infer when it has unnecessarily retransmitted a packet, which it can capitalize to assess its behavior in relation to loss, packet replication or timeouts.
2.1.5.3 Flow Control
Flow control is not to be confused with congestion control, although the services overlap to some extent. Whereas congestion control is concerned with network congestion, flow control is associated with mechanisms controlling the sending rate of data on a TCP connection. As mentioned in subsubsection 2.1.5.1, the window sizefield in the TCP header contains information about the amount of bytes still free in the receive buffer [26, 2]. This is called thereceive window(rwnd), and is conveyed to the sender by the receiving host. TCP does not send one individual segment and wait for an ACK before sending the next one, it rather sends a window of segments.
The information from the rwnd field tells the sender how many segments it can send before expecting an ACK (i.e., the amount of outstanding data on the receiver side).
For instance, if the receiver advertises a buffer of 3 x MSS (Maximum Segment Size), then the sender will transmit 3 segments worth of bytes before awaiting the ACK[23]. By adjusting the rwnd field in its ACK segments, the receiving host can notify the sender if it gets overwhelmed, which in turn would prompt the sender to reduce its sending rate. The dynamic of this communication is called a sliding window protocol. In scenarios with unsuccessful arrival of packets at the receiving host, it can advertise additional space in the receive buffer, hence ”sliding” the window, which speaks to the origin of the name.
2.1.5.4 Three-way Handshake
TCP is connection-oriented, meaning that two applications using TCP must first establish a connection before exchanging data. Athree-way handshake(AKA., SYN- SYN-ACK) is initiated to establish a TCP connection between two hosts [26]. This process involves the synchronization of the hosts initial sequence numbers, and is designed to allow both endpoints to negotiate the parameters of the TCP sockets before transmitting data. The sequence number is included in the TCP header for each segment that will be sent during the session and is relevant in realizing services such as in-order delivery of data. One of the reasons why a three-way handshake is necessary is because the sequence numbers are not tied to a global clock in the network, hence the receiver has no way of knowing whether or not the sequence number received from the initiating host was old (e.g., from a disconnected session) or delayed unless it has the previous sequence number used on the connection.
This is solved by having both endpoints acknowledging the synchronization packets before establishing the connection. Figure 2.8 illustrates the process, and it can be described as follows:
The sender (host A) sends a packet to the receiver (host B) with the SYN flag (see Figure 2.6) enabled, informing that it will use sequence numbers starting with sequence number 100 [26]. Note that the sequence number sent with the SYN packet is chosen randomly due to security reasons. Upon receiving the packet,host B returns a packet with the SYNand ACKflag enabled, effectively acknowledging the sequence number from host A and indicating that it expects to hear sequence 101.
Upon receiving the synchronize-acknowledgment,host A sends an acknowledgment back to host B which establishes the connection.
Figure 2.8: three-way-handshake [26].
2.2 Thin Streams
In the modern Internet there is an abundance of interactive and time-dependent applications, ranging from online games, chat services and VoIP applications. The requirements for data delivery has changed from what it was in the early days of the Internet, and the data flow stemming from these services are different from what was common before. These applications produce thin streams of data, which are data streams identified by small packet payloads and high Inter-Transmission Times (ITT) [23]. Thin streams are application limited, meaning that the network traffic is decided by the application. For instance, the sending rate of packets can be determined byevents in typical IOT structures such as sensor networks, or they can be decided by the periodicity of tick simulations in a tick based application.
You normally have two kinds of streams, thin streams and greedy streams.
Greedy streams aim to consume as much bandwidth as possible, and instead of being limited by the application, it is limited by congestion control [24, 16].
Typical examples of greedy streams include downloads (HTTP, FTP) or streaming services such as Youtube or Twitch TV. Figure 2.9 shows packet statistics for a selection of thin stream and greedy stream applications. It illustrates how the
Figure 2.9: Packet statistics for for a selection of thin stream and greedy stream applications [16].
network traffic generated by time-dependent, and interactive, applications have distinct characteristics. The table shows the distribution of packet payload sizes, packet IAT and average bandwidth consumption in packets per second and kbps.
2.2.1 Thin Streams and TCP
Real time systems and interactive applications are subject to strict requirements when it comes to quality of service (QoS), especially in regards to latency. The users expect a certain degree of session continuity when using applications like Skype or when playing online games. The latency (round trip time) refers to the time between a client sending auser command, the server responding to it, and the client receiving the server’s response. The latency of a connection is decided by a lot of factors, and there are a great number of things that can cause high latency. Some common causes are limits to the capacity of the ISP’s infrastructure during peak times, online application load in a typical client-server network, hardware problems or wireless interference [18]. For individuals at home, there could be router or modem problems, or malware on the client computer that is causing high latency. However, as we will be discussing in this chapter, high latency can also be caused by incompatibility
between technologies such as TCP and interactive applications. Lastly, as will be discussed in subsection 2.2.3, latency can be caused by packet loss due to full queues in tick-based applications.
Studies done by the ITU-T (International Telecommunication Union’s Telecom- munication Standardization Sector) have shown that users of telephony and audio conferencing expect the latency to be less than 200ms. Furthermore, experiments show that, depending on the game, the threshold is generally between 100 – 1000ms for players of online games, and recent studies suggest that some players can per- ceive motor-visual delays as low as 26ms [27, 23]. Consequently, providers of services such as online games should hold the importance of network latency in high regard when designing their applications and choosing their software. However, due to thin streams’ incompatibility with TCP, some users experience application delay that far exceeds the thresholds presented above in current systems.
The problem stems from how TCP handles network congestion. The protocol normally deploys one of two mechanisms when a packet drop occurs, depending on the traffic and the TCP variation. The first method isfast retransmit, where the sender tracks the amount of dupACKs it gets for the same segment. If it receives three dupACKs it assumes that the following segment is lost and retransmits it.
Fast retransmit is a good way to recover from packet losses in greedy streams, but falls short when used with thin streams, as thin streams are not subject to congestion in the same way. With this said, thin streams are still able to cause congestion. [13] found that for thin streams caused by games, TCP does not back off, as the cwnd never grows beyond 1. Even though each client has a thin stream connection to the game server, there may be hundreds of concurrent streams which together can congest the network without reducing the sending rate. In other words, the segments are injected into the network randomly, or based on the tick rate of the application, without reacting to congestion. However, if there is repeated loss in the network, an RTO will occur and the streams will exponentially back off.
The second mechanism is RTOs, which is triggered when the sender does not receive three dupACKs in the time-window needed for fast retransmit to work.
The protocol interprets this as a sign of heavy congestion. Most TCP variations implement back-off algorithms for RTOs. More specifically, the algorithms decide the value of the RTO timer once the previous timer has expired. Conventionally, exponential back-off is enabled, which doubles the timer each time the RTO limit is breached for the same packet [19].
Figure 2.10: Fast retransmit [23].
Thin streams are normally subject to the latter, simply because the application does not send enough packets to trigger fast retransmissions. Because of this, thin streams are subject to all the penalties associated with timeout retransmission, such as head-of-line blocking (HOLB) in the receiver-side stack, congestion window collapse (not necessarily a problem is the rate is low enough) and exponential back- off after repeated packet loss [21]. Figure 2.10 shows the RTO timer as minRTO, and a typical thin stream does not have inter-transmission times (ITT, i.e., packet rate) low enough to be within this time frame. Since the protocol doubles the transmission timer when the same packet is lost several times in a row, we see an exponential increase in latency with repeated losses of the same segment. Because of this, thin streams can often be defined as a data stream where the ITTs are too high for fast retransmit to trigger.
Carsten Grizwodz and Pål Halvorsen analyzed a one-hour packet trace from the servers of Funcom’s MMORPG Anarchy Online. Figure 2.11 illustrates the RTT versus maximum application delay, and shows that a user experienced a maximum delay of 67 seconds due to six consecutive retransmissions [13]. The analysis is based on a server side trace and does not distinguish between lost packets and lost ACKs, but is a good example on how thin streams can experience high delays. In order to keep acceptable and consistent latency thresholds in thin stream applications we need to find a way around TCP’s congestion control mechanisms and their dependency on continuous data to provide effective retransmission solutions.
2.2.2 Related Work
This section shows some of the related work on the topic of thin-streams. There is still a lot to be learned about the nature of thin-streams, and previous research studies have provided tools, theories and findings that will be directly or indirectly used in this thesis.
2.2.2.1 TCP Variations
In his master’s thesis, Espen Søgård Paaby tested different TCP variations in order to find how different TCP mechanisms influence thin-stream latency. He found no profound difference in latency between the TCP variations, but all variations showed the same tendency of high delay when retransmissions occurred. However, TCP New
Figure 2.11: RTT versus maximum application delay [13].
Reno had a better overall latency performance in several scenarios compared to the other variations [22].
2.2.2.2 TCP Modifications
Linux has recently added alternatives to some of the TCP mechanisms that causes problems for thin streams. Andreas Petlund specifies two TCP modifications that can be done to reduce latency in cases where retransmissions are needed to recover lost data, which are thin fast retransmits and linear timeouts [24]. The former triggers the conventional fast retransmission when the first duplicate ACK arrives.
This may cause some spurious retransmissions (e.g., retransmissions that happens when there was no need to recover from loss), but leads to better overall latency.
Linear timeouts, a modification to the exponential back-off algorithm deployed in TCP, enables when a thin stream is detected (when there are less than 4 packets in flight). The retransmission multiplier does not double until six linear timeouts have occurred, at which point a packet that is not recovered is most likely dropped due to heavy congestion. The exponential back-off is then resumed, as the modified
version does not help in this scenario. Tests done on servers running Age of Conan show greatly improved latencies when second or third retransmissions occur [24].
However, as pointed out by Opstad et.al, these reactive mechanisms could still potentially provide high maximum latencies [21].
There has also been attempted proactive and predictive techniques that hide packet loss from the TCP flow- and congestion control algorithms, for instance forward error correction (FEC). Although a good idea, they still rely on both receiver and sender side modifications, which is proved hard to deploy in today’s networks [21].
2.2.2.3 Aggressiveness in Thin Streams
Jonas Sæther Markussen implemented a controlled test environment in order to study the trade-off between aggressiveness and fairness in thin streams. The goal was to find the best metrics for aggressiveness in order to improve thin stream performance while still being fair towards other traffic. Their findings suggested that an increase in the aggressiveness for retransmissions in thin TCP streams is justifiable in competition with greedy streams [16].
Their discovery of the clustering effect caused by thin streams in tick-based application, described in subsection 2.2.3, is also a motivation for this thesis.
2.2.3 Clustering Effect
A 2014 study unearthed a possible clustering effect in thin-streams produced by tick- based applications, which causes the classic high-latency issues with thin streams in cases where the application runs small network queues [16]. They implemented a kernel module calledaqmprobe, which allowed them to inspect router queues in real time. By using a kernel probe, it attaches itself to apfifo (packet-based first in, first out) entry point. From here it can collect metadata about packets with information about the stream, packet size and whether or not the packet was dropped. They experienced a clustering of drop events on roughly every ITT for the thin streams (in a testbed with 40 thin stream TCP connections).
This could be explained by the fact that the experiment had used an ITT variance that was too low, consequently causing the thin streams to cluster together [16].
However, changing the ramp-up time did not solve the problem. The study reasons
that this was because the ITT is periodic and smaller than an RTT, and a change in ramp-up time does not necessarily spread out over the entire ITT interval.
When reducing the queue size to half of its original size, the loss rates for approximately half of the TCP streams increased [16]. The reason for this is that they transmit periodically, which causes each thin stream to end up in roughly the same spot in the clustering for every ITT. In conclusion, some streams consistently delivers packets due to the fact that they are enqueued properly, whilst other streams consistently hits a full queue which results in dropped packets.
3
Tick-Based Applications
Interactive applications can be tick-driven, meaning that they produce data with regular time intervals. Tick based applications manifest themselves in a variety of systems, including time-sliced virtualization applications, sensor networks and games. When tick based applications communicate over TCP, they are not limited by congestion control, but the sending rate is controlled by the periodicity of the data simulations produced by the application. The rate of these data simulations is called thetick rateand can be defined as the rate at which the application server runs simulation steps.
The data stemming from tick-based applications are characterized as thin due to the distinct sending pattern. For instance, in tick-based online games, both the client and the server send packets of information at fixed intervals in order to accurately portray the game state for all connected clients. As a matter of fact, we will be using a tick-based multiplayer FPS game as the use-case for this thesis.
Some flows can experiencing consistent higher loss rates due to the nature of how tick-based applications send data, which could degrade the experience of
the application for the user (see subsection 2.2.3)[16]. The responsiveness of the application is determined by several factors, including server load, tick rate and data rate. However, the most deciding factor is the latency. The gaming industry has developed techniques to compensate for differences in latency between clients by implementing algorithms for prediction and lag compensation [36]. However, these techniques are primarily targeting differences in client capacities, and little is known of whether these implementations can work as mitigation to underlying thin stream problems.
3.1 Application Types
There are a variety of applications that are tick-based, or rather, they depend on a clock of some sort for transmission.
3.1.1 Sensor Networks
Wireless Sensor Networks are often brought up when talking about tick-driven applications and systems. As the name suggests,Sensingrefers to the technique of gathering information about a physical object or process, such as different occurring events [8].
The objects that collect these data are called sensor, which are small devices with a narrow area of use. They often mainly consist of a sensing component and a communication component. When many sensors collectively monitor large environments, it is called a sensor network. The different devices are spatially dispersed and communicates with each other and/or to a networked access point (AC) [28]. The AC is often referred to as a base station, and works as a portal that accesses processing, visualization, analysis and storage systems through the Internet, as showed in Figure 3.1.
Sensor networks can consist of a thousand devices that each has one or more connections that send data whenever an event occurs, such as a fire alarm detecting smoke, or at clock-based intervals [8]. These produce thin streams of data, or short connections at random or regular intervals. For example, a sensor network can be deployed as a smart house solution, where the different devices send data at regular intervals (in the form of ticks) to the centralized hub which in turn calculates
Figure 3.1: Wireless Sensor Network [8].
adjustments based on the data that is collected. By way of example, a sensor could be collecting data about temperatures or other environmental data. Other examples of sensor networks include health care monitoring, such as body position measurement or heart-rate monitors, or sensors monitoring the water quality of a river.
3.1.2 Online Games
Online games are often tick-driven. Examples of tick-driven games are World of Warcraft, the Counter-Strike franchise, and Overwatch. The reason why they are designed this way is to be able to have a consistent view of the game world for all the clients. Decision making and world changes are done at the server, and often these changes are based on actions that the player has done in their client (such as starting an event, or shooting someone).
The client talks to the server, informing it about the actions of the user. The server can then calculate the consequences of those actions, and the changes are propagated to all the affected players that are connected to the server. These updates are sent out in ticks, at a rate which is specified on the server. This process
is outlined in section 3.2 and section 3.3.
3.2 Ticks
A tick, often referred to as a program loop, is a networking term which refers to the number associated with each simulation step that is broadcast to clients to help them synchronize with the server. The term is often associated with games, and is typically deployed in a server-client architecture. A high tick rate gives a more accurate view of the game world for each client that is connected to the dedicated host that runs the game [10, 36]. The concept of tick based applications is nothing new, and is seen in applications other than games as well. Given that the server is forced to transmit on the tick schedule to maintain a proper service, it is important that developers understand how varying network conditions might affect the transmissions, and whether negative effects caused by the periodicity and order of flows that transmit will arise.
Conventionally, during each simulation step, the server processes incoming user commands, which is a sampling of the player’s current keyboard and mouse state. It continues by running a physical simulation step, checks the game rules, and updates all object states. After simulating a tick, the server checks the connected clients and decides whether they need a world update, in which case the server takes a snapshot of the current world state [36]. A higher tick rate increases the accuracy of the game representation for the user, meaning that the player experiences changes as they happen in contrast to being several frames behind. However, increasing the server tick rate does not come without hardware and networking costs, as a high tick rate requires more available bandwidth and CPU power on both the client and the server [36].
Depending on the client specifications, increasing the tick rate would not necessarily improve the user experience, as the player can only receive a set amount of data based on their bandwidth. A server sending more data than the client and/or the client path can handle would result in packet loss. As Valve developers specify in their article about multiplayer networking, the most important networking variables for clients, at least in games running on the source engine (game engine developed by Valve), are their incoming bandwidth capacity and specifiedsnapshot rate (refresh rate they would like to play at) [36].
3.3 Tick rate
The tick rate (used interchangeably with the term update rate), which is the number of server updates each second, is measured in Hertz [10]. A game which has sixty tick simulations every second has a tick rate of 60 hertz. There are some conventions in the gaming industry as to what the tick rate of an application should be, and in most FPS games you will find a tick rate of approximately 20Hz, 30Hz, 60Hz or 120Hz, with some games providing clients with the possibility to adjust the tick rate themselves, or to have adaptive tick rates based on the capacity of the clients.
An example of the latter is the game Overwatch, a multiplayer first-person shooter (FPS) game which was released by Blizzard Entertainment on May 24, 2016. They initially had servers with a tick rate of 21Hz, and scaled it up to 63Hz in what they called the "High Bandwidth Update" [31]. They implemented it in a way so that the tick rate is dynamic based on the capacity of the client’s bandwidth.
Their investment in high capacity servers may have made it possible to increase the tick rate, but it also meant that the bandwidth consumption on the client side would increase. As not all Internet connections are equal, they would adaptively scale down the update rate for specific connections whose clients could not keep up.
They also included tick rate limitation as an option for players, although this was introduced in a patch later on.
Nbn CO Limited, a government-owned Australian corporation which builds and operates the National Broadband Network, mentions that some games have a reported tick rate of up to 250Hz (Gears of War 4). However, there might be diminishing returns in regards to their responsiveness due to the limitations of human perception [10]. Another example of a game with a high tick rate isCounter- Strike: Global Offensive (CS:GO), which was released by Valve in August 2012. The game offers servers with a tick rate of 128Hz, which is what they use in competitive events such as ESEA and Dreamhack.
Valve also offers CS:GO dedicated servers, which individuals can install and host at 128-tick (the tick rate is adjustable as part of the configuration of the server).
These are highly configurable gameplay servers which are made accessible to other players in the community through a server-lookup that is integrated with the client [38].
However, the absolute majority of CS:GO players queue through the in-game