June 2007
Svein Johan Knapskog, ITEM André Årnes, Kripos
Master of Science in Computer Science
Submission date:
Supervisor:
Co-supervisor:
Norwegian University of Science and Technology Department of Telematics
Intrusion Detection in High-Speed Networks
Martin Riegel
Claes Lyth Walsø
Problem Description
In this assignment, methods for high-speed intrusion detection using programmable network monitoring cards will be studied. A distributed high-speed intrusion detection system based on an IDS and DAG monitoring cards is to be deployed in the Uninett backbone network, and
experiments and testing will be performed. In addition, strategies for handling high-speed network data, such as the use of programmable network cards, parallel processing and sampling will be considered. Possible applications for a prototype may be real-time security monitoring, as well as the collection and statistical analysis of security relevant data.
Assignment given: 01. February 2007 Supervisor: Svein Johan Knapskog, ITEM
A BSTRACT
This thesis investigates methods for implementing an intrusion detection system (IDS) in a high-speed backbone network. The work presented in this report is run in cooperation with Kripos and Uninett.
The popular IDS software, Snort, is deployed and tested in Uninett’s backbone network. In addition, the monitoring API (MAPI) is considered as a possible IDS implementation in the same environment. The experiments conducted in this report make use of the programmable DAG card, which is a passive monitoring card deployed on several monitoring sensors in Uninett’s backbone.
As a limitation of the workload, this report only focuses on the detection of botnets. Botnets are networks consisting of infected computers, and are considered to be a significant threat on the Internet as of today.
A total of seven experiments using Snort are presented. These experiments test 1) the impact the number of rules have on Snort, 2) the importance of good configuration, 3)the importance of using well written rules, 4) Snort’s ability to run in an environment with minimum external traffic, 5) the impact the size of the processed packets have, 6) the impact the TCP protocol has on packet processing and 7) Snort’s ability to run as a botnet detection system for a longer period of time.
Based on the results from these experiments, it is concluded that Snort is able to run as a botnet detection system in a high-speed network.
This report also discusses some strategies for handling high-speed network data and some future aspects. In addition, ideas for further work and research are given in the end of the report.
i
P REFACE
This thesis serves as a Master of Science thesis pursued in the 10th semester of the masters programme in telematics at the Norwegian University of Science and Technology, NTNU.
The motivation for the research described in this report was given by Uninett and André Årnes from the Computer Crime Division at Kripos.
We would like to thank our main supervisor, André Årnes, for continuous feedback and help throughout the semester, and professor Svein J. Knapskog for his helpful advice and guid- ance. Also thanks to Erik Hjelmås at Gjøvik University College for providing the IRC server used in the experiments.
A very special thanks goes to Arne Øslebø at Uninett. Without you, this thesis would never have been finished.
Trondheim, June 2007
Martin Riegel Claes Lyth Walsø
iii
A BBREVIATIONS
In this section, abbreviations commonly used in the report are listed in alphabetical order and explained:
AIM AOL Instant Messenger A freeware for instant messaging API Application Programming Interface
A set of routines, protocols and tools for creating software applications C&C Command Control
A C&C server is a server used to control and communicate with bots CPU Central Processing Unit
The processing component in a computer CSV Comma Separated Value
A file format which values are separated by a comma DAG Data Acquisition and Generation
A Gigabit Ethernet monitoring card DiMAPI Distributed MAPI
A distributed functionality for MAPI (see below) DoS Denial of Service
A computer attack that attempts to make resources unavailable FTP File transfer protocol
Protocol used on the Internet for transferring files GB Giga Byte
One billion bytes Gpbs Giga bit per second
One billion bits per second. Used in network traffic context GUI Graphical User Interface
An user interface which let people interact with a computer ICMP Internet Control Message Protocol
A protocol part of the Internet protocol suite IDS Intrusion Detection System
A system which detects intrusion attempts in a computer network IGRP Interior Gateway Routing Protocol
A protocol for exchanging routing data v
vi
IP Internet Protocol
A data-oriented protocol used for communicating data across a packet-switched network IPS Intrusion Prevention System
A system which prevents intrusion attempts in a computer network. Often used with an IDS(see above)
IRC Internet Relay Chat A communication protocol ISP Internet Service Provider
An company which sells Internet access to customers LBNL Lawrence Berkeley National Laboratory
Laboratory in USA which is managed by the University of California LOBSTER Large-scale Monitoring of Broadband Internet Infrastructures
A pilot European infrastructure for accurate Internet traffic monitoring MAPI Monitoring API
An API (see above) for creating functionalities for network monitoring MB Mega Byte
One million bytes
NIC Network Interface Controller
An adapter circuit board installed in a computer to provide a physical connection to a network
OS Operating system
The operating system installed on a computer/server OSI Open Systems Interconnection
A network standard which provides a layered reference model Pcap Packet Capture
API (see above) for capturing packets on a computer network PIDS Privacy IDS
An IDS (see above) that is specifically tailored to intrusion attempts related to privacy issues
PoS Packet over SONET
A protocol for sending packets over a SONET (see below) SCAMPI Scaleable Monitoring Platform for the Internet
A European project to develop a scaleable monitoring platform for the Internet SONET Synchronous Optical Networking
A method for computer communication over optical fiber SSH Secure Shell
A network protocol for securing a communication channel TB Tera Byte
One trillion bytes
vii TCP Transmission Control Protocol
A transport protocol that guarantees reliable and in-order delivery of sender to receiver data
ToS Type of Service
A field in the IP header used for specifying a service UDP User Datagram Protocol
A connectionless network protocol VM Virtual Machine
A virtual machine achieved by using VMWare
C ONTENTS
Abstract i
Preface iii
Abbreviations vii
1 Introduction 1
1.1 Motivation . . . 1
1.2 Objective . . . 1
1.3 Context . . . 2
1.3.1 Threats on the Internet . . . 2
1.3.2 Botnets . . . 2
1.4 Research methodology . . . 3
1.5 Contributions . . . 3
1.6 Structure . . . 4
2 Background 5 2.1 Introduction . . . 5
2.2 IDS . . . 5
2.2.1 Signature- and Anomaly-Based IDS . . . 6
2.2.2 Network-Based IDS . . . 6
2.2.3 Host-Based IDS . . . 6
2.2.4 Stack-Based IDS . . . 6
2.2.5 High-Speed IDS . . . 7
2.3 Related Work . . . 7
2.3.1 A Splitter architecture . . . 7
2.3.2 HotBots 07 . . . 8
2.4 Snort . . . 8
2.4.1 The Ruleset . . . 9
2.5 MAPI . . . 9
2.5.1 The Ruleset . . . 10
2.6 Additional Software . . . 10
2.6.1 VMware . . . 10
2.6.2 Wireshark . . . 11
2.6.3 Cerebus . . . 11
2.7 UNINETT . . . 12
2.7.1 Passive sensor . . . 13
2.8 DAG cards . . . 14
2.8.1 Filters . . . 14
2.8.2 Making filters . . . 15
2.8.2.1 Filter syntax . . . 16
2.9 Botnet . . . 18 ix
x CONTENTS
2.9.1 Botnet Protocols . . . 18
2.9.2 The Botnet Lifecycle . . . 19
2.9.3 Consequences . . . 19
2.9.4 Botnet Families . . . 20
2.10 The IRC protocol . . . 20
3 Approach 23 3.1 Introduction . . . 23
3.2 Architecture . . . 23
3.2.1 Security Aspects . . . 24
3.2.2 Botmaster . . . 25
3.2.3 Bot . . . 25
3.2.4 IRC Server / C&C Server . . . 26
3.2.5 IDS . . . 27
3.2.6 Pros and Cons . . . 28
3.3 Implementation Preparations . . . 28
3.3.1 Compiling the Bots . . . 28
3.3.2 IDS Rules . . . 31
3.3.3 Malicious Traffic Script . . . 32
3.4 Implementing Snort . . . 32
3.4.1 Experiment 1 - Testing The Number of Rules . . . 34
3.4.2 Experiment 2 - Optimising Snort Configuration . . . 34
3.4.3 Experiment 3 - Testing Impact of Common Strings . . . 35
3.4.4 Experiment 4 - Testing With Minimum External Traffic . . . 35
3.4.5 Experiment 5 - Testing Packet Length Input . . . 35
3.4.6 Experiment 6 - Testing Impact of the TCP Protocol . . . 36
3.4.7 Experiment 7 - IDS Stand-Alone Test . . . 36
3.5 Implementing MAPI . . . 37
4 Results 39 4.1 Introduction . . . 39
4.2 Results from the Experiments . . . 40
4.2.1 Experiment 1 - Testing Number of Rules . . . 40
4.2.2 Experiment 2 - Optimising Snort Configuration . . . 42
4.2.3 Experiment 3 - Testing Impact of Common Strings . . . 43
4.2.4 Experiment 4 - Testing With Minimum External Traffic . . . 43
4.2.5 Experiment 5 - Testing Packet Length Input . . . 44
4.2.6 Experiment 6 - Testing Impact of the TCP Protocol . . . 45
4.2.7 Experiment 7 - IDS Stand-Alone Test . . . 46
5 Analysis 47 5.1 Introduction . . . 47
5.2 Analysing the Experiments . . . 47
5.2.1 Experiment 1 - Testing Number of Rules . . . 48
5.2.2 Experiment 2 - Optimising Snort Configuration . . . 51
5.2.3 Experiment 3 - Testing Impact of Common Strings . . . 54
5.2.4 Experiment 4 - Testing With Minimum External Traffic . . . 55
5.2.5 Experiment 5 - Testing Packet Length Input . . . 56
5.2.6 Experiment 6 - Testing Impact of the TCP Protocol . . . 57
CONTENTS xi
5.2.7 Experiment 7 - IDS Stand-Alone Test . . . 58
5.3 Sources of Error . . . 58
5.4 Conclusion . . . 59
6 Discussion 61 6.1 Introduction . . . 61
6.2 Snort vs. MAPI . . . 61
6.2.1 The Rules . . . 61
6.2.2 Performance . . . 61
6.2.3 Distributed approach . . . 62
6.3 DAG cards and Other Hardware . . . 62
6.4 Circumventing the IDS . . . 63
6.5 Botnet Similarities . . . 63
6.6 Alternative Detection Algorithms . . . 63
6.7 Handling the Detection Data . . . 64
6.8 Future Prospects . . . 64
6.8.1 A distributed Approach . . . 64
6.8.2 IPv4 vs. IPv6 . . . 65
7 Further work 67 7.1 Introduction . . . 67
7.2 Implementing MAPI as an IDS . . . 67
7.3 Testing Maximum Traffic Load . . . 67
7.4 Botnets Based on Other Protocols . . . 67
7.5 Increased experiment length and frequency . . . 68
7.6 Dynamic Rules . . . 68
7.7 Notification Upon Detection . . . 68
7.8 Detection of Other Types of Intrusion . . . 68
7.9 Sampling . . . 68
References 70 A Modified Snort configuration file 71 A.1 snort.noflow.conf . . . 71
B MAPI IDS program 77 B.1 MAPI IDS program . . . 77
L IST OF F IGURES
1.1 Active bot-infected computers per day . . . 3
2.1 Overview of the splitter architecture . . . 7
2.2 A Snort rule . . . 9
2.3 A MAPI rule . . . 10
2.4 Illustration of a virtual machine . . . 11
2.5 Overview over UNINETT’s backbone . . . 12
2.6 DAG 4.3s Networking monitoring card . . . 14
2.7 Options for making filters . . . 15
2.8 DAG Filter syntax . . . 16
2.9 DAG Filter syntax example . . . 17
2.10 How hosts become part of botnets . . . 19
2.11 A small IRC network . . . 21
3.1 Overview of the experimental architecture . . . 24
3.2 UNINETT sensor . . . 27
3.3 Rbot configuration file . . . 30
3.4 An example of a content matching Snort rule . . . 31
3.5 mIRC bot command script . . . 32
3.6 Snort.noflow.conf . . . 33
3.7 Snort.tweak.conf . . . 35
3.8 Mapi cooking function . . . 37
5.1 Detection Ratio in experiment 1 . . . 48
5.2 Snort drop ratio in experiment 1 . . . 49
5.3 Processed packets per second in experiment 1 . . . 50
5.4 Detection Ratio in experiment 2 . . . 51
5.5 Snort dropping ratio in experiment 2 . . . 52
5.6 Packet drop ratio in experiment 3 . . . 54
5.7 The effect of the slen value . . . 56
5.8 The impact of the TCP protocol . . . 57
A.1 snort.noflow.conf part 1 . . . 71
A.2 snort.noflow.conf part 2 . . . 72
A.3 snort.noflow.conf part 3 . . . 73
A.4 snort.noflow.conf part 4 . . . 74
A.5 snort.noflow.conf part 5 . . . 75
B.1 MAPI IDS program part 1 . . . 77
B.2 MAPI IDS program part 2 . . . 78
xiii
L IST OF T ABLES
4.1 Results when using 5 rules . . . 40
4.2 Results when using 30 rules . . . 40
4.3 Results when using 50 rules . . . 40
4.4 Results when using 100 rules . . . 41
4.5 Results when using 200 rules . . . 41
4.6 Results when using 1000 rules . . . 41
4.7 Results when using 200 rules with optimised configuration file . . . 42
4.8 Results when using 1000 rules with optimised configuration file . . . 42
4.9 Results when common words are inlcuded in the ruleset . . . 43
4.10 Results when using 5 rules and a minimum of external traffic . . . 43
4.11 The effect of the slen value when using the modified configuartion file . . . 44
4.12 The effect of the slen value when using the optimised configuartion file . . . 44
4.13 The impact the TCP protocol has on the ratio of dropped packets . . . 45
4.14 The IDS stand-alone test . . . 46
xv
C
HAPTER1
I NTRODUCTION
In this report, methods for high-speed intrusion detection systems using programmable net- work monitoring cards will be studied. A high-speed intrusion detection system (IDS) based on Snort, MAPI and DAG monitoring cards will be deployed in the Uninett backbone net- work, and experimental testing will be performed. In addition, strategies for handling high- speed network data will be considered.
1.1 M
OTIVATIONWith the increasing amount of threats on the Internet there is also an increasing need for sys- tems to detect intrusion attempts. These IDSs need to be able to detect known attacks and to handle new ones as they are discovered.
Modern high-speed networks set requirements to scalability of the IDS, as high-speed IDS need to monitor every IP packet and flow and correlate this with the IDS’s database for known attacks.
The research presented in this report is inspired by the LOBSTER1 and SCAMPI2 projects.
LOBSTER is a pilot European Infrastructure for large-scale monitoring of broadband Internet infrastructure. LOBSTER uses passive monitoring sensors connected to the network at speeds from 2.5 to 10 Gbps. SCAMPI is a predecessor to LOBSTER and is an acronym for a Scalable Monitoring Platform for the Internet. The research presented in this report is conducted in cooperation with Uninett. Uninett is responsible for the national research and educational network, and they are also part of the SCAMPI and LOBSTER projects.
1.2 O
BJECTIVEThe primary objective of this project is to implement an IDS in Uninett’s backbone. Strengths and weaknesses of the different approaches will be studied using Snort and MAPI. Snort is an open source, de facto standard for intrusion detection and prevention. MAPI is a network monitoring API designed for high-speed networks.
The IDSs presented in this report will, due to workload limitations, only focus on botnets.
The overall goal of this report is however to end up with a scalable IDS implemented on a 2.5 Gbps link. Due to privacy issues this report does not include a distributed approach, as access was only granted for one sensor. This is further discussed in Section 6.8.1.
1http://www.ist-lobster.org/
2http://www.ist-scampi.org/
1
2 CHAPTER 1. INTRODUCTION
1.3 C
ONTEXTThis section will explain some of the current threats on the Internet relevant for this report.
1.3.1 Threats on the Internet
According to theSymantec3Internet Security Threat Report[Cor06], malicious attacks have gone from being network-based to target client side applications. [Cor06] is a report of the secu- rity threat level on the Internet based on the observations made in the fist six months of 2006, while [Cor07] is for the last six months of 2006. Symantec’s observations are based on the traf- fic collected on several honeypots and by utilizing over 40.000 sensors in over 180 countries.
The following points sums up some highlights mentioned in [Cor06] and [Cor07]:
• An average of 6110 Denial of Service(DoS) attacks daily in the first six months of 2006.
• In the first six months of 2006, 86% of all targeted attacks were against the home user sector. This increased to 93% during the last six months.
• During the first half of 2006, 18% of all distinct malicious code samples were new.
• Worms made out 38 of the top 50 malicious code samples during the same period.
• During 2006 there was an 81% increase of phishing messages, which again increased another 6% the last six months.
• 84% of the phishing activity had financial gain as intention.
As can be seen from the last point, and also mentioned throughout [Cor06], [Cor07] and [BY07], there has been an increase in attacks with the purpose of financial gain.
Also worth mentioning in this connection is that 38% of DoS attacks were directed against Internet Service Providers (ISP) during the first half of 2006.
1.3.2 Botnets
According to [RZMT06], a botnet is a network consisting of infected computers. These com- puters, or bots, are under human control.
Symantec mentions in [Cor06] that they have detected an average of 57,717 active bot net- works per day during the first six months of 2006. As mentioned in [Cor07], this increased to 63,912 during the last six months. Figure 1.1 shows the number of active bot-infected comput- ers per day during 2006. Themoving averageline is a calculated mean value of the number of active bots at the given period.
Approximately 6.000 of these were command and control servers (C&C). C&C servers are servers which provide communication channels for the botmaster, which allows the botmas- ter to communicate with the bots in the network [RZMT06].
3Symantec is, according to their website, "a global leader in infrastructure software, enabling businesses and consumers to have confidence in a connected world." They sell and manufacture security software.
1.4. RESEARCH METHODOLOGY 3
Figure 1.1:Active bot-infected computers per day. The figur is extracted from [Cor07]
Botnets will be further elaborated in Section 2.9 as they are the main focus of the research conducted in this report.
1.4 R
ESEARCH METHODOLOGYThe following is the scientific steps of the research presented in this report:
1. It is necessary to perform a background and theoretical study.
2. The design of the experimental architecture and intrusion detection system has to be made.
3. The experiments have to be prepared, i.e., locate and install bots in a safe environment.
In addition, an IRC server has to be set up in a remote location.
4. Conduct the actual experiments.
5. The results from the experiments will be analysed and discussed.
1.5 C
ONTRIBUTIONSThis report demonstrates that it is possible to use the popular intrusion detection system Snort as an IDS in a high-speed environment, with the use of specialized network hardware in the DAG cards.
The experiments conducted in this report were run in cooperation with Uninett. On the basis of these experiments several bugs in MAPI were discovered. These bugs have been reported, but they are not included in this report, as they are outside the topic if this research.
4 CHAPTER 1. INTRODUCTION
1.6 S
TRUCTUREThis section gives a brief description on the rest of the chapters in the report.
Chapter 2
Chapter 2 provides some background knowledge needed to understand the research pre- sented further in this report.
Chapter 3
Chapter 3 presents the experimental approach.
Chapter 4
Chapter 4 presents the results from the experiments. The results are given in tables.
Chapter 5
Chapter 5 analyses and discusses the results given in Chapter 4.
Chapter 6
Chapter 6 discusses some of the different elements in relations with the experiments presented in this report.
Chapter 7
Chapter 7 suggest some further work related to the research presented here.
The configuration files, bot source code and IDS rules, are included as a digital attachment to this report.
C
HAPTER2
B ACKGROUND
This chapter will provide some background material needed for further reading of this report.
The term IDS will be explained and different kinds of IDSs will be discussed. Some related work will be presented next. An overview over the hardware and software used in the re- search will then given (i.e., Snort, MAPI and some other software applications, and also the DAG-cards). In addition, information about Uninett and some background information about botnets and the IRC protocol will be presented.
2.1 I
NTRODUCTIONAs mentioned in Chapter 1, this report focuses around the process of implementing an IDS in a high-speed environment and the problem and challenges which may arise in that connection.
2.2 IDS
An Intrusion Detection System is continuously inspecting network activity. The IDS is look- ing for suspicious traffic traversing the network, mainly in the inbound direction. However, there are some IDSs which also analyses the traffic in the outbound direction.
As opposed to other network monitoring systems, an IDS looks for specific traffic patterns which are believed to be malicious or suspicious. When suspicious traffic is detected an IDS may take a number of actions. The most common is to alert the administrators who can take necessary actions.
As mentioned, there are several kinds of IDSs, all of which works in different ways. Some of the most important ones will be explained in the following sections in order to get a full understanding of how an IDS works. Section 2.2.5 describes IDSs working in high-speeds environments and some related work. Most of the explanations below are extracted from NISTs1Guide to Intrusion Detection and Prevention Systems[SM07]. As the title suggest, [SM07]
also covers the field of Intrusion Prevention Systems (IPS). According to [SM07], an IPS is software that has all the capabilities of an IDS, but can also attempt to stop possible incidents.
This report will, however, focus on IDS.
1NIST - National Institute of Standards and Technology
5
6 CHAPTER 2. BACKGROUND 2.2.1 Signature- and Anomaly-Based IDS
Signature-based detection is the process of looking for known malicious network traffic pat- terns. This process works very well with known threats, although any new suspicious activity will not be detected.
Anomaly-baseddetection is the process of comparing the network traffic pattern to what is con- sidered to be known normal traffic. The IDS monitors the traffic over a period of time, known as the training period, and creates a profile of the normal traffic. Statistical methods are then applied to detect deviations. These profiles may be quite extensive and anomaly-based IDS has shown to work well when it comes to detecting unknown threats.
2.2.2 Network-Based IDS
A network-based IDS simply analyses the network traffic. According to [SM07], it is most commonly deployed at the network boundaries. The traffic is analysed in real-time as it tra- verses the network. Usually a filter is used to decide which traffic is to be let through and which is to be further analysed by the IDS.
Some of the strengths of a network-based IDS are explained in [LA00]. Itis worth mentioning that this type of IDS does not rely on any software or operation system specifications on the hosts, even though they may be configured to do so if necessary. Because it provides real-time packet analysis, a fast and efficient response may be achieved. Using a network-based IDS makes it harder for an attacker to hide her tracks.
2.2.3 Host-Based IDS
A host-based IDS analyses the activity of a single host for suspicious activity (i.e., network traffic specific for the host, system logs, running processes, application activity, file access and modification, and system and application configuration changes) [SM07].
[LA00] list some of the strengths in host-based IDS. This type of detection is less prone to false positives, and may make it possible to see whether an attack was successful. A host-based IDS may also detect system specific activity at the host. Depending on the implementation, host-based IDSs may provide close to real-time analysis. In addition, no additional hardware is needed by using a host-based IDS.
2.2.4 Stack-Based IDS
[LA00] refers to stack-based IDSs as the newest of IDS technology. These IDSs analyse the packets as they traverse the layers of the protocol stack. If the IDS detects malicious traffic, the packets may be rejected before they are processed at the application layer. It is claimed in [LA00] that acompletestack-based IDS checks both inbound and outboud traffic.
This kind of IDS will not be discussed any further, as it is considered outside the scope of this report.
2.3. RELATED WORK 7 2.2.5 High-Speed IDS
A high-speed IDS is, in this report, referred to as an IDS working in an environment with ex- cessive bandwidth, from 2.5 Gbps to 10 Gbps. [SYL03] mentions that the main problem with running an IDS in such environments is the huge amounts of data that needs to be processed.
The next Section describes some work related to high-speed IDS research presented in this report.
2.3 R
ELATEDW
ORKIn this report, related work is considered to be research on high-speed IDSs and also IDSs specifically designed to detect botnets based on IRC. More specificly, this also includes re- search related to the LOBSTER and SCAMPI projects, mentioned in section 1.1.
2.3.1 A Splitter architecture
Several solutions for handling the problems concerning high-speed IDS have been published.
One solution is described in [XCA+06] by some of the LOBSTER partners. The article presents a traffic splitter architecture, which was also mentioned in [CAM04]. The purpose of this architecture is to split the traffic among several intrusion detector sensors. One important principle is that packets that belong to the same attack will be processed by the same sensor.
Figure 2.1 presents an overview of the splitter architecture.
Figure 2.1:Overview of the splitter architecture. Copied from [CAM04].
The splitter architecture consists of the following components:
• The early filtering process only checks the IP packet headers. If the header is not known to be malicious and the packet contains no payload, the packet is filtered out. Otherwise, the packet is forwarded to the load distributor.
• The purpose of the load distributor is to divide the network traffic among the end sen- sors, and to keep them as evenly loaded as possible. In addition, the packets of the same network flow should be examined by the same sensor. This is achieved by computing
8 CHAPTER 2. BACKGROUND a hash function on selected fields in the packet headers, and then forward the packets with the same hash value to the same sensor.
• The locality buffers attempts to rearrange the interleaving of packets in the traffic so that subsequent packets will trigger the same ruleset as often as possible, and thus decrease the processing time.
This architecture is shown to work well with high speed networks below 1Gbps. However, the architecture based on special designed hardware which will not be used in the research in this report. See [XCA+06] and [CAM04] for further elaboration.
2.3.2 HotBots 07
In April 2007, a conference (HotBots 07) about botnet research was arranged in Cambridge, USA. The article [KRH07] describes a method to detect, track, and characterize botnets on high-speed networks. It is claimed that their method:
1. is entirely passive and therefore invisible to operators, 2. scales to the largest of networks,
3. is based on flow data analysis, which limits privacy issues, 4. has a false positive rate of less than 2%,
5. helps identify botnets that are most affecting real users and customers, 6. can detect botnets that use encrypted communication, and
7. helps quantify size of botnets, identify and characterize their activities without joining the botnet.
The main focus in this thesis report is however to implement a high-speed IDS using MAPI and Snort, eleborated in sections 2.5 and 2.4, respectively. The mentioned approach will, how- ever, not be considered any further.
[GH07] is another article presented atHotBots 07. This article is more relevant to the research conducted in connection with this thesis. [GH07] presents a method which relies on the de- tection of the communication channel between bot and the C&C server. This method is based on the special nicknames that are assigned to the bots in a botnet. By looking at the IRC traffic and inspecting special "unhuman" nicknames, they are able to detect IRC-based botnets.
2.4 S
NORTThe research presented in this report, uses the open source IDS called Snort. The following text is cited from Snort’s official website (www.snort.org):
Snort is an open source network intrusion prevention and detection system utilizing a rule-driven language, which combines the benefits of signature, protocol and anomaly based inspection methods.
2.5. MAPI 9 With millions of downloads to date, Snort is the most widely deployed intrusion detection and preven- tion technology worldwide and has become the de facto standard for the industry.
Snort may be used as a packet sniffer like tcpdump, as a packet logger, or as an IDS. When using Snort as an IDS, it uses a set of rules to detect intrusions. These rules are explained in the next section.
2.4.1 The Ruleset
According to their website, rules are based on detecting the actual vulnerability as opposed to a specific signature. A signature is based on an exploit of a unique piece of data.
Figure 2.2 shows an example Snort rule2.
A Snort rule
a l e r t t c p any any −> 1 9 2 . 1 6 8 . 1 . 1 5 0 111 ( c o n t e n t : " |00 01 86 a5| " ; msg : " mountd a c c e s s " ; )
Figure 2.2:A Snort rule, as presented in [Pro06].
In short terms, the rule in 2.2 will make Snort alert with the messagemountd access, when tcp packets with the destination address192.168.1.150, port 111 is received and the given payload specified incontent:" |00 01 86 a5|"is matched.
The rule header is the text up to the first parenthesis in Figure 2.2. As explained in [Pro06], the rule header contains the rules action, protocol, source, and destination IP addresses and netmasks, and the source and destination ports information. Following the header is the rule option section.
The rule option Section is according to [Pro06] the heart of Snorts intrusion detection engine.
As can be seen in Figure 2.2, this is where the content matching rule is expressed.
2.5 MAPI
The research presented in this report also uses the Monitoring API called MAPI. The follow- ing text is cited from the MAPI webpage (mapi.uninett.no):
MAPI, or Monitoring API, is a multi-user programming interface designed to simplify the develop- ment of network monitoring software and allows users to express their monitoring needs in a device- independent way. The main abstraction provided by MAPI is the network flow. Although flows have been used before in network monitoring systems, MAPI gives flows a first-class status.
The experiments described in makes use of MAPI 2.0Beta1, which was released in Septem- ber 2006. MAPI supports normal NICs (network interface controller), SCAMPI adapters, and also DAG cards without a co-processor. MAPI is implemented in C and provides a wide set of
2Because of available page width in the report the rule is split over two lines.
10 CHAPTER 2. BACKGROUND monitoring functionalities. MAPI runs as a background daemon. A user typically makes his own programs which uses functions available in MAPI. These programs connect to the MAPI daemon when they are executed. MAPI supports both libpcap and its own DAG API.
The paper [TPP+06] presents a distributed extension to MAPI (DiMAPI). By using DiMAPI, it is possible to create an IDS working with several sensors. [TPP+06] also presents a network IDS. The usage of DiMAPI is discussed in Section 6.8.1.
2.5.1 The Ruleset
It is easy to create rules corresponding to the Snort rules in MAPI. Figure 2.3 presents the code lines representing the same rule as in figure 2.2.
A MAPI rule 1 example = m a p i _c r e a t e _ f l ow ( " /dev/dag0 " ) ;
2 mapi_apply_function ( example , " BPF_FILTER " , " t c p AND d s t h o s t 1 9 2 . 1 6 8 . 1 . 1 5 0 AND d s t p o r t 111 " ) ; 3 mapi_apply_function ( example , "STR_SEARCH" , " |00 01 86 a5| " , 0 , 1 5 0 0 ) ;
4 mapi_apply_function ( example , " TO_FILE " , MFF_PCAP, " example . pcap " , 0 ) ; 5
6 i f( mapi_connect ( example ) < 0 ) {
7 p r i n t f ( " Could not c on ne c t t o flow %d\n " , example ) ;
8 e x i t ( EXIT_FAILURE ) ;
9 }
10
11 mapi_close_flow ( example ) ;
Figure 2.3:A MAPI rule. Only the relevant lines are included.
First, as shown in Figure 2.3, the correct flow is created for the network interface of interest, dag0. Second, the filter, which corresponds to the rule header in the Snort rule, is created.
Line 4 applies the search string function. This corresponds to thecontent: "|00 01 86 a5|"part of the Snort rule in Figure 2.2. Line 5 simply logs the detection in a pcap file.
Theif test is simply to check whether the flow was created in a correct manner. Finally, the flow is closed. Notice that there are some lines missing in order to make this a complete C program.
2.6 A
DDITIONALS
OFTWAREIn order to set up the experiments described in this report, some additional software is needed.
A short explanation of this additional software is given in the following sections.
2.6.1 VMware
VMware is basically a program that emulates a computer. This makes it possible for multiple operating systems to co-exist as virtual machines (VM) inside the program, as illustrated in Figure 2.4. The main advantage of suing VMware for virtualization is that it is very easy to control the environment, especially when it comes to quickly pausing or shutting down an OS
2.6. ADDITIONAL SOFTWARE 11 if needed. These latter qualities are favourable for running the experiments throughout the report in a secure manner, as will be further elaborated in Chapter 3.
Figure 2.4:Illustration of a virtual machine
Another positive aspect with how VMware operates, is that it is easy to install monitoring software on the host OS outside of the VMs. This makes it very easy to see how the VM interacts with the external environment and take actions on unwanted communication.
2.6.2 Wireshark
According to wireshark.org, Wireshark is a popular network protocol analyser, previously known as Ethereal. A popular way of using Wireshark is to install it as a packet sniffer. This means that the computer administrator can get a complete picture of the traffic going from or to a computer.
Wireshark makes it possible to inspect traffic patterns, and it is a helpful tool when conduct- ing the experiments described in Chapter 3. As will be shown, Wireshark makes it easy to distinguish between the traffic originating from the VM operating in normal mode and when the different bots are started.
2.6.3 Cerebus
According todragos.com/cerebus, Cerebus is a "full screen, GUI and text-based unified IDS alert file browser and data correlator".
Cerebus makes it possible to read Snort alert files saved in the unified format. The unified format is written in binary and is amongst one of fastest (if not the fastest)3 logging format available in Snort. Cerebus is a very useful tool when conducting the experiments throughout the report.
3According to [Pro06]; configuring Snort, output modules, unified.
12 CHAPTER 2. BACKGROUND
2.7 UNINETT
"The UNINETT group supplies network and network services for universities, university col- leges and research institutions and handles other national ICT4 tasks. The Group is owned by the Norwegian Ministry of Education and Research and consists of a parent company and four subsidiaries"5.
Uninett’s network structure is very comprehensive and the network bandwidth is continu- ously increasing, as the network equipment is replaced with high-speed lines and new hard- ware. The fastest links are currently operating at a speed as high as 10 Gbps, with the majority of the network operating on speeds around 150-1000 Mbps and some on 2.5 Gbit.
Figure 2.5:Overview over UNINETT’s backbone6
The GigaCampus programme7 is a four year initiative put forward by UNINETT. The over-
4ICT - Information and Communication Technology
5Quoted from http://www.uninett.no/om.en.html
7http://www.gigacampus.no
2.7. UNINETT 13 all objective is to address key networking challenges on the campus networks of Norwegian universities and university colleges towards 2009. One goal in this initiative is to increase the overall speed on the network. This means that gigabit networks should be a reality for most of the connected users, and all the big universities should be connected with 10 Gbps links.
2.7.1 Passive sensor
As mentioned above, UNINETT has an extensive network which requires monitoring on dif- ferent levels. UNINETT has placed passive sensors throughout their network. These are fast computers with one or two DAG-card(s) installed. Information about DAG cards and how they operate are further elaborated in section 2.8. The experiments in this report will be car- ried out on a sensor connected to one of the most busy lines on the UNINETT backbone.
As a reference; the average traffic load8 on the sensor on a typical day day, shows that ap- proximately 160.000 packets is sent trough each seconds. The size of these packets sums up to 125MB each second - giving a total of 10.8 TB data per day.
8http://drift.uninett.no/
14 CHAPTER 2. BACKGROUND
2.8 DAG
CARDSDAG cards are basically advanced network monitoring cards with support for bandwith up to 10 Gbps. They are, however, quite more sophisticated than a normal network interface card, in the sense that it is possible to program these cards to do packet processing directly on-board.
As mentioned in Section 2.7.1, UNINETT has placed passive sensors with DAG cards dif- ferent places in their network. The DAG model 4.3S, as shown in Figure 2.6, was used in the experiments described in this thesis. This card supports 2.5 Gbps speeds on a Packet Over Sonet (Pos) link.
Figure 2.6:DAG 4.3s Networking monitoring card Copyright Endace [Ltd05] 2005
DAG 4.3S can be extended with an extra co-processor, which allows for even more packet processing directly on the card. The main advantage of moving the processing to the card, is that hardware processing can be much faster than software processing. Another advantage is that it is possible to use the card as a filter, where packets which are not of interest are dropped instead of being forwarded to the software.
2.8.1 Filters
The DAG 4.3S card is, as mentioned, capable of storing filters on the card. When the card op- erates in default mode with one filter set, the card can store up to 16.384 filters. The filters are basically simple, one-line specifications used to describe characteristics of packets considered to be a match. In addition there is an action specified for each filter. The two possible actions are defined as:
• Accept
Accepted packets are passed on to the host computer.
• Reject
Rejected packets are dropped and not delivered to the host.
DAG 4.3S supports filtering on the different categories listed below. It should be underlined that if the categories are not present in the filter, they are just ignored when the filters are checked in the packet processing.
2.8. DAG CARDS 15
• Ingress interface
This filtering option is not relevant for research presented in this report, because the tested DAG 4.3S card only has one interface.
• Protocol
It is possible to filter on the different protocols ICMP, IGRP, TCP/RawIP or UDP.
• Source and destination IP addresses
Filtering can be done according to which source and destination IP address the packet is going to and from. In addition, filtering can be done on whole or parts of a subnet.
• TCP and UDP source and destination port numbers
It is possible to filter the packets on one or many given port numbers, either on the source or destination address - or both.
• TCP flags
The TCP flags are typically SYN, ACK, FIN, PSH, URG or RST and can be used in the filtering.
2.8.2 Making filters
The filters on the DAG card can be written in different "languages". As can be seen in Figure 2.7 below, there are basically three different ways to make a filter.
Figure 2.7:Options for making filters.
Copyright Endace [Ltd05] 2005
First, there are the written rules on the native DAG card format. As seen on the bottom left
16 CHAPTER 2. BACKGROUND side in Figure 2.7, these rules are basically filters which can be loaded directly into the co- processor by the filter loader.
Second, there are the Snort rules. These rules are written in the Snort format and can be copied directly from the rules used by Snort. Before it is possible to use the rules as filters on the DAG card, they have to be translated into DAG format. This is done using the Snort rule compiler, a program shipped with the DAG software. After compilation, the transformed Snort rules can be read by the filter loader and copied into the co-processor.
The third and last option, is to use rules written for the Tcpdump program. Just like the Snort rules, these rules need to be compiled before they can be used. This is done with the Tcpdump rule compiler, which generates DAG filters that can be read into the co-processor by the filter loader.
2.8.2.1 Filter syntax
Figure 2.8 shows the filter syntax for the filters used in the co-processor.
DAG Filter syntax
<number> < a c t i o n > < p r o t o c o l > < s r c−ip > < s r c−port > <dst−ip > <dst−port > <tcp−f l a g s >
Figure 2.8:DAG Filter syntax
Some syntax fields are already mentioned in Section 2.8.1, but are somewhat further elabo- rated in the list below.
• <number>is the filter number which is an integer between 0 and 16384. The highest numbers are processed first. Filter number 0, is often a "reject all" filter. This means that the packets that reach this filter, are basically packets that did not match any of the other filters and can therefore be discarded.
• <action>is either reject or accept the packet if the filter is matched.
• <protocol>is the protocol which is to be filtered on, and can be either ALL or one of ICMP, IGRP, TCP/RawIIP and UDP.
• <src-ip>is a source IP address or subnet. The field is a 32 bit binary string, with the most significant bit starting at position 0. The following symbols can be used:0,1or-, where0and1represents binary bits while-represents a "do not care bit". This means that a complete 32 bit string represented with1s and0s is a complete IPv4 address. If the last eight bits are "don’t care" bits, the filtering will match all IP addresses in a C-class subnet instead of one single IP address.
• <src-port>is the source port(s). The field is a 16 bit binary string, with the most signifi- cant bit starting at position 0. The following symbols can be used:0,1or-, where0and 1represents binary bits while-represents a "do not care bit". This means that a complete 16 bit string with1s and0s refers to one specific port. 0000000000010101 equals port 21, while 00000000010— means that the filter will match port numbers between 16 and 23.
2.8. DAG CARDS 17
• <dst-ip>is a destination IP address or subnet. See<src-ip>for more details.
• <dst-port>is the destination port(s). See<src-port>for more details.
• <tcp-flags>is the TCP flag that is set. This is an eight bit field which, e.g., represents the values SYN, ACK, FIN, PSH, URG or RST.
Figure 2.9 below shows an example of a filter. Note that there are two entries, namely line 1 and 0.9
DAG Filter syntax example
#1
1 a c c e p t t c p
s r c−i p {−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−} s r c−p o r t {−−−−−−−−−−−−−−−−} dst−i p {1100000010101000−−−−−−−−−−−−−−−−} dst−p o r t { 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 } tcp−f l a g s {−−−−−−1−}
#0
0 r e j e c t a l l
s r c−i p {−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−} s r c−p o r t {−−−−−−−−−−−−−−−−} dst−i p {−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−} dst−p o r t {−−−−−−−−−−−−−−−−} tcp−f l a g s {−−−−−−−−}
Figure 2.9:DAG Filter syntax example
This first entry matches packets where the traffic is TCP, the destination subnet is 192.168.0.0/16 and destination port is 21. In addition, the TCP-flag SYN has to be set in the packet header.
The second entry, line 0, basically rejects all packets which are not matched by line 1, regard- less of each source or destination IP, traffic type or port numbers.
9Because of available page width in the report, the filters are split over four lines, while they originally are written on one line each.
18 CHAPTER 2. BACKGROUND
2.9 B
OTNETA botnet is a network of bots, i.e., a collection of computers which are infected with some kind of malicious distributed software. This means that a bot is actually a program running on the infected host. As previously mentioned in Section 1.3.1, botnets are considered a sig- nificant threat on the Internet. The following sections elaborates the concept of botnets and the consequences they have on the end-users and the network.
2.9.1 Botnet Protocols
There are several ways to implement a botnet, and several communication protocols which may be used between the bots and the C&C servers. The following points list some various protocols used, as mentioned in [SBH+07].
• Internet Relay Chat
According to [RZMT06], [GH07] and [KRH07], to name a few, most botnets are based on the IRC protocol. According to [GSN+07], using IRC makes it attractive to the botmaster due to its redundancy, scalability and versatility. Beacause of its importance, the IRC protocol is further explained in Section 2.10.
• Web-based
According to [SBH+07], web-based C&Cs are the second most common type. [SBH+07]
continues by mentioning two types of web-based botnets. The first one, echo-based, means that the bot simply announces its existence to the C&C server. This may for example be achieved by sending the information as an URL to the web server.
The second type, command-based, uses GUI web interfaces to issue instructions to the bots.
• Peer-to-peer
As mentioned in [GSN+07], a peer-to-peer network is a network in which any node in the network can act as both client and server. Botnets based on peer-to-peer architecture are more resilient, as there is no centralized C&C server.
• Instant messaging
Computers infected with a bot would in this case communicate using known instant messaging protocols, such as AIM, Yahoo!, ICQ and MSN.
• FTP-based
According to [SBH+07], there are cases in which the FTP protocol is used. The bot may collect information from the host, and then upload the captured information to an FTP server.
[SBH+07] also claims that IRC is going to stay the leading protocol and application for C&C servers for a long time to come, even though uses of other protocols are increasing. In the following, all the use of the term botnet is restricted to refer to an IRC based botnet.
2.9. BOTNET 19 2.9.2 The Botnet Lifecycle
Any vulnerable computer could become a part of a botnet, and an example on how the com- puter is compromised and turned into a bot is given below. This example is an excerpt from [RZMT06] and is illustrated in Figure 2.10 below.
1. As shown in the figure, the first step is to exploit a vulnerability in the software running on the victim’s computer. This can be achieved by using self-replicating worms, e-mail viruses, etc.
2. After the attack, the exploited host will execute a script which will download a bot bi- nary. The bot binary will install itself on the computer. Once this is installed, the bot software will start every time the computer is rebooted.
3. Most commonly, the bot running on the host will connect to a remote server. The IRC protocol is, in this example, used for communicating and controlling the bots. At this point, a communication channel between the infected host and the IRC server is set up.
4. The vulnerable computer is now part of the botnet and is under control of the botmaster via the IRC server. The botmaster has full control over the IRC server and thus all bots in her network. The botmaster may now send commands to her bots and prepare and execute a full blown attack. In addition, it is possible to modify the bots if necessary.
Figure 2.10:Stepwise procedure on how host become part of a botnet.
It is worth mentioning that this is only one example of how a host is turned into a bot. As mentioned, several other methods and communication protocols may be used, but this will not be elaborated any further in this report.
2.9.3 Consequences
As mentioned in [KRH07], a botnet may be used to launch distributed DoS attacks, send spam, trojan, phishing e-mails, and so on. All of which may have devastating effect on the networks.
20 CHAPTER 2. BACKGROUND
As mentioned in [IH05], botnets may also be used to perform click frauds and stealing per- sonal information. Software programmed to capture sensitive information such as bank ac- count numbers, PIN codes, passwords, etc., may have financial implications to the end-users.
In addition to loosing private and personal data, the users may be employees of companies whose business strategies and other sensitive information may be compromised.
2.9.4 Botnet Families
Today there exists several botnet families. [BY07] describes four of the biggest ones; Agobot, SDBot, SpyBot, and GT Bot. It is, however, difficult to get hold of the source codes to the different bots as a researcher with non-criminal intentions. Some are, however, available and will be used in the research presented in this report. The following is a list of these botnets, based on IRC, with a short explanation:
• SDBot
This bot appeared in October 2002. According to [BY07], SDBot is a fairly simple bot, consisting of about 2.000 lines of C code. It is, however, claimed to be easily extendable.
[SBH+07] states that SDBot’s key to success is due to poor security on the compromised systems, and also the fact that the author released it as an open source. SDBot spreads itself by exploiting software using blank or common passwords on the host. The bot also includes a backdoor that allows an attacker to gain access to the infected host.
• RBot
As mentioned in [SBH+07], the RBot family is one of the most pervasive and complex of the ones existing today. Filenames and techniques used are different from one variant to another. This bot was, according to [SBH+07], the first one to use compression and encryption.
This bot includes functionalities that make it possible for the botmaster to download and execute files, creating a SOCKS proxy, participating in DDoS attack, logging key- strokes on the hosts, to name a few. In addition to using weak passwords as SDBot does, RBot also exploits vulnerabilities in the Windows operating system and common soft- ware applications. [SBH+07] also claims that RBot can termintate many antivirus and security products to ensure that it can continue to run undetected.
• DBot
Unfortunately, there is not much documentation available about this bot, which makes it all the more interesting. This bot also provides a wide set of functionality. Some of the features are multicommand topic and chat parsing, IRC connection timeout and the abil- ity to use unlimited number of irc servers, to name a few. Maybe the most threatening feature is that this bot also includes a bypass to Windows XP Service Pack 2’s firewall.
At least this is claimed infeatures.txt, which is appended to the source codes.
2.10 T
HEIRC
PROTOCOLAs the focus of this thesis is on IRC-based botnets, this section provides some background knowledge of the IRC protocol.
2.10. THE IRC PROTOCOL 21
According to [OR93], "IRC itself is a teleconferencing system, which (through the use of the client-server model) is well-suited to running on many machines in a distributed fashion."
The backbone of IRC is the connection of the IRC servers. Clients connect to these servers and may then chat with other clients connected to the same network. An example of a small IRC network is given in Figure 2.11 below.
Figure 2.11:Example of a small IRC network, as presented in [OR93]. A, B, C, D and E are all servers, while 1, 2, 3 and 4 represent clients connected to the servers.
When clients are communicating with each other, the messages traverse the relevant servers.
If for example clients 1 and 3 are communicating, the messages are sent through both server A and server B.
The clients have chosen unique nicknames as identifications, and may join or form their own channel. When connected to a channel, clients may send messages to each other, either pub- licly to all clients in the channel, or to specific clients privately. If, for example, clients 1, 2 and 3 are in the same channel, all messages to that channel are sent to all clients and the relevant servers.
In connection with botnets, the bots will sign on to one or more specific IRC channels which are maintained by the botmaster. The botmaster can now control the bots by sending com- mands through the channel.
See [OR93] for further elaboration of the IRC protocol.
C
HAPTER3
A PPROACH
This chapter describes how the different implementations of the IDSs on the high-speed net- work is carried out. First, an introduction is given, followed by the experimental approach.
The different implementations are described in Section 3.4 and Section 3.5. The results from these implementations are given in Chapter 4, and these results are again analysed and dis- cussed in chapter 5.
3.1 I
NTRODUCTIONAs mentioned in Chapter 2, installing an IDS on a high-speed network is not trivial. The main issue is the high data rate, which makes it hard to compare the packets and rules in real time.
This probably sets a limit to the number of rules that simultaneously can be checked.
In addition, there are some challenges when it comes to choosing which rules to use. Be- cause of the comparison problem, it is preferred to use as few rules as possible to find the desired traffic. That means that the rules should not only be able to identify the traffic, but also identify the traffic without generating too many false positives1.
The following list is a point-by-point overview of the experimental approach described fur- ther in Section 3.3.
1. Set up a bot, an IRC server and botmaster on three different locations.
2. Set up an IDS; either as a MAPI implementation or by using Snort.
3. Run a mIRC script on the botmaster computer. The script generates malicious traffic between the botmaster and IRC server.
4. Detect the malicious traffic using the IDS.
5. Analyse and compare the detected traffic.
3.2 A
RCHITECTUREFigure 3.1 presents an overview of the experimental setup. The botmaster, bot, and IRC server are installed on computers located on three different places; the botmaster and bot will be lo- cated on one side of the IDS while the IRC server is located on the other side. This is done in
1According to [SBH+07], a false positive is an event that the system reported that appears bad and in point of fact is benign. [SBH+07] mentions, however, that a false negative is when the system reports that something is okay (or does not report anything) and in point of fact the event is bad
23
24 CHAPTER 3. APPROACH such a way that commands sent from the botmaster to the IRC server also will be detected by the IDS, in addition to the traffic between the IRC server and bot.
The communication between the different computers will mainly take place on the Uninett network, as previously mentioned in Section 2.7. Even though two of the computers are lo- cated on other networks, the traffic is routed through the sensor where the IDS is installed.
Figure 3.1:Overview of the experimental architecture.
3.2.1 Security Aspects
An important point when it comes to the architecture is to make sure that the security is at- tended to. As can be seen from Figure 3.1, there are basically two components which are especially vulnerable; the bot and the IRC server. The computer running the botmaster only utilizes well known programs, and is not more exposed than any other computer connected to the Internet. The IDS is only available through the SSH protocol and therefore not considered to be any threat to the experiments.
The computer running the bot is probably the most vulnerable part in the architecture. Bots can be harmful and if precautions are not made, one could end up giving away control over the computer to another botmaster. To prevent this from happening, the bot executables used
3.2. ARCHITECTURE 25 in the experiments were all compiled by the authors. This is further elaborated in Section 3.3.1.
Because of this, it was guaranteed that the bots at least used a configuration file configured by the authors.
To deal with any unwanted behaviour, the bot programs were all run in a test environment where the traffic generated by the bots was monitored. First after the traffic pattern and be- haviour was thoroughly checked, the bot was allowed to run on the Internet as described in Figure 3.1. In addition, only safe commands were tested. These commands are basically only commands which make the bots reply with their, e.g., version and uptime information. How- ever, some scanner and vulnerabilities commands are included in the detection rules, which are further explained in 3.3.2.
The IRC server is the other vulnerable component besides the bot. The main reason for this, is that the bot is connected to the IRC server and a channel on the server. If someone were to compromise the computer and take control over the IRC server, they would in theory control the connected bot as well.
To prevent this from happening, some measures were taken. First of all, the IRC server was only running when the experiments were conducted, and not left on after the experiments were conducted. In addition, the server was configured to only allow traffic from the IP ad- dresses belonging to the botmaster- and the bot computer. Last, the IRC server was protected with a server and a channel password.
More on the security for the different components, is to some extent described in the sub- sections below.
3.2.2 Botmaster
The botmaster is a standard Windows XP computer with an IRC client program installed. As mentioned, the botmaster is located at the opposite side of the IRC server according to the IDS.
The botmaster is only responsible for sending commands to the IRC server. This is done from mIRC, an IRC client program. As explained on section 3.3.3, these commands are sent from a mIRC script.
Since the tasks performed on the botmaster are so limited, there are minimal requirements to hardware setup.
3.2.3 Bot
The computer where the bots are executed, is a standard Windows XP installation. As with the botmaster, the bot computer is located at the opposite side of the IRC server according to the IDS in figure 3.1.
The bot executables require very little when it comes to system resources, so there are mini- mum requirements when it comes to the hardware setup. However, the test bots used in this article were designed for the Windows platform and therefore a Windows installation is re-
26 CHAPTER 3. APPROACH quired when choosing an operating system.
As mentioned in Section 2.9, bots can be very harmful and extra caution is required when working on them. In that connection, VMware was used to control the environment when running the different bots. This means that the bots were installed and run inside of a VM.
More details on VMware are found in Section 2.6.1.
To get an overview over the traffic originating from the bot, Wireshark was used. As ex- plained in Section 2.6.3, this program makes it possible to see every packet going to and from a computer. By installing Wireshark on the host OS outside the VM, a complete picture of the traffic pattern generated from the bot was acquired.
In addition to the traffic monitoring outside the VM, Process Exlorer from Sysinternals2 was used to keep an eye on the behaviour of the bot executables inside the virtual machine. This made it easy to control the environment and to check if other processes were spawned from the bot executables.
The precautions and traffic sniffing, mentioned above, was mainly conducted to check the behaviour of the bot. After the initial testing was finished and no malicious traffic detected, a connection to the real IRC server (in Figure 3.1) was allowed and established.
3.2.4 IRC Server / C&C Server
It is important for the reader to understand that, in this connection, the IRC server and the Command & Control server are two names for the same thing. As previously mentioned in Section 2.9, the IRC Server is the place the bots connect to. Since the botmaster can control and command the bots from this server, the name C&C server is commonly used.
The computer hosting the IRC server is, similar to the botmaster- and the bot computer, also a plain Windows XP. Running an IRC server, that at least in the magnitude needed for the experiments in this report, requires no significant hardware worth mentioning.
In contrast to the botmaster and bot, the IRC server is located on a subnet on the opposite side of the IDS. This is due to fact that the IDS needs to be able to capture both traffic going to and from the server.
The IRC server software used throughout the experiments is a program called UnrealIRCd3. The configuration of this software is pretty much a default installation and are therefore not elaborated any further.
There are, however, a couple of modifications which have been made. Even though the server was completely controlled by the authors throughout the experiments, the security still had to be attended to. As mentioned in Section 3.2.1, only the IP addresses of the botmaster- and the bot computer were allowed to connect to the server. In addition, the UnrealIRCd was closed down between the experiments.
2Available at http://www.microsoft.com/technet/sysinternals/utilities/ProcessExplorer.mspx
3Available at http://www.unrealircd.com/
3.2. ARCHITECTURE 27 3.2.5 IDS
The IDS is probably the most important part of the architecture seen in Figure 3.1. Because of the high speed of the link where the IDS is installed, this computer requires fast hardware.
See also Section 2.7.
The computer is a Linux Debian/sarge, running on a Pentium Xeon 3.2 GHz CPU with 3 GB memory and 1 GB swap. The PCI bus is a 64bit PCI-X bus. This component is responsible for delivering the packets from the DAG cards, see below, to the software where the process- ing is done.
As discussed in Section 2.8, there are two DAG cards installed on the IDS; one for each traffic direction on the link. This means thatdag0is responsible for the traffic from A to B, whiledag1 is responsible for the traffic from B to A.
Figure 3.2 below, is a sketch of the computer running the IDS and illustrates how the packets arriving via the fiber tab is processed in the computer via the DAG card.
Figure 3.2:Uninett sensor
Both Snort and Mapi need to be able to pick up the packets from the bus for processing. Snort uses a library called libpcap4modified for the DAG card, while Mapi talks directly to the DAG card via the DAG API.
As described in Section 2.8.1, it is possible to load filters on the DAG card and co-processor.
The DAG card processes the packets according to the filters and either rejects or accepts them, as figure 3.2 illustrates. Rejected packets are discarded and accepted packets are delivered to
4Available from http://sourceforge.net/projects/libpcap/