• No results found

Multisensor fusion is a relatively new discipline which combines data from mul-tiple and diverse sensors and sources [39]. The goal is to make an inference of events, activities, and situations using several observations. The following sec-tion will present how data from multiple and diverse sensors and sources can

be combined to increase reliability, and thus increase situational awareness. The concept of voting is introduced, and several voting schemes are discussed. Fi-nally, soft computing approach to voting is introduced.

3.3.1 Voting schemes

Von Neumann [56] suggested in 1956 the use of voting to combine unreliable data into a reliable version. In a general voting algorithm, the four main compo-nents are input data, output data, input votes, and output votes [57]. Parhami [57]

propose a taxonomy for voting algorithms, which we will use to present the dif-ferent types, or classes, of voting methods. Below, an overview of the possible combinations is presented, as proposed by [57].

Input data Can either beexactwhere the input is viewed as inflexible, i.e. input ymust be equal somexi, or inexactwhere input is viewed as flexible, and input objects represented neighbourhoods.

Output data Can either be a consensus where output data is a subset of inputs with voteswsupporting y, ormediation where output data yis the result of an object function minimising or maximising all input.

Input votes Can either be oblivious where input votes are fixed by being built in the voting algorithm, oradaptive where input votes can be provided as inputs.

Output votes Can either be athresholdwhere output vote exceeds a given thresh-old, orplurality where output is the sum of votes for the object with most votes.

For simplicity, we have decided to focus on the output votes only. That is, how a winner object is decided. As seen above, this can either be by threshold or plurality.

Threshold voting

As threshold voting selects object with votes exceeding a given threshold, com-mon majority voters can, in fact, fall within the threshold category [57]. Gener-ally, threshold voting is fundamentally simpler than plurality voting [58].

Plurality voting

Plurality voting, on the other hand, counts votes for each object and selects on of the objects with the highest vote. We can, by combining plurality voting and sim-ple comparison of output vote with threshold imsim-plement many threshold voters, however, the results may be much less efficient that a direct threshold voter [57].

Ordered Weighting Averaging (OWA)

So far, we have only discussed voting where all votes are assumed equal. In many cases, some of the voters may be more reliable than others, and their votes should ,therefore, weight more. The OWA for aggregation was introduced in 1988 by Ronald R. Yager [59]. The OWA operators can allow a positive compensation between ratings, i.e. they can realise trade-offs between objectives [60]. It allows a higher degree of satisfaction of one criterion to compensate for a low degree of satisfaction of another criterion. The extreme cases of OWA operators can either be full compensation (Max(a1, ..., an)) or no compensation (Min(a1, ..., an)).

The weightswwould then bew= (1,0, ..,0)T) andw= (0,0, ..,1)T accordingly.

It is important to note that the weights are not connected to specific criteria, but to a specifically sorted ordering of the value of criteria.

A linguistic quantifier [61]Qα(r) = rα, α≥0 is defined andαvalue is search so that the linguistic quantifierQα approximate the criteria as much as possible, be it expert preference or other [60]. The selectedQαis then applied to an OWA operatorFQ(a1, ..., an) and an aggregated score is calculated.

3.3.2 Fuzzy voting

The application of fuzzy logic has proven successful in many scenarios like the combination of neural networks [62], malware detection [63], and general ex-pert systems [64]. Fuzzy logic is based on the concept variables being part of a set to a certain degree, calculated using a membership functionµ(), and is part of Soft Computing (SC), a collection of data-driven computational models [65].

What separates fuzzy logic operations from traditional logical operations is that there are no crisp lines or sets. LetAandBbe two intersecting subsets of setX.

The membership forx in subsets Aand B can then be calculated usingµ(), e.g µA(x) = 0.4 and µB(x) = 0.6.

When applying fuzzy logic to voting, a fuzzy integral is calculated for each object. The fuzzy integral is defined by [62]:

h(x) ◦ g(·) = max

F⊆X [min(x∈F

min, g(E))] = max

α∈[0,1][min(α, g(hα))] (3.1) whereg is a fuzzy measure andhis a density measure. By calculating the fuzzy integral for each object based on all voters, an aggregated score is generated;

thus, a winner is decided based on the number of votes as well as how certain each voter is.

3.4 Summary

In summary, we have in this chapter discussed the field of ML, providing an overview of common processes. The process of preprocessing has been presented with its methods commonly used in the ML process. An introduction to feature selection and commonly used measures was presented. Further, we presented

the concept of learning and discussed two common approachessupervised learn-ingandsupervised learning. Evaluation of performance is discussed and common challenges as ugly duckling theorem, curse of dimensionality, no free lunch theo-rem, and overfittingwas presented.

Further, an introduction to the field of data fusion was given. Previous work in terms of definitions is presented, and widely used data fusion models were discussed. Models like JDL Fusion Model, Intelligence Cycle, and The Boyd Con-trol Loop was presented providing an overview of the different types of models in terms of granularity and coverage. Where applicable, models were compared either stage by stage or by product.

Finally, multisensor fusion was presented. An overview of how multisensor fusion can be applied to combine data from several unreliable data sources to reliable data was given. Further, we briefly introduced fuzzy voting exploiting great benefits from fuzzy logic.

4 Related Work

In the previous chapters, an introduction to the thesis was given, and theory on several central topics was presented. The following chapter presents previous work related to the expected contributions of this thesis. An overview of the of-the-art in data fusion in security operations is presented. Further, state-of-the-art in reliable feature selection is discussed. An overview of the newly proposed reliable feature selection method is given, discussing the results from previous work. Current work in anonymisation is briefly presented. Further, the current use of information sharing in practice is presented.