• No results found

The section explores theK-Fold Cross-validation for patient 2, the predicted confusion matrices, and all statistical results for the model analyzed with the dataset without any modification (“Dataset 1”) and the augmented dataset (“Dataset 2”). The results of all the other patients are presented in AppendixA. The following list delineates all various classes renamed and adopted in the next architectures for a better understanding:

I: Background class;

II: Brain class;

III: Penumbra class;

IV: Core class.

All results are based on the training of 50 epochs (steps); at the end of every epoch, a validation step is performed to evaluate the loss of the model. The loss function implemented for these architectures was the categorical cross-entropy, while the optimizer function used was the stochastic gradient descent (SGD), and the adaptive moment estimation (Adam).

4.3.1 Architecture 1

Figure 4.4: General structure for the Tile Classification architecture 1.

Tomasetti Luca 45

Fig. 4.4displays a general overview of the first architecture of the Tile Classification approach. The network contains nine hidden layers plus the output layer. A summary of the architecture, plus the parameters involved, are presented in Table 4.1. The first layer, a convolutional layer, takes in input a volume of 30 images of dimension 16x16 pixels; the input is convolved with a ReLU activation function, a kernel filter of size (3,3,3) and batch normalization. Subsequently, the second layer represents a convolutional layer with a ReLU activation function plus a batch normalization operation.

The third layer represents a max-pooling operation of size (2,2,2) over the time-series images plus a dropout operation.

Layers four and five contain convolutional layers similar to the first two layers but with an input of size 8x8x15. The next layer executes a max-pooling operation plus a dropout operation to reduce the overfitting probability. Layer seven is a convolutional layer with a 4x4x7 input with the same components of the other convolutional layers. Layer eight contains the last max-pooling layer before the fully-connected layers. Fully-connected layer compresses the input from three dimensions to a one-dimensional vector of length 100. The output layer contains four artificial neurons which yield the probability of the four classes of the brain for the time-series input. The number of total parameters is 203032; trainable parameters are 202680 while non-trainable are 352.

4.3.2 Architecture 2

Figure 4.5: General structure for the Tile Classification architecture 2.

Tomasetti Luca Chapter 4 Tile Classification Approach

The second architecture has a more compressed structure compared to the first one (Sec. 4.3.1) because it has only seven hidden layers plus the final output layer, as shown in Fig. 4.5. The main difference with the first architecture, besides the number of layers, lays on the kernel size used during convolutional layers: instead of using the default kernel size of (3,3,3), it was implemented a kernel size of (3,3,N) where N is equivalent to the number of image in depth of the volume. Hence, the kernel matrix checks and evaluates the entire time-series volume of the brain section simultaneously.

A summary of layers for this architecture is displayed in Table 4.2. A convolutional operation forms the first layer with a kernel size of (3,3,30), a ReLU activation function and a batch normalization at the end. Layer two uses a max-pooling operation with a window size of (2,2,2) on the input plus a dropout function. The next four layers are a reiteration of the first two layers with different input sizes.

The last hidden layer flats the output in a one-dimensional vector of length 100, making use of the dropout function to prevent overfitting and pass the output to the final layer,

which yields the four probabilities for the respective classes. The number of total parameters is 773384; trainable parameters are 773160 while non-trainable are 224. The number of parameters involved in the architecture is almost four times bigger than the number of parameters of the first architecture, which means that the computational time of each epoch is relatively slower for the analyzed methods.

4.3.3 Architecture 3

Figure 4.6: General structure for the Tile Classification architecture 3.

Tomasetti Luca 47

The last architecture constructed presents a structure similar to the second one (Sec. 4.3.2) but with two fundamental differences in it: the window size of the first max-pooling layer is (2,2,N) and the number of layers. Fig. 4.6exhibits an overview of the architecture, which contains eight hidden layers plus the output. Layer one is a convolutional layer with a kernel size of (3,3,N), a ReLU activation function and a batch normalization operation. The second layer has an es-sential role in the architecture because it is the max-pooling layer with a window size of (2,2,N), whereN is equal to the number of time series in the input volume. After this opera-tion, the depth of the time series volume input is shrunk to a vector of dimension one. Layers three, four, and six execute convolutional operations with a ReLU function, a kernel size of (3,3,1) and a final batch normalization at the end of each layer. The fifth layer performs a max-pooling operation with a window size of (2,2,1) in the same way as layer seven. The last two layers implement a flatten and a dense procedure to create an output of 4 values, to show different probabilities for each class. The number of total parameters is 63800;

trainable parameters are 63312 while non-trainable are 488.

This architecture presented contains the smallest amount of parameters among the architectures; hence, each epoch takes approximately 400 seconds to execute.