Data Collection from Experiment - Cyber Grooming Detection: Human or Machine? Or Hybrid?

The collection of data to be used for analysis in this thesis was conducted through-out an experiment. The experiment required volunteering participants for manu-ally evaluating conversations from the 2000 conversation dataset.

The goal was to get as many evaluations as possible of the 2000 conversations.

Preferably several evaluations of each conversation. This showed to be much more challenging than initially thought, as it was not easy to get volunteering partici-pants in the first place, and it was even harder to get those who initially volun-teered to actually do what they were supposed to do.

3.2.1 Participants

Selection of participants for an experiment is essential for the data collection to be as good and useful as possible. Since cyber grooming can be performed in many different ways, it is essential to cover as much ground as possible in regards of what is triggering human beings to evaluate a conversation to be potentially predatory. Older people have one way of viewing conversations based on their experience in life and understanding of the society today. Younger people, on the other hand, have another way of viewing conversations based on their experience through life so far, also being more used to chat as a communication platform.

Also, it is possible that women will react differently than men. Based on this, the aim for this experiment was a wide variety of different participants, which hopefully would give valuable data for further use.

In order to avoid unnecessary feedback without substance, some limitations were set for the participation:

• The lower age limit for participation was set to 18 years old. The reason for this was that 18 is defined as the legal age in Norway and it was then not necessary to get approval from parents. Further, at the age of 18 people start to get more reflected due to experience in life, but still have youthful opinions and understanding. This is valuable in order to potentially get a better understanding for the meaning between the lines.

• The upper age limit was set to be 65 years old. Older people have grown up in another age without technology around, and are in general assumed not to be in possession of the desired knowledge and understanding needed for this study.

Gathering volunteers for participation showed to be quite much harder than initially thought. In total, we got 36 people to participate. First of all, it was chal-lenging to convince people into participating after explaining the experiment.

Chapter 3: Data 25

Figure 3.2:Structure of XML file.

Many people thought it sounded too comprehensive and like too much work for them to want to participate. Further, several of the volunteering participants ended up doing very little or nothing, resulting in less evaluations of conversations than we initially were hoping for. Out of the 36 initially signed up for participa-tion, only 20 did actually participate. The number of evaluations each participant contributed with ranged from a few to dozens.

3.2.2 The Experiment

The experiment used for the collection of human evaluation of conversations for this thesis was conducted through a web application online. The web application was provided by the supervisor of this thesis, and created for the specific purpose at NTNU Gjøvik. This allowed the participants to participate in the comfort of their own surroundings and at a time that suited their schedule the best. It also limited potential spread of COVID-19, as it was not necessary to gather people in one location.

For the web application, a user account was created for each user in order to keep track of gender and age. This also allowed each user to log in and out as many times as they wanted, in the hope that they would do more evaluations over a period of time by doing some now and some then.

Figure 3.3 is a screenshot of the web application used for the experiment. On the top, it greets the user and shows how many submitted evaluations the person have in total. On the left hand side, it has a menu with action buttons; "Start",

"Predatory", "Non-Predatory", "Quit", "Pause" and "Resume".

The button "Start" starts a new conversation for evaluation. A conversation is equal to one XML file in the dataset. The conversation is then displayed message by message with a few seconds in between, in chronological order. Each party of the conversation is represented by its own color in the main field of the screen, to the right for the menu. The first one to write a message is represented by green on the left hand side, and the other party is represented with red color on the right hand side. When a user has read enough messages to evaluate the conversation to be potentially predatory or non-predatory, the buttons "Predatory" or "Non-Predatory" are used respectively. When the button "Non-"Non-Predatory" is clicked, the conversation stops and a dialog box pops up on the bottom right side of the screen as shown in figure 3.4. From this dialog box the user uses radio buttons to select if the conversation is sexual or normal and writes a few words explaining the decision before hitting the "Submit" button. After the submission, the user is pre-sented to the rest of the conversation as shown in figure 3.5. The user can then read through the remaining of the conversation and decide whether to stand by the made decision by hitting the "Continue" button on the left side, or to change the decision by hitting the "Change Decision" button on the left side. When a user wants to change decision, a new dialog box is shown on the top of the screen prompting the user for a reason to why he/she wants to change the decision.

When the reason is given, the user is then given the option to choose predatory or

Chapter 3: Data 27

Figure 3.3:The graphical user interface of the experiment (GUI)

non-predatory again. For cases where the user thinks the conversation is poten-tially predatory, the "Predatory" button in the menu is used. The conversation is once again stopped, and a dialog box pops up on the bottom right side, similar to the dialog box for non-predatory. What is different with this dialog box, is that the user will have to choose what side he/she thinks is the predator by selecting one of two radio buttons stating "The left one (green)" and "The right one (red)". Below the radio buttons, the user then describes with a few words or sentences why they came to the conclusion. Next the "Submit" button is hit to submit. The remaining of the conversation is then displayed in full and the user can read through it and decide whether or not he/she will stand by the made decision or if it is neces-sary to change decision, in the same way as with non-predatory conversations. By clicking "Continue" in the menu on the left hand side the evaluation is finished and submitted, and by clicking "Change Decision" a dialog box shows on the top of the browser prompting for a reason to why decision is to be changed. The user will then get back to the menu where predatory or non-predatory can be chosen over again.

3.2.3 Data Result from Experiment

From the experiment, a lot of valuable data was collected from human evalua-tions. From the database of the web application, a CSV file was exported contain-ing data collected for each evaluation by each user. The exported CSV file consists

Figure 3.4:Experiment GUI: A conversation is marked as non-predatory and sex-ual. A few words is to be written before the submission.

of 10 columns. The different columns consists of the following:

1. Index counter.

2. File ending number, part of XML file name for the specific conversation.

3. How many messages back and forth the participant needed before a decision was made.

4. Participant ID

5. Analysis result; predatory or non-predatory

6. Subresult of analysis result; left or right for predatory conversations and normal or sexual for non-predatory conversations.

7. Date and time for when the decision was made.

8. Final (1) or changed (0) decision. In case of changed decision (0), the fol-lowing row will give the changed decision.

9. Text field with reason for the participants decision.

10. Dataset name, part of XML file name for the specific conversation.

By combining column 10 and 2, we get the exact file name for the specific XML file for the conversation in question.

In document Cyber Grooming Detection: Human or Machine? Or Hybrid? (sider 46-50)