Usability Survey - Usability Testing - Architecture and Implementation

Architecture and Implementation

5.2 Usability Testing

5.2.2 Usability Survey

Usability testing is important for getting valuable feedback about an artefact and making improvements. Instead of considering a single aspect of usability when creating an artefact, ISO 9241-11 suggests that the measures of usability should cover the following factors³:

• Effectiveness

• Efficiency

• Satisfaction

In addition to testing the prototype practically, the participants of the prototype evaluation were asked to provide feedback about the data transformation recommender system. Therefore, we asked the participants a series of questions which were to be answered in union with the testing of the prototype.

The questionnaire used for usability testing is System Usability Scale (SUS) which is a reliable, low-cost usability scale that can be used for global assessments of systems usability. The scale was created by John Brooke in 1986 to measure the usability of electronic office systems, but is now used to test different kinds of systems. The System Usability Scale is a questionnaire containing ten items which are to be answered using Likert scale⁴ that provides an overview about the the ease of use

3https://www.iso.org/obp/ui/#iso:std:iso:9241:-11:ed-2:v1:en

4https://en.wikipedia.org/wiki/Likert_scale

(or lack thereof) of different applications. Figure 5.1 is an illustration of a five-level Likert scale.

Strongly disagree Disagree

Agree

Strongly agree Neutral

Figure 5.1: A five-level Likert scale.

System Usability Score Questions

The System Usability Score used for testing prototype created for this thesis consists of the following 10 questions:

1. I think that I would like to use this application frequently.

2. I found this application unnecessarily complex.

3. I thought this application was easy to use.

4. I think that I would need assistance to be able to use this website.

5. I found the various functions in this application to be well integrated.

6. I thought there was too much inconsistency in this application.

7. I would imagine that most people would learn to use this application very quickly.

8. I found this application very cumbersome to use.

9. I felt very confident using this application.

10. I needed to learn a lot of things before I could get going with this system.

5.2.3 Results

Each participant of the usability testing ranked each of the 10 questions listed in the previous section from 1 to 5, based on their level of agreement. A SUS score is calculated individually for each participant’s responses. Below is an overview of the method used for calculating SUS score:

1. For each of the odd numbered questions, 1 is subtracted from the response.

2. For each of the even numbered questions, 5 is subtracted from the response.

The values calculated above are added up and the sum is multiplied by 2.5 to obtain the overall value of SUS in a range of 0 to 100.

Even though the score is on the scale of 0 - 100, it is not percentage.

When looking at scores from 500 products for example, consumer and business software, websites, and cell phones⁵, the average SUS score was found to be a 68 which is at 50th percentile⁶. That makes the raw SUS scores which are above 68 as above average and the ones below 68 as less than average. Table 5.2 below shows a complete list of SUS questionnaire scores from all the participants and the scores calculated.

5http://uxpamagazine.org/sustified/

6https://measuringu.com/10-things-sus

Participant Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 SUS Score

1 4 2 4 2 4 2 5 2 3 2 75

2 3 2 3 1 3 2 4 2 2 2 65

3 4 1 4 1 4 2 4 2 3 2 77.5

4 3 2 3 2 3 3 4 2 3 2 62.5

5 2 2 4 2 4 1 4 2 4 1 75

6 5 2 4 2 5 2 5 1 4 1 87.5

7 4 2 4 1 4 1 5 2 3 2 80

8 4 2 4 2 4 1 5 2 3 2 77.5

9 3 2 3 1 4 2 3 2 3 1 70

10 2 3 4 2 4 2 2 1 1 1 60

11 2 2 3 1 2 1 4 2 3 2 65

12 4 2 4 2 3 3 5 3 3 3 65

13 4 2 4 2 3 2 5 1 4 1 80

14 3 3 3 1 4 1 5 2 2 1 72.5

15 2 1 4 2 4 2 4 2 3 2 70

Mean 72

Very good Good Average Bad Very bad

Table 5.2: SUS survey scores per question and total SUS score Average SUS score calculated from the responses of all the participants is 72 which is slightly higher than the average score 68. We found that there were similarities in patterns of performing the same transformations in different datasets, for instance users preferred to apply the transformation Delete column to similar data objects in all datasets tested.

5.3 Discussion

In this thesis we created a prototype for data transformation recom-mender system based on user interactions using Random Forest classi-fier. We defined requirements earlier in this thesis and validated those for in the evaluation process. We found that the prototype successfully fulfills all the requirements.

In addition to validation of requirements, we tested the usability of

tested it with 15 users who evaluated the prototype by first using it for transforming given dataset and then providing feedback by answering questions on System Usability Scale (SUS) from which after mixed responses we obtained an average score of 72. The score we received is above 50th percentile i.e just above average. Based on the SUS scores calculated from the evaluation survey, we can draw following conclusions:

• The scores of Question 2, 3, 4, 7 and 8 in Table 5.2 lie above average and indicate that the users found the application easy to use.

• The users found the application consistent and the functions in it be well integrated as it can be seen from the scores of Question 5 and 6.

• The scores of Question 1, 9 and 10 show that the users are familiar with the data transformation process but perceive the prototype in its current version at an average or below in efficiency of transforming data.

The users who did hands-on testing of prototype also provided feedback as a response to open-ended questions about the features that are missing in the prototype as well as suggestions for improvement.

Below is a summation of the feedback received:

• Users prefer a wide wide variety of data transformations to choose from.

• Editable transformations should be implemented which provide flexibility while transforming data.

• More intelligent detection of data types such as zip codes and URLs ease data transformation process.

As a conclusion, the predictive data transformation prototype is perceived useful by the users but provides a limited range of data transformations which makes it less satisfactory in terms of efficiency in terms of data transformation process.

Chapter 6 Conclusion

This chapter will briefly summarize the thesis, discuss the contribution of this thesis and areas for further research and future work.

With the availability of vast amount of data, data analysis aids in discovery of meaningful insights. Most datasets benefit from preprocessing including data cleaning and transformation done before using it for analysis. Data preprocessing is done to, for instance, detect and correct outliers, convert data into suitable format, and remove inconsistencies. But data preprocessing is a time consuming process and from a wide array of data transformations possible, it can be difficult for data scientists to choose the most relevant transformation.

Machine learning, because of its ability to automate creation of an-alytical models is being used to optimize different processes including pattern recognition and predicting outcome of unseen data without hu-man intervention. For efficiently transforming data, machine learning algorithm can help in creating self-learning model that identifies user preferences through user interactions and provides the most relevant transformations. Thus, for this thesis, we posed three research ques-tions:

• How can relevant data transformation suggestions be provided based on user interactions?

• How can the Random Forest algorithm be used to suggest data transformations?

• How useful are data transformation suggestions generated using machine learning?

The prototype created as contribution to this thesis aimed to find out if machine learning could be used to recommend data transformation suggestions based on user interactions and how useful they are for the users of data cleaning and transformation tools. Some basic transformations were implemented in the prototype and tested

with company data as domain. By tracking the user interactions performed by the selection of data objects in the tabular dataset, the application prototype recommends relevant data transformations from the ones implemented.

The prototype consists of four components: the tabular dataset, the statistical information about the dataset and the selected data object, the recommended transformations based on the selected data object and the history of the transformations performed. In addition, the limitations have been identified and proposed as improvements to be implemented in the future work.

In document Predictive Data Transformation Suggestions Using Machine Learning: Generating Transformation Suggestions in Grafterizer Using the Random Forest Algorithm (sider 67-74)