Research Project: Understanding the Query Formulation Challenge

According to Venters et al (2001) there is little evidence to support the usability of visual query formulation tools, and QBD CBIR interfaces remain one of the least researched and developed element of CBIR retrieval systems. The literature generally acknowledges that the main drawback with this approach is that it depends on the user’s ability to create good example images (See for example Jaimes and Chang (2002)).

Though CBIR and QBD represent research fields that have been active for almost two decades, but there are still several unsolved challenges, particularly related to these systems’ ability to provide the users with results that are semantically relevant to the visual queries. As a result, there are currently only a few CBIR systems that are available to end users.

Consequently, a main focus of this work was to study the needs, expectations, experiences and challenges of users expressing image needs to a CBIR system by drawing visual queries, with a particular focus on the query formulation challenge. This was done by gathering empirical data about

these issues, and identifying how this material can be used to improve current systems’ ability to process visual queries expressed by drawing. The challenges of query interpretation, query mismatch and media mismatch have not been directly evaluated, but the results are important factors in understanding and solving some of these other challenges.

Based on this, five major research goals were defined for this project:

1. Understand how users behave when expressing image requests by drawing visual image queries

2. Determine the type of drawing users draw when expressing image requests by drawing visual image queries

3. Understand how users experience expressing image requests by drawing visual image queries

4. Determine if QBD CBIR can be useful tools for end users, despite the current challenges facing these systems

5. Identify potential improvements that can be made to QBD CBIR systems

An important aspect guiding this work is the notion of expressive convenience. Users will usually approach an image retrieval system with one or more image information needs, and have to translate this information need into a query in the language provided by the system. While the process of drawing visual queries as used in this work might not qualify as a formal language, it might nevertheless be relevant to discuss this process in terms normally used for such languages. One important aspect of formal languages is that a language has a certain expressive power, i.e. the potential for what might be expressed using the language, regardless of how easy or hard it is to use the language.

The expressive power of an image query interface is defined as the type of image information requests that can be expressed using the interface (Definition 5).

The expressive power represents capabilities of a given language or interface: what can be

expressed. A complementary notion to this is expressive convenience: How a language or interface can be used to express a query (Trovåg 2004; Moe 2006).

The expressive convenience of a visual query interface is defined as the ease a user experiences when expressing a given image information request using the interface (Definition 6).

While the expressive power and expressive convenience of visual queries have not been formally used as evaluation criteria in this work, they represent a fundament for the work and have guided the direction of the research.

11 The research goals are expressed in the following set of research questions:

• RQ1: How do users utilize the visual query interface when they draw visual queries?

• RQ2: How realistic are the query images drawn by QBD CBIR users?

• RQ3: What are the major challenges encountered when users draw visual queries?

• RQ4: How do users feel about expressing image requests by drawing visual queries?

• RQ5: What improvements can be made to CBIR systems in order to better support users when drawing visual query images?

The first research question focuses on understanding how the users make use of the tools available for expressing drawing visual queries. Understanding the users’ use of, and actions in, the user interface may provide important insights into both how these interfaces can be improved, as well as providing clues on how these interactions might be used to assist the system in interpreting the queries. This research question is operationalized and evaluated in chapter 5.

The second research question focuses on the degree of realism in the query images the users create.

Current CBIR systems are primarily based on low-level similarity functions. Successful retrieval is dependent on similarities between the query image and the relevant images in a collection. This is particularly important for the challenges of query interpretation and query mismatch challenges.

Accordingly, query images created by users need to be analyzed. This research question is operationalized and evaluated in chapter 6.

The third question focused on gaining an understanding of the query formulation problem and identifying what the users found to be the most challenging aspects of the visual query formulation process. This concerns issues such as what the users find challenging, why it is challenging and what can be done to improve this process. Understanding these challenges is a fundamental step in order to create systems that best can support users when expressing these queries, and increase the likelihood that users will find visual queries a viable alternative to text based queries. This research question is operationalized and evaluated in chapter 7.

The fourth research question covered one of the least evaluated fields within CBIR: how users feel towards expressing image requests through visual queries. Reading through existing literature, one might get the impression that using visual queries might not be a preferred tool for the users as visual queries, as illustrated by the following quote from a peer-review process:

I am not surprised at all when the study indicates that users tend to draw simple iconic pictures for simple retrieval tasks. My argument is that users may not want to draw at all for simple retrieval tasks!

Based on this, it was felt that a thorough evaluation of the opinions and feelings of a set of users using visual queries might be both relevant and interesting for researchers of image retrieval. This research question is operationalized and evaluated in chapter 8.

The final research question this project was focused at identifying which, if any, improvements actual users of visual query interfaces suggest. Having users try different interfaces might identify

shortcomings in these interfaces, making it possible to identify improvements based on feedback from these users. This research question is evaluated based on the overall results and data made during the project. Chapter 9 presents an overview and discussion of the suggestions made by the respondents in the project, while chapter 0 presents four steps that must be followed in order to promote the current position of CBIR systems as experimental prototypes to powerful tools that may be useful for users expressing specific image requests to an image retrieval system.

An overview of the research questions and their corresponding research hypotheses can be found in Appendix 3 - Research Questions and Hypotheses. While the operationalization and evaluation of these research questions are presented in chapters 5 through 10, the questions are actually answered in section 10.1.

In document Drawing visual query images : use, users and usability of query by drawing interfaces for content based image retrieval systems (sider 27-30)