Evaluation designs and models - Evaluation purposes, processes and practices

3. Evaluation purposes, processes and practices

3.3 Evaluation designs and models

The varying definitions and perceptions of evaluation have spawned a broad perspective of different designs and models of evaluative investigation. In their meta-review on this theme, Madaus and Kellaghan emphasise how the ―conduct

55 (Links to question 1 and section 1 more generally in the interview guide).

60

and nature of any evaluation is affected by how one defines the process of evaluation‖ (2000: 19). They go on to recognise that the plurality noticeable within the evaluation field is underpinned by ―deep epistemological differences‖

and diverse opinion about process. As such, evaluation models tend to characterise a particular author‘s view of process and subsequent suggestions for practical implementation rather than describe any widely accepted theoretical position. Numerous evaluation models have developed over time, often to match prevalent theories of organisation and management.

Without any wide acceptance of what models and designs should entail, the variation of perspectives has increased over time. As evaluation models developed, within the education field Stufflebeam (Stufflebeam & Webster, 1983) recognised potentially loosely coupled effects and uses of results in varied types of models that had been applied. He noted that many studies failed to match up to the purpose of designing and conducting an evaluation to assist judging and improving the worth of an educational object (1983: 24). Of these studies, some were politically-oriented ―pseudo-evaluations‖ focused on presenting positive or negative images of a programme, ―irrespective of its actual worth‖; questions-oriented ―quasi-evaluation‖ studies, which then apply a methodology thought appropriate for the particular questions to be addressed⁵⁶, regardless of whether these are relevant for ―developing and supporting value claims‖; and values-oriented evaluations designed ―primarily‖ to meet the basic evaluative purpose outlined above⁵⁷. Stufflebeam considered that such loose coupling between purposes, designs and utilisation appeared to be exhibited by the varied perceptions of clients, practitioners and audiences involved. In his findings, clients tended to be driven towards the political models, evaluators towards the questions models, while audiences are keen to know the value of the object under investigation. Stufflebeam‘s conclusion was that evaluators should be ―sensitive‖ to their own agendas as well as that of clients and audiences, including possible conflicts. Stufflebeam suggested that evaluators should assess the relative approach of each model they intend to implement, collaborating with the client and users. As will be observed, the evaluation field has generally focused somewhere between improving methods and participation in the process. I will return to these points in the summary at the end of the chapter and suggest that this attention should also be supported by greater understanding of the internal decision processes that would appear to underlie these processes.

Researchers have attempted to understand the underlying approaches of evaluation models. House (1978) noted a division in the field between models based on subjectivist ethics, observable in both utilitarian and pluralist ideology, as well as those based on a more liberal objectivist epistemology, in which management focused models frame accountability, efficiency and quality control. According to House these ‗elite‘ models generally emphasise

56 For example accountability, testing and management information gathering.

57 Exemplified by accreditation, policy and decision-oriented studies.

61

empiricism over theory, drawing more heavily from principles of scientific management and systems analysis, assuming a consensus of goals can be reached, which will define the focus of evaluation⁵⁸. Much public sector evaluation still appears to follow this pattern or to demand information that responds to such a view. Utilitarian based evaluations build on a subjectivist ethic with an objectivist epistemology, determining what should be maximised.

Pluralistic evaluation has both a subjective ethic and methodology, and as such is not generalizable, focusing rather on experience and socialization, where precedents become judgements. House‘s reflections appear to inform the basic competing arguments surrounding the evaluation of programmes for school leadership; while demands for a more managerial model are perceived to come from the mandator, the education field appears to consistently apply derivatives of the pluralist models (Guskey, 2000). In these models it is ―particular experience‖ that is in focus rather than judgement of quality per se (House, 1978).

The reflections outlined above move discussion about evaluation beyond that of a purely rational exercise, for example ascertaining the input, process and effects of a specific programme as the basic operative model. Research needs to be further focused upon how processes take place within a specific context, against particular traditions and in relation to expectations and experiences of evaluative activity. Mark et al note the different traditions these models are drawn from have ―influenced some evaluators‘ decisions about evaluation designs, each providing a way of defining success‖ (2000: 11). In this way understanding the views held by evaluators of the basic premise of evaluation and the purposes to which particular models are thought useful, is considered to be another important factor when attempting to investigate and enlighten their role in the decision making process.

As a result focus has now been further placed upon the design process of evaluation, especially in terms of how models are formed. Hansen (2001, 2005a) has briefly outlined distinctions of models that attempt to account for diversity within the design process of evaluations: negotiation models (what we can argue for); appropriateness models (what fits to the problem)⁵⁹; routine models (what we usually do or have done before); competence models (what we can do).

Hansen‘s categories appear to be useful heuristic aggregates but require further study. She suggests that there will, to a greater or lesser extent, be overlap from situation to situation and context to context. It does seem difficult to equate everything into a ‗design‘ in the essence of enacting and implementing an evaluation. The point that both routine and competence may overlap heavily should require us to take a further step back and decipher how decisions are made in such organisations. Particular points of interest are how size will matter,

58 Systems analysis will also assume an agreement about cause and effect relationships (House, 1978: 4-6).

59 adaptation

62

whether the organization is public or private and whether those funding are internal or external commissioners.

A common reflection in the wider evaluation research field is that no model is better than another but depends rather on the actual evaluation question at hand (Krogstrup, 2006). Krogstrup considers the decision over choice of evaluation model to be more commonly normative or political rather than technical or rational (2006: 167). This may appear to be an oversimplification but it does open for possibilities not fully explored by a field that has more traditionally focused more upon describing and improving the technicalities of the process.

Krogstrup adopts an evaluation definition drawn from the work of Evert Vedung, noting that ―[e]valuation is a systematic and retrospective (and a prospective) assessment of processes, outputs and effects of public policy‖⁶⁰ (Vedung, 1998, in Krogstrup, 2006: 17). As Krogstrup reflects, this begs a different question, considering what criteria will form the basis for an assessment. In an attempt to build up capacity to undertake evaluation, evaluation becomes built in and integrated in public organisations and evaluation tasks become institutionalised (Krogstrup, 2006). At the same time Krogstrup believes that there is a decreasing amount of evaluation and an increasing amount of performance measurement (monitoring) (Krogstrup, 2006:

21, 181ff). Greater focus upon performance measurement and monitoring favours a particular evaluation approach and type of information gathered. Such demands are considered as ―external control‖ and accountability measures and are considered to focus to narrowly on particular types of information gathering at the expense of others. This factor will require a new and particular evaluation capacity building in organisations in order to release the knowledge left untapped by such processes that will reveal the social side of the organisation (Ibid.: 195ff). The significance of this statement is seen against the understanding that evaluation is too often poorly performed and utilised. If, in addition, it is poorly designed and limited in focus then it would seem important to build capacity which will progressively become institutionalised. Krogstrup agrees with Stockdill et al. (2002) that such capacity must become part of routine but at the same time remain flexible within a collective, incremental development.

To analyse an approach to evaluation would therefore seem to be helped by attempting to denote evaluation perspectives, that is, epistemological / ontological reflections on evaluation; penchant for particular evaluation models, that is, how to assess outcomes, outputs, processes, as well as the view of the evaluand under investigation. It is also considered important to understand the basic purpose or desired knowledge; guiding values; and intention for and attitude to utilisation⁶¹. Focus should additionally be placed upon applied

60 My translation

61 This is also drawn from Dahler-Larsen, 2002 and Shadish et al 1991, (in Krogstrup 2006: 69)

63

evaluation designs /concepts, that is, the choice of methods, those involved and the organisation of the evaluation itself. I will return to these reflections in Chapter 5. I turn first however to a consideration of how different fields as well as different countries can be perceived to have particular evaluative traditions that might be thought to impact choice of design and model of evaluation.

In document Demands, designs and decisions about evaluation: On the evaluation of postgraduate programmes for school leadership development in Norway and England (sider 75-79)