• No results found

Qualities of classroom observation systems

N/A
N/A
Protected

Academic year: 2022

Share "Qualities of classroom observation systems"

Copied!
32
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

1

Qualities of classroom observation systems

Introduction

Observation systems are used around the world for a variety of purposes. Two critical

purposes are to understand and improve teaching. Scholars often seek to understand teaching by identifying dimensions of teaching and investigating how those dimensions contribute to valued outcomes such as student learning, the classroom environment, or students’ beliefs.

They also seek to use observation systems to improve teaching. Both researchers and

practitioners use observation systems to provide feedback and coaching to teachers as well as to evaluate interventions hypothesized to affect teaching. But when individuals set out to understand and improve teaching, they face many choices. For example, should they use a system that can be used across school subjects, a so-called “generic” system or one that is subject-specific? Should they select a system that produces more narrow and detailed information or one that produces more global, summary information? To what degree do existing systems serve the specific purposes the individual has in mind?

Renewed interest in observation systems around the world spurred, in part, by seminal large scale research such as the Trends in International Mathematics and Science Study (TIMSS) video study (International Association for the Evaluation of Educational

Achievement, 2016) and the Measures of Effective Teaching project (Bill & Melinda Gates Foundation (BMGF), 2018) and has generated significant research and development work on observation systems. This research and development has not yet been synthesized in ways that allow the field to take stock of what we have learned about how to measure teaching through observation systems.

As a contributing step toward understanding recent work on observation systems, this article first describes what we mean by the term ‘observation system’. After clarifying this,

(2)

2 we present one framework that can be used to understand how observation systems vary. In this framework we develop eight observation system aspects (see Table 1) we hope will be useful to better understanding different observation systems. To illustrate the eight aspects of the framework, we apply the framework to four rather well known observation

systems.Finally, the article concludes with a discussion of the results of the application of the framework and what they imply for using observation systems for a specific purpose..

What is an Observation System?

Observation protocols are often thought of as a sheet of paper with categories or rubrics which a rater uses to judge the quality of teaching in a lesson. The dimensions of teaching judged are rated and the ratings are aggregated into a score (e.g., averaging of ratings, through IRT). In schools, these scores are often used to provide teachers with improvement feedback or to evaluate individual teachers. In research contexts, scores are often analysed to determine how they relate to valued outcomes such as student learning, professional

development effects, and many other outcomes. Although the sheets of paper with scales are very important, there is much more to observation protocol validity than constructs and scales.

When measuring teaching through observations, one must measure selected aspects of teaching by sampling lessons or parts of lessons and ensuring that the ratings of those lessons are of reasonable quality. To accomplish these tasks in valid and reliable ways, observation systems can be conceptualized as being comprised of scoring tools, rating quality procedures, and sampling specifications.

The scoring tools in an observation system specify which dimensions of teaching will be measured. These tools include the scales themselves – both the teaching practices being assessed as well as the number and definition of the score points (e.g., present/not present, a

(3)

3 three point criterion-referenced scale or rubric). Because observation scales are designed to measure complex human interactions, raters come to understand the scales through videos (or perhaps text-based descriptions) of teaching that have been rated by someone who

understands the scales and score point distinctions. These video and text-based descriptions show raters how the words of the scoring scales are embodied in teachers’ and students’

words and actions.

As has been documented in some observation systems, human rating of teaching is prone to being unreliable and inaccurate, especially when coding certain aspects of teaching practice such as intellectual challenge or cognitive activation (e.g., BMGF, 2012; Decristan, Klieme, Kunter et al., 2015). Therefore, it is very important for observation systems to have rating quality procedures. These procedures are used to ensure that raters are well trained and are able to use the rating scales accurately and reliably over time. A common quality

procedure is the formal training and certification of raters. Certification tests often mimic the work raters will do in studies or in practice. For example, raters might be required to take and pass a certification test in which [s]he rates a lesson and the ratings must agree exactly with master ratings on 80% of the rating scales. Another common procedure is double scoring, the practice of having two raters independently assign ratings to the same lesson in order to compute inter-rater agreement metrics.

Finally, sampling specifications are the details around how the observations sample from the larger domain to which the ratings are intended to generalize. These specifications include, but are not limited to the number of observations conducted for a reliable estimate of teaching quality, the length of time of those observations, the frequency with which raters assign ratings (e.g., every 10 minutes, every 30 minutes), and how lessons are sampled from the unit of analysis. For example, for a primary teacher, how does a four lesson sample used by researchers vary across the subjects a primary teacher might teach? Are there only

(4)

4 language and mathematics lessons? Are all lessons from April and May or are they sampled from the entire school year? These and other similar questions are addressed in the sampling specifications of an observation system.

Given this description of an observation system, in what follows, we propose a framework to guide considerations of existing observation protocols, hereafter referred to as observation systems. Our framework hypothesizes eight aspects of observation systems, which might be used to better understand the affordances and constraints of any such observation system (see Table 1). We then use these eight aspects of observation systems to consider four different observation systems. In doing so, we hope to show how observation systems can be considered side by side, thereby contributing to the field’s meta-knowledge of observation systems.

Framework for Analysing Observation Systems

A framework for evaluating and describing the nature of observation systems can include many difference aspects of such systems. We do not presume to cover all possible aspects, and subsequent scholarship may productively expand or revise this initial set. Given our collective international experience of developing and using observation systems, we selected aspects that we believe are essential for categorizing classroom observation systems and which vary between systems. The eight aspects included in our framework relate to the following broader categories: the content of classroom observation systems (e.g., which aspects of teaching are evaluated, is the system for general use, or is it subject specific?), whether the system includes guidelines for proper use of the system, whether there is empirical evidence for the content and the use of the system, and the scale of the

implementation of the system (only used by its developers, or also by others). Below, we will elaborate these categories into eight, more fine grained aspects.

(5)

5 [Table 1 near here]

1. Dimensions of teaching

Observation systems include dimensions of teaching that are considered to be indicators of teaching quality. The assumption generally is that the better a teacher scores on these indicators, the better the teaching, and therefore, the more his/her students will learn. Some frequently used indicators in observation systems originate in the process-product studies of teaching (e.g., classroom management; clear explanation of subject matter, Brophy & Good, 1986). Others come from other strands of research, e.g., the TIMSS studies (e.g. cognitive activation, Baumert et al., 2010; Hiebert & Grouws, 2007), research on assessment for learning (Black & Wiliam, 1998), self-regulation (Zimmerman, 1990), and instructional differentiation (e.g., Tomlinson, Brimijoin, & Navaez, 2008).

Based on a review of 28 classroom observation systems (Author, 2018a; Author, 2018b), we present dimensions of teaching quality frequently included in classroom

observation systems. Teaching can be conceptualized in many different ways, therefore, these dimensions are just one way to delineate teaching. Dimensions of teaching include:

 Safe and stimulating classroom climate

This dimension refers to the degree to which teachers and students respect one other, communicate with each other in a supportive way, and together create a safe and positive classroom climate in which student learning is promoted (Pianta & Hamre, 2009; Van der Grift, 2007) .

 Classroom management

Classroom management reflects the degree to which teachers and students manage their behaviour and time in such a way that learning can be productive. In a well-managed class,

(6)

6 little time and energy are lost on activities that are not learning-oriented (Marzano, Marzano,

& Pickering, 2003; Wang, Haertel, & Walberg, 1993).

 Involvement and motivation of students

This dimension is about the extent to which teachers involve all students actively in classroom learning activities, and how much students participate in classroom learning activities (Rosenshine, 1980; Schacter & Thum, 2004).

 Explanation of subject matter

How clearly teachers explain the subject matter to be learned to their students is crucial for how much students learn. Clear explanations include clear specification of lesson objectives to students, reviewing previous learning, the use of clear language, presenting information in an orderly manner, presenting vivid and appealing examples, checking for understanding, and the frequent restatement of essential principles (Van der Grift, 2007).

 Quality of subject matter representation

Quality is influenced here by the richness (e.g. multiple representations of subject matter), precision, and accuracy of the subject matter. Strong representations provide opportunities to learn the subject matter practices (e.g., problem solving, argumentation) as well as the significant organizing ideas and procedures of that subject matter (Hill et al., 2008).

 Cognitive activation

A deep understanding of how the various parts of subject matter are related to and connected with each other requires that teachers can activate students’ deep thinking by means of questions, appropriate assignments, classroom discussions, and other pedagogical strategies (Baumert et al., 2010; Osborne, 2015).

(7)

7

 Assessment for learning

Assessment for learning is characterized by a cycle of communicating explicit assessment criteria, collecting evidence of student understanding of subject matter, and providing feedback to students that moves their learning forward (Black & Wiliam, 1998; 2010).

 Differentiated instruction

Teachers differentiate their teaching to the degree they adapt subject matter, the explanation of subject matter, students’ learning time, and the assignments to the differences between students (Keuning et al., 2017).

 Teaching learning strategies and student self-regulation

This dimension is about teachers a) explicitly modelling, scaffolding and explaining learning strategies to students, which students can use to perform higher-level operations (e.g.,

teaching heuristics, thinking aloud when solving problems, using checklists) (Carnine, Dixon,

& Silbert, 1998; Slavin, 1996), and b) encouraging students to self-regulate and monitor their own learning process in light of the learning goals (Boekaerts, Pintrich, & Zeidner, 2000;

Muijs et al., 2014; Zimmerman, 1990). Teachers who explicitly model, scaffold, explain strategies, give corrective feedback and ensure that children master the material taught contribute highly to the academic success of their pupils.

While all of these dimensions of teaching are fundamental to students’ learning and development, each dimension can be operationalized differently across observation systems.

Further, observation systems vary in the degree to which they capture all dimensions or target specific dimensions.

(8)

8 2. View of teaching and learning.

Observation systems also embody a community of practice’s view of high quality teaching and learning. A community of practice could be a country, with the view embodied in that country’s national teaching standards and then operationalized in a national teacher

evaluation system. It could be a group of reform-minded mathematics and science educators (Sawada et al., 2002), a group of school district administrators, or a group of researchers who study teaching and educational effectiveness.

Of course communities’ views vary, emphasizing different aspects of teaching and learning. Perhaps all communities value cognitive activation for example, but the degree to which teachers facilitate classroom discourse and student participation might vary depending on the country’s cultural views of teaching and learning (Clarke, Emanuelsson, Jablonka, &

Mok, 2006). Communities’ views necessarily reflect cultural differences in valued practices around the world. In Japan, for example, an observation system might privilege how

effectively a collaboratively developed lesson plan was implemented while a United States system might privilege how well a lesson plan supported differentiated instruction.

Communities’ perspectives of teaching quality can be located along a continuum that moves from a behaviourist view of teaching and learning, to more of a cognitive view, to more of a sociocultural or situated view. Communities’ perspectives often blur the boundaries across this continuum and depending on how thoroughly a system is documented, it can be difficult to determine what view(s) underlie a specific system. Further, it is not helpful to dichotomize or oversimplify views of instruction (Oser & Baeriswyl, 2001; Grossman and McDonald, 2008) as it can lead to a focus on differences in how communities define and label teaching rather than a focus on how teaching and learning are related.

(9)

9 3. Subject specificity.

There is widespread agreement about the importance of the subject matter specificity of teaching quality (Seidel & Shavelson, 2007), however, there is less agreement about how to measure this aspect of teaching practice. Several observation systems have been designed to evaluate teachers’ subject specific practices such as the Mathematical Quality of Instruction (MQI), the Protocol for Language Arts Teaching Observation (PLATO), the Quality of Science Teaching (QST) and PISA + in science education. The MQI system (Hill et al., 2008), for example, focuses on elements such as the richness of the mathematics, student participation in mathematical reasoning and meaning-making, and the clarity and correctness of the mathematics covered in class.

On the other hand, there are several systems that are generic, designed to capture key elements of classroom teaching that are critical for students’ learning across subjects and classes (e.g., instructional support, representation of subject matter, classroom climate and classroom management). Examples of such generic systems are the Classroom Assessing Scoring System (CLASS) (Pianta, LaParo & Hamre, 2008), the Framework for Teaching (FFT) (Danielson, 2013) and ICALT (Van de Grift, 2007).

Scholars further make a distinction between surface-structures and deep-structures of classroom activities when trying to measure classroom teaching. Surface-structures refers to features of classroom interactions easily visible in behaviour, such as classroom organization and management, and teachers’ feedback (Fischer & Neumann, 2012). Deep structures refer to those teachers’ activities meant to encourage students’ internal mental processes including cognitive activation, intellectual challenge, problem solving, etc. (Seidel & Prenzel, 2006).

There is a growing body of research that suggests raters are able to fairly reliably score surface structures, but have more difficulty reliably scoring deep structures (Praetorius, Pauli,

(10)

10 Reusser, Rakoczy, & Klieme, 2014). This may be because surface-structures require lower- level inferential judgments while deep-structure require higher-level inferences.

4. Grain size.

Related to a subject specific or more generic focus, there is also the issue of grain size: how discrete/ targeted practices are to be coded (Hill & Grossman, 2013). This issue has been addressed in observation studies for decades (Brophy & Good, 1986; Flanders, 1970). In some newer systems, (CLASS and PLATO, for example) consensus has been reached on a set of core activities (12 for both CLASS and PLATO). This stands in contrast to earlier systems that included a long list of activities to score (Scheerens, 2014). Thus, how many domains and elements that are to be scored is a feature that might vary across systems.

Whether to score the whole lesson or segments of the lesson is a related aspect of grain size. One might imagine observation systems that seek to code smaller grain sizes, i.e., narrower teaching practices, might segment the lesson many times so that narrow behaviours can be accurately documented throughout a lesson (e.g., MQI). Alternatively, observation systems using more holistic codes requiring the rater to judge multiple interrelated practices might segment at larger intervals (e.g., 20 minutes or a whole lesson) so that the ratings reflect all of the interrelated practices (e.g., ICALT).

The decisions about what grain size to capture are further shaped by the rhythm and pace of instruction. Activities are not always equally probable in every segment of a lesson.

For example, while instructional purpose may be central to the beginning of a lesson, it may be less central towards the end of the lesson. The degree of lesson segmentation necessary for a specific grain size of practice being scored is a decision made by system designers (Author, 2018c) and is often undocumented.

(11)

11 5. Focus on students’ actions.

Observation systems can vary in the degree to which they focus on students’ actions and the nature of that focus. Depending on the focus of the scoring scales, procedures, and

exemplars, observation systems might require raters to pay attention to teachers’ or to

students’ words and actions, or some combination thereof. In some observation systems there was an almost exclusive focus on the teachers’ actions, e.g., was the objective of the lesson clearly specified (Brophy and Good, 1986)? Other systems required the rater to scan the room, focusing on only the students’ actions (Abadzi, 2009; Stallings, 1973).

In systems that focus on student actions, the particular actions that are privileged range from behavioural, to cognitive, to affective. For example, in the domain measuring the classroom environment, the FFT asks raters to judge the degree to which students take pride in their work and show caring and warmth. These are all more affective aspects of the learning environment. In the CLASS system for secondary education (Pianta, Hamre, &

Mintz, 2012), raters attend to students’ risk taking and sharing of personal information – both behaviourally oriented markers. These behaviours are used to infer the teachers’ sensitivity to students’ needs.

6. Scoring procedures.

Classroom observation systems differ in their sampling procedures, scoring procedures and preparation of raters. The choices made by developers influence the reliability and validity of the observation scores. We describe each in turn.

Sampling procedures.

Classroom observation systems are developed for one or more of the following purposes:

promoting teacher learning, teacher evaluation, or developing research insights. Given these purposes, lessons are sampled in different ways. The lesson’s subject matter and type (e.g., an

(12)

12 introductory or a practice lesson) may be specified by the system. The observations can be conducted live or on video, be announced or unannounced, and they can vary in length.

Sampling of the lesson can be specified even further: e.g. whether the observer should walk around, talk with students or not during an observation, which part of the lesson should be observed, how many observation cycles should be conducted, and when (day, week, or year) the observation should be conducted.

Scoring procedures.

Observation systems differ in how rating procedures and scoring rules are carried out. How many observations, the number of segments, the degree to which lessons are double rated, and whether ratings are checked systematically by master raters for accuracy are just some of the rating procedures that are relevant to the validity of the system. Scoring rules concern how scores are aggregated across units (e.g., segments, lessons, teachers) and across raters (e.g., averaging discrepant ratings, taking the highest rating), as well as rounding rules, and rules regarding dropping ratings.

Preparation of observers.

Raters are usually trained using manuals that provide insight into the theoretical basis of the system, the meaning of the items and scales and scoring rules. Training can also provide raters opportunities to practice, by observing videos and scoring them during the training.

Certification of raters could be required, as well as recertification after a specific time period.

It is also critical that raters are able to create accurate and unbiased scores across teachers so teachers can improve.

7. Empirical evidence.

The validity of the content of the observation system will probably vary. As was stated in the

(13)

13 dimensions of teaching section, the assumption is that the dimensions of teaching included in observation systems reflect teaching quality. A critical criterion for teaching quality is how much students learn. Thus, it is important to understand the extent to which the assumed relation between the teaching quality indicators and student learning has been confirmed empirically. In other words, what is the nature and quality of the research upon which the indicators are based? This is often considered empirically by testing the degree to which scores from a particular observation system, which includes specific dimensions of teaching, are associated with student outcomes (e.g., Decristan et al., 2015) or statistically-derived measures of teaching quality such as value-added models (e.g., Bell et al, 2012; BMGF, 2014). Despite the desire to use predictive validation studies as the goal standard of empirical evidence, such studies face many problems such as confounds to causal mechanisms,

inadequate accounting for prior learning and other school factors that shape teaching and learning (e.g., curriculum), and inappropriate outcome measures, just to name a few.

While predictive evidence is important, Kane (2006) argues that we must consider the validity of any system in the form of a clear validity argument. Such an argument specifies the inferences necessary to move from observation ratings to inferences about the sample studied (often the quality of teaching in a given timeframe with a specific group of students), all the way to the inferences at the domain level (all of a teacher’s teaching in a given year with all the students [s]he taught). In one application of Kane, U.S. researchers specify empirical evidence that ranges from the quality of scoring inference to predictive validity (Bell et al., 2012). Evidence might include details regarding the training and monitoring of raters, inter-rater reliability, specification of sources of variance, factor analyses, convergent validity evidence, and correlations to measures of student learning (VAM). Certainly all of these sources of empirical evidence contribute to the quality of any observation system’s validity argument.

(14)

14 8. Developmental continuum.

Related to the quality of the empirical evidence available for an observation system,

observation systems can be placed on a developmental continuum. It takes time to develop a strong system and gather information about valid and reliable use of the system. Interesting indicators of the stage of development of the system are the year of development, whether the system was pilot tested, the number of versions, the last published version, whether research was done into the valid and reliable use of the system by the developers, and whether people outside the development team have used or researched the system.

Reviewing Observation Systems – Four Illustrative Examples

In order to understand how this framework could be used to understand observation systems, we have selected two general and two subject specific systems. CLASS and ICALT, the two general systems, are popular in the United States and Europe, respectively. PLATO and TIMSS were developed for different subject matters. Both have been used internationally, and the latter was developed specifically to apply across countries. We consider subject specific systems as well as generic systems because of the important role subject matter plays in the improvement of teaching. These systems were also selected, in part, because they vary across the framework’s aspects. We describe each system using the aspects of the framework to demonstrate how the framework can be used to investigate any observation system. We then summarize the systems briefly in Table 2. It is important to note that we do not provide a full-length treatment of the empirical evidence for the four systems. Any fair treatment of a system’s evidence is necessarily lengthy and detailed, and therefore, beyond the present scope. Instead, we point the reader to representative articles for each system.

(15)

15 International Comparative Analysis of Learning and Teaching (ICALT)

ICALT was developed by European inspectorates with the purpose of inspection of primary schools (Van de Grift, 2007).The University of Groningen (RUG) in The Netherlands continued its development to capture teaching quality. It was developed from a more behavioristic view of teaching and learning and captures a teacher-centered classroom in which knowledge is transmitted through direct instruction. ICALT is used for research purposes and as a system for teacher professional development in K-12, across a variety of curricula, subjects and instructional approaches.

Of the nine teaching dimensions presented earlier, only the dimension about subject matter representation is not covered in ICALT. The 32 items focused on teacher behavior are divided over 6 scales: safe and stimulating learning climate, efficient classroom management, quality of instruction, teaching learning strategies, stimulating learning environment,

adaptation of teaching to diverse student needs. One additional scale called student engagement, contains 3 items that focus on student behavior. The indicators were derived from reviews of research on the relationship between teaching characteristics and the academic achievements of pupils.

ICALT is a high inference system, and scores are based on a whole lesson. All quality indicators are scored on a 4-point scale ranging from ‘predominantly weak’ to

‘predominantly strong’. In the system, examples of good practices are provided for each quality indicator to assist observers in making the judgments. Observers can indicate whether these good practices were present or not during the lesson and, based on this information, they make a quality judgment about the indicator.

Observers can become certified, if they are able to rate a lesson in a way similar to master observers. There is no general manual available for the use of ICALT and the RUG

(16)

16 does not provide training opportunities on a regular basis. However, the RUG trains

observers for RUG projects, and training can be requested from the RUG by others.

The RUG conducts research into ICALT, mainly in secondary education.

Confirmatory factor analysis supported the six scales (Van der Grift, Van der Wal &

Torenbeek, 2011). Rasch-analyses have been conducted to place all quality indicators on a Rasch-scale such that teachers can be trained in their zone of proximal development (e.g. Van der Lans, Van de Grift & Van Veen, 2017). Multi-level analyses showed a relation between ICALT and students’ academic engagement (Maulana, Helms-Lorenz & Van de Grift, 2016).

The RUG also still conducts research into reliability aspects of ICALT (e.g. Van de Lans, Van de Grift, Van Veen, and Fokkens-Bruinsma, 2016) and just started a new international project on teaching quality from an international perspective, the ICALT3.

Classroom Assessment Scoring System (CLASS, K-3 and UE)

The first version of the CLASS was developed in the US for a study into the quality of pre- school programs by the National Center for Early Development and Learning and was an adaptation of the Classroom Observation System (COS) (Teachstone, 2015). Today,

Teachstone offers different versions of the CLASS for different age groups: infants, toddlers, pre-K, K-3, upper elementary (UE) and secondary education. We focus on the K-3 (Pianta, La Paro & Hamre, 2008) and UE version (Pianta, Hamre & Mintz, 2012). The CLASS captures interactions between teachers and students, which are seen as the primary mechanism of student development and learning (Pianta, La Paro & Hamre, 2008). Both observation systems have been used for research and teacher development and evaluation.

They are used across a variety of curricula, subjects and instructional approaches.

At the broadest level, CLASS decomposes classroom interactions into three major domains: emotional support, classroom organization, and instructional support. Within each

(17)

17 of the domains there are multiple dimensions (11 in the UE version and 9 in the K-3 version).

Of the nine teaching dimensions presented earlier, the CLASS UE dimensions cover them all while the CLASS K-3 dimensions do not (see Table 1). The dimensions focus on both the students and the teachers. The UE version also provides a global measure for student

engagement. CLASS was based on a review of constructs assessed in classroom observation systems used in child care and in elementary school research, literature on effective teaching practices, focus groups and extensive piloting (Pianta, LaParo & Hamre, 2008).

The CLASS is a high-inference observation system. Raters observe in cycles: 15-20 minutes of observation and 10 minutes for rating the dimensions. Each cycle is independent of the others. The number of observation cycles should depend on the goals for which CLASS is used. All CLASS dimensions are scored on a 7-point range. For each dimension, several indicators are described that include descriptions of low (1,2), mid (3,4,5) and high (6,7) range behaviour.

Raters must obtain CLASS-certification before they can conduct observations. To become certified, observers attend a two-day CLASS training and have to take a reliability test. Every subsequent year raters must recertify. Both CLASS K-3 and CLASS UE have been used by many researchers in the USA and abroad. In the manual of the K-3 version, the results of six studies are presented. However, the research sample in these studies is often broader than K-3, and it is not always clear whether CLASS K-3 or an earlier version was used. More recent studies used CLASS K-3 in a K-3 setting and provide much evidence about the observation system: an evaluation of the factor-analytic validity (Sandilos, Shervey, DiPerna, Lei and Cheng, 2016), measures of internal consistency and evidence for the

reliability of the individual domains (e.g. Abry, Rimm-Kaufman, Larsen & Brewer, 2013), and stability of most scores across the day of the week, month and year (Henry, 2010).

CLASS UE was used in the MET study (Kane, Kerr & Pianta, 2014), which provided

(18)

18 information about the UE measure including information about the scales, the relation with other measures, and achievement gains.

Third International Mathematics and Science Study (TIMSS) Video Mathematics System

The “TIMSS Video Study”, which was linked to the Third International Mathematics and Science Study (TIMSS) produced a mathematics and science observation system whose goal was “to describe and investigate teaching practices in eighth-grade mathematics in a variety of countries” (Jacobs et al., 2003, p.1). The development was led by U.S. researchers in collaboration with seven countries’ experts. We focus on the mathematics’ system.

TIMSS Video was originally designed to address the U.S. National Council of

Teachers of Mathematics student mathematics standards, which privilege socio-constructivist approaches to learning. However, the system can be used to analyse instruction from more behavioural and cognitive viewpoints as well. TIMSS follows both students’ and teachers’

actions and discourse and tracks the degree to which these are public (i.e., shared with the entire classroom) or private (i.e., between a small number of students). The system has been used for research and the improvement of teaching in secondary mathematics classrooms.

The TIMSS codes describe the subject matter of mathematics lessons by documenting the lesson’s specific mathematical subject matter, the organization of the lesson, and the instructional processes. Each lesson is segmented into problem-based interactions of variable length and mutually exclusive categories, called coverage codes. Twenty-one coverage codes define the organization of the lesson including whether mathematics is being taught, in what problem format, and whether or not problems overlap. There are also occurrence codes that describe the types of activities engaged in by students as well as how those activities unfold, the resources being used, and the nature of mathematical practices and interactions

(19)

19 emphasized.

Because of TIMSS fine grain size, it is scored using both a video and a standardized transcript of the lesson. Using transcripts and the 110 page system, general and specialized raters make a total of 7 passes through a video and its associated transcript in order to assign categorical codes to the entire lesson.

TIMSS parses teaching into very small pieces, e.g., whether there was a mathematical generalization present, how many there were, or how many graphs were drawn publicly. And yet alone, the codes do not make judgements about teaching quality. Analysts bring a

teaching quality analytic framework to the codes in order to aggregate the codes in ways that allow judgements about teaching quality to be made (e.g., Leung, 2005). The teaching learning strategies, cognitive activation, structured and clear explanation of subject matter, and quality of subject matter representation in our analytic framework are all addressed directly by the various occurrence codes. ‘Involvement and motivation of students’ and

‘differentiated instruction’ are somewhat addressed and the TIMSS codes are modestly aligned with the classroom management, safe and stimulating learning environment, assessment for learning, and self-regulated learning dimensions of our framework.

All raters are required to pass a certification test and lessons are double scored. As previously mentioned, all codes are aggregated to the lesson level and to our knowledge, no one has attempted to make systematic claims about teachers, instead, focusing on

descriptions of teaching within and across countries. There is no training offered by the developers.

The original reports of the coding schemes detail the lesson-level reliability of coding as well as the standard errors for each code; additional reports describe the development and application of the codes (e.g., Givvin, Hiebert, Jacobs, Hollingsworth, & Gallimore, 2005).

Our review did not identify a published factor analysis. Experts from participating countries

(20)

20 have gone on to analyse the data from their country and made empirical links to expert

reviews of the videos as well as student achievement, among others (e.g., Kunter & Baumert, 2006; Leung, 2005). To our knowledge, there are no additional studies that modify and report on those modified codes, thus indicating progression on a developmental continuum.

Protocol for Language Arts Teaching Observation (PLATO)

The PLATO classroom observation system was developed by Grossman, and Wyckoff (2013) at Stanford University to capture features of English/Language Arts (ELA)

instruction. PLATO builds on research proven critical for high quality ELA education and was designed to work across a variety of curricula and instructional approaches within language arts middle grade classrooms. PLATO is used both for research purposes and as a system for teacher professional development and the current version is the fifth version of the system (PLATO 5.0.).

PLATO was initially designed to capture ELA instruction in US classrooms, covering a wide range of curricular activities known as ‘key elements’ of ELA education: reading, writing, grammar, literature and oral presentations. Since its early stages, PLATO has also been used to capture instructional qualities in disciplines such as mathematics (Cohen, 2015) and science education (Kloser, 2014). PLATO was used as one of five observation systems in the Measuring Effectiveness of Teaching Study (BMGF, 2017); and currently PLATO is used to capture teaching qualities in Nordic classrooms across subject areas as varied as

mathematics (Author, 2018b), language arts learning (Author, 2018d), and also science and foreign language education. PLATO privileges socio-constructivist approaches to learning but combines this with cognitive approaches and behavioural approaches.

PLATO covers six of our nine teaching dimensions, but with a slightly different framing and indexing. The system is organized around four key instructional domains:

(21)

21 Instructional Scaffolding, Disciplinary Demand, Representing and Use of Content and

Classroom Environment. Each domain is divided into between two and four elements and includes a total of 12 elements. While mainly following teachers’ actions, PLATO also pays attention to student engagement. It is, for example, impossible to receive a high score on PLATO if students are not actively engaged in the task/activity at hand.

PLATO is a high inference system designed for interval coding, using 15-minute intervals for coding all 12 elements and can be used for real time observations as well as for observing classroom videos. Each of the 12 elements are scored on a scale from 1 to 4 based on the evidence for a given element during a 15-minute cycle. At the low end there is almost no evidence, or little evidence of instructional practice related to the element in question, while the higher end is characterized by evidence with some weaknesses, or strong and consistent evidence. Confirmatory factor analysis showed empirical evidence for the scales (Grossman et al. 2013). Kor (2011) performed a generalizability study to analyse the measurement properties of the PLATO rubric and show how the number and succession of segments are critical for the overall reliability. At least 5 segments per teacher are required to achieve an overall reliability greater than .80. In addition to the 12 elements, PLATO

captures the subject matter of instruction (for instance writing, literature and/or grammar) as well as the overall activity structures (whole group, small group, independent work, etc.) for each 15-minute segment. Multi-level analyses indicate a relation between PLATO

dimensions and students’ academic engagement (BMGF, 2017) but further analyses are needed. PLATO requires rater certification which is supported with a 3-4 day training course, originally designed as a face to face training but currently available as an online training.

Table 2 summarizes the aspects of the four observation systems evaluated.

[Table 2 near here]

(22)

22 Discussion

After defining the observation system concept we presented a framework for analysing observation systems and then applied the framework to four well-known systems. The framework’s aspects seem to have value as they point to relevant differences between the four observation systems. If practitioners or researchers plan to use an observation system it is important to be aware of how observation systems can differ, and make informed choices regarding the observation system that will best suit their purposes.

Applying the framework reveals that all but one dimension (i.e., teaching learning strategies and student self-regulation) of teaching is addressed by at least three observation systems. All four observation systems only address a core group of dimensions and do not all measure the same dimensions of teaching. Only the dimensions involvement/motivation and cognitive activation were measured by all four instruments. The other dimensions are also fundamental aspects of teaching quality, however there may be defensible reasons for not including these dimensions in an observation system, depending on one’s purpose. The framework’s contribution is not to endorse a particular system, but rather, its application can support more deliberate selection and use of observation systems.

This also applies to the view of teaching and learning category that forms the basis for a specific observation system because there no “one best” observation system. Definitions of teaching quality are informed by empirical matters, but they are also influenced by

preferences and values regarding good teaching. So, if one plans to use an observation system then those involved should make explicit how they want to define quality teaching, be it teacher-led direct instruction, or (also) manifestations of teaching that reflect more interactive, socio-cultural and/or other perspectives of teaching.

Users of observation systems must also decide if they will use a general system that can be applied across subjects, or a subject specific one that is used for a single subject such

(23)

23 as mathematics or science. This decision will influence the scalability of the observation systems and the potential for guiding teachers’ improvement efforts. Subject-specific systems are often developed for particular grade ranges, e.g., the TIMSS system focused on secondary mathematics classrooms and PLATO focuses on secondary English/Language arts

classrooms, in order to capture activities relevant for that specific subject and/or grade.

Although this limitation impedes the scalability of the observation system across grades and subjects in practice-based settings there is a body of evidence that theorizes the importance of subject matter in teaching and improving teaching (e.g., Grossman & Stodolsky, 1994; Hill &

Grossman, 2013). System narrowness might be particularly useful when researchers are studying the impact of a professional development program or a new reform or curricular intervention, providing ratings and raters’ notes that can be used to provide subject specific feedback to teachers’ nature of the teaching quality in that subject matter.

Our analysis showed that systems also vary on the grain size and focus on students actions aspects: refined 7-point scoring scales versus more restricted scales, varying numbers of teaching quality indicators (10-35), and a focus on teachers only in the evaluation of what happens in the classroom versus approaches in which the behavior and input from students is also measured. When selecting an observation system from this perspective one makes choices regarding to what extent one aims to measure the full complexity of teaching quality:

how many aspects of teaching quality, how many perspectives, which scoring distinctions?

More inclusive and extensive definitions of teaching quality may increase the cognitive demand raters bear and/or require specific rater background knowledge and training, when they have to take account of too many quality aspects in their measurements. Even well- trained raters are an important source of variation in teachers’ teaching quality scores

(BMGF, 2012) and the more complex quality definitions are made, the more likely raters will be a large source of variation in scores. BMGF (2017) argues that when an observation

(24)

24 system includes many teacher competencies the feedback may be very fine-grained and improvement efforts could be tuned well. However, the quality of the feedback may suffer as teachers will be overtaxed. A balance between the two is needed.

An application of the sixth framework aspect, scoring procedures, suggests there is wide variation in how developers support valid scores. The choices developers must make also have trade-offs. For example, working with external expert raters can have the advantage of a tightly controlled and monitored scoring setting where staff are focused narrowly on providing accurate and reliable scores. This might be helpful for accurately identifying the specific dimension of teaching that needs remediation, however, if the scores will be

discussed with the teacher who is trying to improve, a conversation with an expert rater with whom the teacher does not share a trusting relationship may not maximize what a teacher can learn from the scores or feedback. Conversely, if the observation system uses administrators or peers to create the ratings, the existing relationship these professionals share with the teacher may lead to inaccurate ratings or ratings the teacher perceives are less than objective.

There are not right or wrong choices of raters; observation systems must specify the rater (e.g., principal or expert) and then adjust the procedures and processes pertaining to scoring quality to account for whatever decision is made.

A final trade-off concerns the empirical evidence necessary when selecting an observation system. As even the four systems we review here demonstrate, the amount and type of validity evidence for an observation system varies. Certainly for any purpose, one should be concerned that there is evidence raters can be trained to create accurate and reliable scores. Irrespective of the system’s purpose, it is unethical, for example, to tell a teacher she has low levels of formative assessment when [s]he does not. But there may be trade-offs around the validity of a system that should be accounted for, depending on how scores will be used. We think we can tolerate somewhat less accuracy and reliability for improvement

(25)

25 purposes than for research and accountability purposes (e.g., precise rankings of the best to the poorest teaching teacher). If a researcher needs a precise estimate of the impact of a new curriculum, the highest levels of accuracy and reliability are likely necessary for detecting such impacts. However, also for improvement purposes, one should care about the relative differences between score points, which suggests reasonable levels of attention to raters and rating. Researchers and practitioners should think carefully about the purposes they have for the scores and consider what validity evidence can be fashioned into an argument for an appropriate level of score quality for that purpose. It bears noting however, that we should not ignore the importance of practitioners and researchers developing a common language and body of evidence about how to measure teaching and how teaching is related to other valuable outcomes. These outcomes can be achieved with varied levels of validity evidence.

In addition to these trade-offs that should be considered when selecting an

observation system, the framework points to important issues around the research knowledge base. Writing this article underscored the fact that observation system developers make different choices across framework aspects and it is clear that these choices shape the

ultimate nature and character of the system. But it is not clear (or likely) that there is one best set of decisions. Further, developers generally do not share the reasoning behind those

systems. In some cases, it is challenging to locate all of the framework details in published documentation, especially issues of rater training, certification, calibration, and monitoring accuracy and reliability. This limits the field’s understanding of how particular observation system aspects shape empirical evidence. To further develop the field’s knowledge of how to measure and improve teaching, researchers would do well to make these types of decisions more transparent and more a part of the research enterprise. There are examples of this (e.g., Seidel, Prenzel, & Kobarg, 2005) but they are rare and they do not yet constitute a body of scholarship that guides the development of new systems and uses of protocols within

(26)

26 observation systems. Such knowledge would be valuable for the efficiency of observation systems and for the improvement of teaching.

(27)

27 References

Author (2016) Author (2018a) Author (2018b) Author (2018c) Author (2018d)

Abadzi, H. (2009). Instructional time loss in developing countries: Concepts, measurement, and implications. The World Bank Research Observer, 24(2), 267-290.

Abry, T., Rimm-Kaufman, S. E., Larsen, R. A., & Brewer, A. J. (2013). The influence of fidelity of implementation on teacher–student interaction quality in the context of a randomized controlled trial of the Responsive Classroom approach. Journal of School Psychology, 51(4), 437-453.

Baumert, J., Kunter, M., Blum, W., Brunner, M., Voss, T., Jordan, A., Klusmann, Krauss, S.,

& Tsai, Y. M. (2010). Teachers’ mathematical knowledge, cognitive activation in the classroom, and student progress. American Educational Research Journal, 47(1), 133-180.

Bell, C.A., Gitomer, D.H., McCaffrey, D., Hamre, B., Pianta, R., Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17(2–3), 62-87.

Bill and Melinda Gates Foundation (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Retrieved on March, 31, 2017 from https://files.eric.ed.gov/fulltext/ED540962.pdf

Bill and Melinda Gates Foundation (2017). Better Feedback for Better Teaching: A Practical Guide to Improving Classroom Observations. Retrieved on March, 31, 2017 from:

http://k12education.gatesfoundation.org/teacher-supports/teacher- development/measuring-effective-teaching/

Bill and Melinda Gates Foundation (2018). Measures of effective teaching project:

Frequently asked questions. Retrieved on January 26, 2018 from:

http://k12education.gatesfoundation.org/blog/measures-of-effective-teaching-project- faqs/

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: principles, policy & practice, 5(1), 7-74.

Black, P., & Wiliam, D. (2010). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 92(1), 81-90.

(28)

28 Bloom, B. S. (1984). The 2-Sigma problem: The search for methods of group instruction as

effective as one-to-one tutoring. Educational Researcher, 13(6), 4-16.

Boekaerts, M., Pintrich, P. R. & Zeidner, M. (Eds) (2000). Handbook of self-regulation. San Diego: Academic Press).

Brophy, J., & Good, T. L. (1986). Teacher behavior and student achievement In M. Witrock (Ed.), The Third Handbook of Research on Teaching (pp. 328-375). New York:

Macmillan.

Carnine, D. W., Dixon, R. C. & Silbert, J. (1998) Effective strategies for teaching

mathematics, in: E. J. Kameenui & D. W. Carnine (Eds). Effective teaching strategies that accommodate diverse learners. Englewood Cliffs, NJ: Prentice-Hall.

Clarke, D.J., Emanuelsson, J., Jablonka, E., & Mok, I.A.C. (Eds.). (2006). Making

Connections: Comparing Mathematics Classrooms Around the World. Rotterdam:

Sense Publishers.

Cohen, J. (2015). Challenges in identifying high leverage practices. Teachers College Record, 117(7), 1-41.

Danielson, C. (2013). The Framework for Teaching Evaluation Instrument. Princeton, NJ:

Danielson Group.

Decristan. J., Klieme E., Kunter M. et al. (2015). Embedded Formative Assessment and Classroom Process Quality: How Do They Interact in Promoting Science

Understanding? American Educational Research Journal, 52(6), 1133–1159.

Fischer, H. & Neumann, K. (2012). Video analysis as a tool for understanding science instruction. In: D.Jorde and J. Dillon (Eds.), The World of Science Education (pp.115–140). Rotterdam, Netherlands: Sense Publishers.

Flanders, N. A. (1970). Analyzing teaching behavior. Boston: Addison Wesley.

Givvin, K. B., Hiebert, J., Jacobs, J. K., Hollingsworth, H., & Gallimore, R. (2005). Are there national patterns of teaching? Evidence from the TIMSS 1999 video study.

Comparative Education Review, 49(3), 311-343.

Grossman, P., & McDonald, M. (2008). Back to the future: Directions for research in

teaching and teacher education. American Educational Research Journal, 45(1), 184- 205.

Grossman, P. L., & Stodolsky, S. S. (1994). Considerations of Content and the Circumstances of Secondary School Teaching. Review of Research in Education, 20(1), 179-221.

(29)

29 Grossman P., Loeb S., Cohen J. and Wyckoff J. (2013) Measure for measure: The

relationship between measures of instructional practice in middle school English language arts and teachers’ value-added scores. American Journal of Education 119(3): 445–470.

Henry, A. E. (2010). Advantages to and Challenges of Using Ratings of Observed Teacher- child Interactions (Doctoral dissertation, University of Virginia).Hiebert, J., &

Grouws, D. A. (2007). The effects of classroom mathematics teaching on students’

learning. In F. K. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371-404). Charlotte, NC: Information Age Publishing.

Hill, H. C., Blunk, M. L., Charalambous, C. Y., Lewis, J. M., Phelps, G. C., Sleep, L., &

Ball, D. L. (2008). Mathematical knowledge for teaching and the mathematical

quality of instruction: An exploratory study. Cognition & Instruction, 26(4), 430-511.

Hill, H.C. & Grossman, P. (2013) Learning from Teacher Observations: Challenges and Opportunities Posed by New Teacher Evaluation Systems. Harvard Educational Review: July 2013, Vol. 83, No. 2, (pp. 371-384).

International Association for the Evaluation of Educational Achievement (IEA) (2016).

Retrieved from http://www.iea.nl/timss_1999_video_study.html

Jacobs, J. K., Garnier, H., Gallimore, R., Hollingsworth, H., Givvin, K. B., Rust, K., Stigler, J. W. (2003). Third International Mathematics and Science Study 1999 Video Study Technical Report, Volume 1: Mathematics. (NCES 2003012). Washington, D.C.:

National Center for Education Statistics.

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement 4th edition (pp. 17–64). Westport: American Council on Education and Praeger Publishers.

Kane, T., Kerr, K., & Pianta, R. (2014). Designing teacher evaluation systems: New guidance from the measures of effective teaching project. John Wiley & Sons.

Kloser, M. (2014). Identifying a core set of science teaching practices: A Delphi expert panel approach. Journal of Research in Science Teaching, 51(9), 1185 – 1217.

Keuning, T., van Geel, M., Frèrejean, J., van Merriënboer, J., Dolmans, D., & Visscher, A. J.

Differentiëren bij rekenen: een cognitieve taakanalyse van het denken en handelen van basisschoolleerkrachten.[Differentiated Instruction for Mathematics: A Cognitive Task Analysis into Primary School Teachers’ Reasoning and Acting], Pedagogische Studiën, 94(3), 160-181.

(30)

30 Kor. K. (2011). The measurement properties of the PLATO rubric. Paper presented at the

annual meeting of the American Educational Research Association, New Orleans.

Kunter, M., & Baumert, J. (2006). Linking TIMSS to research on learning and instruction: A re-analysis of the German TIMSS and TIMSS video data. In S. J. Howie, & T. Plomp (Eds.), Contexts of learning mathematics and science: Lessons learned from

TIMSS (pp. 335-351). London: Routledge.

Leung, F. K. S. (2005). Some characteristics of East Asian mathematics classrooms based on data from the TIMSS 1999 video study.Educational Studies in Mathematics, 60(2), 199-215.

Marzano, R. J., Marzano, J. S., & Pickering, D. (2003). Classroom management that works:

Research-based strategies for every teacher. ASCD.

Maulana, R., Helms-Lorenz, M., & Van de Grift, W. (2016). Validating a model of effective teaching behaviour of pre-service teachers. Teachers and Teaching, 1-23.

Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014).

State of the art-teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25(2), 231–256.

Oser, F. K. & Baeriswyl, F. J. (2001) Choreographies of Teaching: bridging instruction to learning. In V. Richardson (Ed.) Handbook on Research on Teaching, 4th edn, pp.

1031-1065. Washington, DC:

Osborne, J., Berson, E., Borko, H., Busch, K. C., Zaccarelli, F. G., Million, S., & Tseng, A.

(2015). Assessing the quality of classroom discourse in science classrooms. In 16th biennial conference of the European Association for Research in Learning and Instruction, Limassol, Cyprus (751).

Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher, 38(2), 109-119.

Pianta, R. C., Hamre, B. K., & Mintz, S. (2012). Classroom Assessment Scoring System:

Secondary Manual. Baltimore: Paul H. Brookes Publishing

Pianta, R. C., Hamre, B. K., & Mintz, S. (2012). Classroom Assessment Scoring System:

Manual Upper Elementary. Baltimore: Paul H. Brookes Publishing

Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom Assessment Scoring System: Manual K-3. Baltimore: Paul H. Brookes Publishing.

(31)

31 Praetorius, A. K., Pauli, C., Reusser, K., Rakoczy, K., & Klieme, E. (2014). One lesson is all

you need? Stability of instructional quality across lessons. Learning and Instruction, 31, 2-12.

Rosenshine, B. (1980) How time is spent in elementary classrooms, in: C. Denham & A.

Lieberman (Eds) Time to learn (Washington, DC, National Institute of Education), 107–126.

Sandilos, L. E., Shervey, S. W., DiPerna, J. C., Lei, P., & Cheng, W. (2016). Structural Validity of CLASS K-3 in Primary Grades: Testing Alternative Models.

Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I.

(2002). Measuring Reform Practices in Science and Mathematics Classrooms: The Reformed Teaching Observation Protocol. School Science and Mathematics, 102(6), 245-253.

Schacter, J., & Thum, Y. M. (2004). Paying for high-and low-quality teaching. Economics of Education Review, 23(4), 411-430.

Scheerens, J. (2014) School, teaching, and system effectiveness: some comments on three state-of-the-art reviews, School Effectiveness and School Improvement, 25, 2, 282- 290.

Seidel, T., Prenzel, M., & Kobarg, M. (Eds.) (2005). How to run a video study: Technical report of the IPN video study. New York: Waxmann Munster.

Seidel, T., & Prenzel, M. (2006). Stability of teaching patterns in physics instruction:

Findings from a video study. Learning and Instruction, 16(3), 228-240.

Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454-499.

Slavin, R. E. (1996). Education for all. Lisse: Swets & Zeitlinger Publishers.

Stallings, J.A. (1973). Follow through program classroom observation evaluation. 1971-72.

Report Number SRI-URU-7370. Menlo Park, CA: Stanford Research Institute.

Teachstone (2017). Why Class? Exploring the Promise of the Classroom Assessment Scoring System (CLASS). Retrieved on March., 31 from

http://cdn2.hubspot.net/hubfs/336169/What_Is_CLASS_ebook_Final.pdf?t=1446 Tomlinson, C. A., Brimijoin, K., & Narvaez, L. (2008). The differentiated school: Making

revolutionary changes in teaching and learning. Alexandria, VA: ASCD

(32)

32 Van de Grift, W. (2007). Quality of teaching in four European countries: a review of the

literature and application of an assessment instrument. Educational Research, 49(2), 127-152.

Van de Grift, W., Van der Wal, M., & Torenbeek, M. (2011). Ontwikkeling in de pedagogisch didactische vaardigheid van leraren in het basisonderwijs [The development of primary school teachers”pedagogical and didactical skill].

Pedagogische Studiën, 88, 416-432.

Van der Lans, R. M., van de Grift, W. J., & van Veen, K. (2017). Developing an Instrument for Teacher Feedback: Using the Rasch Model to Explore Teachers' Development of Effective Teaching Strategies and Behaviors. The Journal of Experimental Education, 1-18.

Van der Lans, R. M., van de Grift, W. J., van Veen, K., & Fokkens-Bruinsma, M. (2016).

Once is not enough: Establishing reliability criteria for feedback and evaluation decisions based on classroom observations. Studies in Educational Evaluation, 50, 88-95.

Wang, M. C., Haertel, G. D., & Walberg, H. J. (1993). Toward a knowledge base for school learning. Review of educational research, 63(3), 249-294.

Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview.

Educational psychologist, 25(1), 3-17.

Referanser

RELATERTE DOKUMENTER

When the focus ceases to be comprehensive health care to the whole population living within an area and becomes instead risk allocation to individuals, members, enrollees or

Our analysis covers how the angular reflectance distribution is affected by varying the solar zenith angle and cloud configuration, and also if snow grain size and snow thickness

Next, we present cryptographic mechanisms that we have found to be typically implemented on common commercial unmanned aerial vehicles, and how they relate to the vulnerabilities

An abstract characterisation of reduction operators Intuitively a reduction operation, in the sense intended in the present paper, is an operation that can be applied to inter-

For the case of null-subchannels, we present two estimators based on the correlation func- tion of the subchannel signals and one estimator based on the conjugate correlation

Thus, given an ASP program, an answer set solver grounds the program and generates models in the form of sets of facts that satisfy all rules of the program and that violate none of

Therefore the proposed method takes, in total, 4n random numbers and O ( n logn ) time in the worst case to generate 4n coordinates for a set of n non-intersecting random

environmental conditions during at least 1 hour of operation. 2) Maneuvering Control – Using logged data, plot position and heading response as well as power consumed for