• No results found

5. DISCUSSION

5.2. L ABELLING

Q2a: Do novice data modellers benefit from using natural language terminology when labelling entities/classes?

Q2b: What characterizes novices’ concept building processes related to labelling elements of a conceptual data model?

Concept building, in the sense of becoming familiar with scientific terms and their established and generally accepted meaning, is probably the most obvious link between language and learning. I will now discuss a different type of concept building activity related to the labelling of data model elements.

Data modelling (as well as programming) introduces a number of technical concepts that need labelling (papers 1 and 4). Contrary to the process of scientific concept building, which is frequently described and common to most subject areas, this particular aspect of the relationship between language and learning is specific to computer science (i.e. data modelling and programming). This implies revisiting the distinction between scientific (top-down) and spontaneous (bottom-up) concept building (Vygotsky, 1986). The analysis in the present thesis shows that the concept building (paper 3) and labelling (paper 1) activities of data modelling comprise features of both these types of concept building activities simultaneously.

The spontaneous concepts in programming languages and data modelling methodologies are mainly concerned with the use of intelligible terms for denoting meaningful features of a program or a data model. This has been discussed previously in relation to naming of variables in programming (e.g. Shneiderman, 1980). The process is to a certain extent related to the learning of foreign languages (Vygotsky, 1986), in that it involves reconstructing and/or altering the relationships between terms and meanings from vernacular languages. Known signs are attributed new or altered meanings, and known entities or meanings are labelled with alternative terms or phrases from the more familiar ones (paper 1).

The other type of concept building (i.e. scientific concept building) takes place when new or abstract “gadgets” are introduced in the data model; for instance, when labelling relational phenomena (i.e. classes or entities that arise from objectification of a relationship between classes or entities). These are phenomena that do not have a close mapping to any everyday concepts. Hence, there is no term or expression from vernacular discourse (paper 4) that lends itself to be used as label for the phenomenon (paper 1). A “new” term must be invented or introduced, and the meaning of this term then needs to be explicitly defined. Through repeated use in the scientific discourse of the data modelling activity, the new term and its related meaning develops into a concept in the modeller’s understanding.

In the two first papers, I demonstrate the importance for novices of metalinguistic awareness11, and the related need for explicitness in the choice and use of terms for labelling entities as well as attributes and relationships. Paper 1 illustrates this from a largely empirical point of view, whereas paper 2 takes the more theoretical perspective of linguistic philosophy. The common conclusion, which is also evident from the results in paper 4, is that it is necessary to help the students realise, and become aware of, the differences between natural language use and specialised languages like data modelling or programming. One main difference lies in the necessary levels of precision or accuracy. The meanings of natural language propositions are defined through their use in social practices (Wittgenstein, 1958).

The technical language expressions introduced as labels in a data model or program, on the other hand, need to have their meanings explicitly defined in order to prevent ambiguity. Détienne (2002) describes this duality of computer programming as on the one side being represented by an unambiguous technical syntax, while simultaneously allowing for incorporation of terms from vernacular lexis as labels for variables, classes and operations. This latter aspect has been shown to help the understanding of computer programs (Shneiderman, 1980). But – as have been demonstrated in paper 1 – it also introduces problems because the students tend to confuse the artificial and the natural language domains as contextual frames when determining the meaning of a term used in a data model. It appears that the distinction between artificial and natural languages is not as clear-cut as one would like to believe, but rather that the two are intertwined. Programming language understanding is, as explained in sections 2.4.1

11 In paper 1 this is called metalinguistic consciousness (see also discussion in section 3.1.5).

and 5.1, dependent upon natural language knowledge, but at the same time easily confused by it (Bonar & Soloway, 1985).

The students in paper 1 appeared to have problems with this distinction.

Erroneous modelling was sometimes the result of letting an entity adopt the vernacular meaning of the term chosen as label, without the necessary transformation by grammatical metaphor (paper 4). This could have been avoided if they had been aware of the data model representing a different language game. By this, I emphasize that the students probably have the metalinguistic knowledge of this distinction, but they are not aware of it; they lack the metalinguistic awareness.

When making a data model for some problem domain, it is essential to maintain a closeness of mapping between the stakeholders’ conceptual models of the constructs to be modelled and the representations established (Peckham & Maryanski, 1988). In the study by Bürkle et al. (1995), this was achieved by maintaining a close collaboration between user groups as experts of the domain specific language of banking, so that the concepts that were deployed in the data model were based on a sound understanding of how these concepts were generally used by the users of the system. Both the students in paper 1 and the students in paper 4 displayed problems due to lack of detailed domain familiarity, which forced them to invent meanings of concepts and their interrelationships. In addition to jeopardising the quality of their system, such inventions put an even greater demand on the students to be explicit about the intended meanings of the components of the system, as their understanding cannot rest on shared cultural-historical background knowledge.

In paper 4, I distinguish between technological and scientific lexis (White, 1998). Being unknown terms introduced as labels for new phenomena, technological concepts are clearly scientific according to Vygotsky (1986). White’s scientific expression also conforms to the scientific concepts of Vygotsky, but paper 4 shows that the learning of these concepts does not necessarily follow the top-down patterns described by Vygotsky. It seems that they are transformed generalizations from vernacular concepts (e.g. Blocking as a nominalized version of the activity of blocking one’s account). These concepts get their meaning through grammatical metaphor, rather than through deduction from a formal definition to specific cases. Based on the analysis of paper 4, it would therefore be appropriate to claim that the attribution of

meaning to scientific concepts, in White’s sense, resembles spontaneous concept building in Vygotsky’s terms, rather than scientific concept building.

The complex relationships between the different semiotic systems related to the activity of data modelling are illustrated by the framework presented in paper 4. The navigation between the different metalevels, contexts and signs constitute a bridging of the gap between artificial and natural languages. To be able to handle this bridging of the gap successfully, the data modeller needs a certain level of metalinguistic awareness. This awareness seems to be particularly important for novices. Note, however, that with increasing levels of expertise (Dreyfus & Dreyfus, 1986), the difference becomes less obvious or important, and the metalinguistic knowledge is gradually less explicitly attended to in the discourse. Paper 4 furthermore introduces the notions of technical and vernacular language realms as contextual frames. It appears that proficiency in data modelling is characterised by the ability to seamlessly shift between these different contextual frames in discourse. By seamless shifts I mean that the differences of the involved language games are not attended to explicitly, but still recognised in the way the meaning of a term is determined by the contextual frame in which it is used. This finding corroborates Dreyfus and Dreyfus (1986).