Oslo Studies in Language 9 (3) / 2017
Ruth Vatvedt Fjeld, Kristin Hagen, Birgit Henriksen, Sofie Johansson, Sussi Olsen & Julia Prentice (eds.)
Academic Language in a Nordic Setting –
Linguistic and Educational Perspectives
Oslo Studies in Language
General editors: Atle Grønn and Dag Haug
Editorial board International:
Henning Andersen, Los Angeles (historical linguistics) Östen Dahl, Stockholm (typology)
Torgrim Solstad, Berlin (German, semantics and pragmatics) Arnim von Stechow, Tübingen (semantics and syntax)
National:
Johanna Barðdal, Bergen (construction grammar)
Laura Janda, Tromsø (Slavic linguistics, cognitive linguistics) Terje Lohndal, Trondheim (English, syntax and semantics) Øystein Vangsnes, Tromsø (Norwegian, dialect syntax) Local:
Hans Olav Enger, ILN (Norwegian, cognitive linguistics) Ruth E. Vatvedt Fjeld, ILN (Norwegian, lexicography) Jan Terje Faarlund, CSMN, ILN (Norwegian, syntax)
Cathrine Fabricius-Hansen, ILOS (German, contrastive linguistics) Hilde Hasselgård, ILOS (English, corpus linguistics)
Hans Petter Helland, ILOS (French, syntax)
Janne Bondi Johannessen, ILN, Text Laboratory(Norwegian, language technology) Helge Lødrup, ILN (syntax)
Gunvor Mejdell, IKOS (Arabic, sociolinguistics)
Christine Meklenborg Salvesen, ILOS (French linguistics, historical linguistics) Diana Santos, ILOS (Portuguese linguistics, computational linguistics)
Ljiljana Saric, ILOS (Slavic linguistics)
Bente Ailin Svendsen, ILN (second language acquisition)
Ruprecht von Waldenfels, ILOS (Slavic linguistics, corpus linguistics)
Oslo Studies in Language 9 (3) / 2017
Ruth Vatvedt Fjeld, Kristin Hagen, Birgit Henriksen, Sofie Johansson, Sussi Olsen & Julia Prentice (eds.)
Academic Language in a Nordic Setting –
Linguistic and Educational Perspectives
Oslo Studies in Language, 9(3), 2017.
Ruth Vatvedt Fjeld, Kristin Hagen, Birgit Henriksen, Sofie Johansson, Sussi Olsen
& Julia Prentice (eds.):
Academic Language in a Nordic Setting – Linguistic and Educational Perspectives Oslo, University of Oslo
ISSN 1890-9639
© 2017 the authors
Set in MS Word, fonts Gentium Book Basic and Linux Libertine by Kristin Hagen.
Cover design by Akademika forlag & Atle Grønn.
Printed by Print House AS from camera-ready copy supplied by the editors.
Contents
Introduction 1
Ruth Vatvedt Fjeld, Kristin Hagen, Birgit Henriksen, Sofie Johansson, Sussi Olsen and Julia Prentice Contributions by Plenary Speakers
Academic Phraseology: A Key Ingredient in Successful L2 Academic
Literacy 9
Sylviane Granger
Academic Vocabulary in Teacher Talk: Challenges and Opportunities
for Pedagogy 29
Averil Coxhead
Textography as a Strategy for Investigation: Writing in Higher
Education and in the Professions 45
Brian Paltridge and Marie Stevenson
Pedagogical Challenges and Educational Perspectives From Implicit Norms to Explicit Skills - Focusing on Danish
Academic Vocabulary 59
Anne Sofie Jakobsen
Preparing EFL Students for University EMI Programs:
The Hidden Challenge 77
Birna Arnbjörnsdóttir
Specific Linguistic Perspectives on Academic Language Use in the L1 and L2
Pronoun Use in Novice L1 and L2 Academic Writing 93 Marte Monsen and Sylvi Rørvik
Linguistic Complexity in Academic Writing: Comparing Tasks
in L2 English 111
Päivi Pietilä
Validity in High- and Low-Stakes Tests: A Comparison of Academic Vocabulary and some Lexical Features in CLIL and non-CLIL
Students’ Written Texts 127
Eva Olsson and Liss Kerstin Sylvén
A Bilingual Academic Word List: The Merging of a Norwegian
and a Swedish List 147
Sofie Johansson, Kristin Hagen and Janne Bondi Johannessen Other Research Perspectives on Academic Language
Linguistic Deviations in the Written Academic Register of Danish
University Students 169
Jonas Nygaard Blom, Marianne Rathje, Bjarne le Fevre
Jakobsen, Alexandra Holsting, Kenneth Reinecke Hansen, Jesper Tinggaard Svendsen, Thit Wedel Vildhøj and Anna Vibeke Lindø English as an Interactional Resource for Doing Being Academically
Competent: Student Practices in Group Meetings 191 Elisabeth Dalby Kristiansen
Fjeld, Hagen, Henriksen, Johansson, Olsen & Prentice (eds.) Academic Language in a Nordic Setting – Linguistic and Educational Perspectives, Oslo Studies in Language 9(3), 2017. 1–7.
(ISSN 1890-9639) http://www.journals.uio.no/osla
introduction
RUTH VATVEDT FJELD, KRISTIN HAGEN, BIRGIT HENRIKSEN, SOFIE JOHANSSON, SUSSI OLSEN AND JULIA PRENTICE
[1] t h e ne t work language use in nordic ac ademic settings In 2013, a Scandinavian research network, Language Use in Nordic Academic Set- tings (LUNAS) was formed. The network, which was funded by Nordplus, unites researchers from Denmark, Norway and Sweden working in the fields of lexico- graphy, computational linguistics, second language acquisition, and didactics in second language teaching. The main purpose of the network has been to share information and experiences on academic language use in the Scandinavian countries in educational settings and in research, but of equal importance is to identify common issues and interests within the field and to benefit from joint efforts in language resource development and research. In May 2016, the LU- NAS network organized a large international conference at the University of Copenhagen on academic language use and academic literacies from a multilin- gual perspective in Nordic educational contexts. The articles presented in this volume show some of the results of the cooperation within the LUNAS network since 2013, but also present research from invited guests and other researchers interested in these topics.
As a result of the Scandinavian collaboration, the need for carrying out re- search on large collections of academic texts has been outlined. All three coun- tries have now compiled large corpora of academic texts, although the corpora differ somewhat in size and balance. For instance, while the Norwegian and the Danish corpora are composed of texts from a wide range of disciplines, the Swedish corpus has a preponderance of texts from the humanities and social science. Apart from corpora, monolingual Norwegian and Swedish academic word lists have been extracted using diverse state-of-the art statistically-based methods in language technology and corpus linguistics. In addition, a monolin- gual Danish academic word list and a bilingual Norwegian-Swedish academic word list are in the process of being established by researchers from the LUNAS group.
The preconditions for studying academic language have varied across the Scandinavian countries, and this has influenced the research carried out in the
[2] fjeld, hagen, henriksen, johansson, olsen & prentice
project groups. Research in Norway has focused on the development of a cor- pus and lexical resources. A similar focus has guided the research in Sweden, as well as several ongoing research projects in pedagogical and educational topics relevant to academic language use. In Denmark, no academic corpus was avail- able at the onset of the LUNAS project, and the main research focus in the field of academic language had previously been on education in academic English as a second language. These different perspectives are reflected in this volume of collected papers, which, as a result, consists of three different parts: one with a focus on the pedagogical challenges and educational perspectives, one with a focus on the linguistic perspectives and one more generally related to describ- ing academic language with varying approaches.
[2] t h is vol ume
This volume gives a sample of ongoing research in the field of language use in Nordic academic settings. The selection of papers is based on contributions to the conference at the University of Copenhagen in May 2016.
The review processes were conducted by inviting a selection of conference participants to submit a contribution to this volume. Each paper was reviewed by two expert researchers in the field. Papers with minor or moderate revision suggestions were accepted for publication.
The reviewers listed in alphabetic order are: Dorte Albrechtsen, Anne Gold- en, Hana Gustafsson, Glenn Ole Hellekjær, Anne Holmen, Håkan Jansson, Sabine Kirchmeier, Anne Kjærgaard, Lars Anders Kulbrandstad, Maria Kuteeva, Robert Lew, Monika Mondor, Sanni Nimb, Andreas Nord, Claes Ohlsson, Karl-Heinz Pogner, Ida Seljeseth, Philip Shaw, Emma Sköldberg, Sofia Tingsell, Ole Togeby, Urd Vindenes and Johannes Wagner.
[3] c ont r i but i ons by t h e ple nar y s pe ake rs at t h e lu nas c onf e re nc e
Before providing a brief summary of the content of all the chapters, we want to highlight some of the issues raised in the three invited contributions to this volume by the plenary speakers at the LUNAS conference. These are Sylviane Granger, Averil Coxhead and Brian Paltridge (article written in collaboration with Marie Stevenson). These chapters give some important insights concern- ing theory, research methodology and pedagogical practice for the area of aca- demic language use – both from an L1- and L2 perspective.
Sylviane Granger’s contribution on academic phraseology incorporates all three of the different perspectives mentioned above. Nowadays, the necessity of giving phraseological units the same amount of attention as individual words
INTRODUCTION [3]
OSLa volume 9(3), 2017
is a widely shared view among researchers investigating L2 acquisition and L2 vocabulary learning. When it comes to research on academic language use (both L1 and L2), however, phraseology has only recently started to come in for serious consideration. The discussion concerning the different methodologies that have been used to compile different lists of academic phrasemes raises im- portant questions regarding the criteria to be applied to determine which mul- tiword units should be extracted from academic texts and which should be in- cluded (or excluded) in a final list to serve as a resource for academic writing.
These are questions that are ultimately concerned with the nature of academic language, i.e. which phrases are characteristic for academic language use and which of those are worth teaching to student groups with increasingly hetero- geneous backgrounds. Granger argues that, besides linguists’ and practitioners’
perspectives, the L2-learners’ perspective needs to be included in the equation.
One way to achieve this is to ensure that research on academic language, as well as the resources that result from such research, are also informed by learner corpus research, which can pinpoint the particular challenges that stu- dents experience when writing academic texts in their L2. Another important issue raised in the chapter is the comparison of L2 users’ academic writing to the writings of novice native academic writers. As is often pointed out, academ- ic language is a new language even for the latter group; this, however, does not mean that the needs and challenges for the two groups are identical, and this needs to be taken into account, e.g. when developing resources like academic word- and phrase lists for the different target groups.
The differences between academic vocabulary and the vocabulary of every- day language and the challenges that they provide for students (especially those studying in their L2) are central issues in Averil Coxhead’s contribution.
By focusing on teachers’ language use and the importance of this input for the development of students’ academic vocabulary, Coxhead highlights a problem that teachers in higher education, also in the Nordic countries, are increasingly concerned about, i.e. the transition between upper secondary school and uni- versity education. This transition presents immense difficulties to many stu- dents when it comes to the language that they are expected to use, since the gap between general language use and academic language is simply too big.
New university students are, in other words, expected to use a style and voca- bulary that is new to them and that they haven’t encountered before. To raise teachers’ awareness of their own use (or lack of use) of academic language in their teaching is therefore an important goal for research on academic lan- guage use from a didactic perspective.
Brian Paltridge and Marie Stevenson’s contribution to this volume discusses
[4] fjeld, hagen, henriksen, johansson, olsen & prentice
academic writing from a slightly different perspective by focusing on its rela- tionship to the type of writing assignments students have to perform in their future professions (rather than its function as a means of communication in higher education and the research community). The question as to how far and in what way academic writing in higher education prepares students for writ- ing tasks in their future work place is by no means an obvious one, and the con- tribution therefore highlights an important issue related to student compe- tence development. The chapter also provides important insights into the ad- vantage of triangulating different methodologies when researching the rela- tionship between academic writing and writing in the workplace.
[4] t h e ot h e r c ont r i but i ons i n t h is vol ume
In the following, we summarize the content of the remaining contributions to this volume, divided into three sections, namely:
Pedagogical Challenges and Educational Perspectives
Specific Linguistic Perspectives on Academic Language Use in the L1 and L2
Other Research Perspectives on Academic Language.
[4.1]Pedagogical Challenges and Educational Perspectives
Anne Sofie Jakobsen: From Implicit Norms to Explicit Skills - Focusing on Danish Academic Vocabulary starts with a discussion of the function of academic lan- guage and the challenges L2 language users have in developing skills in aca- demic language use. The main focus is on the discrepancy between presenta- tions of academic vocabulary and academic language in guidance literature on academic writing and the requirements learners face in real academic writing.
Such requirements are treated as implicit norms in the students’ education, and the author argues for a more explicit focus on academic language skills in the Danish educational system.
Birna Arnbjörnsdóttir: Preparing EFL Students for University EMI Programs: The Hidden Challenge discusses the challenges that Icelandic students meet when they study English at university level in Iceland. These challenges are rarely acknowledged, even though most students struggle to express themselves in adequate academic English. This is mainly due to the fact that, throughout the Icelandic educational system, English is taught more with an emphasis on liter- ature and general conversational skills rather than on the mastery of academic
INTRODUCTION [5]
OSLa volume 9(3), 2017
language use of students at various proficiency levels of English in different learning contexts and outlines an established course in English as an L2 as a way to meet the different needs of the students.
[4.2] Specific Linguistic Perspectives on Academic Language Use in the L1 and L2
Marte Monsen and Sylvi Rørvik: Pronoun Use in Novice L1 and L2 Academic Writing describes the use of pronouns in novice academic writing in L1 Norwegian and L2 English by L1 Norwegian writers, with a focus on first person pronouns. The study provides quantitative data that sheds new light on pronoun use in stu- dent writing, and on differences between English and Norwegian academic writing. The main findings reveal differences in the frequency of use of pro- nouns in the two languages, with a tendency for students to use the first person pronouns in a similar way in both languages, despite the fact that the use of these pronouns are discouraged in English academic writing, whereas it is en- couraged in academic writing in Norwegian.
Päivi Pietilä: Linguistic Complexity in Academic Writing: Comparing Tasks in L2 English addresses the important and tricky issue of operationalizing syntactic and lexical complexity in student writing and reports on a small scale study based on the analysis of three types of texts written by students of English in academic contexts. The study measures syntactic and lexical complexity in the student texts with different levels of formality and personal involvement. The results are in line with previous international research on L2 writing. The con- text in which the study was carried out is relevant for the present thematic fo- cus on academic language use in a Nordic setting, as the data were collected from undergraduate and Master’s students at a Finnish university.
Eva Olsson and Liss Kerstin Sylvén: Validity in High- and Low-Stakes Tests: A Comparison of Academic Vocabulary and some Lexical Features in CLIL and non-CLIL Students’ Written Texts is a validation study involving high- and low-stakes es- says that are part of a large longitudinal Swedish research project in CLIL. Their study focusses on the use of upper secondary student’s productive English aca- demic vocabulary and other linguistic features, such as text length, word length and variation of vocabulary. They argue for the need to establish validi- ty in relation to writing assignments in high- and low-stakes contexts in a more general sense, for instance with regard to the role of effort and motivation.
Sofie Johansson, Kristin Hagen and Janne Bondi Johannessen: A Bilingual Ac- ademic Word List: The Merging of a Norwegian and a Swedish List demonstrates how two monolingual academic word lists in Norwegian and Swedish compiled by different methods are merged into a bilingual list. Methodological issues on compiling academic word lists are discussed as well as crucial differences and
[6] fjeld, hagen, henriksen, johansson, olsen & prentice
similarities in Norwegian and Swedish academic language use. The conclusions of the article are that, irrespective of method, there are unexpected differ- ences. In addition, the article lists examples of cognates and false friends in the two languages.
[4.3]Other Research Perspectives on Academic Language
Jonas Nygaard Blom, Marianne Rathje, Bjarne le Fevre Jakobsen, Kenneth Reinecke Hansen, Jesper Tinggaard Svendsen, Alexandra Holsting, Thit Wedel Vildhøj and Anna Vibeke Lindø: Linguistic Deviations in the Written Academic Reg- ister of Danish University Students shows that Danish university students studying Journalism and Danish respectively make a lot of, often serious, mistakes in their academic writing. The study concludes that the students are not profi- cient in neither Danish orthography nor in grammar. In addition, the article discusses the methodological problems involved in quantifying and describing linguistic deviations (defined as orthography and punctuation).
Elisabeth Dalby Kristiansen: English as an Interactional Resource for Doing Being Academically Competent: Student Practices in Group Meetings interprets the interac- tions in a supposedly English-only environment in terms of the language ideo- logies and practices. Using an ethnomethodological approach combined with conversation analysis, she investigates how the students adapt to and adopt English as part of their day-to-day work as students, and how they negotiate their position in the student group through their academic language use. Her conclusion is that this activity does not necessarily lead to better competence in what the author terms ‘institutionalized English’, which is the important pedagogical question which needs further discussion and research.
[5] c onc l ud i ng re mar ks
We would like to thank all the contributors of this volume, the reviewers, and the LUNAS-conference participants.
We would also like to thank Nordplus for funding the research network and the Carlsberg Foundation for supporting the LUNAS conference financially.
Finally, we would like to thank our local universities and institutions for en- couraging and supporting our work on academic language use in Danish, Nor- wegian and Swedish.
We hope that readers of the papers of this volume will appreciate the ongo- ing research in the field. We intend to keep on participating and contributing to research on language use in Nordic academic settings in the future.
INTRODUCTION [7]
OSLa volume 9(3), 2017
c ontac t s
Ruth Vatvedt Fjeld University of Oslo
Kristin Hagen University of Oslo
Birgit Henriksen
University of Copenhagen [email protected]
Sofie Johansson
University of Gothenburg
Sussi Olsen
University of Copenhagen [email protected]
Julia Prentice
University of Gothenburg
Fjeld, Hagen, Henriksen, Johansson, Olsen & Prentice (eds.) Academic Language in a Nordic Setting – Linguistic and Educational Perspectives, Oslo Studies in Language 9(3), 2017. 9–27.
(ISSN 1890-9639) http://www.journals.uio.no/osla
academic phraseology
a key ingredient in successful l2 academic literacy
SYLVIANE GRANGER abs t rac t
One of the weaknesses of most current academic word lists is that they fail to do justice to the large stock of multiword units that are typical of academic language. The objective of this chapter is to raise awareness of the importance of phrasal academic vocabulary. After a brief critical survey of three recently compiled phrasal academic lists, the chapter highlights the potential contribution of learner corpus data to identifying the most useful units for teaching purposes. The approach is illustrated with a case study of phrasal metadiscourse based on corpora of novice and expert native writing, and subcorpora from the International Corpus of Learner English representing L2 writers from six different mother tongue backgrounds.
[1] i nt r oduc t i on
Academic vocabulary can be roughly defined as the words and phrases that are typically used in academic contexts. While acquiring these lexical items is es- sential to the academic achievement of both native and non-native (L2) stu- dents, they represent a particularly significant hurdle for L2 users, who have to understand and produce academic language in a language that is not their own.
Academic vocabulary is commonly subdivided into discipline-specific aca- demic words, i.e. technical words that describe content knowledge (e.g. malig- nant, biopsy or incubate in medicine), and general academic words that occur across content areas and are used to refer to activities typical of academic work and to structure discourse (e.g. despite, conclusion or categorise). Both types of word are challenging for L2 learners, but it is the cross-disciplinary ones that pose the greatest difficulties. One of the reasons is that these words ‘are sup- portive of but not central to the topics of the texts in which they occur’
(Coxhead 2000: 214). As a result, they are not particularly salient and tend to pass unnoticed. To help teachers give these words the pedagogical attention
[10] granger they require, several lists of cross-disciplinary academic words have been com- piled. The first to appear was Coxhead’s (2000) Academic Word List (AWL), which quickly met with great success and is still the most widely used today.
However, as acknowledged by Coxhead (2008) herself, ‘[o]ne of the challenges of the AWL is that it was released solely as a list of individual words and their families, with no indication of the context and patterning in which these words occurred’. The same weakness is pointed out by Gardner & Davies (2013), the compilers of the Academic Vocabulary List, who conclude that ‘more needs to be done in the future to identify core multiword academic vocabulary’ (p. 325).
A first step in that direction was made by Paquot (2010), whose Academic Key- word List contains a number of multiword adverbs, prepositions and conjunc- tions (e.g. as well as, according to, as opposed to, as to, contrary to, in favour of, rather than, for example, whether or not), though these represent but a small proportion (c. 3%) of the whole list. Another initiative designed to ‘phrase up’ single-word vocabulary lists is that of Martinez & Schmitt (2012), who, using a mixture of automatic extraction and manual vetting by judges with language testing and teaching experience, put together a Phrasal Expressions List, which contains 505 frequent non-transparent multiword expressions in English, specially in- tended for receptive use.
In recent years, there have been several initiatives to produce lists of word combinations typical of academic discourse (Durrant 2009, Simpson-Vlach &
Ellis 2010, Ackermann & Chen 2013). The main objective of this chapter is to take a critical look at these lists and the methods used to generate them (Sec- tion 2) and to describe the potential contribution of learner corpora to identify- ing the most useful units (Section 3). In Section 4, the corpus-based approach to academic phraseology is illustrated by means of a case study of phrasal meta- discourse. Section 5 offers some conclusions and avenues for future research.
[2] ph ras a l ac ade mic lis t s
Phrasal academic lists are lists of phrasemes (Mel’čuk 1998), i.e. word combina- tions displaying some degree of selectional restriction that are deemed to offer high learning and teaching potential. Those that have been compiled to date target two different types of word combination: collocations, i.e. pairs of words that co-occur in a short span of text more often than predicted by chance (Sin- clair 1991) and are identified by means of statistical tests such as Mutual Infor- mation (MI), and lexical bundles, i.e. sequences of contiguous words that recur in a particular register (Biber et al. 1999) and are extracted using the n-gram technique. One characteristic shared by the two types of unit is that they are usually not very difficult to understand, but are very difficult to get right in
ACADEMIC PHRASEOLOGY [11]
OSLa volume 9(3), 2017
productive tasks.
The lists assembled by Durrant (2009) and Ackermann & Chen (A&C) (2013) contain typical English for Academic Purposes (EAP) collocations extracted from large corpora of native or expert academic texts. Although they were compiled with the same objective – that of providing a pedagogically useful re- source –, the two lists are quite different, not only in size (1,000 pairs for Dur- rant vs. 2,468 for A&C) but also in the quality of the units. Durrant’s list con- tains a majority (76%) of grammatical collocations, i.e. containing at least one grammatical word, such as consistent with, as shown, suggest that, while A&C’s Academic Collocation List is limited to lexical collocations made up of two lexi- cal words (see examples in Table 1). Several other factors account for discrep- ancies between the two lists, among them the following four: (1) the corpora used differ in size and academic genres covered; (2) A&C exclude collocation pairs that contain words from West’s (1953) General Service List (top 2,000 words), while Durrant includes them; (3) the collocation status of the word pairs relies on different quantitative criteria (e.g. MI of minimum 4 for Durrant and minimum 3 for A&C); (4) Durrant relies solely on automatic extraction, while A&C’s approach makes use of both automatic extraction and manual screening by linguists (e.g. to exclude highly fixed units1 such as collective bar- gaining) and language practitioners (to select the pedagogically most relevant units).
Verb Adverb Adjective Noun
differ significantly causal link
expand rapidly conflicting interests
explore further crucial factor
increase dramatically final stage
vary widely further information
table 1: Excerpt from Ackerman & Chen’s (2013) Academic Collocation List
Rather than collocations, the Academic Formulas List (AFL) put together by Simpson-Vlach & Ellis (2010) comprises lexical bundles, i.e. uninterrupted se- quences of 3-5 words extracted from academic corpora. The final list contains 600 formulaic sequences, subdivided into three 200-formula lists (core, spoken and written). Table 2 shows the top 15 spoken and written sequences in the AFL. The list was obtained using a mixture of automatic and manual extraction procedures. The first step consisted in the automatic extraction of sequences
[1] The degree of fixedness was determined by consulting several online dictionaries to see whether the word combinations were listed as independent entries.
[12] granger that were more frequent in academic than non-academic texts. Conscious that
‘long lists of highly frequent expressions are of minimal use to instructors who must make decisions about what content to draw students’ attention to for maximum benefit within limited classroom time’, Simpson-Vlach & Ellis (2010:
490) submitted a sample of the data to experienced teachers, who were asked to assess their usefulness for teaching purposes. On that basis they created a met- ric, called the Formula Teaching Worth, which reflects the statistical correla- tion between teacher intuition, on the one hand, and phrase frequency and MI score, on the other, and used it to draw up the final list. To enhance the useful- ness of the list for teachers, the authors then grouped the sequences into dis- course-pragmatic categories. For example, bundles such as a high degree, a large number of, a wide range of were classified in the category of ‘quantity specifica- tion’, while it might be, is likely to, in a sense were classified as ‘hedges’.
Spoken AFL Written AFL
be able to on the other hand
blah blah blah due to the fact that
this is the on the other hand the
you know what I mean it should be noted
you can see it is not possible to
trying to figure out a wide range of a little bit about there are a number of does that make sense in such a way that
you know what take into account the
the University of Michigan as can be seen for those of you who it is clear that
do you want me to take into account
thank you very much can be used to
look at the in this paper we
we're gonna talk about are likely to
table 2: Top 15 lexical bundles in Simpson-Vlach & Ellis’s (2010) Spoken and Written Academic Formulas List
These initiatives are promising, but they are only a beginning. The discrep- ancy between the lists shows that there are still a considerable number of is- sues to be addressed in future research, including the following:
(i) What quantitative criteria should best be used to extract the units? On the basis of what statistical tests and with what thresholds?
ACADEMIC PHRASEOLOGY [13]
OSLa volume 9(3), 2017
(ii) Should some types of unit be excluded from the lists and, if so, which ones and on the basis of what criteria?
(iii) Should the automatically derived lists be filtered by teachers and, if so, on the basis of what criteria?
Whatever the method used, the lists generated tend to be quite long, and it is not realistic to expect teachers to have the time or inclination to give all of them the same degree of pedagogical attention. As argued in the following sec- tion, one particularly helpful way of identifying the most useful combinations is to resort to learner corpus data.
[3] i ns ig h t s f rom le ar ne r c orp ora
To extract academic phrases, researchers have so far relied solely on native or expert corpora. This method is clearly essential to identifying the most typical native-like units. However, it provides no information on the basis of which to select the most useful units for specific learning/teaching contexts. One essen- tial feature, namely the degree of difficulty for a particular learner population, is completely disregarded. To bring this variable into the picture, it is necessary to complement corpora of native or expert language with learner corpus data, i.e. electronic collections of academic writing and speech by L2 learners.
Learner corpora are a relatively new resource that is enjoying growing in- terest from the language education community at large, and specialists in aca- demic writing in particular. One of the advantages such corpora offer is that they tend to be quite large and therefore provide numerous authentic exam- ples of learners’ difficulties in context. In addition, as the data are in electronic format, they can be explored automatically with the help of powerful software tools, thereby offering insights that would be inaccessible to manual analysis.
Using the methodology referred to as ‘contrastive interlanguage analysis’ (Granger 2015), it is possible to extract entirely automatically multiword units that are used significantly more frequently (overuse) or less frequently (un- deruse) in learner corpora than in comparable native or expert texts, as well as to compare frequency of use across learner populations (e.g. Swedish vs. Span- ish learners). Misuse can also be detected, such as learners’ use of in the other hand or on the other side instead of, or in addition to, the nativelike connector on the other hand.
This powerful methodology has been used in a large number of studies, some focused on collocations, others on lexical bundles, rarely on both types of unit. For reasons of space it is not possible here to sum up the main trends emerging from these studies (for recent surveys, see Paquot & Granger 2012
[14] granger and Ebeling & Hasselgård 2015). However, one trend deserves special mention because it is consistent across the studies, viz. the high degree of transfer from the learner’s mother tongue (L1). For example, in her study of the use of verb- noun combinations by advanced German-speaking learners, Nesselhauf (2005) observes that some 50% of the inappropriate collocations are probably trans- fer-related. An even higher percentage (67%) is given in Wanner et al.’s (2013) study of miscollocations by Spanish learners. L1 transfer is also at play in the use of lexical bundles. In her study of lexical bundles in English texts written by French-speaking learners, Paquot (2013) finds different types of idiosyncratic use which can be traced back to French. She attributes the ease with which lex- ical bundles can be transferred to the fact that they are semantically and syn- tactically compositional and therefore not sufficiently salient to attract learn- ers’ attention. This finding has important pedagogical implications: it means that in order to ensure maximum efficiency, the pedagogical attention given to certain phraseological units should not be the same in the case of all learners but adapted to the attested needs of particular learner populations.
The flurry of publications focused on phraseology in learner corpus research demonstrates the benefits of an approach that extends the corpus base to in- corporate learner corpus data in addition to the traditionally used na- tive/expert data. In the next section this approach is illustrated by means of a case study on phrasal metadiscourse.
[4] c as e s t udy: ph ras al me tadis c ou rs e
Metadiscourse, i.e. the use of language to ‘organise texts, engage readers and signal attitudes to the material and the audience’ (Hyland 2015: 1), is often ex- pressed by sequences of words rather than single words and is therefore a par- ticularly suitable area of language in which to investigate the phrasemes used by learners when they write academic texts. This section illustrates how the combined use of native and learner corpus data can help uncover differences in the preferred phraseological patterning of one particular metadiscursive word – the noun conclusion (Section 4.1) – and in the use of four-word metadiscursive sequences (Section 4.2). The first analysis is corpus-based, i.e. it starts from a word that is assumed to be worth investigating, and corpus tools and methods are used to retrieve all its occurrences and explore its phraseology. The second is corpus-driven, i.e. no assumption is made initially as to the linguistic items to be investigated; rather, it is the corpus tools and methods that automatically generate lists of potentially interesting phraseological sequences.
The methodology used for the analysis involves the two branches of contras- tive interlanguage analysis, i.e. comparison of learner varieties with one or
ACADEMIC PHRASEOLOGY [15]
OSLa volume 9(3), 2017
more reference varieties, and comparison between learner varieties. The learn- er corpus data come from the International Corpus of Learner English (ICLE) (Granger et al. 2009), which contains essays written by higher intermediate to advanced learners from sixteen mother tongue backgrounds. The subcorpus used contains argumentative essays2 from learner populations representing six mother tongue backgrounds, i.e. French (FR), German (GE), Italian (IT), Norwe- gian (NO), Spanish (SP) and Swedish (SW). The composition of the learner cor- pus allows comparisons not only between individual L2 populations, e.g.
French-speaking vs. German-speaking learners, but also between two groups of L2 populations, i.e. Romance vs. Germanic.
Two reference corpora were used for the comparison with the L2 data. Both are native English corpora, but they represent different degrees of academic maturity. The Louvain Corpus of Native English Essays (LOCNESS) is a corpus of argumentative writing by university students and therefore represents novice native writing. Its main advantage is that it is fully comparable with the ICLE in terms of genre. The other native English corpus is the academic section of the British National Corpus (BNC), Baby.3 Made up of academic texts from periodicals and books, it represents professional academic writing. Having two reference points will make it possible to assess whether, as stated by Römer (2009), L2 learners tend to behave similarly to novice native writers or whether, as ar- gued by Gilquin et al. (2007), they display features that set them distinctly apart from native writers, whether novice or professional. The size of the eight sub- corpora is shown in Table 3.
Corpus No. of words
ICLE-FR 160,245
ICLE-GE 228,180
ICLE-IT 199,001
ICLE-NO 207,230
ICLE-SP 127,051
ICLE-SW 162,216
LOCNESS 327,807
BNC-Acad 1,027,550 table 3: Size of the eight subcorpora
[2] The literary essays in the ICLE were excluded from the data set.
[3] http://www.natcorp.ox.ac.uk/archive/oldBabyDocs/baby-des.html
[16] granger 4.1 Corpus-based approach: conclusion
The simple extraction of all the occurrences of conclusion from the eight sub- corpora already reveals some interesting results. As can be seen in Figure 1, there are quite striking differences between the Romance learner populations (SP, FR and IT), which are characterised by heavy use of conclusion, and the Germanic learner populations (NO, SW and GE), whose frequency of use is much closer to that displayed by novice and professional native speakers (in black in the Figure).
figure 1: Relative frequency of the word conclusion in the eight subcorpora (/200,000 words)
Over and above this purely quantitative difference, the analysis also reveals some interesting qualitative findings. The noun conclusion can be used in two different ways: as part of a noun phrase functioning as subject or object (e.g. a conclusion that can be drawn) and as part of an adverbial connector, usually used at the beginning of the sentence, followed by a comma (e.g. In conclusion, this study shows). A close scan of the concordance lines shows that the proportion of connector vs. non-connector use varies widely across the L2 subcorpora. As shown in Figure 2, while some learner populations mainly use conclusion as an adverbial connector (91% in IT), others rarely use it in that way (only 15% in GE).
ACADEMIC PHRASEOLOGY [17]
OSLa volume 9(3), 2017
figure 2: Connector vs. non-connector use of conclusion (percentage of use) In addition, close scrutiny of the occurrences reveals three types of difficul- ty experienced by learners when using the word conclusion. First, learners regu- larly make use of atypical adverbial connectors such as as a conclusion or as con- clusion instead of the usual in conclusion (see example 1). In some cases the learner-idiosyncratic connectors are even more frequent than the standard connector. For example, in the FR subcorpus as a conclusion accounts for 73% of all the connector uses, as against only 27% for in conclusion. In a more extended study involving 11 learner populations of the ICLE, Paquot (2010) established that as a conclusion represents approximately 40% of the concluding phrasemes involving the noun conclusion. Findings of this type would have been missed by studies focused exclusively on the standard connector. The second difficulty concerns the use of conclusion as a verb argument. Instead of using the typical verbs (come to/reach a conclusion, draw a conclusion, offer a conclusion, etc.), learn- ers often opt for atypical verbal collocates (reach to in example 2 and make in example 3).
(1) (2)
As a conclusion, we can say that the political and cultural unity… (FR) With this idea we reach to the conclusion that a chaos is continually dom- inating our world (SP)
(3) I let it be entirely up to you to make a conclusion (NO)
(4) In conclusion, the root of all evil is the choice of the individual (LOCNESS) (5) In conclusion, I want to point out that in my own view, capital punishment…
(GE)
[18] granger (6) As a conclusion I would only like to say that I think that we are overreacting
wildly… (SW)
figure 3: Extended metadiscursive sequences in ICLE-FR
Finally, the learners produce extended metadiscursive sequences made up of an adverbial connector containing the word conclusion (as a conclusion or in conclusion) and a stance bundle mostly with the verb to say, which add redun- dancy and verbosity to their texts. In ICLE-FR, 54% of the total number of oc- currences of conclusion occur in this type of sequence and, as shown in Figure 3, the diversity of the sequences is very high.4 In native writing, the concluding statement usually follows immediately after the connector (see example 4).
Admittedly, extended metadiscursive sequences can also be found in native writing, but they occur in much smaller numbers and never display the kind of metadiscursive overkill to be found in some of the learner sequences (see ex- amples 5 and 6).
4.2. Corpus-driven approach: four-word metadiscursive bundles
In a corpus-driven approach, no a priori assumptions are made about the specific linguistic forms to be investigated. The first stage of the analysis is ful- ly automatic: it involves extraction of all the four-word sequences from the eight subcorpora. This stage is followed by manual selection of metadiscursive sequences, which either have an organisational function, i.e. reflect relation- ships between prior and coming discourse (e.g. on the other hand), or express stance, i.e. attitude or assessment of certainty (e.g. it is true that) (Biber et al.
[4] See Paquot (2010: 160-161) for a discussion of this phenomenon based on Gledhill’s (2000) notion of
‘phraseological cascade’.
ACADEMIC PHRASEOLOGY [19]
OSLa volume 9(3), 2017
2004: 384).
As the aim was to compare the sequences preferred by each population – those which, based on Hasselgren’s (1994) metaphor, are referred to by Ellis (2012) as ‘phrasal teddy bears’–, the top 20 sequences were selected from each corpus, resulting in a general list of 81 metadiscursive bundle types (see the appendix for the full list). Before turning to some of the results, some caveats are in order. First, this study is exploratory; its main purpose is to illustrate a methodological approach rather than to establish hard facts. The sequences identified as metadiscursive are in fact potentially metadiscursive, as they have not been subjected to manual disambiguation in context. Second, the corpora are not very large, which may reduce the representativeness of the results.
Third, dispersion within the different subcorpora has not been measured. How- ever, the study suggests some interesting – sometimes intriguing – differences between learner populations and native writers which can be used as starting points for more extended studies.
A comparison of the top 20 sequences in each corpus reveals a much higher degree of recurrence in the learner data, the only exception being the German learner subcorpus (see Figure 4). Interestingly, the novice and professional na- tive corpora have a very similar degree of recurrence, which is roughly half that displayed by most of the learner data.
figure 4: Frequency of the top 20 bundles (tokens/200,000 words)
[20] granger This quantitative difference comes with a range of qualitative differences.
As can be seen in Table 4, which lists the top 20 bundles in each corpus, the bundles in the learner data tend to be more clausal (we can say that, it is im- portant to) and more involved, i.e. “reflecting interpersonal interaction and the involved expression of personal feelings and concerns” (Biber et al. 1998: 150).
The sequences containing the first person pronouns (I and we) and the posses- sive determiner my (in bold in the table) are a distinctive feature of learner corpora. Both the novice and the professional native texts tend to contain more phrasal bundles, i.e. noun- rather than verb-based bundles (as a result of, on the basis of) and fewer involved bundles, and generally display a clear tendency to- wards impersonal stance (it is obvious that, it is important to).
table 4: Top 20 metadiscursive bundles in the learner and native subcorpora
ACADEMIC PHRASEOLOGY [21]
OSLa volume 9(3), 2017
As illustrated in Table 5, there are also marked differences in the frequency of use of individual bundles across learner and native speaker populations. The bundle as a conclusion I is a hallmark of French-speaking learners, while we can say that is a typical Romance use, hardly found in any of the other corpora. On the other hand is a favourite with all learners, but the French- and Spanish- speaking learners make particularly high use of this connector. The sequence when it comes to is a Nordic phrasal teddy bear, much used by Norwegian and Swedish learners. In a study based on novice L1 and L2 linguistics research pa- pers, Hasselgård (forthcoming) notes the same overuse in Norwegian learners and attributes it to the phrase når det gjelder, which is formally and functionally similar to when it comes to and is frequently used in Norwegian academic texts.5
Differences in preferential use of metadiscursive bundles should come as no surprise, as languages, not only genetically unrelated ones such as English and Arabic (Sultan 2011), but also closely related ones such as English and French (Granger 2014), differ markedly in their use of metadiscourse. Since sequences such as those investigated in this study are relatively inconspicuous, they tend to be transferred by learners into the target language in their entirety.
Metadiscursive bundle FR IT SP NO SW GE LOC BNC
on the other hand 84 54 94 57 49 50 23 25
when it comes to 1 1 3 40 53 6 10 0
as a conclusion I 16 0 2 3 2 0 0 0
we can say that 21 10 16 0 1 2 0 2
table 5: Relative frequency (tokens/200,000 words) of four bundles across learner and native populations
A hierarchical cluster analysis (HCA) was also performed in order to obtain an overview of how the various populations compare in terms of their choice of metadiscursive bundles. HCA is an exploratory technique that computes a measure of similarity/dissimilarity between groups and provides a general overview of the dataset by means of a dendrogram. More specifically, the input data for the present HCA is based on the distribution (as a percentage) of the 81 metadiscursive bundles within each corpus. The aim was to establish whether (1) the novice native writers would tend to cluster with the novice non-native writers or with the professional native writers; and (2) whether the Romance and Germanic groups would tend to form distinct clusters. As the dendrogram
[5] The same explanation holds for Swedish, which has a similar phrase (när det gäller) that is even more frequent than its Norwegian equivalent (information gratefully received from Hilde Hasselgård).
[22] granger in Figure 5 illustrates, native writers form their own cluster whatever their de- gree of academic maturity, which suggests that, in the case of this particular linguistic phenomenon at least, the notion of ‘nativeness’ is still a valid varia- ble, over and above that of ‘noviceness’. The results for the learner writers are more mixed: while the Romance learners cluster together, the Germanic learn- ers are split: the German-speaking learners cluster with the Romance learners, while the Nordic group – Norwegian and Swedish – form a separate cluster closer to the native speakers.
Figure 5: Hierarchical cluster analysis of the 81 metadiscursive bundles [5] c onc l us i on
Phraseology is now recognised as a major component in general L2 learning and teaching. In the specialised field of academic literacy, however, the phra- seological dimension has yet to establish itself as a core facet. The compilation of phrasal academic lists is a first sign that researchers are becoming aware of this shortcoming and are keen to provide resources to help both learners and teachers. As our analysis of three recently compiled lists of academic phrasemes has shown, researchers have experimented with all kinds of meth- ods to generate the most useful lists. This is a necessary and indeed healthy step, but the results can be quite disconcerting for language practitioners who
ACADEMIC PHRASEOLOGY [23]
OSLa volume 9(3), 2017
fact, as each of the lists is based on its own set of selection criteria, the units they contain can be quite different and, as a result, should be seen as comple- mentary rather than conflicting. Having lists is not enough, however. I fully agree with Simpson-Vlach & Ellis (2010: 510) that it is essential to organise the lists so as to turn them ‘into something that might usefully inform curriculum or language testing materials’. These two authors pave the way by subdividing the bundles into the three categories of spoken, written and core bundles and, more importantly, by placing them in useful functional categories.
One voice that is rarely heard, however, is that of the learners themselves.
As I hope to have demonstrated, learner corpus data is an invaluable resource for identifying the units that are likely to pose problems – whether in terms of misuse, overuse or underuse – for learners in general or for specific learner populations. Learner corpora can be explored in two ways: by searching for a given word that is known or suspected to be problematic, or by letting the cor- pus speak, i.e. employing computational methods to extract multiword units typically used by learners. Though small-scale and largely exploratory, the two studies I have carried out to illustrate these two methods have revealed inter- esting aspects of the learner phrasicon. In particular, the results highlight marked differences between L2 learners and native writers, be they novice or professional. The study also shows that L2 learners should not be considered a homogeneous group: while some of the learner-idiosyncratic features are ge- neric, i.e. shared by all the learner populations, most are L1-specific and re- quire differentiated pedagogical attention.
The key issue that is left unaddressed by the present study is how to incor- porate all this information into teaching practice. Like Coxhead & Byrd (2012:
19), I am convinced that ‘[t]eaching applications using data sets such as we pre- sent here must be mediated. Taking raw data and linguistic techniques into the classroom requires a great deal of care’. I also agree that ‘corpus-based diction- aries and grammars are a wise approach at this time’, but I would go one step further and specify the format in which these dictionaries and grammars should be presented. In my view, the only way to guarantee the kind of flexibil- ity that is needed to adapt the resource to different learner groups is to design customisable web-based environments which learners can turn to when they write academic texts. The Louvain English for Academic Purposes Dictionary (Granger & Paquot 2015), a web-based dictionary-cum-writing aid, which pro- vides a wealth of information on the phraseology of academic words (colloca- tions and lexical bundles), as well as generic and L1-specific warnings against recurrent errors, is a first step in that direction and will hopefully inspire fur- ther research in this field.
[24] granger
re f e re nc e s
Ackermann, Kirsten & Yu-Hua Chen. 2013. Developing the Academic Colloca- tion List (ACL) – A corpus-driven and expert-judged approach. Journal of English for Academic Purposes 12(4), 235-247.
Biber, Doug, Susan Conrad & Randi Reppen. 1998. Corpus Linguistics. Investigating Language Structure and Use. Cambridge: Cambridge University Press.
Biber, Doug, Susan Conrad & Viviana Cortes. 2004. If you Look at … Lexical Bun- dles in University Lectures and Textbooks. Applied Linguistics 25, 371-405.
Biber, Doug, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan.
1999. Longman Grammar of Spoken and Written English. London: Longman.
Coxhead, Averil. 2000. A new academic word list. TESOL Quarterly 34(2), 213-238.
Coxhead, Averil. 2008. Phraseology and English for academic purposes. In Fan- ny Meunier & Sylviane Granger (eds.), Phraseology in Foreign Language Learn- ing and Teaching, 149-161. Amsterdam: Benjamins.
Coxhead, Averil & Patricia Byrd. 2012. Collocations and Academic Word List:
The strong, the weak and the lonely. In Isabel Moskowich & Begoña Crespo (eds.), Encoding the Past, Decoding the Future: Corpora in the 21st Century, 1-20.
Cambridge: Cambridge Scholars Publishing.
Durrant, Philip. 2009. Investigating the viability of a collocation list for stu- dents of English for academic purposes. English for Specific Purposes 28, 157- 169.
Ebeling, Signe Oksefjell & Hilde Hasselgård. 2015. Learner corpora and phrase- ology. In Sylviane Granger, Gaëtanelle Gilquin & Fanny Meunier (eds.), The Cambridge Handbook of Learner Corpus Research, 207-229. Cambridge Universi- ty Press.
Ellis, Nick. 2012. Formulaic language and second language acquisition: Zipf and the phrasal teddy bear. Annual Review of Applied Linguistics 32, 17-44.
Gardner, Dee & Mark Davies. 2013. A New Academic Vocabulary List. Applied Linguistics 35, 305-327.
Gilquin, Gaëtanelle, Sylviane Granger & Magali Paquot. 2007. Learner corpora:
the missing link in EAP pedagogy. Journal of English for Academic Purposes
ACADEMIC PHRASEOLOGY [25]
OSLa volume 9(3), 2017
6(4), 319-335.
Gledhill, Chris. 2000. Collocations in Science Writing. Language in Performance 22.
Tuebingen: Gunter Narr Verlag.
Granger, Sylviane. 2014. A lexical bundle approach to comparing languages:
Stems in English and French. Languages in Contrast 14:1, 58-72.
Granger, Sylviane. 2015. Contrastive interlanguage analysis: A reappraisal. In- ternational Journal of Learner Corpus Research 1(1), 7-24.
Granger, Sylviane, Estelle Dagneaux, Fanny Meunier & Magali Paquot. 2009. The International Corpus of Learner English. Handbook and CD-ROM. Version 2. Lou- vain-la-Neuve: Presses universitaires de Louvain.
Granger, Sylviane & Magali Paquot. 2015. Electronic lexicography goes local:
Design and structures of a needs-driven online academic writing aid. Lexico- graphica - International Annual for Lexicography 31(1), 118-141.
Hasselgård, Hilde. Forthcoming. Phraseological teddy bears: frequent lexical bundles in academic writing by Norwegian learners and native speakers of English. To appear in Michaela Mahlberg and Viola Wiegand (eds), Corpus Linguistics, Context and Culture. Berlin: De Gruyter Mouton.
Hasselgren, Angela 1994. Lexical teddy bears and advanced learners: a study into the ways Norwegian students cope with English vocabulary. Interna- tional Journal of Applied Linguistics 4(2), 237–258.
Hyland, Ken. 2005. Metadiscourse. London: Continuum.
Martinez, Ron & Norbert Schmitt. 2012. A phrasal expressions list. Applied Lin- guistics 33(3), 299-320.
Mel’čuk, Igor. 1998. Collocations and lexical functions. In Anthony Cowie (ed.).
Phraseology. Theory, Analysis, and Applications, 23–53. Oxford: Oxford Univer- sity Press.
Nesselhauf, Nadja. 2005. Collocations in a Learner Corpus. Amsterdam: John Ben- jamins.
Paquot, Magali. 2010. Academic Vocabulary in Learner Writing. Form Extraction to Analysis. London & New York: Continuum.
Paquot, Magali. 2013. Lexical bundles and L1 transfer effects. International Jour-
[26] granger nal of Corpus Linguistics 18(3), 391-417.
Paquot, Magali & Sylviane Granger. 2012. Formulaic language in learner corpo- ra. Annual Review of Applied Linguistics 32, 130-149.
Römer, Ute. 2009. English in Academia: Does nativeness matter? Anglistik: Inter- national Journal of English Studies 20(2), 89-100.
Simpson-Vlach, Rita & Nick Ellis. 2010. An academic formulas list: New methods in phraseology research. Applied Linguistics 31(4), 487-512.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Sultan, Abbas. 2011. A Contrastive Study of Metadiscourse in English and Arabic Linguistics Research Articles. Acta Linguistica 5(1), 28-41.
Wanner, Leo, Margarita Alonso Ramos, Orsolya Vincze, Rogelio Nazar, Gabriela Ferraro, Estela Mosqueira & Sabela Prieto. 2013. Annotation of collocations in a learner corpus for building a learning environment. In Sylviane Granger, Gaëtanelle Gilquin & Fanny Meunier (eds.), Twenty Years of Learner Corpus Research. Looking Back, Moving Ahead, 493-503. Presses universitaires de Louvain: Louvain-la-Neuve.
West, Michael. 1953. A General Service List of English Words. London: Longman.
ACADEMIC PHRASEOLOGY [27]
OSLa volume 9(3), 2017
appe nd ix: l is t of 8 1 me tadis c urs iv e s e que nc e s
c ontac t
Sylviane Granger
Université catholique de Louvain [email protected]
Fjeld, Hagen, Henriksen, Johansson, Olsen & Prentice (eds.) Academic Language in a Nordic Setting – Linguistic and Educational Perspectives, Oslo Studies in Language 9(3), 2017. 29–44.
(ISSN 1890-9639) http://www.journals.uio.no/osla
academic vocabulary in teacher talk:
challenges and opportunities for pedagogy
AVERIL COXHEAD abstract
This article focuses on the opportunities and challenges afforded by teacher talk in Grade Six (10 and 11 year old students) English as an Additional Language, Maths, and Science classes in an international school context in Germany. Teachers recorded their classroom discourse for one week of classes three times in one academic year in each subject. The data shows that high frequency vocabulary prevails in all three subject areas, and Science has a higher vocabulary load than the other two subjects overall. The amount of academic vocabulary, measured by Coxhead’s (2000) Academic Word List, and science vocabulary, measured by Coxhead
& Hirsh’s EAP Science List (2007), were lower over the teacher talk than over secondary school textbooks. This means that teacher talk is lexically easier than textbooks. Over the course of the year, the vocabulary load of the teacher talk increases in all three subjects. This article looks at opportunities and challenges presented by the lexis of teacher talk in these subjects for second and foreign language students in these classes and their teachers. Suggestions for further research are presented by way of a conclusion.
[1] introduction
In an international school context, learners are expected to be able to cope with the English language, the subject content and the educational context at the same time. For some students this will be a relatively easy task, for others it will be a significant challenge. A class of Grade Six students in their first year of high school in an international school in Germany may well contain native speakers of English as well as non-native speakers. Any of these students might well have been educated in several countries that may or may not be part of Anglo-European schooling over their lifetimes and speak one or several lan- guages at home. Their teachers could be non-native or native speakers of Eng- lish.
A key element in classroom learning contexts is the vocabulary needed to
[30] COXHEAD
understand written texts, including textbooks, as well as the spoken texts of the classroom, such as teacher talk. These written and spoken sources of input are vital elements of education for content and language learning. Studies into vocabulary in secondary school contexts tend to have focused on written texts, through corpus-based studies of textbooks such as Greene & Coxhead’s (2015) study of Middle School vocabulary in textbooks in the USA and Coxhead, Ste- vens & Tinkle’s (2010) analysis of a series of Science textbooks in Aotearoa/New Zealand. Very few studies have focused on the vocabulary of teacher talk. Gib- bons (2006: 1) draws attention to the importance of talk in language learning contexts when she writes,
… [t]he talk of teachers and students draws together – or bridg- es – the ‘everyday’ language of students learning through English as a second language, and the language associated with the aca- demic registers of school which they must learn to control.
If learners in a secondary school classroom have a large vocabulary in English, then dealing with the vocabulary load of teacher talk is presumably less prob- lematic than for those with a smaller vocabulary.
Unfortunately, studies have shown low vocabulary scores for high school learners of English as a foreign language in a range of countries, including In- donesia (Nurweni & Read, 1999), Denmark (Henriksen & Danelund, 2015), Tai- wan (Webb & Chang, 2012), and Spain (Olmos, 2009). The Danish, Taiwanese and Spanish studies used the Vocabulary Levels Test (VLT) (Nation, 1983; Schmitt, Schmitt & Clapham, 2001). The VLT is a receptive, frequency-based test with five sections: high frequency vocabulary in the first 2000 and 3000 words, through to lower frequency levels at 5000, and 10000. There is an academic sec- tion, based on Coxhead’s Academic Word List (AWL). Out of the 30 items at each level of the test, test takers need to score 26 to indicate mastery of that level. In the international school in Germany reported on in this study, over 100 non- native speaking Grade 6 students scored below mastery of the first 2000 words of English (under review). These results suggest that the students would not be able to cope with the vocabulary load of the textbooks which were written for first language readers. It is important to investigate teacher talk to find out whether this form of input is a bridge between everyday English and lexically difficult textbooks.
[1.1] Teacher talk and vocabulary
Several studies which have focused on teacher talk have concluded that it is lexically poor (Horst, 2010; Tang, 2011). Horst (2010) used a corpus of 121,000