Learners' and native speakers' use of recurrent word-combinations across disciplines


  • Signe Oksefjell Ebeling University of Bergen
  • Hilde Hasselgård




n-grams, recurrent word combinations, academic writing, disciplinary variation, functional analysis, Norwegian learners of English


This paper compares the use of recurrent word-combinations (n-grams) in texts produced by Norwegian learners of English and native speakers of English in two academic disciplines, namely linguistics and business. The study explores the extent to which the same n-grams are used by learners and native speakers in the two disciplines. Using an adapted version of Moon's (1998) functional framework, we map the functions of the n-grams, distinguishing between three major functions: ideational/informational, interpersonal and textual. The ngrams are extracted from the VESPA and BAWE corpora, representing learner and native language, respectively. The data reveal a complex picture. Informational n-grams are by far the most frequent type and they seem to be not only discipline-specific, but also topic-specific. There are more n-grams with an interpersonal function (evaluative and modalizing) in the linguistics than in the business discipline. Frequencies of n-grams with a textual/organizational function are more similar across the material. However, there is relatively little overlap in the use of individual n-grams with interpersonal and textual functions across the L1 groups. There is a higher degree of similarity between learners and native speakers in the linguistics discipline than in the business discipline. On the other hand, there is some similarity across disciplines within L1 groups as regards interpersonal and textual n-grams.


Ädel, A. and B. Erman. 2012. Recurrent word combinations in academic writing by native speakers and non-native speakers of English: A lexical bundles approach. English for Specific Purposes 31 (2): 81-92.

Ädel, A. and U. Römer. 2012. Research on advanced student writing across disciplines and levels. Introducing the Michigan Corpus of Upper-level Student Papers. International Journal of Corpus Linguistics 17 (1): 3-34.

Alsop, S. and H. Nesi. 2009. Issues in the development of the British Academic Written English (BAWE) corpus. Corpora 4 (1): 71–83.

Altenberg, B. 1998. On the phraseology of spoken English: The evidence of recurrent word-combinations. In Phraseology. Theory, analysis, and applications, ed. A.P. Cowie, 101–122. Oxford: Oxford University Press.

Biber, D. 2006. University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.

Biber, D. and F. Barbieri. 2007. Lexical bundles in university spoken and written registers. English for Specific Purposes 26 (3): 263–286.

Biber, D. and S. Conrad. 1999. Lexical bundles in conversation and academic prose. In Out of corpora: Studies in honour of Stig Johansson, eds. H. Hasselgård and S. Oksefjell, 181–190. Amsterdam: Rodopi.

Biber, D., S. Conrad, and V. Cortes. 2004. 'If you look at…': Lexical bundles in university teaching and textbooks. Applied Linguistics 25: 371–405.

Biber, D. and B. Gray. 2011. The historical shift of scientific academic prose in English towards less explicit styles of expression. In Researching specialized languages, eds. V. Bhatia, P. Sánchez Hernández, and P. Pérez-Paredes, 11–24. Amsterdam: John Benjamins Publishing Company.

Biber, D., S. Johansson, G. Leech, S. Conrad, and E. Finegan. 1999. Longman grammar of spoken and written English. London: Longman.

Chen, Y.-H. and P. Baker. 2010. Lexical bundles in L1 and L2 academic writing. Language Learning and Technology 14 (2): 30–49.

Cortes, V. 2004. Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes 23 (4): 397–323.

Cortes, V. 2006. Teaching lexical bundles in the disciplines: An example from a writing intensive history class. Linguistics and Education 17 (4): 391-406.

Coxhead, A. 2000. A new academic wordlist. TESOL Quarterly 34 (2): 213-238.

Culpeper, J. and M. Kytö. 2002. Lexical bundles in Early Modern English dialogues: a window into the speech-related language of the past. In Sounds, words, texts and change: selected papers from 11 ICEHL, Santiago de Compostela, 7-11 September 2000, eds. T. Fanego, B. Méndez-Naya, and E. Seoane, 45–64. Amsterdam: John Benjamins.

Ebeling, S.O. 2011. Recurrent word-combinations in English student essays. Nordic Journal of English Studies, 10 (1): 49–76.

Ebeling, S.O. and A. Heuboeck. 2007. Encoding document information in a corpus of student writing: The British Academic Written English corpus. Corpora 2 (2): 241–256.

Ebeling, S.O. and P. Wickens. 2012. Interpersonal themes and author stance in student writing. In English corpus linguistics: Looking back, moving forward. Papers from the 30th international conference on English language research on computerized corpora (ICAME 30), eds. S. Hoffmann, P. Rayson, and G. Leech, 23–40. Amsterdam: Rodopi.

Gilquin, G. and M. Paquot. 2008. Too chatty: Learner academic writing and register variation. English Text Construction 1(1): 41–61.

Granger, S. 1996. From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In Languages in contrast. Papers from a symposium on text-based cross-linguistic studies, Lund 4-5 March 1994, eds. K. Aijmer, B. Altenberg, and M. Johansson, 37–51. Lund Studies in English 88. Lund: Lund University Press.

Granger, S. 1998. Prefabricated patterns in advanced EFL writing: Collocations and formulae. In Phraseology. Theory, analysis, and applications, ed. A.P. Cowie, 145–160. Oxford: Oxford University Press.

Groom, N. 2005. Pattern and meaning across genres and disciplines: An exploratory study. Journal of

English for Academic Purposes 4 (3): 257–277.

Halliday, M.A.K. 2004. An introduction to functional grammar. 3rd ed., revised by C.M.I.M. Matthiessen. London: Arnold.

Hasselgård, H. 2012. Facts, ideas, questions, problems, and issues in advanced learners' English. Nordic Journal of English Studies, 11 (1): 22–54.

Hasselgård, H. In press. Discourse-organizing metadiscourse in novice academic English. To appear in Corpus linguistics on the move: Exploring and understanding English through corpora, eds. M.J. López-

Couso, B. Méndez-Naya, P. Núñez-Pertejo, and I. Palacios. Amsterdam: Brill / Rodopi.

Heuboeck, A., J. Holmes, and H. Nesi. 2008. The BAWE Corpus Manual. University of Warwick, University of Reading, Oxford Brookes University.

Hyland, K. 2008. As can be seen. Lexical bundles and disciplinary variation. English for Specific Purposes 27 (1): 4–21.

Lie, J. 2013. 'The fact that the majority seems to be...': A corpus-driven investigation of lexical bundles in native and non-native academic English. Master’s thesis, University of Oslo. [available at www.duo.uio.no]

Meunier, F. and S. Granger (Eds.). 2008. Phraseology in foreign language learning and teaching. Amsterdam: John Benjamins Publishing Company.

Moon, R. 1998. Fixed expressions and idioms in English. A corpus-based approach. Oxford: Clarendon Press.

O'Donnell, M.B., U. Römer and N.C. Ellis. 2013. The development of formulaic sequences in first and second language writing. Investigating effects of frequency, association, and native norm. International Journal of Corpus Linguistics 18 (1): 83–108.

Paquot, M. 2013. Lexical bundles and L1 transfer effects. International Journal of Corpus Linguistics 18 (3): 391–417.

Paquot, M., S.O. Ebeling, A. Heuboeck, and L. Valentin. 2010. The VESPA tagging manual. CECL, Université catholique de Louvain.

Paquot, M., H. Hasselgård, and S.O. Ebeling. 2013. Writer/reader visibility in learner writing across genres. A comparison of the French and Norwegian components of the ICLE and VESPA learner corpora. In Twenty years of learner corpus research: Looking back, moving ahead, eds. S. Granger, G. Gilquin, and F. Meunier, 377–387. Louvain-la-Neuve: Presses universitaires de Louvain.

Pawley, A. and F. H. Syder. 1983. Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In Language and communication, eds. J. C. Richards and R. W. Schmidt, 191–226. London: Longman.

Petch-Tyson, S. 1998. Writer/reader visibility in EFL written discourse. In Learner English on Computer, ed. S. Granger, 107–118. London: Longman.

R Development Core Team (2013). R: A language and environment for statistical computing. R

Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.


Ringbom, H. 1998. Vocabulary frequencies in advanced learner English: A cross-linguistic approach. In Learner English on Computer, ed. S. Granger, 41-52. London: Longman.

Scott, M. 2012. WordSmith Tools version 6. Liverpool: Lexical Analysis Software.

Scott, M. and C. Tribble. 2006. Textual patterns: Key words and corpus analysis in language education. Amsterdam / Philadelphia: John Benjamins.

Stubbs, M. and I. Barth. 2003. Using recurrent phrases as text-type discriminators: A quantitative method and some findings. Functions of Language 10 (1): 61–104.

Wang, Y. 2013. Delexical verb + noun collocations in Swedish and Chinese learner English. Doctoral dissertation, Uppsala University.




How to Cite

Ebeling, Signe Oksefjell, and Hilde Hasselgård. 2015. “Learners’ and Native speakers’ Use of Recurrent Word-Combinations across Disciplines”. Bergen Language and Linguistics Studies 6 (May). https://doi.org/10.15845/bells.v6i0.810.