Investigation of EFL learners ’ discourse functions of lexical bundles : A case of written register

The purpose of the study is to examine the discourse functions of lexical bundles in written university registers. Lexical bundles which are frequently found in written academic discourse are examined in order to analyze their functions in this register, comparing the frequency of each function. As the written register, theses of 72 EFL students were randomly chosen. The categorization was constructed based on the functional framework of lexical bundles in discourse offered by Biber et al. Every 4-word sequence in the theses was identified. The most frequent 4-word bundles in the theses were categorized and discussed accordingly.


INTRODUCTION
The term 'lexical bundle' comes from the field of corpus linguistics.It was defined as "the most frequent recurring lexical sequences which can be regarded as extended collocations: sequences of three or more words that show a statistical tendency to co-occur (e.g., in the case of the)" (Biber and Conrad, 1999).The concept of lexical bundles goes back to Salem (1987) and the research he carried out on a corpus of French government texts.Butler (1997) and Altenberg (1998) subsequently employed the notion in their investigations based on Spanish and English corpora.However, the term 'lexical bundle' first explicitly appeared in the Longman Grammar of Spoken and Written English (Biber and Conrad, 1999), a monumental work entirely based on the British National Corpus of 100 million words.
In addition to Biber and Conrad's term, these multi-word items have been studied under different terms such as fixed expressions (Moon, 1998), lexical phrases (Nattinger and DeCarrico, 1992), pre-fabs, readymade units, routines, formulas, using different criteria to define and identify multi-word items.Although these words were given different names and they were investigated in terms of structures, their use in discourse came to the fore much later.They started to be seen as an especially productive approach in describing discourse in different contexts like university registers, academic registers (Biber and Conrad, 1999;Biber, and Barbieri, 2007;Hyland, 2008) or the language of journalistic prose (Cowie, 1992).The concept of lexical bundles has been used in several later studies (Biber et al., 2004;Cortes, 2004) to investigate common multiword items in discourse.Different approaches have been E-mail: universed2@gmail.com.
Authors agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License developed, using different criteria for the identification of multi-word sequences.For example, some studies describe multi-word sequences that are idiomatic (e.g.expressions like in a nutshell), while other studies focus on sequences that are non-idiomatic but perceptually salient (e.g.you're never going to believe this).Moreover, different modes have been also used to examine these multi-word sequences.There has been considerable attention given to the ways in which formulaic language can be studied in spoken and written discourse (Biber et al., 2004;Cowie, 1992;Moon, 1998).These results were in contrast to previous analyses which regarded the use of prefabs a characteristic of spoken registers (Pawley and Syder, 1983).Lexical bundles were noted to be present in written and spoken registers alike and they were considered "basic building blocks for constructing spoken and written discourse" (Biber and Conrad, 1999, p.188).Moreover, further research found that in certain written registers lexical bundles are surprisingly common (Biber and Barbieri, 2007).
As shown in the studies mentioned above, lexical bundles are important building blocks in discourse.By developing a detailed taxonomy, Biber et al. (2004) found that the three main functions lexical bundles serve in discourse include expressing stance, organizing discourse and referential expressions.They provide a kind of pragmatic 'head' for larger phrases and clauses, where they function as discourse frames for the expression of new information.In this way, lexical bundles provide interpretive frames for the developing discourse (Biber and Barbieri, 2007).
Stance bundles express attitude or assessment, discourse organizers reflect the relationships between different parts of texts, and referential expressions refer to physical or abstract entities, or to other textual parts.Biber et al. (2004) noted that each of these main categories has several sub-categories which are associated with more specific discourse functions.The present study focuses on each of these main functions to examine their use in written university registers.Lexical bundles which are frequently found in written academic language are examined in order to analyze their functions in this register, comparing the frequency of each function.

Discourse functions of lexical bundles
Lexical bundles establish various discourse functions and they differ among themselves in functional characteristics.These functions are provided in Table 1.
As can be seen from Table 1, main functions of lexical bundles are stance bundles, discourse organizers, and referential bundles.In the following sections, each function will be further discussed.

Stance bundles
Stance bundles, the first major category, express attitudes or assessments of certainty that frame some other proposition.There are five functional subcategories of stance bundles: epistemic bundles, desire bundles, obligation (directive) bundles, intention/prediction bundles, and ability bundles:

Epistemic stance bundles
These bundles express some degree of certainty.

Desire bundles
These are the bundles including certain words expressing hope or wish such as want, expect, etc..I don't want to deliver bad news to her.I want you to take out a piece of paper and jot some notes down . . .

Obligation (directive) bundles
These are the bundles including modals of obligation and necessity such as have to, must, etc.

Intention/prediction bundles
These are the bundles including modals of intention and prediction such as be going to, will, etc.
right now what we're going to take a look at are ones that are [. ..] positive and beneficial.

Ability bundles
These are the bundles including modals of ability such as be able to, can, etc.
I want you to be able to name and define those four curriculum category.

Discourse organizers
Discourse organizing bundles, the second major category, are used to indicate the overall discourse structure and to signal the informational status of statements.Discourse organizing bundles serve three functional sub-categories: topic introduction, topic elaboration/clarification, and referential identification/focus:

Topic introduction bundles
Such kind of bundles serves for introducing a new topic.
What I want to do is quickly run through the exercise . . .

Topic elaboration/clarification bundles
These bundles help one exemplify or clarify a topic.

Identification/focus bundles
Such bundles help one identify or focus on a certain topic.
For those of you who came late I have the, uh, the quiz.

Referential bundles
The third major category is referential bundles that generally identify an entity or single out some particular attribute of an entity as especially important.There are three functional sub-categories included under referential bundles: imprecision bundles, bundles specifying attributes, and time/place/text-deixis bundles:

Imprecision bundles
These bundles refer to something in a way which is not clear or exact.
I think really we now have what about, six weeks left in class or something like that.

Bundles specifying attributes
These bundles are in attributive position, usually including quantifiers.

It creates a little bit of wealth.
These figures give an idea of the size of the ethnological community in Russia.
. ..students must define and constantly refine the nature of the problem . . .

Time/place/text-deixis bundles
Such bundles refer to the concept of time, place or text, usually including prepositions.

Lexical bundles in literature
Results of corpus-driven research motivated a growing number of studies that explored different structural types of recurrent multiword chunks in various kinds of written corpora: recurrent word combinations (Altenberg, 1993), prefabricated patterns (Granger, 1998), phrasal lexemes (Moon, 1998), highly recurrent word combinations (De Cock, 2000), and lexical bundles (Biber and Conrad, 1999;Biber et al., 1999), all of which are identified in a language on the basis of their frequency of co-occurrence.Biber (1999), in his study, compared lexical bundles in conversation and academic prose, while Cortes et al. (2005) investigated lexical bundles in student disciplinary writing (history and biology).They all concluded that lexical bundles function as basic building blocks of the discourse.Cortes (2002) compared the use of lexical bundles in the corpus of freshman writing, academic prose and conversation in her study in Freshman Composition.She inferred that the main function of lexical bundles is establishing discourse relationship which helps students to speak and write with greater fluency and accelerating language acquisition process.Hyland (2008) explored the forms, structures and functions of 4-word bundles in a 3.5 million word corpus of research articles, doctoral dissertations and Master's theses in four disciplines.He that the common use of lexical bundles such as it has been noted that in academic written genres helps to signal the text register to readers and reduce processing time by using familiar patterns to link elements of new information.Thus, text receivers are able to sort out what is natural from what is merely grammatical and judge whether a particular bundle 'sounds right' in that context.Therefore, as can be seen is a frequent and unremarkable bundle in academic writing while the equally possible as you can see or as can be observed are rarely encountered.Analyzing a range of written corpora, other researchers suggested that, in addition to already established classes of formulaic sequences (sayings, proverbs, speech formulae, and idioms), a large number of language units were found to cooccur in preferred order without being governed by specific grammar rules (Altenberg, 1993;Biber and Conrad, 1999;Granger, 1998;Moon, 1998;Sinclair, 1999;Wray, 2002).
In their study Biber and Barbieri (2007) investigated the use of lexical bundles in written registers and found that lexical bundles are very common in written course management.They included both instructional registers and student advising/management registers such as office hours, class management talk, written syllabi, etc.They suggested that both course management and institutional writing use a large number of lexical bundle types.However, the functional distribution of bundles was strikingly different in these two registers.In written course management, over half of all bundle types were stance bundles.Referential bundles were relatively common in that register.On the other hand, in institutional writing, almost 70% of all bundle types were referential; stance bundles and discourse organizers were considerably less common in that register.

Research questions
As an exploration of the discourse functions of lexical bundles in written registers, this study addresses the following four research questions: 1. Do all of the three functions exist in the written register?2. If so, among the three main discourse functions of lexical bundles, which is the most frequent one in written registers?3. Which sub-category is the most frequent one among the three main discourse functions?4. What are the most frequent lexical bundles creating each main discourse function?

METHODOLOGY
The present study is parallel with these earlier studies in that it also investigates the frequency of these bundles.However, it goes one step further by examining the frequencies of bundles with respect to their discourse functions and exploring the most common ones.More specifically, the study aims to examine a written registertheses of 72 EFL students who submitted their theses as a final project required in the course "research methods".
With respect to participants, all of them were from the Foreign Language Teaching Department of the university.All of the participants had two years of education in EFL department.Among them, 44 were females and 28 were males.Their ages ranged from 19 to 22 (mean: 20.5).
As for data collection, in order to investigate lexical bundles in one disciplinary domain, educational sciences, 10 theses were selected from EFL Department.All studies were submitted within the period between 2013-2014.First, theses were collected and their introduction and conclusion parts were taken into account.The categorization was constructed based on the functional framework of lexical bundles in discourse mentioned earlier (Biber et al., 2004).Every 4-word sequence in the theses was identified.
When it comes to data analysis, lexical bundles were identified by two scorers.One of them was the researcher and the other one was another instructor.This determined the quantitative mode of the research.To limit the scope of the investigation, only recurrent sequences of 4-word bundles were analyzed in detail.They have been classified according to their function.Then a frequency count was applied for the comparison of lexical bundles in the theses of the selected disciplinary domain; first main categorization was formed.Then, main categorization was followed by subcategorization.The corresponding function of each bundle was assigned by two scorers.Both scorers used the categorization of Biber et al (2004) for coding the data.They exhibited a 81% agreement rate in the categorization of bundles.Then, they came together to come to an agreement for the ones previously not agreed.

Distribution of lexical bundles
Table 2 displays the distribution of bundle types in the written register according to their frequencies.While categorizing, their sub-functions are not considered.Instead, the categorization was formed according to their frequencies.
Table 2 highlights the fact that many of the most frequent 4-word bundles in academic writing are very common.Among these bundles, the most common ones are; was one of the (10.1%), one of the most (8.4%), the results of the (7.6%), as part of the (5.9%), on the other hand (5%), and the fact that the (5%), respectively.

Functional analysis
Table 3 reflects the frequency of each main function.In academic writing, referential bundles are extremely common (50.8%).It is followed by discourse organizing bundles (41.5%) and stance bundles (7.6%).

Mohamadr 13
It can be seen from Table 3 that in the first main category -stance bundles, the only sub-category in which some related lexical bundles with discourse functions are found is "epistemic bundles".On the other hand, the second main category, discourse organizers, includes three sub-categories for all of which certain discourse functions were detected.The final main category, referential bundles, has only one sub-category -imprecision bundles, for which no discourse function was found.Each category along with its sub-categories will be discussed separately in the next section.

Stance expressions
Table 4 shows that among the sub-categories of stance expressions, epistemic stance bundles are the only detected ones.Other sub-categories have not been detected in the theses.
Among the stance bundles, can be used to is the most common one (2.5%).

Discourse organizers
Table 5 reflects that topic introduction bundles and identification/focus bundles are the frequent ones.Topic elaboration/clarification bundles are not common so much.
Among the discourse organizers, one of the most is the most frequent one (8.4%).

Referential expressions
Table 6 highlights that bundles specifying attributes are the most common one (11).It is followed by time/place/text-deixis bundles (5).Imprecision bundles have not been detected in the theses.
Among the referential bundles, was one of the is the most common one (10.1%).

DISCUSSION
The first research question investigated whether all of the three functions exist in the written register.The results reflected in Table 3 showed that all of the three functions exist to some extent.
The next question asked for the most frequent main function in written register.The results revealed that referential bundles are the most frequent main function (16).It is followed by discourse organizing bundles (11) and stance bundles (4).
The third research question went one step further by  Discourse Organizers: freq Topic introduction bundles (25) the results of the, as a result of, on the other hand, with respect to the, in the form of Topic elaboration, (2) clarification bundles as well as the Identification/focus bundles (22) it is important to, these results suggest that, to that of the, one of the most, the fact that the the aim of the, the nature of the, the relationship between the, the number of the, in each of the, was one of the, as part of the, in the case of, on the basis of, to the extent that, the role of the Time/place/text-deixis bundles (15) at the end of, in the present study, the end of each, at the time of, at the beginning of asking the most common sub-category.The results suggested that among the sub-categories of stance expressions, epistemic stance bundles are the most frequent ones.With respect to discourse organizers, topic introduction bundles and identification/focus bundles are the frequent ones.Finally, in referential expressions, bundles specifying attributes are the most common one.
The last research question was concerned with the most frequent lexical bundles creating each main discourse function.As for stance expressions, can be used to (3) is the most frequent one.With respect to discourse organizers, one of the most (6) is the common one.In the States are not formally employed in farm work . . .She's in that.. uh.. office down there.. at the end of the hall . . .As shown in Figure 4.4, . . .

Table 2 .
Most frequent 4-word bundles in the theses.

Table 3 .
Overall frequency of each main categories and their sub-functions.

Table 4 .
Overall frequency of each sub-category related to stance expressions.

Table 5 .
Overall frequency of each sub-category related to discourse organizers

Table 6 .
Overall frequency of each sub-category related to discourse organizers